New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Add support for traffic class selection #5148
Conversation
@shefty Who else should we add to this PR as reviewers? This is a first pass at traffic class support for libfabric based on the previous OFIWG meetings. |
I would post an email to the ofiwg mail list and point people to this PR. Then, anyone interested can come have a look. |
Done |
I do plan to squash this into a single commit when review is complete. The small commits are meant to show how the feedback is addressed until then. Just a heads up. |
I'm basically reviewing the merged result only, so whichever is easier for you. I agree this should be squashed when merging. |
a47a04f
to
9822769
Compare
This commit introduces the capability to select a traffic class when creating libfabric objects with the goal to allow applications to utilize traffic shaping mechanisms exposed to providers. In this proposal, the fi_domain, fi_tx_attr, and fi_rx_attributes are modified to add the new fi_traffic_class structure to the structure definition. The fi_traffic_class structure provides a 'type' field, indicating the type of traffic class value that is represented in the 'value' field of the structure. Two different types of traffic class are presented with this commit; FI_TC_TYPE_LABEL, and FI_TC_TYPE_DSCP. Libfabric defines generalized traffic classes via community defined labels and expected behavior for those labels, which fall under the category of FI_TC_TYPE_LABEL. Providers may also define provider specific labels utilizing the most significant 32 bits of the value field. To provide applications with greater control over traffic class selection, this proposal exposes the ability to specify a specify DSCP value to be associated with libfabric objects, using the value field set to the DSCP value, and type set to FI_TC_TYPE_DSCP. Additionally, to simplify the traffic class selection process, this commit introduces a heirarchal inheritance model with the fi_traffic_class structure. When an fi_domain is created, the application can choose to select a traffic class, or allow the provider to select a default traffic class. The traffic class associated with the domain is used as the default for all objects created within the domain. This allows an application to request a general traffic class for most traffic flows, while allowing the application to finely control traffic at the granularity of the tx context or rx context. The original proposal to OFIWG did not include the rx context modification. However, multi-part data transfer operations such as rendezvous messaging benefit from the differentiation of request and response traffic shaping flows. The traffic class field in the rx attribute is only intended to apply to these multi-part data transfer operations and does not imply any relation with the tx attribute of a peer endpoint. Signed-off-by: James Swaro <jswaro@cray.com>
4bd4466
to
847d697
Compare
@shefty Ready for another round of reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is close. thanks
FYI - I'm pulling this into my tree to make some updates and will open a new PR. I want to get this merged soon, so it's ready before 1.9rc1. |
Before you merge it, I'll squash that last commit. |
I was going to squash it.
|
Ok, all yours. Would you mind copying me to the new PR? |
Yes. I plan on tagging you and requesting your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See PR #5334. Posting comments that I had written as part of review. Those have been addressed in the updated PR.
info->domain_attr.traffic_class = FI_TC_DSCP_SET(0x2e); | ||
ret = fi_domain(fabric, info, &domain, context); | ||
assert(ret == FI_SUCCESS); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a sizeable amount of example code. It would be better to have a fabtests example than this much code added to the man page for a single field in the attribute structures. This is particularly true since the use of the field is straightforward and aligns with other attribute fields.
traffic is considered to be undifferentiated. | ||
|
||
Traffic classes can be any DSCP value, any of the following defined classes of | ||
traffic, or any provider-specific definitions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of FI_TC_DSCP_SET/GET need to be defined here.
size_t size; | ||
size_t iov_limit; | ||
size_t rma_iov_limit; | ||
uint32_t tclass; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a 1.2 field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly confused. Perhaps you can walk me through this, but I thought 1.2 referred to the ABI version, not the API version.
size_t rx_ctx_cnt; | ||
size_t auth_key_size; | ||
uint8_t *auth_key; | ||
uint32_t tclass; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a 1.2 field
size_t auth_key_size; | ||
size_t max_err_data; | ||
size_t mr_cnt; | ||
uint32_t tclass; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a 1.2 field. All of the above structures should be the exact structure that was defined prior to any changes being made. You can use the 1.8 release structures, for example.
traffic, or any provider-specific definitions. | ||
|
||
All values other than 0 and less than FI_TC_LABEL are treated as DSCP values in | ||
the traffic class field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this statement is true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we changed that. Nice catch.
|
||
*FI_TC_REALTIME_STREAM* | ||
: This class is used for data that requires dedicated bandwidth and little | ||
jitter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed, since it was removed from the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
@@ -316,6 +316,22 @@ enum { | |||
FI_PROTO_EFA | |||
}; | |||
|
|||
#define FI_TC_PROV_SPECIFIC (1UL << 31) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FI_PROV_SPECIFIC is already defined by the API. No need for an overlapping definition.
I think we can close this and update 5334. I left the comments here, so you could see why I made the changes I did in 5334. Everything has been addressed there. So please review that PR and respond with any issues. |
Closed. Follow up with #5334 |
This commit introduces the capability to select a traffic class when
creating libfabric objects with the goal to allow applications to
utilize traffic shaping mechanisms exposed to providers.
In this proposal, the fi_domain, fi_tx_attr, and fi_rx_attributes are
modified to add the new fi_traffic_class structure to the structure definition.
The fi_traffic_class structure provides a 'type' field, indicating the type of
traffic class value that is represented in the 'value' field of the structure.
Two different types of traffic class are presented with this commit; FI_TC_TYPE_LABEL,
and FI_TC_TYPE_DSCP. Libfabric defines generalized traffic classes via community
defined labels and expected behavior for those labels, which fall under the category
of FI_TC_TYPE_LABEL. Providers may also define provider specific labels utilizing the
most significant 32 bits of the value field. To provide applications with greater
control over traffic class selection, this proposal exposes the ability to specify
a specify DSCP value to be associated with libfabric objects, using the value field set
to the DSCP value, and type set to FI_TC_TYPE_DSCP.
Additionally, to simplify the traffic class selection process, this commit introduces
a heirarchal inheritance model with the fi_traffic_class structure. When an fi_domain
is created, the application can choose to select a traffic class, or allow the provider
to select a default traffic class. The traffic class associated with the domain is used
as the default for all objects created within the domain. This allows an
application to request a general traffic class for most traffic flows, while
allowing the application to finely control traffic at the granularity of the tx context
or rx context.
The original proposal to OFIWG did not include the rx context modification. However,
multi-part data transfer operations such as rendezvous messaging benefit from the differentiation of
request and response traffic shaping flows. The traffic class field in the rx attribute is
only intended to apply to these multi-part data transfer operations and does not imply any
relation with the tx attribute of a peer endpoint.
Signed-off-by: James Swaro jswaro@cray.com