Skip to content

add NIC_TRAFFIC_CLASS and NIC_SERVICE_LEVEL env vars for DSCP marking#315

Merged
nileshnegi merged 5 commits into
ROCm:candidatefrom
paklui:candidate-tc
May 26, 2026
Merged

add NIC_TRAFFIC_CLASS and NIC_SERVICE_LEVEL env vars for DSCP marking#315
nileshnegi merged 5 commits into
ROCm:candidatefrom
paklui:candidate-tc

Conversation

@paklui
Copy link
Copy Markdown
Contributor

@paklui paklui commented May 26, 2026

Motivation

On the latest cluster, we need to specify the DSCP value for the network's QoS set on the NIC and switch.

Technical Details

Adds support for marking RoCE/IB traffic with specific DSCP/QoS values.

  • NIC_TRAFFIC_CLASS (default=0): sets the DSCP/traffic class byte in the RoCE GRH (grh.traffic_class) when transitioning QPs to RTR state. This option is like the equivalent to NCCL_IB_TC in NCCL/RCCL.
  • NIC_SERVICE_LEVEL (default=0): sets the IB service level (ah_attr.sl) on QPs. This applies to IB and RoCE connections. This option is like the equivalent to NCCL_IB_SL in NCCL/RCCL.
  • NicOptions: I added uint8_t serviceLevel and uint8_t trafficClass fields
  • TransitionQpToRtr(): accepts trafficClass and serviceLevel as parameters; sets grh.traffic_class (RoCE only) and ah_attr.sl (all QP types)

Test Plan

I tried to run using the nicp2p as a test
NIC_TRAFFIC_CLASS=128 ./TransferBench nicp2p

Test Result

It appears to run using that traffic class as expected

Submission Checklist

Adds support for marking RoCE/IB traffic with specific DSCP/QoS values.

- NIC_TRAFFIC_CLASS (default=0): sets the DSCP/traffic class byte in the
  RoCE GRH (grh.traffic_class) when transitioning QPs to RTR state.
- NIC_SERVICE_LEVEL (default=0): sets the IB service level (ah_attr.sl)
  on QPs. This applies to IB and RoCE connections.
- NicOptions: I added uint8_t serviceLevel and uint8_t trafficClass fields
- TransitionQpToRtr(): accepts trafficClass and serviceLevel as parameters;
  sets grh.traffic_class (RoCE only) and ah_attr.sl (all QP types)
Copilot AI review requested due to automatic review settings May 26, 2026 18:23
@paklui paklui requested a review from a team as a code owner May 26, 2026 18:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds NIC QoS controls to TransferBench’s NIC executor by exposing DSCP/traffic class and IB service level configuration via environment variables and wiring them into QP transition to RTR.

Changes:

  • Added cfg.nic.serviceLevel and cfg.nic.trafficClass fields and included them in cross-rank NIC config consistency checks.
  • Added env var parsing + reporting for NIC_SERVICE_LEVEL and NIC_TRAFFIC_CLASS, and mapped them into ConfigOptions.
  • Extended TransitionQpToRtr() to apply grh.traffic_class (RoCE) and ah_attr.sl (all QPs).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/header/TransferBench.hpp Adds NIC options fields and applies them during QP RTR transition.
src/client/EnvVars.hpp Adds env vars to configure/report NIC service level and traffic class, and maps them into config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/client/EnvVars.hpp
Comment thread src/client/EnvVars.hpp
Comment thread src/header/TransferBench.hpp
@paklui paklui requested a review from a team as a code owner May 26, 2026 18:26
Copilot AI review requested due to automatic review settings May 26, 2026 19:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread src/client/EnvVars.hpp Outdated
Comment thread src/client/EnvVars.hpp Outdated
Comment thread src/client/EnvVars.hpp Outdated
Copilot AI review requested due to automatic review settings May 26, 2026 20:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread src/client/EnvVars.hpp
Comment thread src/header/TransferBench.hpp
@nileshnegi nileshnegi merged commit 7e8b1bb into ROCm:candidate May 26, 2026
10 checks passed
@nileshnegi nileshnegi mentioned this pull request Jun 1, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants