docs: document gpu.clique relationship and non-MNNVL topology source#255
Conversation
Adds a 'How Topograph Fits in the Kubernetes Topology Stack' section after the Workflow overview, clarifying that Topograph (inter-node network topology discovery) and the kubelet Topology Manager (intra-node NUMA alignment) operate at different scopes and are complementary. Includes a Mermaid architecture diagram. Signed-off-by: Rob Esker <resker@nvidia.com>
The kubelet Topology Manager is a Kubernetes-specific concern and is only relevant to readers of the k8s engine documentation. Moving the comparison table and Mermaid diagram from README to docs/engines/k8s.md, immediately after the label overview, where the context is clearest. Signed-off-by: Rob Esker <resker@nvidia.com>
…/provider docs - k8s.md: add section on relationship to nvidia.com/gpu.clique (IB provider produces same ClusterUUID.CliqueId value; gpu.clique absent on non-MNNVL systems making Topograph the only topology source) - k8s.md: add Mixed Workload Considerations section explaining topology fragmentation when topology-insensitive workloads consume nodes alongside distributed training - infiniband.md: note accelerator value format (ClusterUUID.CliqueId) matches nvidia.com/gpu.clique in both bm and k8s variants - netq.md: note NMX DomainUUID differs in format from gpu.clique - README.md: add sentence on non-MNNVL systems where gpu.clique is absent and Topograph is the only topology source Signed-off-by: Rob Esker <resker@nvidia.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #255 +/- ##
=======================================
Coverage 68.46% 68.46%
=======================================
Files 82 82
Lines 4842 4842
=======================================
Hits 3315 3315
Misses 1395 1395
Partials 132 132 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile SummaryThis PR adds documentation clarifying the relationship between Topograph's Confidence Score: 5/5Safe to merge — documentation-only PR with all technical claims verified against the source code. All four changed files are documentation. Key technical claims (IB provider producing ClusterUUID.CliqueId, NetQ using DomainUUID, topology.KeyNodeClusterID annotation key) were verified directly against bm.go, k8s.go, nmx.go, and topology.go. The gpu.clique value format (ClusterUUID.CliqueId with dot separator) was confirmed via external NVIDIA documentation. No code paths are affected. No P0 or P1 findings. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
HW["GPU Hardware\n(nvidia-smi: ClusterUUID + CliqueId)"]
subgraph MNNVL["MNNVL Systems (GB200 NVL72)"]
DRA["DRA Provider\nreads gpu.clique"]
IBMNNVL["InfiniBand Provider\n(if used)"]
GPU_CLIQUE["nvidia.com/gpu.clique\n= ClusterUUID.CliqueId"]
ACC_MNNVL["accelerator label\n= ClusterUUID.CliqueId"]
end
subgraph NonMNNVL["Non-MNNVL Systems (DGX B200/B300)"]
IB["InfiniBand Provider\n(only topology source)"]
NO_GPU_CLIQUE["nvidia.com/gpu.clique\nnot set"]
ACC_NON["accelerator label\n= ClusterUUID.CliqueId"]
end
subgraph NetQFlow["NetQ / MNNVL via NMX"]
NMX["NetQ NMX API\nDomainUUID"]
ACC_NETQ["accelerator label\n= DomainUUID\n(different format)"]
end
HW -->|"GPU Operator reads"| GPU_CLIQUE
HW -->|"IB provider reads"| IBMNNVL
HW -->|"IB provider reads"| IB
DRA --> GPU_CLIQUE
IBMNNVL --> ACC_MNNVL
GPU_CLIQUE -.->|"same value, correlatable"| ACC_MNNVL
IB --> ACC_NON
IB --> NO_GPU_CLIQUE
NMX --> ACC_NETQ
ACC_NETQ -.->|"same physical domain,\ndifferent string value"| ACC_NON
Reviews (1): Last reviewed commit: "docs: add gpu.clique relationship and no..." | Re-trigger Greptile |
Summary
Adds technical context clarifying how Topograph's network topology labels relate to
nvidia.com/gpu.clique(set by the GPU Operator device plugin) and why Topograph is often the only source of topology on non-MNNVL GPU clusters (DGX B200/B300).Changes
docs/engines/k8s.md— new subsections:nvidia.com/gpu.clique— explains that the InfiniBand provider'sacceleratorlabel value is derived from the sameClusterUUID.CliqueIdhardware identifiers used bygpu.cliqueon MNNVL systems (correlatable). On non-MNNVL systemsgpu.cliqueis absent because the IMEX labeler requiresGPU_FABRIC_STATE_COMPLETED, which non-MNNVL GPUs do not reach.docs/providers/infiniband.md— notes that theacceleratorvalue format matchesgpu.cliquein bothinfiniband-bmandinfiniband-k8svariants.docs/providers/netq.md— notes that NMXDomainUUIDis a distinct identifier fromgpu.clique'sClusterUUID.CliqueId; the values are not directly comparable.README.md— adds a brief note that on non-MNNVL GPU clusters (DGX B200/B300 SuperPODs),gpu.cliqueis not set and Topograph with an InfiniBand provider is the only topology source.Why this matters
For operators evaluating Topograph: it clarifies when Topograph is essential vs. when it complements existing signals. For developers: it documents a non-obvious invariant about where identifier formats match across providers vs. diverge.
Test plan
gpu.cliquevalue format verified againstNVIDIA/k8s-device-plugin/internal/lm/imex.go(IMEX labeler)IsFabricAttached()behavior verified againstNVIDIA/go-nvlib/pkg/nvlib/device/device.gopkg/providers/infiniband/common.goandbm.go(Cluster.ID()returnsUUID + "." + cliqueID)pkg/providers/netq/nmx.go(DomainUUIDsource)