Skip to content

fix: disable CDI in GPU Operator for dynamo inference recipes#134

Merged
mchmarny merged 2 commits intoNVIDIA:mainfrom
yuanchen8911:fix/disable-cdi-gpu-operator
Feb 17, 2026
Merged

fix: disable CDI in GPU Operator for dynamo inference recipes#134
mchmarny merged 2 commits intoNVIDIA:mainfrom
yuanchen8911:fix/disable-cdi-gpu-operator

Conversation

@yuanchen8911
Copy link
Contributor

Summary

  • Disable CDI (Container Device Interface) in GPU Operator for both Kind and EKS dynamo inference recipe overlays
  • CDI causes KAI scheduler annotation issues that prevent proper GPU resource allocation
  • Adds gpu-operator.cdi.enabled: false and gpu-operator.cdi.default: false to inference overlays

Test plan

  • make lint passes
  • KWOK tests pass with updated overlays
  • Deploy dynamo inference recipe on Kind/EKS and verify GPU pods schedule correctly

🤖 Generated with Claude Code

@yuanchen8911 yuanchen8911 requested a review from a team as a code owner February 17, 2026 23:31
GPU Operator >= v25.10.0 with cdi.enabled=true causes the KAI
scheduler operator to set --cdi-enabled=true on the binder, which
injects management.nvidia.com CDI device annotations. These fail to
resolve on nodes where the CDI spec lacks UUID-based entries.
Disable CDI in both EKS and Kind dynamo overlay recipes.

Signed-off-by: yuanchen97@gmail.com
@yuanchen8911 yuanchen8911 force-pushed the fix/disable-cdi-gpu-operator branch from f2a1829 to 2c3a45a Compare February 17, 2026 23:43
Copy link
Member

@mchmarny mchmarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgmt

@mchmarny mchmarny merged commit c8b61c0 into NVIDIA:main Feb 17, 2026
8 of 10 checks passed
@mchmarny mchmarny deleted the fix/disable-cdi-gpu-operator branch February 17, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants