Skip to content

feat(api): expose runtimeClassName on InferenceServiceSpec (closes #375)#380

Merged
Defilan merged 1 commit intomainfrom
feat/issue-375-runtime-class-name
May 2, 2026
Merged

feat(api): expose runtimeClassName on InferenceServiceSpec (closes #375)#380
Defilan merged 1 commit intomainfrom
feat/issue-375-runtime-class-name

Conversation

@Defilan
Copy link
Copy Markdown
Member

@Defilan Defilan commented May 2, 2026

Closes #375.

Summary

Adds RuntimeClassName *string to InferenceServiceSpec. The deployment builder forwards it to PodSpec.RuntimeClassName. Most commonly set to "nvidia" on clusters where the NVIDIA Container Runtime is not configured as the cluster default.

Originally surfaced via a Discord question yesterday: a community member's GPU pods were scheduling onto the GPU node but never getting the device files bind-mounted. They needed to set runtimeClassName: nvidia on the Pod, and there was no way to plumb that through InferenceServiceSpec. The available workarounds were either reconfiguring containerd globally or running a Kyverno mutating webhook against the inference label selector. Both clunkier than just having the field on the CRD.

Changes

File Change
api/v1alpha1/inferenceservice_types.go Add RuntimeClassName *string \json:"runtimeClassName,omitempty"`` field with usage docs
api/v1alpha1/zz_generated.deepcopy.go make generate
config/crd/bases/... make manifests
charts/llmkube/templates/crds/inferenceservices.yaml make chart-crds
internal/controller/deployment_builder.go One-line RuntimeClassName: isvc.Spec.RuntimeClassName in PodSpec construction
internal/controller/inferenceservice_deployment_test.go Two unit tests: set path asserts *PodSpec.RuntimeClassName == "nvidia", unset path asserts PodSpec.RuntimeClassName == nil

Net: +125 lines, no deletions. Field is optional and nil-safe; existing clusters see no behavior change.

Why now

Tagged for the 0.7.6 release window (the same window that closed #374 via PR #376). It's a small, isolated, user-visible feature that closes a real Discord-reported gap. Doesn't overlap with any other open PR (#340 lives in runtime_*.go; this lives in deployment_builder.go).

Out of scope (deferred to follow-ups if needed)

  • Validating that the named RuntimeClass actually exists in the cluster — Kubernetes returns a clear error at pod admission, no need to duplicate
  • Auto-detecting the cluster's GPU runtime configuration — users should know their cluster
  • A runtimeClass example in the README — happy to add as a one-paragraph follow-up if the team wants it

Test plan

  • make manifests generate fmt vet clean
  • make chart-crds synced (no drift)
  • go test ./internal/controller/... ./pkg/agent/... passes (new tests included)
  • golangci-lint v2.4.0 reports 0 issues
  • Manual verification on a kind cluster: deploy with and without runtimeClassName: nvidia, confirm kubectl get pod -o yaml shows the expected runtimeClassName value

Related

Adds a RuntimeClassName field on InferenceServiceSpec that the deployment
builder forwards directly to PodSpec.RuntimeClassName. Most commonly set
to "nvidia" on clusters where the NVIDIA Container Runtime is not the
cluster default — without it, GPU pods schedule onto the GPU node but
never get the device files bind-mounted, and the container fails at
runtime with "no CUDA-capable device is detected".

Originally surfaced as a Discord question that we filed as #375 yesterday.
The fix is local: one new optional field on the CRD, one new line in the
PodSpec construction, plus a unit test that asserts both the set and
unset paths.

Helm chart CRD synced via make chart-crds. RBAC unchanged. The field is
optional and nil-safe; existing clusters see no behavior change.

Signed-off-by: Christopher Maher <chris@mahercode.io>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

❌ Patch coverage is 20.00000% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
api/v1alpha1/zz_generated.deepcopy.go 0.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Defilan Defilan merged commit cc44ff5 into main May 2, 2026
19 checks passed
@Defilan Defilan deleted the feat/issue-375-runtime-class-name branch May 2, 2026 22:15
@github-actions github-actions Bot mentioned this pull request May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant