Skip to content

feat(driver-kubernetes): sideload supervisor binary via init container#1154

Merged
TaylorMutch merged 3 commits intomainfrom
kube-support/sideload-supervisor/tmutch
May 4, 2026
Merged

feat(driver-kubernetes): sideload supervisor binary via init container#1154
TaylorMutch merged 3 commits intomainfrom
kube-support/sideload-supervisor/tmutch

Conversation

@TaylorMutch
Copy link
Copy Markdown
Collaborator

Summary

Replaces the hostPath volume approach for injecting the supervisor binary into sandbox pods with an init container + emptyDir pattern. The init container pulls a dedicated supervisor image, copies the openshell-sandbox binary into a shared emptyDir volume, and the agent container executes it from there. This removes the requirement that every k3s node has the supervisor binary pre-installed on the host filesystem.

Related Issue

Changes

  • Replace supervisor_volume() hostPath volume with an emptyDir volume
  • Add supervisor_init_container() that copies openshell-sandbox from the supervisor image into the shared volume via command -v + cp
  • Add supervisor_image and supervisor_image_pull_policy fields to KubernetesComputeConfig
  • Wire OPENSHELL_SUPERVISOR_IMAGE / OPENSHELL_SUPERVISOR_IMAGE_PULL_POLICY env vars in openshell-server and the kubernetes driver CLI args; default falls back to DEFAULT_SUPERVISOR_IMAGE
  • Update Helm chart with server.supervisorImage (default ghcr.io/nvidia/openshell/supervisor:latest) and server.supervisorImagePullPolicy values
  • Update unit tests to assert init container presence and emptyDir volume shape instead of hostPath assertions

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@TaylorMutch TaylorMutch requested a review from a team as a code owner May 4, 2026 19:44
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@TaylorMutch TaylorMutch merged commit d73860f into main May 4, 2026
5 checks passed
@TaylorMutch TaylorMutch deleted the kube-support/sideload-supervisor/tmutch branch May 4, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants