Skip to content

feat(server): add auto-detection of compute driver at startup#1088

Merged
TaylorMutch merged 1 commit intoNVIDIA:mainfrom
sjenning:feat/auto-detect-compute-driver
May 1, 2026
Merged

feat(server): add auto-detection of compute driver at startup#1088
TaylorMutch merged 1 commit intoNVIDIA:mainfrom
sjenning:feat/auto-detect-compute-driver

Conversation

@sjenning
Copy link
Copy Markdown
Contributor

Summary

Add automatic detection of the appropriate compute driver when no drivers are explicitly configured. The server now checks the runtime environment and selects Kubernetes, Podman, or Docker in priority order — eliminating the need for manual --drivers configuration in most deployments.

Related Issue

None filed.

Changes

  • Added Auto variant to ComputeDriverKind enum (internal-only, skips serialization)
  • Added detect_driver() function with priority: Kubernetes → Podman → Docker
  • Added is_binary_available() helper using Command::new(name).arg("--version").output()
  • Changed default_compute_drivers() to return empty vec, triggering auto-detection
  • Updated configured_compute_driver() to call detect_driver() when config is empty
  • Removed default_value = "kubernetes" from --drivers CLI flag
  • VM excluded from auto-detection (requires explicit --drivers vm)

Testing

  • mise run rust:lint passes (clippy + format + license headers)
  • mise run rust:check passes (cargo check)
  • Unit tests pass: cargo test -p openshell-core config:: (7/7)
  • Unit tests pass: cargo test -p openshell-server (10/10)
  • E2E tests not run (requires running cluster)

Note: One pre-existing flaky integration test (sandbox_create_keeps_sandbox_with_forwarding) fails due to port 8080 being occupied on this system — unrelated to these changes.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@sjenning sjenning requested a review from a team as a code owner April 30, 2026 19:31
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 30, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Comment thread crates/openshell-core/src/config.rs Outdated
@sjenning sjenning force-pushed the feat/auto-detect-compute-driver branch from 8518427 to 4782f22 Compare April 30, 2026 20:28
TaylorMutch
TaylorMutch previously approved these changes Apr 30, 2026
@TaylorMutch TaylorMutch added the test:e2e Requires end-to-end coverage label Apr 30, 2026
@github-actions
Copy link
Copy Markdown

Label test:e2e applied, but pull-request/1088 is at {"messa while the PR head is 4782f22. A maintainer needs to comment /ok to test 4782f22081d2435e7bcb7af7b30d4804d77747e2 to refresh the mirror. Once the mirror catches up, re-run Branch E2E Checks from the Actions tab.

@TaylorMutch
Copy link
Copy Markdown
Collaborator

/ok to test 4782f22

Comment thread crates/openshell-core/src/config.rs Outdated
When no drivers are explicitly configured, the server now automatically
detects the appropriate compute driver by checking the runtime environment:

- Kubernetes: detected via KUBERNETES_SERVICE_HOST env var (set inside pods)
- Podman: detected by checking if podman binary is available on PATH
- Docker: detected by checking if docker binary is available on PATH

Priority order: Kubernetes → Podman → Docker. VM is never auto-detected
and must be selected explicitly via --drivers vm.

The Auto variant is internal-only and does not serialize to config files.
The default --drivers value is now empty, triggering auto-detection.
@TaylorMutch
Copy link
Copy Markdown
Collaborator

/ok to test f7c3dd3

@TaylorMutch TaylorMutch merged commit ea4915a into NVIDIA:main May 1, 2026
66 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants