Skip to content

fix(sandbox): restore GPU filesystem baseline#1522

Open
elezar wants to merge 1 commit into
fix/1486-gpu-enrichment-no-network/elezarfrom
fix/1486-gpu-sandbox-filesystem-policy/elezar
Open

fix(sandbox): restore GPU filesystem baseline#1522
elezar wants to merge 1 commit into
fix/1486-gpu-enrichment-no-network/elezarfrom
fix/1486-gpu-sandbox-filesystem-policy/elezar

Conversation

@elezar
Copy link
Copy Markdown
Member

@elezar elezar commented May 22, 2026

Summary

Restore GPU procfs baseline handling on top of #1524 so Docker-backed GPU sandboxes get the procfs write access CUDA workloads require. This fixes the remaining cuda-basic failure after GPU enrichment runs for no-network sandboxes.

Related Issue

Fixes #1486

Depends on #1524.
The no-network enrichment regression is handled in #1524 and was introduced by #158. This PR addresses the follow-up /proc promotion regression introduced by #910, where explicit default read-only paths prevented GPU-required read/write baseline promotion.

Changes

  • Promote default-like GPU read/write paths, including /proc, when a GPU is available and the sandbox uses the default/discovered policy path.
  • Reject custom policies that keep required GPU read/write paths read-only instead of silently overriding them.
  • Update sandbox policy documentation to describe the CUDA procfs tradeoff and GPU policy behavior.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)
  • Plain Docker control: docker run --rm --device nvidia.com/gpu=all localhost/openshell/gpu-workload-cuda-basic:785872b4 passed with OPENSHELL_GPU_WORKLOAD_SUCCESS cuda-basic
  • Parent/pre-fix sandbox validation failed as expected with cudaGetDeviceCount returned 304
  • Current stacked branch sandbox validation passed with OPENSHELL_GPU_WORKLOAD_SUCCESS cuda-basic

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners May 22, 2026 13:47
@github-actions
Copy link
Copy Markdown

@elezar elezar changed the base branch from main to fix/1486-gpu-enrichment-no-network/elezar May 22, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: GPU sandboxes miss filesystem access for CUDA workloads

1 participant