NVIDIA · elezar · May 20, 2026
@@ -168,6 +168,7 @@ logs/
 tmp/
 temp/
 *.tmp
+e2e/gpu/images/.build/
 
 # Secrets/credentials (should never be committed)
 *.pem

@@ -0,0 +1,113 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# GPU workload images
+
+This directory defines GPU workload images used by OpenShell GPU e2e tests.
+
+The image definitions live here first so the OpenShell e2e harness can iterate
+against a concrete contract. The long-term image ownership should move to
+`NVIDIA/OpenShell-Community`; OpenShell should then keep the contract, local
+build task, and tests that consume published image refs.
+
+## Contract
+
+Each workload image must:
+
+- Use the OpenShell community base image as its final-stage base.
+- Install the workload at `/usr/local/bin/openshell-gpu-workload`.
+- Run the same workload as the image default entrypoint for direct
+  container-engine validation.
+- Require no network access after the image is pulled.
+- Print `OPENSHELL_GPU_WORKLOAD_SUCCESS` only when validation succeeds.
+- Print `OPENSHELL_GPU_WORKLOAD_FAILURE` and exit non-zero when validation
+  fails.
+- Be usable as an OpenShell sandbox image with `openshell sandbox create
+  --from <image>`.
+
+OpenShell sandbox creation replaces the image entrypoint with the supervisor and
+does not run the OCI image `CMD`. E2e tests that use these images through
+OpenShell should run `/usr/local/bin/openshell-gpu-workload` explicitly.
+
+## Images
+
+| Source directory | Image name | Purpose |
+| --- | --- | --- |
+| `smoke-pass` | `gpu-workload-smoke-pass` | Always succeeds and prints the success marker. |
+| `smoke-fail` | `gpu-workload-smoke-fail` | Always fails and prints the failure marker. |
+| `cuda-basic` | `gpu-workload-cuda-basic` | Runs CUDA `deviceQuery` and `vectorAdd` validation. |
+
+## Build
+
+Build all workload images:
+
+```shell
+mise run e2e:gpu:images:build
+```
+
+Build a subset by source directory name:
+
+```shell
+OPENSHELL_GPU_WORKLOAD_IMAGES=smoke-pass,smoke-fail \
+mise run e2e:gpu:images:build
+```
+
+The build task uses `tasks/scripts/container-engine.sh`. Set
+`CONTAINER_ENGINE=docker` or `CONTAINER_ENGINE=podman` to choose an engine
+explicitly. When unset, the helper uses its existing auto-detection behavior.
+
+Local tags use the current commit short SHA. Dirty local trees append `-dirty`.
+Set `OPENSHELL_GPU_WORKLOAD_IMAGE_TAG=<tag>` to override the tag.
+
+The task writes the latest build refs to:
+
+```text
+e2e/gpu/images/.build/latest.env
+```
+
+Use it in later commands:
+
+```shell
+source e2e/gpu/images/.build/latest.env
+```
+
+## Direct Validation
+
+Validate smoke pass:
+
+```shell
+docker run --rm "${OPENSHELL_E2E_GPU_SMOKE_PASS_IMAGE}"
+```
+
+Validate smoke fail:
+
+```shell
+docker run --rm "${OPENSHELL_E2E_GPU_SMOKE_FAIL_IMAGE}"
+```
+
+The smoke fail command should exit non-zero and print
+`OPENSHELL_GPU_WORKLOAD_FAILURE`.
+
+Validate CUDA with Docker CDI:
+
+```shell
+docker run --rm --device nvidia.com/gpu=all \
+  "${OPENSHELL_E2E_GPU_CUDA_WORKLOAD_IMAGE}"
+```
+
+Use `podman run` with the same `--device nvidia.com/gpu=all` option on hosts
+where Podman CDI is configured.
+
+Direct container-engine validation catches image, CDI, CUDA, and host GPU setup
+issues before OpenShell sandbox behavior is involved.
+
+## Publish Guidance
+
+Published tests should reference immutable image refs:
+
+```shell
+OPENSHELL_E2E_GPU_CUDA_WORKLOAD_IMAGE=ghcr.io/nvidia/openshell-community/sandboxes/gpu-workload-cuda-basic@sha256:<digest>
+```
+
+Mutable tags are acceptable for local iteration. CI should use a digest or an
+immutable release tag once the images are published from OpenShell-Community.
@@ -0,0 +1,72 @@
+# syntax=docker/dockerfile:1
+
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+ARG CUDA_BUILD_IMAGE=nvcr.io/nvidia/cuda:12.8.1-base-ubuntu22.04
+ARG OPENSHELL_SANDBOX_BASE_IMAGE=ghcr.io/nvidia/openshell-community/sandboxes/base:latest
+
+FROM ${CUDA_BUILD_IMAGE} AS builder
+
+ARG DEBIAN_FRONTEND=noninteractive
+ARG CUDA_SAMPLES_REF=v12.8
+ARG CUDA_SAMPLES_REPO=https://github.com/NVIDIA/cuda-samples
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+        build-essential \
+        ca-certificates \
+        cmake \
+        cuda-nvcc-12-8 \
+        curl \
+        g++ \
+    && rm -rf /var/lib/apt/lists/*
+
+WORKDIR /build/cuda-samples
+
+RUN set -eux; \
+    curl -fsSL "${CUDA_SAMPLES_REPO}/archive/refs/tags/${CUDA_SAMPLES_REF}.tar.gz" \
+        -o /tmp/cuda-samples.tar.gz; \
+    tar -xzf /tmp/cuda-samples.tar.gz \
+        --strip-components=1 \
+        --wildcards \
+        '*/Common/*' \
+        '*/cmake/*' \
+        '*/Samples/0_Introduction/vectorAdd/*' \
+        '*/Samples/1_Utilities/deviceQuery/*' \
+        '*/LICENSE'; \
+    sed -i 's/CUDA::cudart/CUDA::cudart_static/g' \
+        Samples/1_Utilities/deviceQuery/CMakeLists.txt; \
+    cmake -S Samples/1_Utilities/deviceQuery -B /tmp/build-device-query \
+        -DCMAKE_BUILD_TYPE=Release \
+        -DCMAKE_CUDA_RUNTIME_LIBRARY=Static; \
+    cmake --build /tmp/build-device-query --parallel; \
+    cmake -S Samples/0_Introduction/vectorAdd -B /tmp/build-vector-add \
+        -DCMAKE_BUILD_TYPE=Release \
+        -DCMAKE_CUDA_RUNTIME_LIBRARY=Static; \
+    cmake --build /tmp/build-vector-add --parallel; \
+    mkdir -p /opt/openshell-gpu-workload; \
+    cp /tmp/build-device-query/deviceQuery /opt/openshell-gpu-workload/deviceQuery; \
+    cp /tmp/build-vector-add/vectorAdd /opt/openshell-gpu-workload/vectorAdd; \
+    cp LICENSE /opt/openshell-gpu-workload/cuda-samples.LICENSE; \
+    rm -f /tmp/cuda-samples.tar.gz
+
+FROM ${OPENSHELL_SANDBOX_BASE_IMAGE}
+
+ARG CUDA_SAMPLES_REF=v12.8
+
+LABEL com.nvidia.openshell.gpu-workload.name="cuda-basic" \
+      com.nvidia.openshell.gpu-workload.cuda-samples-ref="${CUDA_SAMPLES_REF}"
+
+USER root
+RUN mkdir -p /usr/local/lib/openshell-gpu-workload \
+    /usr/local/share/doc/openshell-gpu-workload
+COPY --from=builder /opt/openshell-gpu-workload/deviceQuery /usr/local/lib/openshell-gpu-workload/deviceQuery
+COPY --from=builder /opt/openshell-gpu-workload/vectorAdd /usr/local/lib/openshell-gpu-workload/vectorAdd
+COPY --from=builder /opt/openshell-gpu-workload/cuda-samples.LICENSE /usr/local/share/doc/openshell-gpu-workload/cuda-samples.LICENSE
+COPY workload.sh /usr/local/bin/openshell-gpu-workload
+RUN chmod 0755 /usr/local/bin/openshell-gpu-workload \
+    /usr/local/lib/openshell-gpu-workload/deviceQuery \
+    /usr/local/lib/openshell-gpu-workload/vectorAdd
+
+USER sandbox
+ENTRYPOINT ["/usr/local/bin/openshell-gpu-workload"]
@@ -0,0 +1,42 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# GPU workload CUDA basic
+
+`cuda-basic` validates that a GPU-enabled environment can run a basic CUDA
+runtime workload. It is a single image that runs two validation steps:
+
+1. `deviceQuery` checks CUDA runtime, driver, and device discovery.
+2. `vectorAdd` checks kernel launch, device memory allocation, host/device
+   copies, synchronization, and result validation.
+
+The image builds the samples from `NVIDIA/cuda-samples` tag `v12.8` with a CUDA
+12.8 builder image, then copies only the compiled binaries into the OpenShell
+community base final image.
+
+The workload prints `OPENSHELL_GPU_WORKLOAD_SUCCESS` only after both samples
+pass. On failure it prints `OPENSHELL_GPU_WORKLOAD_FAILURE` and exits non-zero.
+
+Build it with:
+
+```shell
+OPENSHELL_GPU_WORKLOAD_IMAGES=cuda-basic mise run e2e:gpu:images:build
+```
+
+Run it directly with Docker CDI:
+
+```shell
+source e2e/gpu/images/.build/latest.env
+docker run --rm --device nvidia.com/gpu=all \
+  "${OPENSHELL_E2E_GPU_CUDA_WORKLOAD_IMAGE}"
+```
+
+Use `podman run` with the same `--device nvidia.com/gpu=all` option when Podman
+CDI is configured.
+
+The image does not vendor GPU driver libraries such as `libcuda.so.1`. Those
+libraries must be provided by the host GPU runtime or CDI injection.
+
+The CUDA samples are redistributed under the NVIDIA CUDA samples license. The
+license text is copied into the image at
+`/usr/local/share/doc/openshell-gpu-workload/cuda-samples.LICENSE`.
@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+set -euo pipefail
+
+readonly SUCCESS_MARKER="OPENSHELL_GPU_WORKLOAD_SUCCESS"
+readonly FAILURE_MARKER="OPENSHELL_GPU_WORKLOAD_FAILURE"
+readonly WORKLOAD_DIR="/usr/local/lib/openshell-gpu-workload"
+
+run_sample() {
+  local name=$1
+  local expected=$2
+  local binary="${WORKLOAD_DIR}/${name}"
+  local output
+
+  output="$(mktemp)"
+  echo "running CUDA sample: ${name}"
+  if ! "${binary}" >"${output}" 2>&1; then
+    cat "${output}"
+    echo "${FAILURE_MARKER} ${name} exited non-zero" >&2
+    rm -f "${output}"
+    exit 1
+  fi
+
+  cat "${output}"
+  if ! grep -Fq "${expected}" "${output}"; then
+    echo "${FAILURE_MARKER} ${name} did not print expected output: ${expected}" >&2
+    rm -f "${output}"
+    exit 1
+  fi
+
+  rm -f "${output}"
+}
+
+run_sample "deviceQuery" "Result = PASS"
+run_sample "vectorAdd" "Test PASSED"
+
+echo "${SUCCESS_MARKER} cuda-basic"
@@ -0,0 +1,15 @@
+# syntax=docker/dockerfile:1
+
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+ARG OPENSHELL_SANDBOX_BASE_IMAGE=ghcr.io/nvidia/openshell-community/sandboxes/base:latest
+
+FROM ${OPENSHELL_SANDBOX_BASE_IMAGE}
+
+USER root
+COPY workload.sh /usr/local/bin/openshell-gpu-workload
+RUN chmod 0755 /usr/local/bin/openshell-gpu-workload
+
+USER sandbox
+ENTRYPOINT ["/usr/local/bin/openshell-gpu-workload"]
@@ -0,0 +1,24 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# GPU workload smoke fail
+
+`smoke-fail` validates negative-path diagnostics in e2e test plumbing.
+
+The workload does not perform GPU-specific work. It prints
+`OPENSHELL_GPU_WORKLOAD_FAILURE`, emits a stable diagnostic, and exits non-zero.
+
+Build it with:
+
+```shell
+OPENSHELL_GPU_WORKLOAD_IMAGES=smoke-fail mise run e2e:gpu:images:build
+```
+
+Run it directly:
+
+```shell
+source e2e/gpu/images/.build/latest.env
+docker run --rm "${OPENSHELL_E2E_GPU_SMOKE_FAIL_IMAGE}"
+```
+
+The direct run should fail.
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+set -euo pipefail
+
+echo "OPENSHELL_GPU_WORKLOAD_FAILURE smoke-fail intentional failure" >&2
+exit 42
@@ -0,0 +1,15 @@
+# syntax=docker/dockerfile:1
+
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+ARG OPENSHELL_SANDBOX_BASE_IMAGE=ghcr.io/nvidia/openshell-community/sandboxes/base:latest
+
+FROM ${OPENSHELL_SANDBOX_BASE_IMAGE}
+
+USER root
+COPY workload.sh /usr/local/bin/openshell-gpu-workload
+RUN chmod 0755 /usr/local/bin/openshell-gpu-workload
+
+USER sandbox
+ENTRYPOINT ["/usr/local/bin/openshell-gpu-workload"]
@@ -0,0 +1,23 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# GPU workload smoke pass
+
+`smoke-pass` validates image publishing, sandbox image compatibility, default
+entrypoint execution, and success-marker assertion plumbing.
+
+The workload does not perform GPU-specific work. It prints
+`OPENSHELL_GPU_WORKLOAD_SUCCESS` and exits `0`.
+
+Build it with:
+
+```shell
+OPENSHELL_GPU_WORKLOAD_IMAGES=smoke-pass mise run e2e:gpu:images:build
+```
+
+Run it directly:
+
+```shell
+source e2e/gpu/images/.build/latest.env
+docker run --rm "${OPENSHELL_E2E_GPU_SMOKE_PASS_IMAGE}"
+```
@@ -0,0 +1,8 @@
+#!/usr/bin/env bash
+
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+set -euo pipefail
+
+echo "OPENSHELL_GPU_WORKLOAD_SUCCESS smoke-pass"
-Original file line number
+Diff line change
@@ Expand Up / @@ -168,6 +168,7 @@ logs/ @@
     tmp/
     temp/
     *.tmp
+    e2e/gpu/images/.build/
     # Secrets/credentials (should never be committed)
     *.pem
@@ Expand Down @@