feat: named sandbox templates with git checkout, environment, and bundled providers

## Problem Statement

Sandbox creation currently requires users to manually specify environment variables, clone repos, and attach providers on every invocation. For repeatable workflows — such as running multiple parallel Claude Code sandboxes against the same repo with Vertex AI — this means re-typing the same flags each time, or scripting around the CLI.

Two concrete pain points:

1. **No git checkout on init.** Users who want code in a sandbox must either `--upload` local files or manually `exec` a `git clone` after creation. There is no way to declaratively say "this sandbox starts with repo X checked out."

2. **Provider-adjacent env vars have no home.** If a generic ADC/Google metadata provider were to replace the current `google-vertex-ai` provider (see [discussion](https://github.com/NVIDIA/OpenShell/pull/1752#issuecomment-4635414894)), Anthropic-specific Vertex env vars (`CLAUDE_CODE_USE_VERTEX`, `ANTHROPIC_VERTEX_PROJECT_ID`, etc.) would no longer belong in the provider's resolution logic — they are consumer config that sits on top of the credential layer. Today there is nowhere to put them except manual `exec` or custom images.

A **named sandbox template** stored on the gateway solves both: define once, create many sandboxes from the same template, with environment, git checkout, and default providers bundled together. Templates are purely a convenience — users can always create sandboxes without one, specifying flags inline as they do today.

## Related Issues

- **#1520** — `feat(cli): investigate sandbox specs and openshell apply -f`. Investigates file-based declarative sandbox definitions. Complementary: a YAML file could be the authoring format for a named template, and `apply -f` could create/update templates on the gateway.
- **#863** — `SandboxTemplate.environment` env vars not applied to container. Bug: the existing `SandboxTemplate.environment` proto field doesn't flow through to the sandbox. Must be fixed before templates can set env vars.
- **#1447** — `feat: warmpool support for OpenShell sandboxes`. Warm pools need a specification for *what* to pre-create — named templates are a natural fit. Note: #1447 references the `agents.x-k8s.io` `SandboxTemplate` CRD, which is a Kubernetes-level pod templating concept (image, resources, volumes). The `NamedSandboxTemplate` proposed here is an OpenShell gateway-scoped concept at a higher abstraction level (environment, git checkout, bundled providers). They share a name but operate at different layers and could compose: OpenShell templates define *what* to create, the K8s CRD defines *how* to schedule it.
- **#1268** — Inject read-only files into sandbox at creation time. Related workspace-setup concern (file injection before Landlock).
- **#1706** — Emulate GCE metadata server for Google SDK access. Provider-level feature that templates would reference via bundled providers.
- **#1423** — Make local dev credential discovery first-class. Provider-side complement — auto-discovered credentials compose with templates.
- **#1492** — Add opaque driver-specific passthrough to SandboxTemplate. Actively extending the template proto.

## Technical Context

### Current state of the proto

`SandboxTemplate` already exists at `proto/openshell.proto:335-363` with `image`, `labels`, `annotations`, `environment`, `resources`, and `driver_config` fields. But it is a flat struct embedded in `SandboxSpec` — not a named, reusable, gateway-persisted entity. `SandboxSpec.environment` also exists and flows through the gateway and compute drivers, but is **not exposed** by the CLI `sandbox create` command.

### Current sandbox creation flow

```
CLI (sandbox create --provider X --from image)
  → Gateway: validates providers, resolves image, persists Sandbox, calls ComputeDriver
    → Driver (Docker/K8s/VM): builds env (defaults + template env + spec env + driver overrides), launches workload
      → Supervisor: loads policy, fetches provider credentials via GetSandboxProviderEnvironment RPC, spawns child process with provider env + proxy env + TLS env
```

### Where env vars are set today

Four layers, in precedence order:
1. `SandboxSpec.environment` — proto field exists, flows correctly, **CLI never populates it**
2. `SandboxTemplate.environment` — proto field exists, **#863 reports it's broken**
3. Driver-controlled env — identity, callback, security-critical vars (always override)
4. Provider credential env — fetched at runtime by supervisor, injected as placeholders

### Environment merge semantics with templates

When a named template is used, environment variables merge in this order (later layers win):

1. **Provider credential env** — base layer from bundled (and any additionally attached) providers
2. **Template environment** — merges on top of provider env, allowing templates to override or supplement provider-injected vars
3. **Inline `--env` flags** (future work) — would merge on top of template env for per-sandbox overrides

This means a template can set consumer-specific env vars (e.g., `CLAUDE_CODE_USE_VERTEX=1`) that sit on top of what the provider injects (e.g., `GCP_PROJECT_ID`), without modifying the provider itself.

### Where git checkout happens today

**Nowhere.** The only related mechanism is `--upload` (tar-over-SSH of local files post-creation). No `git clone` during init.

## Affected Components

| Component | Key Files | Role |
|-----------|-----------|------|
| Proto definitions | `proto/openshell.proto` | Add `NamedSandboxTemplate` message, `GitCheckout` message, CRUD RPCs |
| Gateway server | `crates/openshell-server/src/grpc/sandbox.rs` | Resolve template by name during sandbox creation |
| Gateway persistence | `crates/openshell-server/src/` (store layer) | Persist named templates |
| CLI | `crates/openshell-cli/src/main.rs`, `crates/openshell-cli/src/run.rs` | `template create/list/get/delete` commands, `sandbox create --template` flag |
| Compute drivers | `crates/openshell-driver-{docker,kubernetes,vm}/` | Pass template env through to sandbox, handle git checkout volume |
| Python SDK | `python/openshell/` | Template CRUD bindings |

## Technical Investigation

### Architecture Overview

Named sandbox templates would be a new gateway-scoped domain object, similar to how providers are managed today:

```
NamedSandboxTemplate (gateway-persisted)
  ├── metadata (name, labels)
  ├── environment: map<string, string>       — env vars merged on top of provider env
  ├── git_checkout: GitCheckout              — repo to clone on init
  ├── providers: repeated string             — default providers to attach
  ├── image: string                          — override default sandbox image
  └── upload: repeated FileUpload            — files to upload on init (composable with --upload)
```

```protobuf
message GitCheckout {
  string url = 1;       // Git repo URL
  string ref = 2;       // Branch, tag, or commit (default: HEAD)
  string path = 3;      // Clone target path inside sandbox (default: /home/user/<repo-name>)
  bool shallow = 4;     // --depth 1 for faster clones
}
```

Templates are a convenience, not a requirement. Users who don't need reusability continue using inline flags exactly as they do today. A template simply pre-populates what would otherwise be specified on every `sandbox create` invocation.

### Git checkout: gateway-side clone (outside the sandbox)

**Recommended: the gateway or compute driver clones the repo** using credentials from a bundled provider (e.g., a `github` provider's token), then mounts/copies the checkout into the sandbox filesystem. Benefits:

- **No git credentials inside the sandbox** — the sandbox never sees the git token. The agent can read/edit code but cannot `git push`.
- **Simpler lifecycle** — no dependency on proxy being up, no network namespace concerns.
- **Writable checkout** — the agent can modify files; the `.git` directory could optionally be stripped or made read-only.
- **Composes with `--upload`** — both the template's git checkout and the user's `--upload` path work together (upload overlays on top of checkout).

Trade-off: the sandbox cannot `git pull` or `git push` without separate git access. Agent output would be extracted via `sandbox exec`, file download, or diff export.

### Code References

| Location | Description |
|----------|-------------|
| `proto/openshell.proto:309-332` | `SandboxSpec`: has `environment`, `template`, `providers` fields |
| `proto/openshell.proto:335-363` | `SandboxTemplate`: flat struct with `environment`, `image`, `resources` |
| `proto/openshell.proto:1089-1103` | `GetSandboxProviderEnvironmentResponse`: provider env resolution |
| `crates/openshell-server/src/grpc/sandbox.rs:117-224` | `handle_create_sandbox_inner()`: validates providers, resolves template |
| `crates/openshell-server/src/grpc/provider.rs:430-539` | `resolve_provider_environment()`: provider credentials → env var map |
| `crates/openshell-cli/src/main.rs:1162-1293` | CLI `sandbox create` command — no `--env` or `--template` flags |
| `crates/openshell-cli/src/run.rs:1740-1920` | `sandbox_create()`: CLI-side creation logic |
| `crates/openshell-driver-docker/src/lib.rs:1634-1710` | Docker driver `build_environment()` |
| `crates/openshell-sandbox/src/lib.rs:370-425` | Supervisor fetches provider env at startup |
| `crates/openshell-sandbox/src/process.rs:194-258` | `inject_provider_env()` + child process spawn |

### Current Behavior

When `sandbox create` is called:
1. CLI parses `--provider`, `--from`, `--policy` flags — no `--env` or `--template` support
2. Gateway validates providers exist, checks for credential key collisions, resolves image
3. `SandboxSpec` is persisted with empty `environment` map (CLI never populates it)
4. Compute driver builds container env from defaults + template env + spec env + driver overrides
5. Supervisor fetches provider credentials at runtime and injects into child process

### What Would Need to Change

**Proto layer:**
- New `NamedSandboxTemplate` message with metadata, environment, git_checkout, providers, image
- New `GitCheckout` message
- CRUD RPCs: `CreateTemplate`, `GetTemplate`, `ListTemplates`, `DeleteTemplate`
- `CreateSandboxRequest` gains a `template_name` field (or reuse existing `template` field)

**Gateway:**
- Template persistence (same store pattern as providers)
- Template resolution during sandbox creation: merge template fields into SandboxSpec
- Git checkout execution: clone repo using bundled provider credentials, make available to compute driver

**CLI:**
- `openshell template create/list/get/delete` commands
- `openshell sandbox create --template <name>` flag
- Template overrides at creation time deferred to future work

**Compute drivers:**
- Accept git checkout directory/volume and mount into sandbox filesystem
- Docker: bind mount or copy into container
- K8s: init container or volume mount
- VM: include in rootfs or mount

**Python SDK:**
- Template CRUD methods on the client

### Alternative Approaches Considered

**Option A: Extend SandboxSpec inline (no named templates).** Add `--env` and `--git-repo` flags to `sandbox create`. Simpler, but no reusability — you retype everything each time. Doesn't satisfy the "5 parallel sandboxes from the same config" use case well.

**Option B: CLI-side config files only.** Store templates as local YAML files, expand them client-side before sending to the gateway. Simpler (no new RPCs/persistence), but not shareable across machines or team members. #1520 explores this direction — the two could converge.

**Option C: Named gateway-scoped templates (recommended).** Full CRUD on the gateway. Reusable, shareable, composable with providers. More implementation work but the right long-term abstraction.

### Patterns to Follow

- Provider CRUD in `crates/openshell-server/src/grpc/provider.rs` — same pattern for template CRUD
- Provider persistence in the gateway store — same pattern for template persistence
- Provider CLI commands — same pattern for template CLI commands
- `--upload` tar-over-SSH in `crates/openshell-cli/src/run.rs:5669-5706` — git checkout should compose with this

## Proposed Approach

Introduce `NamedSandboxTemplate` as a gateway-persisted domain object with environment, git checkout, bundled providers, and optional image override. Templates are managed via CLI CRUD commands and referenced by name at sandbox creation time. Git checkout happens at the gateway/driver level before the sandbox starts, so git credentials never enter the sandbox. The `--upload` path remains supported and composes with template-driven checkout. Template environment merges on top of provider-injected env vars, giving templates a natural place for consumer-specific config that doesn't belong in the provider itself. Template overrides at sandbox creation time are deferred to future work.

## Scope Assessment

- **Complexity:** High — new domain object, CRUD RPCs, persistence, CLI surface, git checkout orchestration across three compute drivers
- **Confidence:** Medium — core design is clear, but git checkout mechanics vary significantly across Docker/K8s/VM drivers
- **Estimated files to change:** 15-25
- **Issue type:** `feat`

## Risks & Open Questions

- **#863 must be fixed first** — `SandboxTemplate.environment` is currently broken. Template env vars flowing through is a prerequisite.
- **Git checkout across drivers** — Docker bind mounts, K8s init containers/volumes, and VM rootfs injection are three different mechanisms. Need to define the driver interface for "make this directory available in the sandbox."
- **Git credential scoping** — if the template bundles a `github` provider, the gateway can use that provider's token for the clone. But the clone happens outside the sandbox, so the gateway needs access to provider credentials at clone time, not just at sandbox runtime.
- **Template + provider interaction** — when a template bundles providers, does the user need to have those providers already created? Or can the template reference provider *types* and auto-create instances?
- **Upload + git checkout ordering** — if both are specified, upload should overlay on top of the git checkout. Need to define sequencing.
- **Template mutability** — can a template be updated after creation? Do running sandboxes reflect updates, or are they snapshots?
- **Relationship to #1520** — the YAML file format from #1520 could become the authoring surface for templates. Need alignment on whether templates are created via `openshell template create --from-file template.yaml` or `openshell apply -f template.yaml`.

## Test Considerations

- **Unit tests:** Template CRUD operations, template resolution during sandbox creation, environment merging (provider env → template env layering)
- **Unit tests:** Template + provider bundling — verify provider attachment, credential key collision checks
- **Integration tests:** Template persistence round-trip (create, get, list, delete)
- **E2e tests:** Create sandbox from template with env vars — verify env vars appear in sandbox and override provider env where specified
- **E2e tests:** Create sandbox from template with git checkout — verify repo is cloned and writable
- **E2e tests:** Create sandbox from template with bundled providers — verify providers are attached and credentials flow
- **E2e tests:** Template + `--upload` composition — verify both git checkout and uploaded files are present
- **E2e tests:** Create sandbox without a template — verify existing inline workflow is unaffected
- **Negative tests:** Template with nonexistent provider, template with invalid git URL, duplicate template names

---
*Created by spike investigation. Use `build-from-issue` to plan and implement.*

Location	Description
`proto/openshell.proto:309-332`	`SandboxSpec`: has `environment`, `template`, `providers` fields
`proto/openshell.proto:335-363`	`SandboxTemplate`: flat struct with `environment`, `image`, `resources`
`proto/openshell.proto:1089-1103`	`GetSandboxProviderEnvironmentResponse`: provider env resolution
`crates/openshell-server/src/grpc/sandbox.rs:117-224`	`handle_create_sandbox_inner()`: validates providers, resolves template
`crates/openshell-server/src/grpc/provider.rs:430-539`	`resolve_provider_environment()`: provider credentials → env var map
`crates/openshell-cli/src/main.rs:1162-1293`	CLI `sandbox create` command — no `--env` or `--template` flags
`crates/openshell-cli/src/run.rs:1740-1920`	`sandbox_create()`: CLI-side creation logic
`crates/openshell-driver-docker/src/lib.rs:1634-1710`	Docker driver `build_environment()`
`crates/openshell-sandbox/src/lib.rs:370-425`	Supervisor fetches provider env at startup
`crates/openshell-sandbox/src/process.rs:194-258`	`inject_provider_env()` + child process spawn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: named sandbox templates with git checkout, environment, and bundled providers #1814

Problem Statement

Related Issues

Technical Context

Current state of the proto

Current sandbox creation flow

Where env vars are set today

Environment merge semantics with templates

Where git checkout happens today

Affected Components

Technical Investigation

Architecture Overview

Git checkout: gateway-side clone (outside the sandbox)

Code References

Current Behavior

What Would Need to Change

Alternative Approaches Considered

Patterns to Follow

Proposed Approach

Scope Assessment

Risks & Open Questions

Test Considerations

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Component	Key Files	Role
Proto definitions	`proto/openshell.proto`	Add `NamedSandboxTemplate` message, `GitCheckout` message, CRUD RPCs
Gateway server	`crates/openshell-server/src/grpc/sandbox.rs`	Resolve template by name during sandbox creation
Gateway persistence	`crates/openshell-server/src/` (store layer)	Persist named templates
CLI	`crates/openshell-cli/src/main.rs`, `crates/openshell-cli/src/run.rs`	`template create/list/get/delete` commands, `sandbox create --template` flag
Compute drivers	`crates/openshell-driver-{docker,kubernetes,vm}/`	Pass template env through to sandbox, handle git checkout volume
Python SDK	`python/openshell/`	Template CRUD bindings

feat: named sandbox templates with git checkout, environment, and bundled providers #1814

Description

Problem Statement

Related Issues

Technical Context

Current state of the proto

Current sandbox creation flow

Where env vars are set today

Environment merge semantics with templates

Where git checkout happens today

Affected Components

Technical Investigation

Architecture Overview

Git checkout: gateway-side clone (outside the sandbox)

Code References

Current Behavior

What Would Need to Change

Alternative Approaches Considered

Patterns to Follow

Proposed Approach

Scope Assessment

Risks & Open Questions

Test Considerations

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions