Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion architecture/build-containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ The gateway runs the control plane API server. It is deployed as a StatefulSet i
- **Docker target**: `gateway` in `deploy/docker/Dockerfile.images`
- **Registry**: `ghcr.io/nvidia/openshell/gateway:latest`
- **Pulled when**: Cluster startup (the Helm chart triggers the pull)
- **Entrypoint**: `openshell-gateway --port 8080` (gRPC + HTTP, mTLS)
- **Entrypoint**: `openshell-gateway --bind-address 0.0.0.0 --port 8080` (gRPC + HTTP, mTLS)

## Cluster (`openshell/cluster`)

Expand Down
4 changes: 2 additions & 2 deletions architecture/gateway-security.md
Original file line number Diff line number Diff line change
Expand Up @@ -304,9 +304,9 @@ Traffic flows through several layers from the host to the gateway process:
| Container | `30051` | Hardcoded in `crates/openshell-bootstrap/src/docker.rs` |
| k3s NodePort | `30051` | `deploy/helm/openshell/values.yaml` (`service.nodePort`) |
| k3s Service | `8080` | `deploy/helm/openshell/values.yaml` (`service.port`) |
| Server bind | `8080` | `--port` flag / `OPENSHELL_SERVER_PORT` env var |
| Server bind | `0.0.0.0:8080` in deployed containers | `--bind-address 0.0.0.0 --port 8080` / `OPENSHELL_BIND_ADDRESS` + `OPENSHELL_SERVER_PORT` |

Docker maps `host_port → 30051/tcp`. Inside k3s, the NodePort service maps `30051 → 8080 (pod port)`. The server binds `0.0.0.0:8080`.
Docker maps `host_port → 30051/tcp`. Inside k3s, the NodePort service maps `30051 → 8080 (pod port)`. The deployed gateway container binds `0.0.0.0:8080` explicitly.

## Security Model Summary

Expand Down
19 changes: 13 additions & 6 deletions architecture/gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,22 +107,25 @@ The gateway boots in `cli::run_cli` (`crates/openshell-server/src/cli.rs`) and p
- `docker` constructs `openshell-driver-docker` in-process and manages local containers labeled with the configured sandbox namespace.
- `vm` spawns the standalone `openshell-driver-vm` binary as a local compute-driver process, resolves it from `--driver-dir`, conventional libexec install paths, or a sibling of the gateway binary, connects to it over a Unix domain socket, and keeps the libkrun/rootfs runtime out of the gateway binary.
3. Build `ServerState` (shared via `Arc<ServerState>` across all handlers), including a fresh `SupervisorSessionRegistry`.
4. **Spawn background tasks**:
4. Resume persisted sandboxes that were stopped during the previous gateway shutdown.
5. **Spawn background tasks**:
- `ComputeRuntime::spawn_watchers` -- consumes the compute-driver watch stream, republishes platform events, and runs a periodic `ListSandboxes` snapshot reconcile.
- `ssh_tunnel::spawn_session_reaper` -- sweeps expired or revoked SSH session tokens from the store hourly.
- `supervisor_session::spawn_relay_reaper` -- sweeps orphaned pending relay channels every 30 seconds.
5. Create `MultiplexService`.
6. Bind `TcpListener` on `config.bind_address`.
7. Optionally create `TlsAcceptor` from cert/key files.
8. Enter the accept loop: for each connection, spawn a tokio task that optionally performs a TLS handshake, then calls `MultiplexService::serve()`.
6. Create `MultiplexService`.
7. Bind the primary gateway listener and any compute-driver requested listeners. Docker requests the Docker bridge gateway address with the normal gateway port, so sandbox containers can call back over the bridge without joining the host network.
8. Bind optional health and metrics listeners.
9. Optionally create `TlsAcceptor` from cert/key files.
10. Spawn a task per gateway listener. Each accepted connection optionally performs a TLS handshake, then calls `MultiplexService::serve()`.

## Configuration

All configuration is via CLI flags with environment variable fallbacks. The `--db-url` flag is required. The `--ssh-handshake-secret` flag is required for non-Docker drivers; Docker sandboxes do not receive a handshake secret.

| Flag | Env Var | Default | Description |
|------|---------|---------|-------------|
| `--port` | `OPENSHELL_SERVER_PORT` | `8080` | TCP listen port (binds `0.0.0.0`) |
| `--bind-address` | `OPENSHELL_BIND_ADDRESS` | `127.0.0.1` | IP address for gateway, health, and metrics listeners. Container deployments pass `0.0.0.0` explicitly. |
| `--port` | `OPENSHELL_SERVER_PORT` | `8080` | TCP listen port |
| `--log-level` | `OPENSHELL_LOG_LEVEL` | `info` | Tracing log level filter |
| `--tls-cert` | `OPENSHELL_TLS_CERT` | None | Path to PEM certificate file |
| `--tls-key` | `OPENSHELL_TLS_KEY` | None | Path to PEM private key file |
Expand All @@ -135,6 +138,7 @@ All configuration is via CLI flags with environment variable fallbacks. The `--d
| `--sandbox-image` | `OPENSHELL_SANDBOX_IMAGE` | None | Default container image for sandbox pods |
| `--grpc-endpoint` | `OPENSHELL_GRPC_ENDPOINT` | None | gRPC endpoint reachable from within the cluster (for supervisor callbacks) |
| `--drivers` | `OPENSHELL_DRIVERS` | `kubernetes` | Compute backend to use. Current options are `kubernetes`, `docker`, and `vm`. |
| `--docker-network-name` | `OPENSHELL_DOCKER_NETWORK_NAME` | `openshell-docker` | Docker bridge network that local Docker sandboxes join |
| `--vm-driver-state-dir` | `OPENSHELL_VM_DRIVER_STATE_DIR` | `target/openshell-vm-driver` | Host directory for VM sandbox rootfs, console logs, and runtime state |
| `--driver-dir` | `OPENSHELL_DRIVER_DIR` | unset | Override directory for `openshell-driver-vm`. When unset, the gateway searches `~/.local/libexec/openshell`, `/usr/local/libexec/openshell`, `/usr/local/libexec`, then a sibling binary. |
| `--vm-krun-log-level` | `OPENSHELL_VM_KRUN_LOG_LEVEL` | `1` | libkrun log level for VM helper processes |
Expand Down Expand Up @@ -607,6 +611,9 @@ The gateway reaches the sandbox exclusively through the supervisor-initiated `Co
The Docker driver (`crates/openshell-driver-docker/src/lib.rs`) is an in-process compute backend for local standalone gateways. It creates one Docker container per sandbox, labels each container with `openshell.ai/managed-by=openshell`, `openshell.ai/sandbox-id`, `openshell.ai/sandbox-name`, and `openshell.ai/sandbox-namespace`, and bind-mounts a Linux `openshell-sandbox` supervisor binary into the container.

- **Create**: Pulls or validates the sandbox image according to `sandbox_image_pull_policy`, creates a labeled container, mounts the supervisor binary and optional TLS material, and starts the container with the supervisor as entrypoint.
- **Bridge networking**: Ensures a local Docker bridge network exists (`openshell-docker` by default) and starts every sandbox container on that network instead of using `network_mode=host`.
- **Gateway callback routing**: On native Linux Docker, injects `host.openshell.internal` with the bridge gateway IP and reports that bridge gateway IP plus the normal gateway port to `run_server()` as an extra listener. If the primary listener already binds the wildcard address for that port, the extra address is covered and is not bound a second time. On Docker Desktop, the bridge gateway IP belongs to Docker Desktop's VM rather than the macOS/Windows host, so the driver maps `host.openshell.internal` to Docker's `host-gateway` alias and does not request an extra listener. `OPENSHELL_ENDPOINT` inside Docker sandboxes uses the configured scheme and points at `host.openshell.internal:<gateway-port>` in both cases.
- **Environment ownership**: Merges template and spec environment first, then overwrites driver-owned supervisor variables, including `PATH`, `OPENSHELL_ENDPOINT`, `OPENSHELL_SANDBOX_ID`, `OPENSHELL_SSH_SOCKET_PATH`, and `OPENSHELL_SANDBOX_COMMAND`. This keeps privileged supervisor setup from resolving helper binaries through a user-controlled search path.
- **List/Get/Watch**: Reads labeled containers in the configured sandbox namespace and derives driver-native sandbox status from Docker state plus supervisor relay readiness.
- **Stop**: Stops the matching labeled container without deleting it.
- **Delete**: Force-removes the matching labeled container.
Expand Down
11 changes: 10 additions & 1 deletion crates/openshell-core/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ pub const DEFAULT_SSH_HANDSHAKE_SKEW_SECS: u64 = 300;
/// Default Podman bridge network name.
pub const DEFAULT_NETWORK_NAME: &str = "openshell";

/// Default Docker bridge network name for local sandboxes.
pub const DEFAULT_DOCKER_NETWORK_NAME: &str = "openshell-docker";

/// Default OCI image for the openshell-sandbox supervisor binary.
pub const DEFAULT_SUPERVISOR_IMAGE: &str = "openshell/supervisor:latest";

Expand Down Expand Up @@ -387,7 +390,7 @@ impl Config {
}

fn default_bind_address() -> SocketAddr {
"0.0.0.0:8080".parse().expect("valid default address")
"127.0.0.1:8080".parse().expect("valid default address")
}

fn default_log_level() -> String {
Expand Down Expand Up @@ -473,6 +476,12 @@ mod tests {
);
}

#[test]
fn config_defaults_to_loopback_bind_address() {
let expected: SocketAddr = "127.0.0.1:8080".parse().expect("valid address");
assert_eq!(Config::new(None).bind_address, expected);
}

#[test]
fn config_new_disables_health_bind_by_default() {
let cfg = Config::new(None);
Expand Down
Loading
Loading