[BUG] nemoclaw onboard forces --gpu on WSL2, sandbox DOA (workaround included)

## Environment

- **OS:** Windows 11 → WSL2 Ubuntu 24.04 LTS
- **Docker:** Docker Desktop 4.x with WSL2 integration
- **GPU:** NVIDIA RTX 5090 Laptop (also confirmed on RTX 5070 Ti — see [NVIDIA Forums #363769](https://forums.developer.nvidia.com/t/bug-rtx-5070-ti-sandbox-notfound-access-denied-on-nemoclaw-onboarding-wsl2/363769)
- **NemoClaw:** v0.0.7
- **OpenShell:** v0.0.7
- **OpenClaw:** 2026.3.11

## Problem

`nemoclaw onboard` detects `nvidia-smi` on WSL2 and forces `--gpu` on both `openshell gateway start` and `openshell sandbox create`. On WSL2 with Docker Desktop, the GPU cannot be passed through to the k3s cluster inside the gateway container. The sandbox reports as created but is immediately dead — every subsequent command returns `"sandbox not found"`.

There is no `--no-gpu` flag, no environment variable to skip GPU detection, and no config option to override this behavior.

## Steps to Reproduce

1. Install NemoClaw on WSL2 with any NVIDIA GPU present
2. Run `nemoclaw onboard`
3. `openshell doctor check` passes all prerequisites
4. Step [3/7] "Creating sandbox" reports success
5. Immediately after, any sandbox command returns `status: NotFound`

## Expected Behavior

`--gpu` should be optional. Either:
- Add a `--no-gpu` or `--skip-gpu` flag to `nemoclaw onboard`
- Detect WSL2 via `/proc/version` and skip GPU passthrough automatically
- Fall back gracefully when GPU passthrough fails instead of creating a dead sandbox

## Workaround

Bypass `nemoclaw onboard` entirely and drive `openshell` directly without `--gpu`:

```bash
# 1. Clean stale state (critical — failed onboard runs corrupt k3s)
openshell sandbox delete <name> 2>/dev/null
openshell gateway destroy --name nemoclaw 2>/dev/null
docker volume rm openshell-cluster-nemoclaw 2>/dev/null

# 2. Start gateway WITHOUT --gpu
openshell gateway start --name nemoclaw

# 3. Create provider BEFORE sandbox (credentials injected at creation time)
openshell provider create --name nvidia-nim --type nvidia \
  --credential NVIDIA_API_KEY=nvapi-xxx

# 4. Set inference route
openshell inference set --provider nvidia-nim \
  --model nvidia/nemotron-3-super-120b-a12b

# 5. Create sandbox WITHOUT --gpu
openshell sandbox create --name my-sandbox --from openclaw
```

Inside the sandbox, `openclaw onboard` must use `https://inference.local/v1` as the base URL (Custom Provider → OpenAI-compatible) since the sandbox blocks direct outbound network.

Full workaround documentation and automated deploy scripts for WSL2 and macOS: [thenewguardai/tng-nemoclaw-quickstart — WSL2-WORKAROUND.md](https://github.com/thenewguardai/tng-nemoclaw-quickstart/blob/main/docs/WSL2-WORKAROUND.md)

## Additional Notes

- Stale gateway state from failed `nemoclaw onboard` runs requires `docker volume rm openshell-cluster-nemoclaw` to fully clear — `openshell gateway destroy` alone is not sufficient
- Provider must be created before sandbox creation, otherwise the sandbox lacks inference credentials
- `ANTHROPIC_API_KEY` in the sandbox environment causes OpenClaw to silently default to Claude regardless of configured model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] nemoclaw onboard forces --gpu on WSL2, sandbox DOA (workaround included) #208

Environment

Problem

Steps to Reproduce

Expected Behavior

Workaround

Additional Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] nemoclaw onboard forces --gpu on WSL2, sandbox DOA (workaround included) #208

Description

Environment

Problem

Steps to Reproduce

Expected Behavior

Workaround

Additional Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions