-
Notifications
You must be signed in to change notification settings - Fork 411
Description
Agent Diagnostic
- Pointed Claude Code at the OpenShell repo (
crates/openshell-sandbox/src/) - Read
proxy.rsto trace the CONNECT handling flow: request parsing (line 301) →evaluate_opa_tcp()(line 340) → OPA deny → 403 response (line 414) - Read
procfs.rsto understand binary identity resolution:resolve_tcp_peer_identity()→parse_proc_net_tcp()→find_pid_by_socket_inode() - Identified that
parse_proc_net_tcp()only reads/proc/<pid>/net/tcp{,6}(line 178) - Compared
/proc/1/net/tcpcontents between macOS Docker (arm64) and WSL2 Docker (amd64) — sandbox user connections visible on macOS, absent on WSL2 - Compared
/proc/net/tcp(global view) on WSL2 — sandbox user connections ARE visible there - Root cause: On WSL2, iptables REDIRECT/DNAT connections from the sandbox network namespace don't appear in per-PID
/proc/<pid>/net/tcp, only in global/proc/net/tcp - Fix: Added
/proc/net/tcp{,6}as fallback inparse_proc_net_tcp(). Built patched binary on amd64 and verified the fix.
Description
The sandbox egress proxy at 10.200.0.1:3128 returns HTTP 403 Forbidden for all HTTP CONNECT requests when running on Docker Desktop with WSL2 (Windows 11, amd64). The same container image, policy, and binary work correctly on Docker Desktop with Apple Hypervisor (macOS, arm64).
parse_proc_net_tcp() in procfs.rs only reads /proc/<entrypoint_pid>/net/tcp{,6} to resolve the socket inode for the peer connection. On WSL2/Docker Desktop, connections from the sandbox user namespace that are redirected via iptables REDIRECT/DNAT to the proxy are not visible in the per-PID table — they only appear in the global /proc/net/tcp.
This causes resolve_tcp_peer_identity() to fail, the proxy cannot identify the calling binary, no network policy matches, and OPA denies the request.
Forward proxy requests (GET http://...) are also affected since they use the same evaluate_opa_tcp() code path.
Reproduction Steps
- Run OpenShell cluster on Docker Desktop + WSL2 (Windows 11, amd64)
- Create a sandbox with a network policy allowing
socatandcurlto connect to a non-loopback IP:
network_policies:
my_endpoint:
endpoints:
- host: 100.77.240.62
port: 18789
enforcement: enforce
access: full
binaries:
- path: /usr/bin/socat
- path: /usr/bin/curl- Apply the policy and SSH into the sandbox
- Run:
curl -v -p -x http://10.200.0.1:3128 http://100.77.240.62:18789/ - Result:
HTTP/1.1 403 Forbidden(CONNECT tunnel failed) - Same steps on macOS Docker Desktop (arm64):
HTTP/1.1 200 Connection Established
Diagnostic from inside the pod on WSL2:
# Per-PID view — no sandbox user connections visible
cat /proc/1/net/tcp
# Only shows root-owned listeners and K8s API connections
# Global view — sandbox user connections ARE here
cat /proc/net/tcp
# Shows redirected connections from sandbox namespaceEnvironment
- Failing: Windows 11, Docker Desktop 4.x with WSL2/Hyper-V backend, amd64
- Working: macOS 15, Docker Desktop with Apple Hypervisor, arm64
- OpenShell: v0.0.16 (
ghcr.io/nvidia/openshell/cluster:0.0.16) - Docker: Docker Desktop (both platforms)
Logs
From sandbox pod logs on WSL2 (no deny lines visible — log level filters them):
WARN openshell_sandbox::sandbox::linux::landlock: Landlock filesystem sandbox is UNAVAILABLE
From socat inside the sandbox:
2026/03/30 15:55:37 socat[469] E CONNECT 100.77.240.62:18789: Forbidden
From curl verbose output:
> CONNECT 100.77.240.62:18789 HTTP/1.1
< HTTP/1.1 403 Forbidden
* CONNECT tunnel failed, response 403
**Note**: The OPA deny at `proxy.rs:414` uses `info!()` level logging, but the sandbox defaults to WARN level, so the deny reason (including binary path resolution failure) is not visible in pod logs.Agent-First Checklist
- I pointed my agent at the repo and had it investigate this issue
- I loaded relevant skills (e.g.,
debug-openshell-cluster,debug-inference,openshell-cli) - My agent could not resolve this — the diagnostic above explains why