Skip to content

fix(security): harden sandbox SSH with mandatory HMAC secret, NetworkPolicy, and nonce replay detection#127

Merged
johntmyers merged 2 commits intomainfrom
fix/25-harden-sandbox-ssh/johntmyers
Mar 5, 2026
Merged

fix(security): harden sandbox SSH with mandatory HMAC secret, NetworkPolicy, and nonce replay detection#127
johntmyers merged 2 commits intomainfrom
fix/25-harden-sandbox-ssh/johntmyers

Conversation

@johntmyers
Copy link
Collaborator

🏗️ build-from-issue-agent

Closes #25

Summary

Hardens the SSH connection path between gateway and sandbox pods with three defense-in-depth improvements: mandatory HMAC handshake secret, Kubernetes NetworkPolicy restricting sandbox SSH ingress to the gateway, and nonce replay detection in the NSSH1 handshake. No UX changes — the secret is auto-generated for cluster deployments.

Changes Made

  • crates/navigator-sandbox/src/lib.rs: Fail startup if NEMOCLAW_SSH_HANDSHAKE_SECRET is empty/unset when SSH is enabled (replaces silent unwrap_or_default())
  • crates/navigator-sandbox/src/ssh.rs: Add NonceCache type (Arc<Mutex<HashMap<String, Instant>>>), background reaper task, nonce replay rejection in verify_preface(), and 6 unit tests
  • crates/navigator-server/src/lib.rs: Add startup validation rejecting empty ssh_handshake_secret
  • crates/navigator-server/src/sandbox/mod.rs: Always inject NEMOCLAW_SSH_HANDSHAKE_SECRET env var (remove conditional guard), add env injection test
  • deploy/helm/navigator/templates/networkpolicy.yaml: New NetworkPolicy restricting sandbox pod port 2222 ingress to gateway pods only (gated by networkPolicy.enabled)
  • deploy/helm/navigator/templates/statefulset.yaml: Wire NEMOCLAW_SSH_HANDSHAKE_SECRET env var into gateway pod
  • deploy/helm/navigator/values.yaml: Add server.sshHandshakeSecret (required) and networkPolicy.enabled: true
  • deploy/kube/manifests/navigator-helmchart.yaml: Add sshHandshakeSecret placeholder
  • deploy/docker/cluster-entrypoint.sh: Auto-generate secret via openssl rand -hex 32 at cluster deploy time
  • architecture/sandbox-connect.md: Document mandatory secret, nonce replay detection, and NetworkPolicy

Deviations from Plan

None — implemented as planned.

Tests Added

  • Unit: 6 verify_preface tests in crates/navigator-sandbox/src/ssh.rs (valid preface, replay rejection, expired timestamp, invalid HMAC, malformed input, distinct nonces); 1 env injection test in crates/navigator-server/src/sandbox/mod.rs
  • Integration: N/A — changes are isolated to startup validation and a single function
  • E2E: N/A — no changes under e2e/

Documentation Updated

  • architecture/sandbox-connect.md: Added nonce replay detection section, mandatory handshake secret section, NetworkPolicy in security model, updated config reference

Verification

  • Pre-commit checks passing (unit tests, lint, format)
  • All 14 new/existing tests pass (8 SSH + 6 sandbox)
  • Architecture documentation updated

…Policy, and nonce replay detection

Closes #25

- Make NEMOCLAW_SSH_HANDSHAKE_SECRET mandatory: server and sandbox both
  refuse to start if the secret is empty/unset. Cluster deployments
  auto-generate it via openssl rand in the entrypoint script.
- Add Kubernetes NetworkPolicy restricting sandbox port 2222 ingress to
  the gateway pod only, preventing lateral movement from other cluster
  workloads.
- Add NSSH1 nonce replay detection with a TTL-bounded cache, rejecting
  replayed handshakes within the timestamp validity window.
- Add unit tests for verify_preface (valid, replay, expired, bad HMAC,
  malformed) and env injection.
@johntmyers johntmyers self-assigned this Mar 5, 2026
@johntmyers johntmyers merged commit c6919aa into main Mar 5, 2026
10 checks passed
@johntmyers johntmyers deleted the fix/25-harden-sandbox-ssh/johntmyers branch March 5, 2026 20:21
drew pushed a commit that referenced this pull request Mar 16, 2026
…Policy, and nonce replay detection (#127)

* fix(security): harden sandbox SSH with mandatory HMAC secret, NetworkPolicy, and nonce replay detection

Closes #25

- Make NEMOCLAW_SSH_HANDSHAKE_SECRET mandatory: server and sandbox both
  refuse to start if the secret is empty/unset. Cluster deployments
  auto-generate it via openssl rand in the entrypoint script.
- Add Kubernetes NetworkPolicy restricting sandbox port 2222 ingress to
  the gateway pod only, preventing lateral movement from other cluster
  workloads.
- Add NSSH1 nonce replay detection with a TTL-bounded cache, rejecting
  replayed handshakes within the timestamp validity window.
- Add unit tests for verify_preface (valid, replay, expired, bad HMAC,
  malformed) and env injection.

* fix(deploy): pass sshHandshakeSecret in fast deploy helm upgrade

---------

Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sandbox SSH accepts any auth

1 participant