fix(security): revert gateway auth token externalization by ericksoa · Pull Request #2482 · NVIDIA/NemoClaw

ericksoa · 2026-04-25T21:20:45Z

Summary

Reverts 51aa6af (feat(security): externalize gateway auth token from openclaw.json (#2378))
The externalized token path breaks openclaw tui inside the sandbox — OpenClaw 2026.4.9 requires OPENCLAW_GATEWAY_TOKEN but the runtime injection fails under Landlock (non-root mode) and the token is no longer in openclaw.json where the TUI and gateway can read it
Restores build-time token generation in openclaw.json so gateways authenticate out-of-the-box again
The token externalization will be re-introduced in a separate PR with deeper testing across root/non-root modes and OpenClaw 2026.4.9

Test plan

npm run typecheck:cli passes
npx vitest run --project cli — 2110 tests pass
All pre-commit and pre-push hooks pass
Verify openclaw tui works inside sandbox after rebuild
Verify gateway auth works on Spark (non-root mode)
Verify gateway auth works in root mode

Summary by CodeRabbit

Documentation
- Clarified security guidance: gateway auth tokens are stored in the sandbox configuration and risk notes updated.
Changes
- Token generation moved earlier in the image/build process so auth is present in the sandbox config at runtime.
- Runtime token retrieval simplified and connection instructions updated.
- Gateway token is exported to an environment variable and persisted/removed in users' shell profiles.
Tests
- Tests updated to validate token export, persistence, and retrieval behavior.

Reverts 51aa6af. The externalized token path breaks `openclaw tui` inside the sandbox — OpenClaw 2026.4.9 requires OPENCLAW_GATEWAY_TOKEN but the runtime injection fails under Landlock (non-root) and the token is no longer in openclaw.json where the TUI can read it. Restores build-time token generation in openclaw.json so gateways authenticate out-of-the-box again. The externalization will be re-introduced in a separate PR with deeper testing. Fixes #2480

coderabbitai · 2026-04-25T21:20:55Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 618627f2-b577-4814-aa8c-c3c5e421c14e

📥 Commits

Reviewing files that changed from the base of the PR and between 4752d10 and 72ca3a0.

📒 Files selected for processing (1)

Dockerfile

🚧 Files skipped from review as they are similar to previous changes (1)

Dockerfile

📝 Walkthrough

Walkthrough

Gateway token handling was changed: a per-build random token is embedded into /sandbox/.openclaw/openclaw.json at image build. Runtime reads gateway.auth.token from that file and exports OPENCLAW_GATEWAY_TOKEN (persisting to user rc files) instead of creating a separate external token file; host-side retrieval relies only on the config path.

Changes

Cohort / File(s)	Summary
Documentation & Build `\.agents/skills/nemoclaw-user-configure-security/references/best-practices.md`, `docs/security/best-practices.md`, `Dockerfile`	Docs updated to state tokens reside in `.openclaw/openclaw.json`. Dockerfile now generates and embeds a per-build random gateway token (`secrets.token_hex(32)`) into `openclaw.json`, removing runtime token-generation/cleanup steps and related comments.
Runtime / Startup Script `scripts/nemoclaw-start.sh`	Replaced external token file flow with `_read_gateway_token()` that parses `gateway.auth.token` from `/sandbox/.openclaw/openclaw.json`. Added `export_gateway_token()` to export `OPENCLAW_GATEWAY_TOKEN` and persist/remove marked export blocks in `${_SANDBOX_HOME}/.bashrc` and `${_SANDBOX_HOME}/.profile`; startup flows updated to call this.
Host-side Onboard Logic `src/lib/onboard.ts`	Removed kubectl-exec and temp-file search fallbacks; `fetchGatewayAuthTokenFromSandbox` now uses only the openclaw.json download path. Updated fallback help text to instruct manual `jq` extraction from `/sandbox/.openclaw/openclaw.json`.
Tests `test/nemoclaw-start.test.ts`	Reworked tests to validate `export_gateway_token` behavior: rc-file marker persistence/removal, shared `_read_gateway_token()` usage, Python `with open(...)` read, shell-escaping, empty-token unset behavior, and updated startup sequencing expectations.

Sequence Diagram(s)

sequenceDiagram
    actor Build as Build Time
    participant Docker as Dockerfile
    participant Config as /sandbox/.openclaw/openclaw.json
    participant StartSh as scripts/nemoclaw-start.sh
    participant RcFiles as .bashrc/.profile
    participant UserShell as User Interactive Shell
    participant TUI as openclaw tui

    Build->>Docker: generate per-build random token (secrets.token_hex(32))
    Docker->>Config: embed token in openclaw.json (gateway.auth.token)

    Note over StartSh,Config: container starts
    StartSh->>Config: _read_gateway_token() parses gateway.auth.token
    Config-->>StartSh: token value
    StartSh->>StartSh: export OPENCLAW_GATEWAY_TOKEN
    StartSh->>RcFiles: write/remove marked export blocks via export_gateway_token()
    RcFiles->>UserShell: rc files sourced on new shell
    UserShell->>TUI: openclaw tui (reads $OPENCLAW_GATEWAY_TOKEN)
    TUI-->>TUI: gateway authentication proceeds

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(security): externalize gateway auth token from openclaw.json #2378: Conflicting gateway-token handling changes—externalizes token to /run/nemoclaw/gateway-token versus this PR’s embedding in /sandbox/.openclaw/openclaw.json, touching Dockerfile, scripts, onboard logic, docs, and tests.

Suggested labels

security

Suggested reviewers

brandonpelfrey

Poem

🐰 A tiny token tucked in JSON bright,
Built at image time in the quiet night.
At boot I hop out, export with care,
I nest in rc files so shells find me there,
OpenClaw tui greets me—now we're square. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(security): revert gateway auth token externalization' directly summarizes the main change—reverting a previous externalization of the gateway auth token.
Linked Issues check	✅ Passed	The PR addresses issue `#2480` by restoring build-time token generation in openclaw.json, ensuring the token is available for `openclaw tui` and the gateway to authenticate without manual intervention.
Out of Scope Changes check	✅ Passed	All changes are scoped to reverting externalized gateway token handling: documentation updates, Dockerfile changes, token reading logic, and test updates align directly with the fix objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch revert/gateway-token-externalization

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Dockerfile`:
- Around line 230-232: The ARG NEMOCLAW_BUILD_ID is declared but never used, so
changing it does not invalidate the token-generation layer; update the
token-generation layer that creates the gateway token (the "token-generation"
RUN/step) to consume NEMOCLAW_BUILD_ID (e.g., reference it in that RUN via ENV
or a no-op echo/printf) so Docker sees the build-arg changes and busts the
cache; ensure you reference ARG NEMOCLAW_BUILD_ID before the token-generation
RUN and use the variable name NEMOCLAW_BUILD_ID in that step so token
regeneration runs on each build-arg change.

In `@scripts/nemoclaw-start.sh`:
- Around line 621-660: The startup currently aborts if writing
${_SANDBOX_HOME}/.bashrc or .profile fails when persisting
OPENCLAW_GATEWAY_TOKEN (snippet using marker_begin/marker_end), which breaks
non-root/sandboxed runs; change the logic to make rc-file writes best-effort by
routing token persistence through the existing /tmp sourced-file pattern (create
a /tmp/openclaw-env-<uid>.sh containing the snippet and ensure rc files source
that file if writable), and if you must directly update ${_SANDBOX_HOME}/.bashrc
or .profile only attempt writes when they are writable and swallow failures (do
not let errors from cat >"$rc_file" or printf >>"$rc_file" abort startup),
leaving the export OPENCLAW_GATEWAY_TOKEN="$token" in the current process
unconditional so gateway startup never depends on rc file writes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9b2a9d79-dfe8-4da3-93e1-2c11cc9ba0b2

📥 Commits

Reviewing files that changed from the base of the PR and between cc15689 and 1e497c6.

📒 Files selected for processing (6)

.agents/skills/nemoclaw-user-configure-security/references/best-practices.md
Dockerfile
docs/security/best-practices.md
scripts/nemoclaw-start.sh
src/lib/onboard.ts
test/nemoclaw-start.test.ts

…writes The reverted export_gateway_token code predates the Landlock fix in a54f9a3 and lacks || true guards on .bashrc/.profile writes. Under Landlock enforcement, DAC check ([ -w file ]) passes but the actual write is blocked, crashing the entrypoint under set -e — the exact same failure pattern that caused the 5-day non-root outage. Apply the same || true + continue pattern used in install_configure_guard.

NEMOCLAW_BUILD_ID was declared as an ARG but never referenced by any downstream instruction, so changing it via --build-arg had no effect on Docker layer caching. Reference it on the token-generation RUN line so Docker sees the value change and invalidates the cached layer, ensuring each build produces a fresh gateway auth token. Pre-existing issue surfaced by CodeRabbit review.

…d cache (#2483) ## Summary - Fixes 4x build time regression on Spark (400s+ → ~100s) caused by `NEMOCLAW_BUILD_ID` cache-busting the config generation layer, which invalidated the expensive `openclaw doctor --fix` + `openclaw plugins install` layer on every build - Splits token generation into two steps: config layer writes a placeholder (cacheable), then a late layer injects `secrets.token_hex(32)` (cache-busted but trivially fast) - The doctor/plugins layer no longer rebuilds on every build Depends on #2482 ## Test plan - [x] `npx vitest run --project cli` — 1947 tests pass (ssrf-parity skip is pre-existing, needs plugin build) - [x] All pre-commit and pre-push hooks pass - [ ] Verify build time improvement on Spark  ## Summary by CodeRabbit * **Chores** * Optimized Docker image build layers to improve caching efficiency while ensuring unique credentials are generated for each build.

Resolves conflicts in Dockerfile and test/nemoclaw-start.test.ts. - Dockerfile config-generation block: kept the externalized scripts/generate-openclaw-config.py invocation (the PR's purpose) and dropped the inline python3 -c block from main. - Dockerfile token step: dropped the PR's --clear-token step and took main's late-layer secrets.token_hex(32) injection (#2482 reverted gateway auth token externalization, so the token is again baked at build time). - scripts/generate-openclaw-config.py: ported the inference_inputs parsing (#2441) and channel healthMonitor field from main; removed the now-obsolete --clear-token mode. - test/nemoclaw-start.test.ts: took main's version, since the PR's token-externalization regression tests no longer match main's reverted design. - test/generate-openclaw-config.test.ts: removed the --clear-token test cases.

coderabbitai Bot reviewed Apr 25, 2026

View reviewed changes

Comment thread Dockerfile

Comment thread scripts/nemoclaw-start.sh Outdated

ericksoa added 2 commits April 25, 2026 14:26

ericksoa merged commit 31c782c into main Apr 25, 2026
39 checks passed

ericksoa mentioned this pull request Apr 25, 2026

perf(dockerfile): move token injection to late layer to preserve build cache #2483

Merged

3 tasks

ericksoa mentioned this pull request Apr 27, 2026

fix: auto-disable device auth for non-loopback URLs (#2341) #2449

Merged

coderabbitai Bot mentioned this pull request Apr 27, 2026

feat(security): runtime gateway token injection #2485

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): revert gateway auth token externalization#2482

fix(security): revert gateway auth token externalization#2482
ericksoa merged 3 commits intomainfrom
revert/gateway-token-externalization

ericksoa commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ericksoa commented Apr 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ericksoa commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading