fix: prompt for NGC API key when pulling local NIM images#2043
fix: prompt for NGC API key when pulling local NIM images#2043
Conversation
Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds two NGC Docker auth helpers to Changes
Sequence DiagramsequenceDiagram
participant User
participant Wizard as Onboard Wizard
participant NIM as nim.ts
participant Config as "~/.docker/config.json"
participant Docker as "docker CLI"
User->>Wizard: select NIM-local
Wizard->>NIM: isNgcLoggedIn()
NIM->>Config: read / parse config.json
Config-->>NIM: config contents / error
NIM-->>Wizard: logged-in? (true/false)
alt not logged in
Wizard->>User: prompt for NGC API key (secret)
User-->>Wizard: provide apiKey
Wizard->>NIM: dockerLoginNgc(apiKey)
NIM->>Docker: spawn "docker login nvcr.io -u $oauthtoken --password-stdin"
Docker-->>NIM: exit status / stderr
NIM-->>Wizard: success? (true/false)
alt login failed
Wizard->>User: re-prompt for API key
User-->>Wizard: provide apiKey or cancel
Wizard->>NIM: dockerLoginNgc(apiKey)
NIM-->>Wizard: final result
end
end
alt logged in or login success
Wizard->>NIM: pullNimImage(model)
NIM-->>Wizard: image pulled
else login failure after retry
Wizard-->>User: error and exit
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/nim.ts`:
- Around line 179-189: The current isNgcLoggedIn() does a raw substring search
which yields false positives; change it to parse the Docker config JSON and
check the auths (and optionally credHelpers/credsStore) for an actual entry for
nvcr.io. Specifically, in isNgcLoggedIn() read and JSON.parse the config file,
then confirm that parsed.auths exists and that
Object.keys(parsed.auths).some(key => key.includes("nvcr.io") ||
key.endsWith("nvcr.io")) and that the corresponding parsed.auths[key] has a
non-empty auth/token field; if not found (or JSON parsing fails) return false.
In `@src/lib/onboard.ts`:
- Around line 3958-3978: The NGC login flow currently always calls prompt()
(lines around nim.isNgcLoggedIn and nim.dockerLoginNgc) which will hang in
CI/non-TTY; guard both prompt calls with an isNonInteractive() check and avoid
prompting when non-interactive by failing fast with a clear error (use
processLogger.error or console.error) indicating NGC API key is required or
Docker must be pre-authenticated for non-interactive runs; specifically update
the block that checks nim.isNgcLoggedIn() and the retry branch that calls
prompt() so that if isNonInteractive() is true you skip prompting and
immediately exit (or return a failure) instead of blocking.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 0dad4a20-f07b-4231-80c6-20dcf3b79797
📒 Files selected for processing (3)
src/lib/nim.tssrc/lib/onboard.tstest/onboard-selection.test.ts
…ormalization Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
ericksoa
left a comment
There was a problem hiding this comment.
Nice work on this PR — the core approach is solid and the security practices (stdin for credentials, masked input, non-interactive guard) are all good. A few things to address before merging:
Should fix
1. Surface Docker error output
dockerLoginNgc() pipes stderr but never shows it to the user. When login fails, the generic "Failed to login" message makes troubleshooting hard (was it a bad key? network error? Docker not running?). Consider logging result.stderr on failure:
if (result.status !== 0 && result.stderr) {
console.error(` Docker login error: ${result.stderr.trim()}`);
}2. Add unit tests for isNgcLoggedIn()
This function has non-trivial JSON parsing logic with multiple code paths (auths, credHelpers, missing file, malformed JSON). Right now the only test coverage is a mock that bypasses it entirely. At minimum, unit tests for the various Docker config shapes would catch regressions.
3. Handle spawnSync error case
If Docker isn't installed or the daemon isn't running, spawnSync returns { status: null, error: ... }. The current code returns false (correct), but the user sees "Failed to login to NGC registry" which is misleading. Checking result.error to surface "Docker not found" or similar would improve the UX.
Nice to have
4. credsStore gap
isNgcLoggedIn() checks credHelpers and auths but not credsStore. Docker Desktop users with a global credential store (e.g., "credsStore": "desktop") may have nvcr.io creds in the OS keychain that don't appear in auths, causing a false negative and unnecessary re-prompt. Worth at least a TODO comment or follow-up issue.
5. Minor nit: $oauthtoken comment
A brief inline comment like // NGC expects literal "$oauthtoken" as username would help future readers unfamiliar with the NGC auth convention.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/onboard.ts (1)
4023-4049: Extract this NGC login block into a helper to keepsetupNim()complexity down.Line 4035 and Line 4043 duplicate prompt/login handling. Moving this block to a small helper will reduce branching in an already very large function and improve testability.
♻️ Proposed refactor
+async function ensureNgcRegistryLoginForNim() { + if (nim.isNgcLoggedIn()) return; + + if (isNonInteractive()) { + console.error( + " Docker is not logged in to nvcr.io. In non-interactive mode, run `docker login nvcr.io` first and retry.", + ); + process.exit(1); + } + + console.log(""); + console.log(" NGC API Key required to pull NIM images."); + console.log(" Get one from: https://org.ngc.nvidia.com/setup/api-key"); + console.log(""); + + for (let attempt = 0; attempt < 2; attempt += 1) { + const ngcKey = normalizeCredentialValue(await prompt(" NGC API Key: ", { secret: true })); + if (!ngcKey) { + console.error(" NGC API Key is required for Local NIM."); + process.exit(1); + } + if (nim.dockerLoginNgc(ngcKey)) return; + console.error(" Failed to login to NGC registry. Check your API key and try again."); + console.log(""); + } + + console.error(" NGC login failed. Cannot pull NIM images."); + process.exit(1); +} ... - // Ensure Docker is logged in to NGC registry before pulling NIM images. - if (!nim.isNgcLoggedIn()) { - ... - } + await ensureNgcRegistryLoginForNim();As per coding guidelines, "Limit cyclomatic complexity to 20 in JavaScript/TypeScript files, with target of 15."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 4023 - 4049, Extract the NGC login block inside setupNim() into a small helper (e.g., ensureNgcLoggedIn or handleNgcLogin) that accepts the nim object and uses prompt/normalizeCredentialValue/isNonInteractive to perform the logic currently duplicated around nim.dockerLoginNgc; move all branches there (non-interactive error + exit, displaying instructions, prompting for NGC API Key, retry once, and calling process.exit on failure) so setupNim() calls this helper instead of inline code; reference and reuse nim.dockerLoginNgc, isNonInteractive, prompt, normalizeCredentialValue and keep the same exit/console behavior to preserve semantics while reducing cyclomatic complexity and improving testability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 4023-4049: Extract the NGC login block inside setupNim() into a
small helper (e.g., ensureNgcLoggedIn or handleNgcLogin) that accepts the nim
object and uses prompt/normalizeCredentialValue/isNonInteractive to perform the
logic currently duplicated around nim.dockerLoginNgc; move all branches there
(non-interactive error + exit, displaying instructions, prompting for NGC API
Key, retry once, and calling process.exit on failure) so setupNim() calls this
helper instead of inline code; reference and reuse nim.dockerLoginNgc,
isNonInteractive, prompt, normalizeCredentialValue and keep the same
exit/console behavior to preserve semantics while reducing cyclomatic complexity
and improving testability.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: cb4aabf9-e798-4a47-a9d9-80e8ce31b8d1
📒 Files selected for processing (1)
src/lib/onboard.ts
Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/nim.ts`:
- Around line 191-194: The isNgcLoggedIn() check currently only returns true for
parsed.credHelpers["nvcr.io"] or explicit auth/token in parsed.auths keys;
update it to also consider a global credential store by returning true if
parsed.credsStore is set (meaning Docker stores credentials externally) and
there is an auths marker for nvcr.io (e.g., auths["nvcr.io"] or
auths["https://nvcr.io"] exists but may be an empty object). In other words,
inside isNgcLoggedIn() use parsed.credsStore || parsed.credHelpers?.["nvcr.io"]
as a positive condition and treat an empty entry in auths as valid when
parsed.credsStore is present, while preserving the existing auth/token checks
for entries.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 02a99a75-6aed-44ce-ab7d-808aec3c62d4
📒 Files selected for processing (2)
src/lib/nim.test.tssrc/lib/nim.ts
✅ Files skipped from review due to trivial changes (1)
- src/lib/nim.test.ts
Docker Desktop writes an empty auths marker after login; creds live in the OS keychain. Treat marker + credsStore as logged in. Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/lib/nim.ts (1)
192-195:⚠️ Potential issue | 🟠 MajorHandle additional Docker
authskey variants fornvcr.io.Line 193 only checks two exact keys. Docker may store registry keys in alternate forms (for example with
/v2/), which can makeisNgcLoggedIn()returnfalseeven when login exists.🔧 Proposed fix
- const auths = parsed?.auths || {}; - const entry = auths["nvcr.io"] || auths["https://nvcr.io"]; - if (entry?.auth) return true; + const auths = parsed?.auths ?? {}; + const authKey = Object.keys(auths).find((k: string) => /(^|:\/\/)nvcr\.io(\/|$)/i.test(k)); + const entry = authKey ? auths[authKey] : undefined; + if (entry?.auth || entry?.identitytoken || entry?.token) return true; if (entry && parsed?.credsStore) return true;#!/bin/bash set -euo pipefail # Inspect current matching logic in src/lib/nim.ts rg -n -C3 'isNgcLoggedIn|auths\["nvcr\.io"\]|https://nvcr\.io|credsStore|identitytoken|token' src/lib/nim.ts # Verify whether tests cover non-exact nvcr.io auth key variants (e.g. /v2/) rg -n -C2 'nvcr\.io|https://nvcr\.io|/v2/' --iglob '*nim*.test.ts'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/nim.ts` around lines 192 - 195, The login check in isNgcLoggedIn only tests auths["nvcr.io"] and auths["https://nvcr.io"], which misses variants like "https://nvcr.io/v2/" or other prefixed/suffixed keys; update isNgcLoggedIn to iterate Object.keys(parsed?.auths || {}) and match keys against a regex that normalizes/removes scheme and path (e.g. /^(?:https?:\/\/)?nvcr\.io(?:\/.*)?$/i) to find the nvcr entry, then treat an entry as logged-in if it has auth, identitytoken, token, or if parsed?.credsStore is set; ensure you still use parsed?.auths and parsed?.credsStore and update references to auths/entry accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/lib/nim.ts`:
- Around line 192-195: The login check in isNgcLoggedIn only tests
auths["nvcr.io"] and auths["https://nvcr.io"], which misses variants like
"https://nvcr.io/v2/" or other prefixed/suffixed keys; update isNgcLoggedIn to
iterate Object.keys(parsed?.auths || {}) and match keys against a regex that
normalizes/removes scheme and path (e.g. /^(?:https?:\/\/)?nvcr\.io(?:\/.*)?$/i)
to find the nvcr entry, then treat an entry as logged-in if it has auth,
identitytoken, token, or if parsed?.credsStore is set; ensure you still use
parsed?.auths and parsed?.credsStore and update references to auths/entry
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: f31d5ba7-3994-4a6b-b9fd-a481b5624263
📒 Files selected for processing (2)
src/lib/nim.test.tssrc/lib/nim.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/nim.test.ts
ericksoa
left a comment
There was a problem hiding this comment.
LGTM. Clean implementation of NGC registry auth for local NIM onboarding.
- API key piped via
--password-stdin(never in argv) — good security practice - Docker Desktop credential storage correctly handled (empty marker + credsStore)
- Both
nvcr.ioandhttps://nvcr.iovariants checked inisNgcLoggedIn() - Non-interactive mode exits with clear instructions instead of hanging
- 10 test cases for auth detection cover the key scenarios
- One retry on failure is pragmatic UX
## Summary Catches up the user-facing reference and troubleshooting docs with the CLI and policy behavior changes that landed in v0.0.21. Drafted via the `nemoclaw-contributor-update-docs` skill against commits in `v0.0.20..v0.0.21`, filtered through `docs/.docs-skip`. ## Changes - **`docs/reference/commands.md`** - `nemoclaw list`: session indicator (●) for connected sandboxes (#2117). - `nemoclaw <name> connect`: active-session note; auto-recovery from SSH identity drift after a host reboot (#2117, #2064). - `nemoclaw <name> status`: three-state Inference line (`healthy` / `unreachable` / `not probed`) covering both local and remote providers; new `Connected` line (#2002, #2117). - `nemoclaw <name> destroy` and `rebuild`: active-session warning with second confirm; rebuild reapplies policy presets to the recreated sandbox (#2117, #2026). - `nemoclaw <name> policy-add` and `policy-remove`: positional preset argument and non-interactive flow via `--yes`/`--force`/`NEMOCLAW_NON_INTERACTIVE=1` (#2070). - `nemoclaw <name> policy-list`: registry-vs-gateway desync detection (#2089). - **`docs/reference/troubleshooting.md`** - `Reconnect after a host reboot`: now reflects automatic stale `known_hosts` pruning on `connect` (#2064). - `Running multiple sandboxes simultaneously`: onboard's forward-port collision guard (#2086). - New section: `openclaw config set` or `unset` is blocked inside the sandbox (#2081). - **`docs/network-policy/customize-network-policy.md`**: non-interactive `policy-add`/`policy-remove` form; preset preservation across rebuild (#2070, #2026). - **`docs/inference/use-local-inference.md`**: NIM section now covers the NGC API key prompt with masked input and `docker login nvcr.io --password-stdin` behavior (#2043). - **Generated skills regenerated** to pick up the source changes (`.agents/skills/nemoclaw-user-reference/references/{commands,troubleshooting}.md`, plus minor heading-flow deltas elsewhere). The pre-commit `Regenerate agent skills from docs` hook ran and confirmed source ↔ generated parity. Commits skipped per `docs/.docs-skip` or no doc impact: `bbbaa0fb` (skip-features), `7cb482cb` (skip-features), `8dee23fd` (skip-terms), plus the usual CI / test / refactor / install-plumbing churn. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes for the modified files (the one failing test, `test/cli.test.ts > unknown command exits 1`, also fails on `origin/main` and is unrelated to these markdown-only changes) - [ ] `npm test` passes — skipped; same pre-existing CLI-dispatch test failure unrelated to docs - [ ] Tests added or updated for new or changed behavior — n/a, doc-only - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) — not run locally - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) — n/a, no new pages ## AI Disclosure - [x] AI-assisted — tool: Claude Code --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Multi-session SSH connections with concurrent session support. * Three-state inference health reporting (healthy/unreachable/not probed) across all providers. * Automatic SSH host key rotation detection and recovery. * Non-interactive policy preset management via positional arguments. * Session indicators in sandbox list view. * **Bug Fixes** * Protected destructive operations with active-session warnings. * Policy presets now preserved during sandbox rebuilds. * **Documentation** * NGC registry authentication requirements for container images. * Multi-sandbox onboarding and reconnection guidance. * Troubleshooting updates for port conflicts and SSH issues. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Local NIM onboarding fails at the image pull step because
docker pull nvcr.io/nim/...requires NGC registry authentication. This adds an NGC API key prompt during onboard that runsdocker login nvcr.io --password-stdinbefore pulling the NIM image. The key is masked during input and handled securely via stdin.Related Issue
Based on the investigation in PR #236.
Changes
src/lib/nim.ts: AddisNgcLoggedIn()to check if Docker is already authenticated with nvcr.io, anddockerLoginNgc()to login securely via--password-stdin.src/lib/onboard.ts: Prompt for NGC API key before NIM image pull when not already logged in. Masked input, one retry on failure.test/onboard-selection.test.ts: MockisNgcLoggedInin NIM-local selection test.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)AI Disclosure
Signed-off-by: zyang-dev 267119621+zyang-dev@users.noreply.github.com
Summary by CodeRabbit
New Features
Bug Fixes / Reliability
Tests