Skip to content

fix(inference): pull NIM by platform digest, use served model id#4641

Merged
cv merged 26 commits into
mainfrom
fix/nim-pull-attestation-index-digest
Jun 5, 2026
Merged

fix(inference): pull NIM by platform digest, use served model id#4641
cv merged 26 commits into
mainfrom
fix/nim-pull-attestation-index-digest

Conversation

@hunglp6d
Copy link
Copy Markdown
Contributor

@hunglp6d hunglp6d commented Jun 2, 2026

Summary

Fixes local NVIDIA NIM onboarding (NEMOCLAW_EXPERIMENTAL=1 nemoclaw onboard
Local NVIDIA NIM) on Docker 29.x with the containerd image store. Three defects on
that path, all reproduced and verified end-to-end on DGX Spark (GB10, Docker 29.5.2):

  1. Pull fails with error from registry: Incorrect Repository Format. NIM :latest
    tags are multi-arch OCI indexes that also carry buildkit attestation manifests
    (platform: unknown/unknown). The containerd image store pulls the per-arch layers,
    then fetches the attestation manifest, which nvcr.io rejects — aborting after every
    layer with no usable image. (Older Docker without the containerd store never fetched
    it, so the break tracks the Docker/image-store version, not the host or repo path.)
  2. Health check times out. A 30B NIM loads in ~5 min on GB10, right at the old 300s
    wait; larger models always time out.
  3. Endpoint validation 404s. NIM serves the id from its image config
    (nvidia/nemotron-3-nano), which differs from the catalog name
    (nvidia/nemotron-3-nano-30b-a3b); validating/routing with the catalog id returns 404.

Related Issue

Fixes #3885

Changes

  • src/lib/inference/nim.ts
    • pullNimImage resolves the index to the host-arch image-manifest digest
      (docker manifest inspect → match platform.architecture/os, skipping
      unknown/unknown attestation entries), pulls that single manifest by digest
      (no index walk → no attestation fetch), and re-tags it to the original ref.
      Falls back to a plain tag pull when the ref is not a resolvable multi-arch index,
      and logs which path was taken.
    • adoptServedModelId reads the served id from /v1/models and uses it for
      validation/route/config when it differs from the catalog name. The served id is
      local-service-controlled, so it is validated with isSafeModelId before adoption
      (mirrors the adjacent local-vLLM detected-model boundary); an unsafe id is ignored
      with a diagnostic and never echoed into logs.
    • waitForNimHealth default raised 300s → 1200s for slow first-loads (no new env var).
  • src/lib/onboard.ts: calls the helpers above; kept net-neutral per
    codebase-growth-guardrails.
  • src/lib/adapters/docker/inspect.ts, image.ts: add dockerManifestInspect
    and dockerTag.
  • Tests: digest resolution, served-id adoption (incl. unsafe-id rejection), thrown
    dockerManifestInspect fallback, and manifest-inspect/tag argv.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Hung Le hple@nvidia.com

Summary by CodeRabbit

  • New Features

    • Docker image tagging and manifest inspection for improved image handling.
    • Automatic discovery and adoption of the NIM served-model ID from the local model endpoint.
  • Improvements

    • Prefer host-architecture manifests when pulling multi-arch images and retag to original references.
    • Onboarding now adopts served model IDs earlier and tightens inference API validation.
    • Increased default health-check timeout for model startup.
  • Tests

    • Added unit tests for manifest selection, image pulling, and served-model parsing.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 2, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Docker manifest inspect/tag helpers; platform-aware OCI manifest digest selection with digest-based pull-and-retag; served-model-id parsing and optional adoption; integrates adoption into onboarding; increases DEFAULT_NIM_HEALTH_TIMEOUT_SECONDS to 1200.

Changes

NIM Platform-Aware Image Pulling and Served Model Discovery

Layer / File(s) Summary
Docker manifest inspection and tagging helpers
src/lib/adapters/docker/image.ts, src/lib/adapters/docker/inspect.ts, src/lib/adapters/docker/index.test.ts
Adds dockerTag and dockerManifestInspect wrappers around dockerRun/dockerCapture and tests asserting correct CLI argv and options.
Platform-aware pull utilities and resolver
src/lib/inference/nim.ts, src/lib/inference/nim.test.ts
Adds nodeArchToOci, selectPlatformManifestDigest, imageRepository, and a pull resolver that inspects OCI index JSON, selects the host-arch Linux manifest digest, pulls by digest, then retags to the original ref; tests include mock OCI indexes and pull/retag workflows with fallbacks.
NIM served model ID resolution
src/lib/inference/nim.ts, src/lib/inference/nim.test.ts
Exports parseServedModelId, getServedModelId, and adoptServedModelId to parse /v1/models responses and optionally override catalog model ids; tests cover parsing, endpoint responses, adoption rules, and unsafe-id handling.
Onboarding integration and API guard
src/lib/inference/nim.ts, src/lib/onboard.ts
Adopts the served model id during local NIM onboarding and refines the post-validation API forcing guard; updates related health-check comment and timeout constant to 1200 seconds.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Suggested labels

bug-fix, platform: dgx-spark, Docker

Suggested reviewers

  • cv

Poem

🐰 I sniffed the manifests at dawn,

Found arch-bound digests, neatly drawn.
I pulled by hash, then gave a tag —
The NIM awoke; no more the snag.
Hooray for carrots, code, and wag!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main changes: pulling NIM by platform digest and using served model IDs.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #3885: digest-based pulling for multi-arch indexes, fallback to plain tag pull with logging, served model ID adoption with safety validation, and increased health-check timeout.
Out of Scope Changes check ✅ Passed All changes directly support the linked issue objectives: docker helper additions, NIM pull/model-id logic, onboarding integration, and comprehensive test coverage for digest resolution and served-id adoption.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/nim-pull-attestation-index-digest

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

E2E Advisor Recommendation

Required E2E: onboard-inference-smoke-e2e, inference-routing-e2e, cloud-onboard-e2e
Optional E2E: gpu-e2e, cloud-inference-e2e

Dispatch hint: cloud-onboard-e2e,inference-routing-e2e

Auto-dispatched E2E: inference-routing-e2e, cloud-onboard-e2e via nightly-e2e.yaml at 3b503571a321952bee0ba5d80fc0a8f405e43cf3nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required E2E

  • onboard-inference-smoke-e2e (low): Lightweight regression E2E for onboard inference validation. It is the closest existing focused test for the onboard validation behavior changed in the local NIM path, including failing closed when a configured route cannot serve a real chat completion.
  • inference-routing-e2e (medium): Covers real OpenShell gateway inference routing, credential isolation, and provider error classification. The NIM changes alter local OpenAI-compatible model selection and chat-completions routing, so this should be merge-blocking adjacent confidence.
  • cloud-onboard-e2e (medium): Exercises the install/onboard flow and sandbox health/security checks through a real user path. Although it does not select local NIM, src/lib/onboard.ts changed in the inference-provider setup area and this guards against broad onboarding regressions.

Optional E2E

  • gpu-e2e (high): Optional high-cost local GPU confidence. It does not cover NIM/NGC directly, but it exercises local provider onboarding, Docker/GPU availability, sandbox inference wiring, and real assistant user flow on a GPU runner.
  • cloud-inference-e2e (medium): Optional end-to-end live inference smoke through inference.local and OpenClaw. Useful to confirm the broader inference path still works, but it does not directly exercise local NIM image pull or served model-id adoption.

New E2E recommendations

  • local NIM provider onboarding (high): No existing E2E appears to run NEMOCLAW_PROVIDER=nim-local with a real NGC-backed NIM container. This PR's main behavior—manifest-index digest resolution, digest pull, tag-back, container start, /v1/models served-id adoption, and chat-completions validation—is therefore not directly covered.
    • Suggested test: local-nim-onboard-e2e
  • NIM OCI manifest attestation regression (high): Add a focused E2E or hermetic integration test with an NGC-like multi-arch OCI index containing linux manifests plus unknown/unknown attestation manifests, asserting the CLI pulls the host-arch digest and never bare-pulls the tag before tagging it back.
    • Suggested test: nim-manifest-digest-pull-e2e
  • NIM served model-id safety (medium): Add E2E coverage where a local OpenAI-compatible NIM mock returns a served model id different from the catalog name and then an unsafe id, verifying validation uses the safe served id and refuses unsafe/log-injection-shaped values.
    • Suggested test: nim-served-model-id-validation-e2e

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,inference-routing-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. The production changes are confined to Docker helper additions and the Local NVIDIA NIM onboarding/runtime path. The dispatchable scenario catalog in e2e-scenarios.yaml/scenarios.yaml has cloud NVIDIA and local Ollama coverage, but no Local NIM/NVCR image-pull scenario that would exercise docker manifest resolution, docker tag, served-model adoption, or the NIM health-timeout change. Unit test changes are outside test/e2e-scenario/. No scenario E2E job would directly validate the changed surface.

Optional scenario E2E

  • None.

Relevant changed files

  • src/lib/adapters/docker/image.ts
  • src/lib/adapters/docker/inspect.ts
  • src/lib/inference/nim.ts
  • src/lib/onboard.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

PR Review Advisor

Findings: 1 needs attention, 5 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 1 still applies, 3 new items found

Review findings

🛠️ Needs attention

  • Offset NIM monolith growth before merge (src/lib/inference/nim.ts:1): This PR adds 132 lines to `src/lib/inference/nim.ts` and 265 lines to `src/lib/inference/nim.test.ts`, both already large hotspots. The deterministic growth guard marks both as blocker-level monolith growth.
    • Recommendation: Extract the new manifest-resolution and served-model-id helpers/tests into smaller focused modules or otherwise offset the hotspot growth before merge.
    • Evidence: `src/lib/inference/nim.ts` grows from 869 to 1001 lines (+132); `src/lib/inference/nim.test.ts` grows from 1814 to 2079 lines (+265).

🔎 Worth checking

  • Source-of-truth review needed: NIM platform digest pull workaround: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `pullImageResolvingPlatform()` explains the current workaround and fallback, but not the removal condition.
  • Source-of-truth review needed: Served model id adoption: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `adoptServedModelId()` documents the mismatch but does not state a removal condition.
  • Validate registry digest strings before constructing Docker refs (src/lib/inference/nim.ts:747): `selectPlatformManifestDigest()` accepts any non-empty `entry.digest` from `docker manifest inspect` JSON and `pullImageResolvingPlatform()` uses it to build `repo@digest`. Docker is invoked via argv arrays, so this is not shell injection, but stricter digest validation would reduce malformed-ref and registry-response hardening risk.
    • Recommendation: Accept only canonical OCI digest strings, for example `sha256:<64 hex>`, before returning a digest. Add a negative test where a matching platform entry contains a malformed digest and verify it is rejected or falls back safely.
    • Evidence: `typeof entry.digest === "string" && entry.digest.length > 0` is the only digest check before `${imageRepository(image)}@${digest}` is passed to `dockerPull()` and `dockerTag()`.
  • Add targeted runtime validation for the Docker/NIM pull path (src/lib/inference/nim.ts:776): The unit tests cover the intended argv behavior, but the linked issue is a Docker 29/containerd plus nvcr.io registry interaction and the acceptance clauses include successful pull, container start, and sandbox readiness. Those runtime boundaries are not proven by unit mocks.
    • Recommendation: Add or identify targeted runtime/integration validation that exercises Docker 29/containerd against a NIM-style index: resolve Linux platform digest, avoid pulling the tag index, tag back to the friendly ref, start the local NIM container, and verify onboarding uses the served model id.
    • Evidence: Tests mock `docker manifest inspect`, `docker pull`, `docker tag`, and `/v1/models`; deterministic test-depth context flags runtime/sandbox/infrastructure paths in `image.ts`, `inspect.ts`, `nim.ts`, and `onboard.ts` for behavioral runtime validation.
  • Document removal conditions for localized NIM compatibility workarounds (src/lib/inference/nim.ts:776): The digest-pull workaround and served-model-id adoption clearly identify the invalid external states and have regression tests, but the code does not state when these compatibility paths can be removed. Without that source-of-truth marker, workaround behavior can become permanent even after NGC/Docker/catalog behavior changes.
    • Recommendation: Add concise comments or tracking references describing the removal conditions, such as Docker/NGC no longer requiring digest pulls for NIM attestation indexes and catalog ids being guaranteed to match served `/v1/models` ids.
    • Evidence: `pullImageResolvingPlatform()` and `adoptServedModelId()` are localized fallback/tolerant behaviors with tests, but their comments explain the current workaround and not the condition under which it should be removed.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Docker 29/containerd NGC NIM index pull resolves a Linux platform digest, avoids pulling the tag index, tags back, and the tagged image can be used by `docker run`.. The new unit tests are targeted and useful, but the highest-risk behavior crosses Docker CLI, Docker 29/containerd image store, nvcr.io registry manifests, local NIM startup, and onboarding provider configuration.
  • **Runtime validation** — Local NIM onboarding passes the adopted `/v1/models` served id into validation and final provider config instead of the catalog id.. The new unit tests are targeted and useful, but the highest-risk behavior crosses Docker CLI, Docker 29/containerd image store, nvcr.io registry manifests, local NIM startup, and onboarding provider configuration.
  • **Runtime validation** — When manifest inspect returns a matching entry with a malformed digest, the pull path rejects it or falls back without constructing `repo@<bad>`.. The new unit tests are targeted and useful, but the highest-risk behavior crosses Docker CLI, Docker 29/containerd image store, nvcr.io registry manifests, local NIM startup, and onboarding provider configuration.
  • **Runtime validation** — When no platform digest matches the host arch, the diagnostic and fallback path are visible and the original failure is still surfaced clearly.. The new unit tests are targeted and useful, but the highest-risk behavior crosses Docker CLI, Docker 29/containerd image store, nvcr.io registry manifests, local NIM startup, and onboarding provider configuration.
  • **Runtime validation** — A running local NIM whose `/v1/models` endpoint is temporarily unreachable keeps the catalog model and still follows the expected validation recovery path.. The new unit tests are targeted and useful, but the highest-risk behavior crosses Docker CLI, Docker 29/containerd image store, nvcr.io registry manifests, local NIM startup, and onboarding provider configuration.
  • **Add targeted runtime validation for the Docker/NIM pull path** — Add or identify targeted runtime/integration validation that exercises Docker 29/containerd against a NIM-style index: resolve Linux platform digest, avoid pulling the tag index, tag back to the friendly ref, start the local NIM container, and verify onboarding uses the served model id.
  • **Acceptance clause:** Image pull completes successfully. — add test evidence or identify existing coverage. `pullNimImage()` now resolves a platform digest with `docker manifest inspect`, pulls `repo@digest`, and tags back to the original ref. Unit tests prove the happy path avoids `docker pull ...:latest`, but no runtime Docker 29/nvcr.io validation evidence is present.
  • **Acceptance clause:** NIM container starts. — add test evidence or identify existing coverage. The digest pull is tagged back to the original image ref used by `startNimContainerByName()`, and the default health timeout is now 1200s. Tests cover timeout and tag argv, but not a real container start from the re-tagged digest image.
Since last review details

Current findings:

  • Source-of-truth review needed: NIM platform digest pull workaround: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `pullImageResolvingPlatform()` explains the current workaround and fallback, but not the removal condition.
  • Source-of-truth review needed: Served model id adoption: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `adoptServedModelId()` documents the mismatch but does not state a removal condition.
  • Offset NIM monolith growth before merge (src/lib/inference/nim.ts:1): This PR adds 132 lines to `src/lib/inference/nim.ts` and 265 lines to `src/lib/inference/nim.test.ts`, both already large hotspots. The deterministic growth guard marks both as blocker-level monolith growth.
    • Recommendation: Extract the new manifest-resolution and served-model-id helpers/tests into smaller focused modules or otherwise offset the hotspot growth before merge.
    • Evidence: `src/lib/inference/nim.ts` grows from 869 to 1001 lines (+132); `src/lib/inference/nim.test.ts` grows from 1814 to 2079 lines (+265).
  • Validate registry digest strings before constructing Docker refs (src/lib/inference/nim.ts:747): `selectPlatformManifestDigest()` accepts any non-empty `entry.digest` from `docker manifest inspect` JSON and `pullImageResolvingPlatform()` uses it to build `repo@digest`. Docker is invoked via argv arrays, so this is not shell injection, but stricter digest validation would reduce malformed-ref and registry-response hardening risk.
    • Recommendation: Accept only canonical OCI digest strings, for example `sha256:<64 hex>`, before returning a digest. Add a negative test where a matching platform entry contains a malformed digest and verify it is rejected or falls back safely.
    • Evidence: `typeof entry.digest === "string" && entry.digest.length > 0` is the only digest check before `${imageRepository(image)}@${digest}` is passed to `dockerPull()` and `dockerTag()`.
  • Add targeted runtime validation for the Docker/NIM pull path (src/lib/inference/nim.ts:776): The unit tests cover the intended argv behavior, but the linked issue is a Docker 29/containerd plus nvcr.io registry interaction and the acceptance clauses include successful pull, container start, and sandbox readiness. Those runtime boundaries are not proven by unit mocks.
    • Recommendation: Add or identify targeted runtime/integration validation that exercises Docker 29/containerd against a NIM-style index: resolve Linux platform digest, avoid pulling the tag index, tag back to the friendly ref, start the local NIM container, and verify onboarding uses the served model id.
    • Evidence: Tests mock `docker manifest inspect`, `docker pull`, `docker tag`, and `/v1/models`; deterministic test-depth context flags runtime/sandbox/infrastructure paths in `image.ts`, `inspect.ts`, `nim.ts`, and `onboard.ts` for behavioral runtime validation.
  • Document removal conditions for localized NIM compatibility workarounds (src/lib/inference/nim.ts:776): The digest-pull workaround and served-model-id adoption clearly identify the invalid external states and have regression tests, but the code does not state when these compatibility paths can be removed. Without that source-of-truth marker, workaround behavior can become permanent even after NGC/Docker/catalog behavior changes.
    • Recommendation: Add concise comments or tracking references describing the removal conditions, such as Docker/NGC no longer requiring digest pulls for NIM attestation indexes and catalog ids being guaranteed to match served `/v1/models` ids.
    • Evidence: `pullImageResolvingPlatform()` and `adoptServedModelId()` are localized fallback/tolerant behaviors with tests, but their comments explain the current workaround and not the condition under which it should be removed.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@hunglp6d hunglp6d self-assigned this Jun 3, 2026
@hunglp6d hunglp6d added enhancement: inference VRDC Issues and PRs submitted by NVIDIA VRDC test team. labels Jun 3, 2026
@hunglp6d hunglp6d marked this pull request as ready for review June 3, 2026 00:06
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/inference/nim.test.ts`:
- Around line 207-245: The test "resolves the host-arch manifest digest..." uses
the real process.arch which makes expectedDigest deterministic only on certain
machines; update the test to pin or stub process.arch (or parameterize per
supported arch) before calling nimModule.nodeArchToOci and pullNimImage so
DIGEST_BY_ARCH[ociArch] is deterministic—e.g., set process.arch temporarily to
"x64" or "arm64" around the calls to nodeArchToOci and pullNimImage (and restore
it in the finally), or loop the test for each supported arch; target symbols:
pullNimImage, nodeArchToOci, DIGEST_BY_ARCH, process.arch,
loadNimWithMockedRunner, and ensure restore() still runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8d830609-3f2c-49d6-a054-2bc43ce52ba0

📥 Commits

Reviewing files that changed from the base of the PR and between f17a19a and 3d71312.

📒 Files selected for processing (6)
  • src/lib/adapters/docker/image.ts
  • src/lib/adapters/docker/index.test.ts
  • src/lib/adapters/docker/inspect.ts
  • src/lib/inference/nim.test.ts
  • src/lib/inference/nim.ts
  • src/lib/onboard.ts

Comment thread src/lib/inference/nim.test.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 26855567186
Target ref: 3d713125307111b85f78d3afc2235c3cd34e03c1
Workflow ref: main
Requested jobs: inference-routing-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
inference-routing-e2e ✅ success

@hunglp6d
Copy link
Copy Markdown
Contributor Author

hunglp6d commented Jun 3, 2026

Addressed advisor findings:

  • Validate served model id: adoptServedModelId now checks the /v1/models id with isSafeModelId before adopting it (mirrors the local-vLLM boundary); an unsafe id is ignored with a diagnostic and never echoed to logs. +test.
  • Observable digest-pull fallback: pullImageResolvingPlatform now logs when it falls back to a plain tag pull, plus a regression test for a thrown dockerManifestInspect.

@hunglp6d hunglp6d added the v0.0.57 Release target label Jun 3, 2026
@cv cv added v0.0.58 Release target and removed v0.0.57 Release target labels Jun 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

Selective E2E Results — ⚠️ No requested jobs ran

Run: 26857466199
Target ref: 271f8e3b6441a5cd7343e8daed6621e761594f8a
Workflow ref: main
Requested jobs: gpu-e2e
Summary: 0 passed, 0 failed, 1 skipped

Job Result
gpu-e2e ⏭️ skipped

@wscurran wscurran added area: inference Inference routing, serving, model selection, or outputs area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow labels Jun 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 26898539348
Target ref: c86f11a03fb480ed9dc1b2b65a632a10654fb599
Workflow ref: main
Requested jobs: inference-routing-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
inference-routing-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

Selective E2E Results — ⚠️ No requested jobs ran

Run: 26899287698
Target ref: 550bba2719e12f27ec555daafa4ce03dd7d4b75b
Workflow ref: main
Requested jobs: gpu-e2e
Summary: 0 passed, 0 failed, 1 skipped

Job Result
gpu-e2e ⏭️ skipped

@cv cv added v0.0.59 Release target and removed v0.0.58 Release target labels Jun 4, 2026
@cv cv added v0.0.60 Release target and removed v0.0.59 Release target labels Jun 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 26981266018
Target ref: e30a51f5d738cf230329b0ebe768a05af9cddc5d
Workflow ref: main
Requested jobs: cloud-onboard-e2e,inference-routing-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
inference-routing-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27025479068
Target ref: 6ef60d97cff2512a984f8507531669f007553535
Workflow ref: main
Requested jobs: inference-routing-e2e,cloud-onboard-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
inference-routing-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27026265139
Target ref: ff0fa2ccc7daedaa6105c62d1d94c6b31018c8ae
Workflow ref: main
Requested jobs: inference-routing-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
inference-routing-e2e ✅ success

@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ⚠️ No requested jobs ran

Run: 27036040845
Target ref: daef088ecca8dce092f97b9076fb40d75d088863
Workflow ref: main
Requested jobs: gpu-e2e
Summary: 0 passed, 0 failed, 1 skipped

Job Result
gpu-e2e ⏭️ skipped

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27036186305
Target ref: cc5720710d7c22b7bc48a0f0116c6a88a1ab5098
Workflow ref: main
Requested jobs: gpu-e2e,inference-routing-e2e
Summary: 1 passed, 0 failed, 1 skipped

Job Result
gpu-e2e ⏭️ skipped
inference-routing-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27038414371
Target ref: 3b503571a321952bee0ba5d80fc0a8f405e43cf3
Workflow ref: main
Requested jobs: inference-routing-e2e,cloud-onboard-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
inference-routing-e2e ✅ success

@cv cv merged commit 0b17a14 into main Jun 5, 2026
28 checks passed
@cv cv deleted the fix/nim-pull-attestation-index-digest branch June 5, 2026 22:11
miyoungc added a commit that referenced this pull request Jun 6, 2026
## Summary
- Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the
dev announcement from discussion #4877.
- Fills the source-doc gaps found during release-prep review across
inference, policy tiers, command behavior, security boundaries, Hermes
dashboard/tooling, runtime context, and troubleshooting.
- Refreshes generated agent skills under `.agents/skills/` from the
current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`.

## Source summary
- #4037 -> `docs/reference/architecture.mdx`,
`docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents
system-only runtime context that stays out of visible chat.
- #4875 -> `docs/reference/architecture.mdx`,
`docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents
try-first sandbox network/filesystem guidance and clearer failure
classification.
- #4788 -> `docs/security/best-practices.mdx`,
`docs/about/release-notes.mdx`: Documents shared OpenClaw
device-approval policy for startup and connect.
- #4768 -> `docs/reference/network-policies.mdx`,
`docs/network-policy/integration-policy-examples.mdx`,
`docs/get-started/quickstart.mdx`,
`docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`:
Documents `weather`, `public-reference`, and Hermes managed-tool gateway
preset behavior.
- #3788 and #4864 -> `docs/reference/network-policies.mdx`,
`docs/reference/commands.mdx`: Documents non-interactive policy-tier
fail-fast behavior and interactive prompt fallback.
- #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware
default sandbox resolution for `list`, `status`, and `tunnel` commands.
- #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel
status` behavior.
- #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy
preset descriptions in `policy-list`.
- #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents
package-managed OpenShell gateway service and Docker-driver
gateway-marker behavior.
- #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent
gateway/dashboard cleanup isolation by sandbox name and port.
- #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU
patch rollback behavior.
- #4610 -> `docs/reference/troubleshooting.mdx`,
`docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission
guidance aligned and removes skipped experimental wording.
- #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling
for custom `onboard --from <Dockerfile>` contexts in generated skills.
- #4870 -> `docs/reference/commands.mdx`,
`docs/manage-sandboxes/runtime-controls.mdx`: Documents
`NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage.
- #4641 -> `docs/inference/inference-options.mdx`,
`docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM
platform-digest pulls and served-model id adoption.
- #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents
stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash
coverage.
- #4852 -> `docs/inference/use-local-inference.mdx`,
`docs/reference/troubleshooting.mdx`: Documents Ollama model fit
filtering, 16K context floor, cold-load retry, and failed-model
exclusion.
- #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents
API-family sync, Hermes `api_mode`, and Bedrock Runtime exception.
- #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents
Nemotron managed-inference native tool-search fallback.
- #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents
interactive multimodal input prompting.
- #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass
normalization in generated troubleshooting coverage.
- #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents
prebuilt Hermes dashboard assets and TUI recovery without runtime
rebuilds.
- #4854 -> `docs/inference/switch-inference-providers.mdx`,
`docs/reference/commands.mdx`: Documents Hermes proxy API-key
placeholder preservation during inference switches.
- #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`.agents/skills/`: Keeps messaging enrollment behavior aligned with
manifest-hook implementation.
- #4771 -> `docs/security/best-practices.mdx`,
`docs/security/credential-storage.mdx`: Documents Hermes
placeholder-only secret boundary for sandbox-visible runtime files.
- #4787 -> `docs/security/best-practices.mdx`,
`docs/about/release-notes.mdx`: Documents expanded memory scanner
examples for OpenAI project keys and Slack app-level tokens.
- #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill
install mirroring into the agent home directory.
- #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep
structure and generated `.agents/skills/` refresh as the template for
this release.

## Verification
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/
--prefix nemoclaw-user --doc-platform fern-mdx --dry-run`
- `npm run docs`
- `git diff --check`
- skip-term scan across `docs/`, `.agents/skills/`, and `skills/`
- `npm run build:cli`
- `npm run typecheck:cli`
- Commit and pre-push hook suites, including markdownlint, gitleaks,
env-var docs gate, docs-to-skills verification, and skills YAML tests

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* DeepSeek-V4-Flash now available as default inference model for DGX
Station.
* Hermes dashboard improved with dedicated port and OAuth-authenticated
tool gateway selection.
* Added weather and public-reference policy presets for expanded agent
capabilities.
* Enhanced Ollama model selection with GPU memory filtering and
automatic retry for timeouts.

* **Bug Fixes**
  * Improved policy tier validation to prevent invalid configurations.
* Better sandbox cleanup scoping by port to prevent conflicts across
deployments.
  * Added GPU patch failure recovery with automatic rollback.

* **Documentation**
* Expanded troubleshooting guides for inference, security, and sandbox
lifecycle.
  * Added .dockerignore best practices for custom deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: inference Inference routing, serving, model selection, or outputs area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality v0.0.60 Release target VRDC Issues and PRs submitted by NVIDIA VRDC test team.

Projects

None yet

3 participants