Skip to content

fix: resolve silent output and remove nc dependency#3669

Open
Iamkewl wants to merge 1 commit into
NVIDIA:mainfrom
Iamkewl:fix/issue-3665-pr
Open

fix: resolve silent output and remove nc dependency#3669
Iamkewl wants to merge 1 commit into
NVIDIA:mainfrom
Iamkewl:fix/issue-3665-pr

Conversation

@Iamkewl
Copy link
Copy Markdown
Contributor

@Iamkewl Iamkewl commented May 17, 2026

Summary

Replaces the platform-dependent nc readiness probe with an inline Node TCP probe, aligns timeouts, hardens tests to avoid CI flakes, and adds CI reliability fixes.

Related Issue

Fixes #3665

Changes

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Your Name your-email@example.com

Summary by CodeRabbit

  • New Features

    • Ollama auth-proxy startup timeout is now configurable via environment variable for slow environments.
  • Bug Fixes

    • Improved CI robustness with retry logic and checksum validation for dependency downloads.
    • Enhanced error handling for filesystem traversal operations.
    • Port reachability checks now use native Node.js for better reliability.
  • Tests

    • Expanded test coverage for port availability checks and conditional test execution based on environment configuration.

Review Change Stack

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 17, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Node type refs and guarded filesystem traversal, guards skills frontmatter tests when generated skills are absent, replaces an external nc port probe with an inline Node net probe invoked via process.execPath, makes the Ollama proxy startup timeout configurable, updates tests to match the new probe, and adds hadolint download retries.

Changes

Error Resilience Improvements

Layer / File(s) Summary
Node types and guarded directory traversal
scripts/find-source-shape-tests.ts
Added /// <reference types="node" /> and wrapped readdirSync/statSync calls in try/catch to return early on unreadable directories and skip entries when stat fails with ENOENT/EACCES.
Test suite existence guard
test/skills-frontmatter.test.ts
Added fs.existsSync(skillsRoot) check that registers a single placeholder assertion and returns early when generated skills are missing, gated by NEMOCLAW_TEST_ALLOW_MISSING_GENERATED_SKILLS.
Replace nc-based port probe with Node net probe
src/lib/core/wait.ts
waitForPort now runs a small Node net probe via process.execPath and uses the probe exit status to determine TCP reachability.
Test coverage and spawnSync mocks update
test/wait.test.ts, test/onboard-selection.test.ts
Added network-based waitForPort tests with ephemeral servers; updated embedded spawnSync mocks in onboard tests to treat probes run via process.execPath as successful alongside nc.
Startup timeout env and CI retry
src/lib/inference/ollama/proxy.ts, ci/env-var-doc-allowlist.json, .github/actions/basic-checks/action.yaml
startOllamaAuthProxy() reads NEMOCLAW_OLLAMA_PROXY_STARTUP_TIMEOUT (default 10s) for readiness checks; CI hadolint download now retries up to 3 times with SHA256 validation; allowlist documents the new env var.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

configuration, enhancement: testing, Platform: All, v0.0.44

Suggested reviewers

  • jyaunches
  • prekshivyas
  • cjagwani

Poem

🐰 I hop through folders, stat and try,
If a port won't answer, I spawn Node and pry.
Tests that miss generated things politely pause,
CI retries dance to checksum laws.
A little rabbit cheered the resilient cause.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix: resolve silent output and remove nc dependency' directly addresses the main technical change: replacing the nc (netcat) command-line tool with a native Node.js TCP probe.
Linked Issues check ✅ Passed The PR code changes comprehensively address the linked issue #3665 objectives: it removes the platform-dependent nc dependency with native Node TCP probing, makes Ollama proxy startup timeout configurable for slow/WSL2 environments, hardens directory walking to handle filesystem errors, and adds integration tests for the wait functionality.
Out of Scope Changes check ✅ Passed All code changes are directly scoped to fixing issue #3665: replacing nc with Node TCP probes, making timeout configurable, hardening error handling in file walking, updating corresponding tests, and improving CI download reliability.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/skills-frontmatter.test.ts (1)

48-53: ⚡ Quick win

Remove the duplicated skillsRoot existence guard.

This block is a duplicate of Lines 41-46 and is redundant/unreachable in practice.

Proposed cleanup
-  if (!fs.existsSync(skillsRoot)) {
-    it("no generated skills present (CI) — skipping", () => {
-      expect(true).toBe(true);
-    });
-    return;
-  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/skills-frontmatter.test.ts` around lines 48 - 53, The duplicate
existence guard for skillsRoot (the block starting with if
(!fs.existsSync(skillsRoot)) { it("no generated skills present (CI) — skipping",
...); return; }) should be removed because it repeats the earlier guard; delete
this second block so only the initial check for skillsRoot remains in the test
file (keep the first occurrence that guards the test suite and remove the
redundant copy to avoid unreachable/duplicated logic).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/skills-frontmatter.test.ts`:
- Around line 48-53: The duplicate existence guard for skillsRoot (the block
starting with if (!fs.existsSync(skillsRoot)) { it("no generated skills present
(CI) — skipping", ...); return; }) should be removed because it repeats the
earlier guard; delete this second block so only the initial check for skillsRoot
remains in the test file (keep the first occurrence that guards the test suite
and remove the redundant copy to avoid unreachable/duplicated logic).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 829d456c-baa2-4c7c-8cf7-8519bd4d2aa9

📥 Commits

Reviewing files that changed from the base of the PR and between 647566a and efeb4b2.

📒 Files selected for processing (2)
  • scripts/find-source-shape-tests.ts
  • test/skills-frontmatter.test.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/core/wait.ts (1)

52-55: 💤 Low value

Consider aligning socket timeout with the polling interval.

The socket timeout is hardcoded to 1000ms, but the probe is invoked every 200ms (line 64). Each probe attempt can block for up to 1 second, which reduces the effective polling frequency from the expected 5 Hz down to ~1 Hz. This means fewer probe attempts occur within the overall timeout window.

For tighter polling behavior, consider reducing the socket timeout to 200-500ms to match the polling interval.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/core/wait.ts` around lines 52 - 55, The socket probe currently uses a
hardcoded 1000ms timeout (socket.setTimeout(1000)), which blocks each probe and
undermines the 200ms polling interval; change socket.setTimeout to match the
polling interval (e.g., 200–500ms) or use the existing poll interval
constant/variable instead of 1000 so socket.setTimeout(...) aligns with the
probe frequency; update the code around socket.setTimeout and its related
handlers (socket.once('connect'), socket.once('timeout'), socket.once('error'))
to use that interval.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/core/wait.ts`:
- Around line 52-55: The socket probe currently uses a hardcoded 1000ms timeout
(socket.setTimeout(1000)), which blocks each probe and undermines the 200ms
polling interval; change socket.setTimeout to match the polling interval (e.g.,
200–500ms) or use the existing poll interval constant/variable instead of 1000
so socket.setTimeout(...) aligns with the probe frequency; update the code
around socket.setTimeout and its related handlers (socket.once('connect'),
socket.once('timeout'), socket.once('error')) to use that interval.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ab07f6d6-012c-433e-91fc-6cd9def6555e

📥 Commits

Reviewing files that changed from the base of the PR and between 0a48cc5 and 59e2d5a.

📒 Files selected for processing (1)
  • src/lib/core/wait.ts

@Iamkewl Iamkewl marked this pull request as draft May 18, 2026 13:36
@Iamkewl Iamkewl marked this pull request as ready for review May 18, 2026 13:49
@Iamkewl
Copy link
Copy Markdown
Contributor Author

Iamkewl commented May 18, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/actions/basic-checks/action.yaml:
- Around line 22-27: The checksum verification can silently succeed when both
EXPECTED and ACTUAL are empty; update the hadolint download loop to set and
check an explicit verified flag (e.g., verified=1) only when both EXPECTED and
ACTUAL are populated and match, and after the loop replace the current guard
that compares EXPECTED and ACTUAL with a conditional that fails if verified is
not set or if EXPECTED is empty (use [ -n "$EXPECTED" ] and the verified flag)
so the script errors out on missing checksum or failed verification; adjust the
loop body that currently computes EXPECTED and ACTUAL to assign those variables
conditionally and set verified when the hash check passes.

In `@src/lib/inference/ollama/proxy.ts`:
- Around line 213-214: The parsed proxyStartupTimeout can be NaN if
process.env.NEMOCLAW_OLLAMA_PROXY_STARTUP_TIMEOUT is non-numeric, causing
waitForPort(OLLAMA_PROXY_PORT, proxyStartupTimeout) to behave incorrectly;
update the code around Number.parseInt(...) to validate the parsed value (use
Number.parseInt(..., 10) result checked with Number.isFinite/Number.isNaN or
isNaN) and fall back to the default 10 (or clamp to a positive integer) when the
env value is invalid or <= 0 before calling waitForPort; ensure you keep the
existing symbols proxyStartupTimeout, OLLAMA_PROXY_PORT and waitForPort when
making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c1dd8a3d-a52d-40bb-aa93-393505d1086b

📥 Commits

Reviewing files that changed from the base of the PR and between 0a48cc5 and 383b070.

📒 Files selected for processing (6)
  • .github/actions/basic-checks/action.yaml
  • ci/env-var-doc-allowlist.json
  • scripts/find-source-shape-tests.ts
  • src/lib/core/wait.ts
  • src/lib/inference/ollama/proxy.ts
  • test/onboard-selection.test.ts
✅ Files skipped from review due to trivial changes (1)
  • ci/env-var-doc-allowlist.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/core/wait.ts
  • scripts/find-source-shape-tests.ts

Comment thread .github/actions/basic-checks/action.yaml
Comment thread src/lib/inference/ollama/proxy.ts Outdated
@Iamkewl Iamkewl marked this pull request as draft May 18, 2026 13:59
@wscurran wscurran added fix NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). labels May 18, 2026
@wscurran
Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR to fix the silent output issue and remove the dependency on the nc utility. This change aims to improve the reliability of the CLI by ensuring stdio is flushed and implementing a native Node.js TCP probe.


Related open issues:

@Iamkewl Iamkewl marked this pull request as ready for review May 18, 2026 15:13
@Iamkewl
Copy link
Copy Markdown
Contributor Author

Iamkewl commented May 18, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nemoclaw onboarding - gateway connection issue

2 participants