fix(cli): verify configured channels reach OpenClaw runtime (#4156)#4182
fix(cli): verify configured channels reach OpenClaw runtime (#4156)#4182yimoj wants to merge 4 commits into
Conversation
) Onboarding could write Telegram and other channel blocks into `/sandbox/.openclaw/openclaw.json` but the OpenClaw dashboard still rendered "No channels found" because NemoClaw only verified static config and gateway-provider attachment — never that the running OpenClaw process acknowledged each configured channel. Add a two-layer probe that reads the in-sandbox config AND scans the gateway log segment since the most recent launch for channel mentions; surface mismatches as a `messaging` warn diagnostic in both `verifyDeployment` (post-create) and `nemoclaw <sandbox> doctor`. Format the post-deployment summary so warn-level messaging diagnostics also render on the healthy-overall path, where they would otherwise be silently dropped. Focused tests cover the new module (pure parsing + shell-script end-to-end), the verifyDeployment integration (configured-but-not- running, stale-rebuild config drift, log unavailable, malformed config), and a doctor `--json` E2E assertion that the new Runtime channel registry check fires after rebuild. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds runtime probing of OpenClaw channel config + gateway log, integrates results into doctor and verify-deployment flows, wires onboarding to provide the probe, and adds unit, onboarding, and E2E tests. ChangesMessaging Channel Runtime Verification
Sequence Diagram(s)sequenceDiagram
participant User as Sandbox User
participant Doctor as doctor.ts
participant Verify as verify-deployment.ts
participant Probe as probeChannelRuntimeStatus
participant Config as openclaw.json
participant Log as gateway.log
User->>Doctor: run `nemoclaw doctor`
Doctor->>Probe: probeChannelRuntimeStatus(configFilePath, exec)
Probe->>Config: read+parse openclaw.json -> configuredChannels
Probe->>Log: execute generated scan script -> FOUND: tokens
Log-->>Probe: visibleChannels
Probe-->>Doctor: RuntimeChannelStatus (visible/configured/logProbeOk)
Doctor->>Doctor: compareChannelSets(configured, visible)
Doctor-->>User: report `ok` or `warn` with details
User->>Verify: onboarding finalization -> verifyDeployment
Verify->>Probe: probeChannelRuntimeStatus()
Probe->>Config: extract enabled channels
Probe->>Log: scan for started channels
Probe-->>Verify: RuntimeChannelStatus
Verify-->>User: verification result includes messagingRuntimeChannelsMissing/configMissing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
test/e2e/test-messaging-providers.sh (1)
2083-2098: ⚡ Quick winConsider detecting JSON parse errors early for clearer diagnostics.
If the Python JSON parsing fails (lines 2074 or 2081),
runtime_checkwill contain{"error":"..."}, which then falls through to line 2097 and fails with an unclear message"Unexpected Runtime channel registry status ''".Proposed improvement
+ if echo "$runtime_check" | grep -q '"error"'; then + skip "RT1: Failed to parse doctor JSON output (${runtime_check:0:300})" + elif echo "$runtime_check" | grep -q '"missing"'; then - if echo "$runtime_check" | grep -q '"missing"'; then skip "RT1: doctor --json had no Runtime channel registry check (no configured channels)" elseThis provides a clearer diagnostic message when the doctor output format changes or JSON parsing fails.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/test-messaging-providers.sh` around lines 2083 - 2098, The runtime_check JSON parsing can fail and leave rt_status empty, producing an unclear "Unexpected ... ''" error; update the logic around extracting rt_status (the python3 json.load(sys.stdin).get('status','') call) to first detect and surface any JSON parse/doctor errors by checking for an "error" field (or non-JSON output) from runtime_check and include that error text in the diagnostic; in practice modify the rt_status extraction and the else branch that calls fail to first try extracting json.load(...).get('error','') (or detect json decode failure) and if present fail with that error detail, otherwise fall back to the existing status/detail handling so failures show clear JSON/doctor error messages referencing the runtime_check variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/channel-runtime-status.ts`:
- Around line 309-316: When logProbeOk is false in the runtime visibility
branch, don't expose configuredChannels as runtime-verified; change the returned
payload (in the branch guarded by logProbeOk) so visibleChannels is an empty
array (or otherwise indicates no runtime confirmation) instead of
configuredChannels. Update the return object in channel-runtime-status.ts (the
block that currently returns ok: true, visibleChannels: configuredChannels,
configuredButNotRunning: [], logProbeOk: false, detail: ...) so visibleChannels
reflects only log-confirmed channels when logProbeOk is false and keep
configuredButNotRunning/logProbeOk/detail as-is.
In `@src/lib/verify-deployment.ts`:
- Around line 365-370: The message returned when messaging.runtimeMissing is
non-empty incorrectly assumes missing startup log entries; update the wording in
the return to be neutral or branch on cause: use the compareChannelSets result
(runtimeMissing from compareChannelSets(channels, runtime.visibleChannels)) to
say channels are "not visible to the OpenClaw runtime" rather than "no startup
entries in /tmp/gateway.log", or add a conditional that checks the specific
cause before mentioning logs (e.g., if you have a separate flag for missing from
openclaw.json vs not present in gateway.log then mention logs only for the
latter). Ensure you modify the return string where messaging.runtimeMissing is
used so it references runtimeMissing and does not always point operators to the
gateway.log when the channel may simply be absent from openclaw.json.
---
Nitpick comments:
In `@test/e2e/test-messaging-providers.sh`:
- Around line 2083-2098: The runtime_check JSON parsing can fail and leave
rt_status empty, producing an unclear "Unexpected ... ''" error; update the
logic around extracting rt_status (the python3
json.load(sys.stdin).get('status','') call) to first detect and surface any JSON
parse/doctor errors by checking for an "error" field (or non-JSON output) from
runtime_check and include that error text in the diagnostic; in practice modify
the rt_status extraction and the else branch that calls fail to first try
extracting json.load(...).get('error','') (or detect json decode failure) and if
present fail with that error detail, otherwise fall back to the existing
status/detail handling so failures show clear JSON/doctor error messages
referencing the runtime_check variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5944d44b-e6c1-4210-8c58-f6f277580cd8
📒 Files selected for processing (7)
src/lib/actions/sandbox/doctor.tssrc/lib/channel-runtime-status.test.tssrc/lib/channel-runtime-status.tssrc/lib/onboard.tssrc/lib/verify-deployment.test.tssrc/lib/verify-deployment.tstest/e2e/test-messaging-providers.sh
- Extract the post-deploy channel-runtime probe wiring into src/lib/onboard/verify-channel-runtime.ts so the entrypoint stays net-neutral (the `onboard-entrypoint-budget` CI gate). - channel-runtime-status: keep `visibleChannels` strictly log-corroborated. When the gateway log is unreadable we no longer return the configured set there; callers that diff against `visibleChannels` would otherwise treat an inconclusive probe as healthy. - channel-runtime-status: expose `configuredChannels` separately so callers can detect stale-rebuild mismatches (registry expects a channel that `openclaw.json` no longer contains) even when the runtime layer is unavailable. - verify-deployment + doctor: branch on `logProbeOk` so the runtime diff only runs with log corroboration, and surface a separate `missing from sandbox config` warning for config-only mismatches when the log is unreadable. - Neutralize the messaging hint so it does not blame the gateway log for what might just be a stale rebuild — the operator is pointed at both `openclaw.json` and the log. - Tests cover the new contract: visibleChannels stays empty when log is unavailable, configuredChannels still surfaces the config-derived set, and verify-deployment / doctor flag a stale rebuild even when the gateway log is missing. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 7434-7438: The new multi-line wiring for probeChannelRuntimeStatus
is causing net growth; collapse it into a single compact expression to eliminate
added lines (or remove the unnecessary "?? undefined"). Replace the block using
probeChannelRuntimeStatus and buildChannelRuntimeProbe with a single-line call:
require("./onboard/verify-channel-runtime").buildChannelRuntimeProbe(agent, {
executeSandboxCommand: (script) => executeSandboxCommandForVerification(name,
script) }) and remove the trailing "?? undefined" so the change is net-neutral;
reference symbols: probeChannelRuntimeStatus, buildChannelRuntimeProbe,
executeSandboxCommandForVerification, agent, name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b54082f3-832b-4d58-ad2b-02a6f073c9b0
📒 Files selected for processing (8)
src/lib/actions/sandbox/doctor.tssrc/lib/channel-runtime-status.test.tssrc/lib/channel-runtime-status.tssrc/lib/onboard.tssrc/lib/onboard/verify-channel-runtime.test.tssrc/lib/onboard/verify-channel-runtime.tssrc/lib/verify-deployment.test.tssrc/lib/verify-deployment.ts
✅ Files skipped from review due to trivial changes (1)
- src/lib/onboard/verify-channel-runtime.ts
…iring Collapse the post-deploy channel-runtime probe wiring into a single call site in `onboard.ts` by adding `buildOnboardChannelRuntimeProbe` (which internally binds `executeSandboxCommandForVerification` to the sandbox name) and folding the still-needed `executeSandboxCommand` dep to a direct function reference. Result: `src/lib/onboard.ts` is now +2/-2 instead of +5/-0, satisfying the `onboard-entrypoint-budget` CI gate while preserving behavior. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
CodeQL flagged the previous `execSync` invocation in
`channel-runtime-status.test.ts` for incomplete string escaping —
backslashes in the embedded gateway-log-scan script were not being
escaped before being re-quoted into a `sh -c "..."` argv. Switch to
`spawnSync("sh", ["-c", script])`, which passes the script verbatim
as an argv element and avoids the per-character escape gymnastics
entirely. Behavior is unchanged and all 34 tests still pass.
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
✨ |
Summary
probeChannelRuntimeStatus) that reads the in-sandbox OpenClaw config AND scans the gateway log segment since the most recent launch for channel mentions, comparing against the registry's expected channels.messagingwarn diagnostic in bothverifyDeployment(printed during onboarding) andnemoclaw <sandbox> doctor, with actionable hints that point at the dashboard, the gateway log, and rebuild.formatVerificationDiagnosticsalso render warn-level diagnostics on the overall-healthy path so the new signal is never silently dropped when gateway + dashboard probes pass.Why
NemoClaw onboarding could bake Telegram (or any other) channel block into
/sandbox/.openclaw/openclaw.json, but the OpenClaw dashboard still showed "No channels found." The pre-existing post-deploy check only verified gateway-provider attachment, so the runtime gap stayed hidden. Fixes #4156.Test plan
npm run typecheck:clisucceeds.npx vitest run --project cli src/lib/channel-runtime-status.test.ts src/lib/verify-deployment.test.ts— 60 tests pass (34 new for the runtime probe, 26 inverify-deployment, covering: config has the channel and log mentions it; configured-but-not-running (No channels found #4156 symptom); stale-rebuild config drift; gateway log unreadable; malformed/empty config; openclaw-weixin→wechat collapse; awk segment isolation in a realsh).cd nemoclaw && npm test.test/e2e/test-messaging-providers.sh) gains a Phase 7b assertion thatdoctor --jsonsurfaces the new Runtime channel registry check after rebuild.messagingChannels: ["telegram"]but no live runtime,node ./bin/nemoclaw.js my-sb doctor --jsonemits aMessaging → Runtime channel registrywarn diagnostic instead of the previous silent pass.Signed-off-by: Yimo Jiang yimoj@nvidia.com
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Tests