Skip to content

fix: validate messaging compatible endpoint routing#2849

Merged
cv merged 13 commits into
mainfrom
fix/telegram-compatible-endpoint-2766
May 1, 2026
Merged

fix: validate messaging compatible endpoint routing#2849
cv merged 13 commits into
mainfrom
fix/telegram-compatible-endpoint-2766

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented May 1, 2026

Summary

Adds a targeted guard for the Telegram plus OpenAI-compatible endpoint regression so onboarding validates the managed inference.local route before a messaging-triggered agent turn fails later with a generic network error.

Related Issue

Fixes #2766

Changes

  • Preserve the managed OpenClaw provider shape for compatible endpoints and assert it in config/inference tests.
  • Add a post-onboard compatible-endpoint smoke check that verifies gateway provider config and performs a bounded sandbox-side chat completion through https://inference.local/v1/chat/completions.
  • Improve Telegram startup and agent-turn diagnostics so provider readiness and inference failures are logged separately without leaking credentials.
  • Add a hermetic messaging-compatible-endpoint-e2e branch-validation job with a local OpenAI-compatible mock endpoint.
  • Remove the PR-template disclosure checkbox and the contributor-skill instruction that forced agents to add it.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

  • New Features

    • Messaging-compatible-endpoint support with onboarding smoke validation and an option to skip Telegram reachability in non-interactive runs.
  • Tests

    • New hermetic end-to-end messaging-compatible-endpoint script, expanded unit/e2e assertions, and selectable nightly test execution.
  • Chores

    • Added nightly E2E workflow job and startup diagnostics to surface Telegram provider readiness and targeted failure breadcrumbs.
  • Documentation

    • Removed AI disclosure requirement from PR/contributor templates.

@ericksoa ericksoa self-assigned this May 1, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a messaging-compatible-endpoint E2E job and hermetic test script, onboarding smoke checks that validate managed inference routing and /chat/completions, a Telegram diagnostics Node preload injected at startup, test coverage additions, and PR template removals for AI disclosure. (≤50 words)

Changes

Cohort / File(s) Summary
CI / Workflow
\.coderabbit.yaml, .github/workflows/nightly-e2e.yaml
Adds messaging-compatible-endpoint-e2e mapping and new GitHub Actions job; wires job into workflow dispatch and downstream needs for reporting and aggregation.
Onboarding & Inference Config
src/lib/onboard.ts, src/lib/inference-config.test.ts
Adds compatible-endpoint sandbox smoke checks (builds/runs sandbox script that validates managed provider presence, openclaw.json wiring, and /chat/completions response); strengthens inference-config test to assert full selection object.
Startup Diagnostics
scripts/nemoclaw-start.sh
Installs a Telegram-specific Node preload when Telegram is configured; preload probes api.telegram.org, emits provider-ready breadcrumbs, monitors stderr for inference/agent failure patterns, and is included in root/non-root flows and permission validation.
E2E Tests & Scripts
test/e2e/brev-e2e.test.ts, test/e2e/test-messaging-compatible-endpoint.sh
Adds E2E suite entry for messaging-compatible-endpoint and a hermetic script that boots an OpenAI-compatible mock, provisions a sandbox with compatible-endpoint provider, validates onboarding/openclaw.json, performs sandboxed inference to inference.local, and verifies authenticated mock traffic.
Unit / Test Coverage
test/nemoclaw-start.test.ts, test/onboard.test.ts, test/generate-openclaw-config.test.ts
New tests covering Telegram diagnostics installation and logs, compatible-endpoint smoke script generation/conditions, and generated OpenClaw config for a managed-compatible provider.
Templates / Docs
.agents/.../SKILL.md, .github/PULL_REQUEST_TEMPLATE.md
Removes the AI Disclosure section and related checklist requirement from PR templates.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Onboard as Onboarding Flow
    participant Gateway
    participant Sandbox as NEMOCLAW Sandbox
    participant Inference as inference.local
    participant Mock as OpenAI Mock Server

    User->>Onboard: Start onboarding (compatible-endpoint)
    Onboard->>Gateway: Query provider presence
    Onboard->>Sandbox: Generate & run smoke test script
    Sandbox->>Sandbox: Read /sandbox/.openclaw/openclaw.json
    Sandbox->>Inference: POST /v1/chat/completions
    Inference->>Mock: Forward authenticated request
    Mock-->>Inference: Return mock response with marker
    Inference-->>Sandbox: Return response
    Sandbox->>Onboard: Report smoke PASS/FAIL
    Onboard-->>User: Continue or abort onboarding
Loading
sequenceDiagram
    participant E2E as E2E Runner
    participant Mock as OpenAI Mock Server
    participant Provisioner as NEMOCLAW Provisioner
    participant Sandbox as NEMOCLAW Sandbox
    participant Inference as inference.local
    participant Telegram as Telegram API

    E2E->>Mock: Boot mock server (Bearer auth)
    E2E->>Provisioner: Provision sandbox with compatible-endpoint provider
    Provisioner->>Sandbox: Initialize openclaw.json (managed provider -> inference.local)
    E2E->>Sandbox: Check onboard logs for smoke marker
    E2E->>Sandbox: Inspect openclaw.json provider shape
    Sandbox->>Inference: Call /v1/chat/completions
    Inference->>Mock: Authenticated POST logged
    Mock-->>Inference: Return mock assistant content
    Sandbox-->>E2E: Validate assistant content
    E2E->>Telegram: (optional) perform Telegram round-trip
    E2E-->>E2E: Aggregate PASS/FAIL/SKIP
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I poked the logs and tapped the shell tonight,
a telegram blinked and diagnostics took flight.
A mock reply, a sandbox cheer, a tidy PASS,
smoke checks green, telemetry in glass.
Hooray — the rabbit danced, all systems bright!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: validate messaging compatible endpoint routing' directly summarizes the main change: adding validation for messaging-compatible endpoint routing through a sandbox smoke check and improved diagnostics.
Linked Issues check ✅ Passed The PR implements core requirements from #2766: validates managed provider config in onboarding [src/lib/onboard.ts, test/onboard.test.ts], adds sandbox-side inference smoke check [src/lib/onboard.ts, test/e2e/test-messaging-compatible-endpoint.sh], improves Telegram diagnostics [scripts/nemoclaw-start.sh, test/nemoclaw-start.test.ts], and includes E2E regression testing [.github/workflows/nightly-e2e.yaml, test/e2e/brev-e2e.test.ts].
Out of Scope Changes check ✅ Passed All changes align with #2766 objectives: config validation, Telegram diagnostics, inference smoke testing, and E2E job setup. The PR template removals (removing AI disclosure checkbox) are minor administrative changes unrelated to the core issue but documented in PR summary.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/telegram-compatible-endpoint-2766

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms


Comment @coderabbitai help to get the list of available commands and usage tips.

@ericksoa ericksoa added bug Something isn't working Integration: Telegram Use this label to identify Telegram bot integration issues with NemoClaw. v0.0.33 labels May 1, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
test/nemoclaw-start.test.ts (1)

1129-1159: ⚡ Quick win

Exercise the Telegram preload behavior, not just its source text.

These assertions prove the heredoc is present, but they will not catch a broken sanitizer or a misleading startup-order breadcrumb. This file already executes extracted preloads for the safety net and Slack guard; adding one behavior-level Telegram test here would lock down the critical runtime behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/nemoclaw-start.test.ts` around lines 1129 - 1159, Add a behavior-level
test that actually executes the Telegram diagnostics preload instead of only
checking its text: locate the extracted heredoc for _TELEGRAM_DIAGNOSTICS_SCRIPT
in the test src, write its contents to a temp file, ensure permissions via the
same validate_tmp_permissions flow, then invoke the entrypoint path that calls
install_telegram_diagnostics (the same path matched by
configure_messaging_channels -> install_telegram_diagnostics) or simulate the
connect-shell sourcing behavior so the temp script is required
(NODE_OPTIONS/--require) and assert a side effect (env var, temp file touch, or
stdout marker) the preload sets; this verifies the sanitizer and startup
ordering actually run rather than just existing in the source.
test/onboard.test.ts (1)

292-302: ⚡ Quick win

Also assert the smoke marker/prompt contract in the generated script

This test already checks routing and secret-safety well; adding assertions for the explicit probe prompt and marker would better lock the intended smoke behavior.

Suggested addition
   it("builds a compatible-endpoint smoke script that validates managed inference config", () => {
     const script = buildCompatibleEndpointSandboxSmokeScript("deepseek-ai/DeepSeek-V4-Flash");
 
     assert.match(script, /models\.providers\.inference/);
     assert.match(script, /https:\/\/inference\.local\/v1/);
     assert.match(script, /apiKey.*unused/);
     assert.match(script, /agents\.defaults\.model\.primary/);
+    assert.match(script, /Reply with exactly: PONG/);
+    assert.match(script, /INFERENCE_SMOKE_OK/);
     assert.match(script, /curl[\s\S]*\/chat\/completions/);
     assert.doesNotMatch(script, /COMPATIBLE_API_KEY/);
     assert.doesNotMatch(script, /api\.deepinfra\.com/);
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 292 - 302, The test for
buildCompatibleEndpointSandboxSmokeScript should also assert the smoke probe
prompt and marker contract are present in the generated script; update the test
(in onboard.test.ts) that calls
buildCompatibleEndpointSandboxSmokeScript("deepseek-ai/DeepSeek-V4-Flash") to
add assertions that the script contains the explicit probe prompt text and the
smoke marker/identifier used by the runtime (e.g., the probe prompt string and
the SMOKE marker token your system expects), so add assert.match checks against
those two unique patterns (the probe prompt and the smoke marker) to lock the
intended smoke behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 866-872: The sanitize function's final regex only matches tokens
composed of a limited charset and stops at colons, leaking colon-delimited
tokens (e.g., Telegram tokens); update the replace call in sanitize (function
sanitize) to consume the entire token value up to whitespace or quotes by
replacing the right-hand character class with a greedy "not whitespace or quote"
pattern (e.g., use [^\s"']+ or similar) so patterns like
(api[_-]?key|token|authorization)["'=:\s]+[^\s"']+ are fully redacted; keep the
/gi flags for case-insensitivity.
- Around line 862-949: The stderr hook can emit a Telegram-specific failure
before Telegram is known ready; change the logic in process.stderr.write (the
handler that checks inferenceLogged and tests text for LLM errors) to only set
inferenceLogged and call emit(...) when readyLogged is true (i.e., gate the
breadcrumb on readyLogged). Locate process.stderr.write, describeRequest,
maybeLogTelegramReady and readyLogged/inferenceLogged, and add the check (skip
early if !readyLogged) so the “[telegram] [default] agent turn failed after
provider startup…” message is only emitted after maybeLogTelegramReady has set
readyLogged.

In `@src/lib/onboard.ts`:
- Around line 8541-8551: The smoke check is using session/remembered channels
(selectedMessagingChannels/recordedMessagingChannels) which can include channels
disabled by createSandbox; change the source to the sandbox's active channels by
retrieving registry.getSandbox(sandboxName)?.messagingChannels and pass that (or
an empty array if undefined) as messagingChannels into
verifyCompatibleEndpointSandboxSmoke instead of smokeMessagingChannels built
from selectedMessagingChannels/recordedMessagingChannels; keep the call to
verifyCompatibleEndpointSandboxSmoke and ensure
sandboxName/provider/model/endpointUrl/credentialEnv/agent are preserved.

In `@test/onboard.test.ts`:
- Around line 44-45: The runtime guard is missing checks for the two new helpers
on OnboardTestInternals; update isOnboardTestInternals to validate that
buildCompatibleEndpointSandboxSmokeScript and buildSandboxConfigSyncScript exist
and are functions (or match their expected types), so the guard fails early with
a clear error instead of causing "is not a function" at runtime when callers of
those helpers (e.g., where OnboardTestInternals is used) invoke them; locate the
isOnboardTestInternals function and add explicit type checks for these two
symbols to keep the guard in sync with the OnboardTestInternals interface.

---

Nitpick comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 1129-1159: Add a behavior-level test that actually executes the
Telegram diagnostics preload instead of only checking its text: locate the
extracted heredoc for _TELEGRAM_DIAGNOSTICS_SCRIPT in the test src, write its
contents to a temp file, ensure permissions via the same
validate_tmp_permissions flow, then invoke the entrypoint path that calls
install_telegram_diagnostics (the same path matched by
configure_messaging_channels -> install_telegram_diagnostics) or simulate the
connect-shell sourcing behavior so the temp script is required
(NODE_OPTIONS/--require) and assert a side effect (env var, temp file touch, or
stdout marker) the preload sets; this verifies the sanitizer and startup
ordering actually run rather than just existing in the source.

In `@test/onboard.test.ts`:
- Around line 292-302: The test for buildCompatibleEndpointSandboxSmokeScript
should also assert the smoke probe prompt and marker contract are present in the
generated script; update the test (in onboard.test.ts) that calls
buildCompatibleEndpointSandboxSmokeScript("deepseek-ai/DeepSeek-V4-Flash") to
add assertions that the script contains the explicit probe prompt text and the
smoke marker/identifier used by the runtime (e.g., the probe prompt string and
the SMOKE marker token your system expects), so add assert.match checks against
those two unique patterns (the probe prompt and the smoke marker) to lock the
intended smoke behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c5b1e550-4cd0-4cbd-961a-3368d3021789

📥 Commits

Reviewing files that changed from the base of the PR and between f9d21af and 93b823d.

📒 Files selected for processing (10)
  • .coderabbit.yaml
  • .github/workflows/nightly-e2e.yaml
  • scripts/nemoclaw-start.sh
  • src/lib/inference-config.test.ts
  • src/lib/onboard.ts
  • test/e2e/brev-e2e.test.ts
  • test/e2e/test-messaging-compatible-endpoint.sh
  • test/generate-openclaw-config.test.ts
  • test/nemoclaw-start.test.ts
  • test/onboard.test.ts

Comment thread scripts/nemoclaw-start.sh
Comment thread scripts/nemoclaw-start.sh
Comment thread src/lib/onboard.ts Outdated
Comment thread test/onboard.test.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25224181432
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: messaging-compatible-endpoint-e2e
Summary: 0 passed, 0 failed, 21 skipped

Job Result
cloud-e2e ⏭️ skipped
cloud-inference-e2e ⏭️ skipped
cloud-onboard-e2e ⏭️ skipped
deployment-services-e2e ⏭️ skipped
diagnostics-e2e ⏭️ skipped
docs-validation-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-e2e ⏭️ skipped
inference-routing-e2e ⏭️ skipped
messaging-compatible-endpoint-e2e ⚠️ cancelled
messaging-providers-e2e ⏭️ skipped
network-policy-e2e ⏭️ skipped
overlayfs-autofix-e2e ⏭️ skipped
rebuild-hermes-e2e ⏭️ skipped
rebuild-openclaw-e2e ⏭️ skipped
sandbox-operations-e2e ⏭️ skipped
sandbox-survival-e2e ⏭️ skipped
shields-config-e2e ⏭️ skipped
skill-agent-e2e ⏭️ skipped
snapshot-commands-e2e ⏭️ skipped
token-rotation-e2e ⏭️ skipped
upgrade-stale-sandbox-e2e ⏭️ skipped

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 25224419429
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: messaging-compatible-endpoint-e2e
Summary: 0 passed, 1 failed, 21 skipped

Job Result
cloud-e2e ⏭️ skipped
cloud-inference-e2e ⏭️ skipped
cloud-onboard-e2e ⏭️ skipped
deployment-services-e2e ⏭️ skipped
diagnostics-e2e ⏭️ skipped
docs-validation-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-e2e ⏭️ skipped
inference-routing-e2e ⏭️ skipped
messaging-compatible-endpoint-e2e ❌ failure
messaging-providers-e2e ⏭️ skipped
network-policy-e2e ⏭️ skipped
overlayfs-autofix-e2e ⏭️ skipped
rebuild-hermes-e2e ⏭️ skipped
rebuild-openclaw-e2e ⏭️ skipped
sandbox-operations-e2e ⏭️ skipped
sandbox-survival-e2e ⏭️ skipped
shields-config-e2e ⏭️ skipped
skill-agent-e2e ⏭️ skipped
snapshot-commands-e2e ⏭️ skipped
token-rotation-e2e ⏭️ skipped
upgrade-stale-sandbox-e2e ⏭️ skipped

Failed jobs: messaging-compatible-endpoint-e2e. Check run artifacts for logs.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/lib/onboard.ts (1)

8552-8561: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use active registry messaging channels for smoke gating

At Line 8552, smoke gating still falls back to selected/session channels. If a channel is remembered but disabled in the created/reused sandbox, this can incorrectly run the hard-fail smoke check and abort onboarding.

Suggested fix
-    const smokeMessagingChannels =
-      selectedMessagingChannels.length > 0 ? selectedMessagingChannels : recordedMessagingChannels;
+    const activeRegistryChannels = registry.getSandbox(sandboxName)?.messagingChannels;
+    const smokeMessagingChannels = Array.isArray(activeRegistryChannels)
+      ? activeRegistryChannels
+      : selectedMessagingChannels.length > 0
+        ? selectedMessagingChannels
+        : recordedMessagingChannels;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 8552 - 8561, The smoke gating currently uses
smokeMessagingChannels which falls back to selectedMessagingChannels or
recordedMessagingChannels and can include channels that are disabled in the
created/reused sandbox; change the call to verifyCompatibleEndpointSandboxSmoke
to pass the sandbox's active messaging channels instead (derive an
activeMessagingChannels list from the sandbox/registry state used when
creating/reusing the sandbox, filtering out any channels marked disabled) so the
smoke check uses only channels actually enabled in the sandbox (replace usage of
smokeMessagingChannels with activeMessagingChannels and ensure any helper that
reads recordedMessagingChannels/selectedMessagingChannels is updated to prefer
sandbox/registry active channels).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 8552-8561: The smoke gating currently uses smokeMessagingChannels
which falls back to selectedMessagingChannels or recordedMessagingChannels and
can include channels that are disabled in the created/reused sandbox; change the
call to verifyCompatibleEndpointSandboxSmoke to pass the sandbox's active
messaging channels instead (derive an activeMessagingChannels list from the
sandbox/registry state used when creating/reusing the sandbox, filtering out any
channels marked disabled) so the smoke check uses only channels actually enabled
in the sandbox (replace usage of smokeMessagingChannels with
activeMessagingChannels and ensure any helper that reads
recordedMessagingChannels/selectedMessagingChannels is updated to prefer
sandbox/registry active channels).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 85e66d73-035e-428d-84fc-be7808baa97f

📥 Commits

Reviewing files that changed from the base of the PR and between 655d6cd and 64ce905.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 25225179236
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: messaging-compatible-endpoint-e2e
Summary: 0 passed, 1 failed, 21 skipped

Job Result
cloud-e2e ⏭️ skipped
cloud-inference-e2e ⏭️ skipped
cloud-onboard-e2e ⏭️ skipped
deployment-services-e2e ⏭️ skipped
diagnostics-e2e ⏭️ skipped
docs-validation-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-e2e ⏭️ skipped
inference-routing-e2e ⏭️ skipped
messaging-compatible-endpoint-e2e ❌ failure
messaging-providers-e2e ⏭️ skipped
network-policy-e2e ⏭️ skipped
overlayfs-autofix-e2e ⏭️ skipped
rebuild-hermes-e2e ⏭️ skipped
rebuild-openclaw-e2e ⏭️ skipped
sandbox-operations-e2e ⏭️ skipped
sandbox-survival-e2e ⏭️ skipped
shields-config-e2e ⏭️ skipped
skill-agent-e2e ⏭️ skipped
snapshot-commands-e2e ⏭️ skipped
token-rotation-e2e ⏭️ skipped
upgrade-stale-sandbox-e2e ⏭️ skipped

Failed jobs: messaging-compatible-endpoint-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25225962098
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: messaging-compatible-endpoint-e2e
Summary: 1 passed, 0 failed, 21 skipped

Job Result
cloud-e2e ⏭️ skipped
cloud-inference-e2e ⏭️ skipped
cloud-onboard-e2e ⏭️ skipped
deployment-services-e2e ⏭️ skipped
diagnostics-e2e ⏭️ skipped
docs-validation-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-e2e ⏭️ skipped
inference-routing-e2e ⏭️ skipped
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ⏭️ skipped
network-policy-e2e ⏭️ skipped
overlayfs-autofix-e2e ⏭️ skipped
rebuild-hermes-e2e ⏭️ skipped
rebuild-openclaw-e2e ⏭️ skipped
sandbox-operations-e2e ⏭️ skipped
sandbox-survival-e2e ⏭️ skipped
shields-config-e2e ⏭️ skipped
skill-agent-e2e ⏭️ skipped
snapshot-commands-e2e ⏭️ skipped
token-rotation-e2e ⏭️ skipped
upgrade-stale-sandbox-e2e ⏭️ skipped

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25226844478
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: all (no filter)
Summary: 0 passed, 0 failed, 1 skipped

Job Result
cloud-e2e ⚠️ cancelled
cloud-inference-e2e ⚠️ cancelled
cloud-onboard-e2e ⚠️ cancelled
deployment-services-e2e ⚠️ cancelled
diagnostics-e2e ⚠️ cancelled
docs-validation-e2e ⚠️ cancelled
gpu-e2e ⏭️ skipped
hermes-e2e ⚠️ cancelled
inference-routing-e2e ⚠️ cancelled
messaging-compatible-endpoint-e2e ⚠️ cancelled
messaging-providers-e2e ⚠️ cancelled
network-policy-e2e ⚠️ cancelled
overlayfs-autofix-e2e ⚠️ cancelled
rebuild-hermes-e2e ⚠️ cancelled
rebuild-openclaw-e2e ⚠️ cancelled
sandbox-operations-e2e ⚠️ cancelled
sandbox-survival-e2e ⚠️ cancelled
shields-config-e2e ⚠️ cancelled
skill-agent-e2e ⚠️ cancelled
snapshot-commands-e2e ⚠️ cancelled
token-rotation-e2e ⚠️ cancelled
upgrade-stale-sandbox-e2e ⚠️ cancelled

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
scripts/nemoclaw-start.sh (2)

948-956: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only emit the inference breadcrumb after Telegram is actually ready.

Line 948 marks the provider as started on the plain starting provider log, so the next generic LLM request failed in this process is still attributed to Telegram even if no successful Bot API call ever happened. Gate Line 951 on readyLogged instead so startup failures and post-ready inference failures stay distinct.

Suggested fix
-      if (providerStarted && /Embedded agent failed before reply|LLM request failed|FailoverError/i.test(text)) {
+      if (readyLogged && /Embedded agent failed before reply|LLM request failed|FailoverError/i.test(text)) {
         inferenceLogged = true;
         var line = text.split(/\r?\n/).find(function (entry) {
           return /Embedded agent failed before reply|LLM request failed|FailoverError/i.test(entry);
         }) || text;
-        emit('[telegram] [default] agent turn failed after provider startup; inference error: ' + sanitize(line).slice(0, 600));
+        emit('[telegram] [default] agent turn failed after provider ready; inference error: ' + sanitize(line).slice(0, 600));
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` around lines 948 - 956, The current inference
breadcrumb uses providerStarted to decide when to emit failures, which falsely
attributes pre-ready LLM failures to Telegram; change the guard to require
readyLogged instead of providerStarted (i.e., only set inferenceLogged and call
emit(...) when readyLogged && /Embedded agent failed before reply|LLM request
failed|FailoverError/i.test(text)); keep the same logic for extracting the
matching line via text.split(...).find(...) and the sanitize(...).slice(0,600)
call, and ensure any tests or downstream logic that relied on inferenceLogged
still behave the same.

871-874: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Consume the full Bearer token in sanitize().

Line 871 still stops at :, and Line 873 only redacts up to the first whitespace after authorization:. A line like Authorization: Bearer 123456:ABC will still leak the suffix. Please make the Bearer matcher consume the whole token, or fold Bearer <token> into the authorization scrubber too.

Suggested fix
-    text = text.replace(/Bearer\s+[A-Za-z0-9._~+\/=-]+/g, 'Bearer <redacted>');
+    text = text.replace(/Bearer\s+[^"'\s,)]+/gi, 'Bearer <redacted>');
     text = text.replace(
-      /\b(api[_-]?key|token|authorization)\b(["']?\s*[:=]\s*["']?)[^"'\s,)]+/gi,
+      /\b(api[_-]?key|token|authorization)\b(["']?\s*[:=]\s*["']?)(?:Bearer\s+)?[^"'\s,)]+/gi,
       '$1$2<redacted>'
     );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` around lines 871 - 874, The Bearer-redaction regex
only matches a limited char set and stops at ":" (e.g. "Authorization: Bearer
123:ABC"), so update the sanitizer to consume the entire token by replacing the
first regex with one that uses a non-whitespace match (for example use
text.replace(/Bearer\s+\S+/gi, 'Bearer <redacted>')) or alternatively fold
Bearer handling into the authorization/api key scrubber by expanding the second
regex to also match "Bearer <token>" patterns; target the two occurrences shown
(the line that calls text.replace(/Bearer\s+[A-Za-z0-9._~+\/=-]+/g, ...) and the
authorization/api_key/token scrubber) and ensure the replacement uses
'<redacted>' for the whole token.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 948-956: The current inference breadcrumb uses providerStarted to
decide when to emit failures, which falsely attributes pre-ready LLM failures to
Telegram; change the guard to require readyLogged instead of providerStarted
(i.e., only set inferenceLogged and call emit(...) when readyLogged && /Embedded
agent failed before reply|LLM request failed|FailoverError/i.test(text)); keep
the same logic for extracting the matching line via text.split(...).find(...)
and the sanitize(...).slice(0,600) call, and ensure any tests or downstream
logic that relied on inferenceLogged still behave the same.
- Around line 871-874: The Bearer-redaction regex only matches a limited char
set and stops at ":" (e.g. "Authorization: Bearer 123:ABC"), so update the
sanitizer to consume the entire token by replacing the first regex with one that
uses a non-whitespace match (for example use text.replace(/Bearer\s+\S+/gi,
'Bearer <redacted>')) or alternatively fold Bearer handling into the
authorization/api key scrubber by expanding the second regex to also match
"Bearer <token>" patterns; target the two occurrences shown (the line that calls
text.replace(/Bearer\s+[A-Za-z0-9._~+\/=-]+/g, ...) and the
authorization/api_key/token scrubber) and ensure the replacement uses
'<redacted>' for the whole token.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c04366ed-2338-4a6b-a381-3ca55c5c022a

📥 Commits

Reviewing files that changed from the base of the PR and between bab222f and 7e81b39.

📒 Files selected for processing (4)
  • scripts/nemoclaw-start.sh
  • src/lib/onboard.ts
  • test/nemoclaw-start.test.ts
  • test/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/nemoclaw-start.test.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 25226856804
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: all (no filter)
Summary: 20 passed, 1 failed, 1 skipped

Job Result
cloud-e2e ❌ failure
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
deployment-services-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
gpu-e2e ⏭️ skipped
hermes-e2e ✅ success
inference-routing-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
token-rotation-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success

Failed jobs: cloud-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25228189442
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: all (no filter)
Summary: 14 passed, 0 failed, 1 skipped

Job Result
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
deployment-services-e2e ⚠️ cancelled
diagnostics-e2e ⚠️ cancelled
docs-validation-e2e ✅ success
gpu-e2e ⏭️ skipped
hermes-e2e ✅ success
inference-routing-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ⚠️ cancelled
overlayfs-autofix-e2e ⚠️ cancelled
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ⚠️ cancelled
sandbox-operations-e2e ⚠️ cancelled
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
token-rotation-e2e ⚠️ cancelled
upgrade-stale-sandbox-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25226856804
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: all (no filter)
Summary: 21 passed, 0 failed, 1 skipped

Job Result
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
deployment-services-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
gpu-e2e ⏭️ skipped
hermes-e2e ✅ success
inference-routing-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
token-rotation-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25229459569
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: all (no filter)
Summary: 21 passed, 0 failed, 1 skipped

Job Result
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
deployment-services-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
gpu-e2e ⏭️ skipped
hermes-e2e ✅ success
inference-routing-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
token-rotation-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 25232604443
Branch: fix/telegram-compatible-endpoint-2766
Requested jobs: all (no filter)
Summary: 20 passed, 1 failed, 1 skipped

Job Result
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
deployment-services-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
gpu-e2e ⏭️ skipped
hermes-e2e ✅ success
inference-routing-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
token-rotation-e2e ❌ failure
upgrade-stale-sandbox-e2e ✅ success

Failed jobs: token-rotation-e2e. Check run artifacts for logs.

@ericksoa
Copy link
Copy Markdown
Contributor Author

ericksoa commented May 1, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 25232604443 Branch: fix/telegram-compatible-endpoint-2766 Requested jobs: all (no filter) Summary: 20 passed, 1 failed, 1 skipped

Job Result
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
deployment-services-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
gpu-e2e ⏭️ skipped
hermes-e2e ✅ success
inference-routing-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
token-rotation-e2e ❌ failure
upgrade-stale-sandbox-e2e ✅ success

Failed jobs: token-rotation-e2e. Check run artifacts for logs.

the token-rotation-e2e is a known timeout, we can consider it passed for this purpose

@ericksoa ericksoa requested a review from cv May 1, 2026 21:33
@cv cv merged commit fd240ff into main May 1, 2026
17 checks passed
cv pushed a commit that referenced this pull request May 2, 2026
## Summary
Daily release-prep documentation refresh for merged PRs from the past 24
hours.
This updates user-facing docs for Telegram mention-only mode, in-sandbox
messaging shutdown, Hermes onboarding/runtime behavior, and
compatible-endpoint smoke validation, then bumps the docs metadata to
0.0.33 after tag v0.0.32.

## Related Issue
None.

## Changes
- #2417 / c7e49ad: Document `TELEGRAM_REQUIRE_MENTION` for Telegram
group-chat replies in `docs/manage-sandboxes/messaging-channels.md` and
`docs/reference/commands.md`.
- #1977 / 69403e0: Update `nemoclaw tunnel stop` and deprecated
`nemoclaw stop` docs to explain that NemoClaw also attempts to stop the
in-sandbox OpenClaw gateway and messaging polling.
- #2781 / b83ffe2, #2859 / 4df8be6, and #2846 / 0968dfd: Refresh the
Hermes quickstart for the default `my-hermes` sandbox name, cross-agent
same-name guard, agent type visibility in `nemoclaw list`, Brave prompt
omission, and supported prebaked Hermes integrations.
- #2849 / fd240ff: Document the Telegram plus OpenAI-compatible
endpoint `inference.local` smoke check in inference options and
troubleshooting.
- Bump `docs/versions1.json` and `docs/project.json` from 0.0.32 to
0.0.33 for daily release preparation.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
- [ ] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

Additional checks run:
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --dry-run`
- `git diff --check`
- `make docs` passed with the existing local version-switcher read
message.
- Full `npx prek run --all-files` and `npm test` were skipped for this
doc-only automation run. Commit and pre-push hooks otherwise passed
docs, lint, secret, and conversion checks until the local `Test (skills
YAML)` hook failed because `vitest/config` is not installed in this
fresh worktree.

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Updated Hermes quickstart: default sandbox name is "hermes"; guidance
to use distinct sandbox names, note same-name reuse is prevented, Hermes
wizard does not request Brave Web Search, and sandbox listings now show
agent type.
* Clarified provider onboarding: bounded in-sandbox smoke check runs
when Telegram messaging is enabled.
* Expanded Telegram docs: added TELEGRAM_REQUIRE_MENTION (DMs still
governed by TELEGRAM_ALLOWED_IDS), onboarding examples,
stop-messaging/tunnel behavior, and troubleshooting.
  * Promoted docs to version 0.0.33.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
cv added a commit that referenced this pull request May 15, 2026
## Summary
`NEMOCLAW_SANDBOX_READY_TIMEOUT` has been a recognised env var since
#2849, but no documentation accompanied it —
`docs/reference/commands.md`, `docs/reference/troubleshooting.md`, and
the inference / deployment guides only mention the companion
`NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` (added in #1620 and documented at
that time). Operators hitting `Sandbox '<name>' was created but did not
become ready within 180s` have no doc-grep path to the workaround, and
the two timeouts are easy to conflate. This closes the documentation gap
left by #2849.

Originally tried under #3435; closed because that PR mis-framed the docs
as resolving #3344 / #3416 (the root cause of both was the GPU policy
bug fixed in #3436, not a timeout misconfiguration). The docs themselves
still have value as a follow-up to the env-var introductions, so
reopening as a new PR with the correct framing.

## Related Issue
<!-- Not closing any issue; this addresses the doc-gap surfaced while
investigating #3344 and #3416 (both already fixed in code by #3436). -->

## Changes
- `docs/reference/commands.md`: add `NEMOCLAW_SANDBOX_READY_TIMEOUT` and
`NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` to the Onboard Timeouts table.
- `docs/reference/troubleshooting.md`: new troubleshooting entry
"Sandbox onboard times out with 'did not become ready within Ns'" that
distinguishes the readiness wait from the inference-probe budget, with a
worked example.
- `docs/inference/use-local-inference.md`: cross-link the two timeouts
from the existing `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` section so readers
of either knob land on the other.
- `docs/deployment/deploy-to-remote-gpu.md`: new "First-Run Readiness
Budget" section calling out DGX Station / cloud-VM /
large-quantised-model conditions that exceed the default and showing how
to raise it.

No code changes — the readiness behaviour is unchanged.

## Type of Change

- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [ ] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [ ] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Added a “First-Run Readiness Budget” note for remote GPU hosts
explaining longer initial sandbox build/upload times and advice to
increase NEMOCLAW_SANDBOX_READY_TIMEOUT.
* Clarified that NEMOCLAW_LOCAL_INFERENCE_TIMEOUT applies to
inference-server validation while sandbox readiness uses
NEMOCLAW_SANDBOX_READY_TIMEOUT (default 180s).
* Expanded examples for exporting both timeouts and onboarding timeout
messaging.
* Added troubleshooting guidance and inspection steps when sandbox
readiness timeouts delete partial sandboxes.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3440)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Integration: Telegram Use this label to identify Telegram bot integration issues with NemoClaw.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telegram bot hangs at [telegram] [default] starting provider to later crash

3 participants