test(e2e): add openclaw plugin EXDEV guard for #3513 by jyaunches · Pull Request #3761 · NVIDIA/NemoClaw

jyaunches · 2026-05-18T22:03:30Z

Summary

Adds a failing-test-first regression guard for #3513 / #3127 in regression-e2e.yaml.

The new openclaw-plugin-runtime-exdev-e2e job provisions a fresh sandbox and runs the first in-sandbox openclaw agent invocation. On unfixed code, the expected failure is OpenClaw plugin runtime dependency installation aborting with EXDEV: cross-device link not permitted / failed to install bundled runtime deps / PluginLoadFailureError.

Expected state before the fix

This PR is intentionally expected to fail when the job is manually dispatched on main-equivalent code. The failure is the executable acceptance criterion for #3513.

Expected RED fragment:

EXDEV: cross-device link not permitted
failed to install bundled runtime deps
PluginLoadFailureError

Expected GREEN after the fix: openclaw agent --agent main --json ... exits 0 and returns a response without plugin runtime-deps EXDEV errors.

Workflow placement

Workflow: .github/workflows/regression-e2e.yaml
Job: openclaw-plugin-runtime-exdev-e2e
Not scheduled nightly; run only by manual dispatch until explicitly promoted.

Dispatch:

gh workflow run regression-e2e.yaml --repo NVIDIA/NemoClaw \
  -f jobs=openclaw-plugin-runtime-exdev-e2e \
  --ref test/openclaw-plugin-runtime-exdev-e2e-guard

Related: #3513
Also related: #3127

Summary by CodeRabbit

Tests
- Added an end-to-end test that provisions a fresh sandbox, forces cross-device rename conditions, and verifies plugin runtime staging and a successful first-agent bootstrap; test fails on install/runtime errors or missing expected responses.
Chores
- Added a conditional CI regression job to run the new E2E test selectively; uploads redacted logs/artifacts on failure and exposes a selector output to enable the job.

Adds a failing E2E test that demonstrates the bug tracked by #3513. Until the fix lands, the regression-e2e openclaw-plugin-runtime-exdev-e2e job will fail. This is intentional: the failing test is the executable acceptance criterion. Related: #3513

coderabbitai · 2026-05-18T22:03:42Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a gated regression E2E job (openclaw-plugin-runtime-exdev-e2e) to the CI selector and a new script test/e2e/test-openclaw-plugin-runtime-exdev.sh that provisions a fresh sandbox, onboards NemoClaw, forces cross-device staging, captures/redacts logs, and asserts no EXDEV/plugin-install failures and presence of PONG.

Changes

OpenClaw EXDEV Plugin Runtime-Deps Test

Layer / File(s)	Summary
Workflow selector and job registration `.github/workflows/regression-e2e.yaml`	Added `openclaw_plugin_runtime_exdev` output to `select_regression_jobs`, extended valid jobs list to include `openclaw-plugin-runtime-exdev-e2e`, and added the conditional E2E job that runs the new test and uploads `/tmp/nemoclaw-e2e-openclaw-plugin-exdev-*` artifacts on failure.
E2E test script for EXDEV validation `test/e2e/test-openclaw-plugin-runtime-exdev.sh`	New Bash E2E: provisions fresh sandbox, verifies Docker and API key, installs NemoClaw if needed, runs `nemoclaw onboard --fresh`, captures filesystem evidence, forces staging to `/dev/shm`, bootstraps `openclaw agent --agent main`, redacts secrets from logs, fails on EXDEV/plugin-install patterns or nonzero exit, and asserts `PONG`.

Sequence Diagram(s)

sequenceDiagram
  participant GitHubActions
  participant Runner
  participant TestScript as test-openclaw-plugin-runtime-exdev.sh
  participant NemoClawSandbox as NemoClaw_Sandbox
  participant OpenClawAgent as openclaw_agent
  participant ArtifactUploader as Artifact_Uploader

  GitHubActions->>Runner: trigger regression-e2e workflow
  Runner->>TestScript: execute test-openclaw-plugin-runtime-exdev.sh
  TestScript->>NemoClawSandbox: prepare sandbox & onboard (nemoclaw onboard --fresh)
  TestScript->>NemoClawSandbox: run df to capture filesystem layout
  TestScript->>OpenClawAgent: set TMPDIR=/dev/shm and start openclaw agent --agent main
  OpenClawAgent-->>TestScript: stdout logs (PONG or errors)
  Runner->>ArtifactUploader: upload /tmp/nemoclaw-e2e-openclaw-plugin-exdev-* on failure

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

[Ubuntu 22.04][Agent] openclaw CLI fails to start in sandbox — plugin runtime install errors #3513: Reproduces and guards against the EXDEV plugin runtime-deps install failure described in this issue.

Possibly related PRs

NVIDIA/NemoClaw#3411: Similar selector/output additions and gated E2E job wiring in regression-e2e.yaml.
NVIDIA/NemoClaw#3478: Similar selector/output changes adding a gated E2E job and selector output.
NVIDIA/NemoClaw#3595: Adds a conditional E2E selector output in the same workflow selector pattern.

Suggested labels

enhancement: testing, E2E, Integration: OpenClaw, CI/CD

Suggested reviewers

ericksoa
cv

Poem

🐰 I hop into a sandbox, tail aflutter bright,
I start the agent, scrub the logs by night,
No EXDEV tumble, PONG answers clear and strong,
Secrets masked, df whispers where files belong,
A rabbit cheers—the bootstrap sang its song.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'test(e2e): add openclaw plugin EXDEV guard for `#3513`' directly and specifically describes the main changes: adding an E2E test guard for an OpenClaw plugin EXDEV issue.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch test/openclaw-plugin-runtime-exdev-e2e-guard

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-18T22:04:16Z

E2E Advisor Recommendation

Required E2E: openclaw-plugin-runtime-exdev-e2e
Optional E2E: None

Dispatch hint: openclaw-plugin-runtime-exdev-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

openclaw-plugin-runtime-exdev-e2e (high): This PR adds the regression workflow job and its backing script, so the new job should be run to validate dispatch selection, required secrets/environment, sandbox onboard, and the EXDEV runtime-deps guard itself.

Optional E2E

None.

New E2E recommendations

None.

Dispatch hint

Workflow: .github/workflows/regression-e2e.yaml
jobs input: openclaw-plugin-runtime-exdev-e2e

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-openclaw-plugin-runtime-exdev.sh`:
- Around line 144-148: The current success check greps for any "PONG" in
AGENT_LOG which can match the prompt and produce false positives; update the
conditional that references AGENT_LOG (the grep -qi 'PONG' check) to instead
search for the explicit response JSON field returned by the agent (e.g., a
pattern matching the response key and value like a JSON pair such as "response":
"PONG" or the exact response_token field your agent emits) so the test asserts
the agent returned the expected JSON response token rather than any occurrence
of PONG in the log.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 09946038-dffd-48f2-86b5-ee0248ed6a10

📥 Commits

Reviewing files that changed from the base of the PR and between 30d1be9 and f6b34db.

📒 Files selected for processing (2)

.github/workflows/regression-e2e.yaml
test/e2e/test-openclaw-plugin-runtime-exdev.sh

coderabbitai

♻️ Duplicate comments (1)

test/e2e/test-openclaw-plugin-runtime-exdev.sh (1)

148-149: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tighten the success assertion to avoid false positives.

On Line 148, grep -qi 'PONG' can match the prompt text and pass even when the agent response token is missing. Match a response JSON field/value pair instead.

Suggested patch

-if grep -qi 'PONG' "$AGENT_LOG"; then
+if grep -qiE '"(content|text|response)"[[:space:]]*:[[:space:]]*"PONG"' "$AGENT_LOG"; then
   pass "openclaw agent completed without plugin runtime-deps EXDEV despite cross-device staging"
 else
   fail "openclaw agent exited 0 but expected response token was missing; see ${AGENT_LOG}"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-openclaw-plugin-runtime-exdev.sh` around lines 148 - 149, The
current grep check against AGENT_LOG uses a loose 'PONG' match which can hit
unrelated prompt text; replace it with a JSON field/value match so the success
assertion only passes when the agent returned the expected response (e.g., look
for a JSON pair like "response":"PONG" or similar in AGENT_LOG) and update the
conditional that calls pass(...) accordingly; target the grep invocation that
currently searches for 'PONG' and change the pattern to a JSON-aware regex
(matching the exact field name used by the agent response) so false positives
from prompt text are avoided.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@test/e2e/test-openclaw-plugin-runtime-exdev.sh`:
- Around line 148-149: The current grep check against AGENT_LOG uses a loose
'PONG' match which can hit unrelated prompt text; replace it with a JSON
field/value match so the success assertion only passes when the agent returned
the expected response (e.g., look for a JSON pair like "response":"PONG" or
similar in AGENT_LOG) and update the conditional that calls pass(...)
accordingly; target the grep invocation that currently searches for 'PONG' and
change the pattern to a JSON-aware regex (matching the exact field name used by
the agent response) so false positives from prompt text are avoided.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0c10b753-290e-49e1-91f0-69c05691b27e

📥 Commits

Reviewing files that changed from the base of the PR and between f6b34db and 4b736c1.

📒 Files selected for processing (1)

test/e2e/test-openclaw-plugin-runtime-exdev.sh

…runtime-exdev-e2e-guard

# Conflicts: # test/e2e/docs/parity-inventory.generated.json

github-advanced-security AI found potential problems May 18, 2026

View reviewed changes

Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Fixed

Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Fixed

Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Fixed

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Outdated

test(e2e): force cross-device plugin staging

4b736c1

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

jyaunches added 11 commits May 18, 2026 18:32

test(e2e): prepare tmpfs fallback dir

3ae7858

test(e2e): avoid tmpfs fallback setup

0ec0e18

test(e2e): preserve openclaw fallback temp

df4d171

test(e2e): target plugin deps rename helper

a3b736b

test(e2e): encode remote exdev script

cfe3d27

test(e2e): grant shm writes for exdev fixture

9574562

test(e2e): bake shm policy into exdev sandbox

f2e4ff2

test(e2e): widen dev shm sandbox policy

02a4517

Merge remote-tracking branch 'origin/main' into test/openclaw-plugin-…

f9bd64b

…runtime-exdev-e2e-guard

fix(e2e): update exdev guard parity metadata

8786677

fix(e2e): map whatsapp parity assertions

8b632bb

cv approved these changes May 19, 2026

View reviewed changes

merge main into regression guard

5fa4f80

# Conflicts: # test/e2e/docs/parity-inventory.generated.json

jyaunches enabled auto-merge (squash) May 19, 2026 16:38

jyaunches merged commit c2d3d04 into main May 19, 2026
20 checks passed

jyaunches mentioned this pull request May 19, 2026

fix(openclaw): bump runtime deps EXDEV fix #3820

Open

Conversation

jyaunches commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Expected state before the fix

Workflow placement

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jyaunches commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading