Skip to content

test(e2e): add openclaw plugin EXDEV guard for #3513#3761

Merged
jyaunches merged 14 commits into
mainfrom
test/openclaw-plugin-runtime-exdev-e2e-guard
May 19, 2026
Merged

test(e2e): add openclaw plugin EXDEV guard for #3513#3761
jyaunches merged 14 commits into
mainfrom
test/openclaw-plugin-runtime-exdev-e2e-guard

Conversation

@jyaunches
Copy link
Copy Markdown
Contributor

@jyaunches jyaunches commented May 18, 2026

Summary

Adds a failing-test-first regression guard for #3513 / #3127 in regression-e2e.yaml.

The new openclaw-plugin-runtime-exdev-e2e job provisions a fresh sandbox and runs the first in-sandbox openclaw agent invocation. On unfixed code, the expected failure is OpenClaw plugin runtime dependency installation aborting with EXDEV: cross-device link not permitted / failed to install bundled runtime deps / PluginLoadFailureError.

Expected state before the fix

This PR is intentionally expected to fail when the job is manually dispatched on main-equivalent code. The failure is the executable acceptance criterion for #3513.

Expected RED fragment:

EXDEV: cross-device link not permitted
failed to install bundled runtime deps
PluginLoadFailureError

Expected GREEN after the fix: openclaw agent --agent main --json ... exits 0 and returns a response without plugin runtime-deps EXDEV errors.

Workflow placement

  • Workflow: .github/workflows/regression-e2e.yaml
  • Job: openclaw-plugin-runtime-exdev-e2e
  • Not scheduled nightly; run only by manual dispatch until explicitly promoted.

Dispatch:

gh workflow run regression-e2e.yaml --repo NVIDIA/NemoClaw \
  -f jobs=openclaw-plugin-runtime-exdev-e2e \
  --ref test/openclaw-plugin-runtime-exdev-e2e-guard

Related: #3513
Also related: #3127

Summary by CodeRabbit

  • Tests

    • Added an end-to-end test that provisions a fresh sandbox, forces cross-device rename conditions, and verifies plugin runtime staging and a successful first-agent bootstrap; test fails on install/runtime errors or missing expected responses.
  • Chores

    • Added a conditional CI regression job to run the new E2E test selectively; uploads redacted logs/artifacts on failure and exposes a selector output to enable the job.

Review Change Stack

Adds a failing E2E test that demonstrates the bug tracked by #3513.

Until the fix lands, the regression-e2e openclaw-plugin-runtime-exdev-e2e job will fail. This is intentional: the failing test is the executable acceptance criterion.

Related: #3513
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a gated regression E2E job (openclaw-plugin-runtime-exdev-e2e) to the CI selector and a new script test/e2e/test-openclaw-plugin-runtime-exdev.sh that provisions a fresh sandbox, onboards NemoClaw, forces cross-device staging, captures/redacts logs, and asserts no EXDEV/plugin-install failures and presence of PONG.

Changes

OpenClaw EXDEV Plugin Runtime-Deps Test

Layer / File(s) Summary
Workflow selector and job registration
.github/workflows/regression-e2e.yaml
Added openclaw_plugin_runtime_exdev output to select_regression_jobs, extended valid jobs list to include openclaw-plugin-runtime-exdev-e2e, and added the conditional E2E job that runs the new test and uploads /tmp/nemoclaw-e2e-openclaw-plugin-exdev-* artifacts on failure.
E2E test script for EXDEV validation
test/e2e/test-openclaw-plugin-runtime-exdev.sh
New Bash E2E: provisions fresh sandbox, verifies Docker and API key, installs NemoClaw if needed, runs nemoclaw onboard --fresh, captures filesystem evidence, forces staging to /dev/shm, bootstraps openclaw agent --agent main, redacts secrets from logs, fails on EXDEV/plugin-install patterns or nonzero exit, and asserts PONG.

Sequence Diagram(s)

sequenceDiagram
  participant GitHubActions
  participant Runner
  participant TestScript as test-openclaw-plugin-runtime-exdev.sh
  participant NemoClawSandbox as NemoClaw_Sandbox
  participant OpenClawAgent as openclaw_agent
  participant ArtifactUploader as Artifact_Uploader

  GitHubActions->>Runner: trigger regression-e2e workflow
  Runner->>TestScript: execute test-openclaw-plugin-runtime-exdev.sh
  TestScript->>NemoClawSandbox: prepare sandbox & onboard (nemoclaw onboard --fresh)
  TestScript->>NemoClawSandbox: run df to capture filesystem layout
  TestScript->>OpenClawAgent: set TMPDIR=/dev/shm and start openclaw agent --agent main
  OpenClawAgent-->>TestScript: stdout logs (PONG or errors)
  Runner->>ArtifactUploader: upload /tmp/nemoclaw-e2e-openclaw-plugin-exdev-* on failure
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#3411: Similar selector/output additions and gated E2E job wiring in regression-e2e.yaml.
  • NVIDIA/NemoClaw#3478: Similar selector/output changes adding a gated E2E job and selector output.
  • NVIDIA/NemoClaw#3595: Adds a conditional E2E selector output in the same workflow selector pattern.

Suggested labels

enhancement: testing, E2E, Integration: OpenClaw, CI/CD

Suggested reviewers

  • ericksoa
  • cv

Poem

🐰 I hop into a sandbox, tail aflutter bright,
I start the agent, scrub the logs by night,
No EXDEV tumble, PONG answers clear and strong,
Secrets masked, df whispers where files belong,
A rabbit cheers—the bootstrap sang its song.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(e2e): add openclaw plugin EXDEV guard for #3513' directly and specifically describes the main changes: adding an E2E test guard for an OpenClaw plugin EXDEV issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/openclaw-plugin-runtime-exdev-e2e-guard

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Fixed
Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Fixed
Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Fixed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

E2E Advisor Recommendation

Required E2E: openclaw-plugin-runtime-exdev-e2e
Optional E2E: None

Dispatch hint: openclaw-plugin-runtime-exdev-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • openclaw-plugin-runtime-exdev-e2e (high): This PR adds the regression workflow job and its backing script, so the new job should be run to validate dispatch selection, required secrets/environment, sandbox onboard, and the EXDEV runtime-deps guard itself.

Optional E2E

  • None.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/regression-e2e.yaml
  • jobs input: openclaw-plugin-runtime-exdev-e2e

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-openclaw-plugin-runtime-exdev.sh`:
- Around line 144-148: The current success check greps for any "PONG" in
AGENT_LOG which can match the prompt and produce false positives; update the
conditional that references AGENT_LOG (the grep -qi 'PONG' check) to instead
search for the explicit response JSON field returned by the agent (e.g., a
pattern matching the response key and value like a JSON pair such as "response":
"PONG" or the exact response_token field your agent emits) so the test asserts
the agent returned the expected JSON response token rather than any occurrence
of PONG in the log.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 09946038-dffd-48f2-86b5-ee0248ed6a10

📥 Commits

Reviewing files that changed from the base of the PR and between 30d1be9 and f6b34db.

📒 Files selected for processing (2)
  • .github/workflows/regression-e2e.yaml
  • test/e2e/test-openclaw-plugin-runtime-exdev.sh

Comment thread test/e2e/test-openclaw-plugin-runtime-exdev.sh Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/e2e/test-openclaw-plugin-runtime-exdev.sh (1)

148-149: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tighten the success assertion to avoid false positives.

On Line 148, grep -qi 'PONG' can match the prompt text and pass even when the agent response token is missing. Match a response JSON field/value pair instead.

Suggested patch
-if grep -qi 'PONG' "$AGENT_LOG"; then
+if grep -qiE '"(content|text|response)"[[:space:]]*:[[:space:]]*"PONG"' "$AGENT_LOG"; then
   pass "openclaw agent completed without plugin runtime-deps EXDEV despite cross-device staging"
 else
   fail "openclaw agent exited 0 but expected response token was missing; see ${AGENT_LOG}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-openclaw-plugin-runtime-exdev.sh` around lines 148 - 149, The
current grep check against AGENT_LOG uses a loose 'PONG' match which can hit
unrelated prompt text; replace it with a JSON field/value match so the success
assertion only passes when the agent returned the expected response (e.g., look
for a JSON pair like "response":"PONG" or similar in AGENT_LOG) and update the
conditional that calls pass(...) accordingly; target the grep invocation that
currently searches for 'PONG' and change the pattern to a JSON-aware regex
(matching the exact field name used by the agent response) so false positives
from prompt text are avoided.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@test/e2e/test-openclaw-plugin-runtime-exdev.sh`:
- Around line 148-149: The current grep check against AGENT_LOG uses a loose
'PONG' match which can hit unrelated prompt text; replace it with a JSON
field/value match so the success assertion only passes when the agent returned
the expected response (e.g., look for a JSON pair like "response":"PONG" or
similar in AGENT_LOG) and update the conditional that calls pass(...)
accordingly; target the grep invocation that currently searches for 'PONG' and
change the pattern to a JSON-aware regex (matching the exact field name used by
the agent response) so false positives from prompt text are avoided.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0c10b753-290e-49e1-91f0-69c05691b27e

📥 Commits

Reviewing files that changed from the base of the PR and between f6b34db and 4b736c1.

📒 Files selected for processing (1)
  • test/e2e/test-openclaw-plugin-runtime-exdev.sh

# Conflicts:
#	test/e2e/docs/parity-inventory.generated.json
@jyaunches jyaunches enabled auto-merge (squash) May 19, 2026 16:38
@jyaunches jyaunches merged commit c2d3d04 into main May 19, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants