ci(e2e): add scenario run-all workflow by jyaunches · Pull Request #3920 · NVIDIA/NemoClaw

jyaunches · 2026-05-20T19:38:53Z

Summary

make the single scenario E2E workflow reusable via workflow_call
add a manual run-all workflow that fans out every migrated scenario
preserve the existing single-scenario manual dispatch path

Test plan

npx prettier --check .github/workflows/e2e-scenarios.yaml .github/workflows/e2e-scenarios-all.yaml

Follow-up

dispatch E2E / Scenario Runner / All from this branch to validate the full migrated scenario set

Summary by CodeRabbit

New Features
- Added a reusable scenario-based E2E runner with manual trigger and optional suite filter to run multiple predefined scenarios.
Chores
- Improved CI concurrency handling and scenario fan-out to streamline parallel E2E runs and artifact naming tied to each scenario.
- Explicitly forward required runtime credentials to E2E jobs.
Tests
- Updated E2E workflow tests to match the new scenario-driven workflow behavior.

coderabbitai · 2026-05-20T19:39:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8f0c5e27-7ec9-4a3e-9ade-e28ce33a6bc5

📥 Commits

Reviewing files that changed from the base of the PR and between b23062d and ba32152.

📒 Files selected for processing (1)

.github/workflows/e2e-scenarios-all.yaml

📝 Walkthrough

Walkthrough

Refactors e2e-scenarios.yaml to accept workflow_call inputs (scenario, suite_filter, optional NVIDIA_API_KEY), replaces manual-event input references with inputs.*, updates WSL and non-WSL env/ifs, and adds e2e-scenarios-all.yaml to fan out predefined scenarios to the reusable workflow.

Changes

E2E Scenario Workflow Reusability and Orchestration

Layer / File(s)	Summary
Define reusable workflow contract `.github/workflows/e2e-scenarios.yaml`	Adds `on.workflow_call` exposing required `scenario` input, optional `suite_filter` input, and optional `NVIDIA_API_KEY` secret.
Update concurrency, env, and non-WSL run wiring `.github/workflows/e2e-scenarios.yaml`	Replaces `github.event.inputs.` with `inputs.` for concurrency, resolve-runner SCENARIO env, non-WSL gating, coverage env, wires `secrets.NVIDIA_API_KEY` and `E2E_SUITE_FILTER`/`SCENARIO` from `inputs`, and updates artifact naming.
WSL conditional and env updates `.github/workflows/e2e-scenarios.yaml`	Switches all WSL step `if:` conditions to use `inputs.scenario`, updates WSL run step to pass `inputs.scenario`, exports `NEMOCLAW_RECREATE_SANDBOX` in generated WSL script, and adjusts WSL env wiring to `inputs.*`.
Create scenario orchestrator workflow `.github/workflows/e2e-scenarios-all.yaml`, `test/e2e/scenario-framework-tests/e2e-scenarios-workflow.test.ts`	Adds manual-dispatch workflow `e2e-scenarios-all.yaml` accepting optional `suite_filter` and invoking the reusable workflow once per predefined scenario while forwarding `secrets.NVIDIA_API_KEY`; updates test to expect artifact name built from `inputs.scenario`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/NemoClaw#3734: Overlaps with changes to how scenario/SCENARIO is passed into run-scenario.sh and WSL execution paths.
NVIDIA/NemoClaw#3493: Related adjustments routing suite_filter into the runtime via E2E_SUITE_FILTER and shared test coverage.

Suggested labels

CI/CD, E2E, enhancement: testing

Suggested reviewers

ericksoa
cv

Poem

🐰 I hopped through workflows, inputs in paw,
I handed each scenario its suite_filter law,
The orchestrator hums, jobs scatter and run,
Artifacts tagged by scenario—one by one,
A crunchy carrot cheer for CI well-done.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding a new workflow for running all E2E scenarios, which is the primary focus of the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ci/e2e-scenarios-all-runner

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-20T19:39:26Z

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: ba321522fb0781100dfe848a34a6769cefa57a11
Findings: 1 blocker(s), 2 warning(s), 2 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used trusted deterministic PR context plus read-only inspection of the changed files; no tests, scripts, package-manager commands, or workflows were executed.; GitHub status information is a snapshot and showed multiple checks still pending/in progress for the requested head SHA.; No linked issues were present, so acceptance mapping is limited to PR body clauses and comments provided in the trusted context.; E2E Advisor comment exists, but current-head E2E recommendation was still IN_PROGRESS in the status rollup, so E2E completion for ba32152 is not verified.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: ba321522fb0781100dfe848a34a6769cefa57a11
Recommendation: blocked
Confidence: high

Workflow/security fixes look directionally good, but the PR is not merge-ready because current-head CI is pending, mergeability is blocked, and run-all coverage/tests do not yet prove the claimed full scenario fan-out.

Gate status

CI: pending — 12 status context(s) appear pending for ba32152; GraphQL shows E2E recommendation, wsl-e2e, macos-e2e, PR review advisor, CodeQL, ShellCheck, unit-vitest-linux, and sandbox image builds still in progress or queued.
Mergeability: fail — mergeStateStatus=BLOCKED for PR ci(e2e): add scenario run-all workflow #3920 at head ba32152.
Review threads: pass — 3 review thread(s), all resolved. Prior CodeRabbit findings for secrets: inherit, unpinned setup-node, and template injection were marked addressed.
Risky code tested: warning — Risky area is workflow/enforcement. A workflow test changed, but semantic coverage is thin: it only updates artifact naming and does not assert the reusable workflow contract, absence of github.event.inputs in workflow_call paths, or the new run-all workflow fan-out/secrets behavior.

🔴 Blockers

Current-head merge gates are not satisfied: The PR cannot be considered merge-ready while mergeability is BLOCKED and multiple current-head checks are pending/in progress. This is especially important because the change touches GitHub Actions workflows that handle secrets and E2E execution routing.
- Recommendation: Wait for all required checks for ba32152 to complete successfully and resolve the mergeStateStatus=BLOCKED condition before merge consideration.
- Evidence: GraphQL: mergeStateStatus=BLOCKED; status rollup includes pending/in-progress E2E recommendation, PR review advisor, CodeQL, ShellCheck, wsl-e2e, macos-e2e, unit-vitest-linux, and sandbox image jobs.

🟡 Warnings

Run-all workflow appears not to cover every setup scenario in the catalog (.github/workflows/e2e-scenarios-all.yaml:25): The new workflow comment and PR clause say it runs every migrated/current setup scenario, but the hardcoded fan-out contains 7 jobs while test/e2e/nemoclaw_scenarios/scenarios.yaml contains additional setup_scenarios such as ubuntu-repo-cloud-openclaw-custom-policies, ubuntu-invalid-nvidia-key-negative, and ubuntu-gateway-port-conflict-negative. If those are migrated scenarios, the run-all workflow is incomplete; if they are intentionally excluded, the workflow name/comment and PR acceptance language overstate coverage.
- Recommendation: Either add the missing setup_scenarios, generate the matrix from scenarios.yaml, or explicitly document and test the intended subset. Add a static test that compares e2e-scenarios-all.yaml jobs against the intended catalog/list.
- Evidence: .github/workflows/e2e-scenarios-all.yaml defines jobs from ubuntu-repo-cloud-openclaw through ubuntu-no-docker-preflight-negative only. scenarios.yaml includes additional setup_scenarios at lines 203, 216, and 224.
Static workflow tests do not cover the new reusable/run-all contract (test/e2e/scenario-framework-tests/e2e-scenarios-workflow.test.ts:56): The test change only updates the artifact-name expectation to inputs.scenario. It does not assert that e2e-scenarios.yaml exposes workflow_call inputs/secrets, that workflow_call paths no longer rely on github.event.inputs, that actions remain pinned, or that e2e-scenarios-all.yaml passes suite_filter and maps only NVIDIA_API_KEY for every fan-out job.
- Recommendation: Add scenario-framework tests for workflow_call inputs/secrets, no github.event.inputs references in reusable paths, explicit NVIDIA_API_KEY secret mapping, pinned actions, and run-all job coverage against the intended scenario catalog.
- Evidence: Diff changes one assertion at test/e2e/scenario-framework-tests/e2e-scenarios-workflow.test.ts for artifact naming; E2E Advisor also recommended static tests for e2e-scenarios-all.yaml fan-out and workflow_call/input usage.

🔵 Suggestions

Workflow comments/tests still describe the scenario workflow as manual-only (.github/workflows/e2e-scenarios.yaml:7): The workflow is now both workflow_dispatch and workflow_call, but the header comment says manual-only and the test name e2e_scenarios_workflow_should_be_manual_only is now imprecise. The assertion only forbids push, pull_request, and schedule, so behavior is probably fine, but naming/comments can mislead future maintainers.
- Recommendation: Update the comment and test name to say manual-or-reusable and still explicitly not automatic on push/PR/schedule.
- Evidence: .github/workflows/e2e-scenarios.yaml adds on.workflow_call while retaining the old 'Manual-only' comment; the test still uses the name e2e_scenarios_workflow_should_be_manual_only.
Active overlap exists with another PR touching the same test file (test/e2e/scenario-framework-tests/e2e-scenarios-workflow.test.ts:1): Codebase drift check found open PR test(e2e): remove parity report workflow #3819, 'test(e2e): remove parity report workflow', also changing test/e2e/scenario-framework-tests/e2e-scenarios-workflow.test.ts. This does not contradict the current diff by itself, but it increases rebase/conflict risk around workflow tests.
- Recommendation: Before merge, re-check the final base and ensure this test file still reflects the intended workflow coverage after any overlap with PR test(e2e): remove parity report workflow #3819 is resolved.
- Evidence: Trusted context openPrOverlaps reports PR test(e2e): remove parity report workflow #3819 with sameFiles: test/e2e/scenario-framework-tests/e2e-scenarios-workflow.test.ts.

Acceptance coverage

met — make the single scenario E2E workflow reusable via workflow_call: .github/workflows/e2e-scenarios.yaml adds on.workflow_call with required scenario input, optional suite_filter input, and optional NVIDIA_API_KEY secret; workflow internals now reference inputs.scenario/inputs.suite_filter.
partial — add a manual run-all workflow that fans out every migrated scenario: .github/workflows/e2e-scenarios-all.yaml adds workflow_dispatch and fan-out jobs calling ./.github/workflows/e2e-scenarios.yaml, but the hardcoded 7-job list does not appear to include every setup_scenarios entry in scenarios.yaml and there is no test proving intended coverage.
met — preserve the existing single-scenario manual dispatch path: .github/workflows/e2e-scenarios.yaml retains on.workflow_dispatch inputs for scenario and suite_filter while adding workflow_call.
unknown — npx prettier --check .github/workflows/e2e-scenarios.yaml .github/workflows/e2e-scenarios-all.yaml: The PR body lists this test plan, but the trusted context does not include a completed prettier result for the current head SHA; CI is still pending.
missing — dispatch E2E / Scenario Runner / All from this branch to validate the full migrated scenario set: No passed e2e-scenarios-all run for ba32152 is present in the trusted context; the E2E recommendation check for the current head is still IN_PROGRESS.

Security review

pass — Secrets and Credentials: No hardcoded secrets were added. The reusable workflow declares only NVIDIA_API_KEY, and the new run-all caller maps only NVIDIA_API_KEY explicitly instead of using secrets: inherit.
warning — Input Validation and Data Sanitization: The prior shell-template injection pattern was addressed by passing scenario and suite_filter through env vars before shell use. However, scenario and suite_filter remain free-form workflow inputs, and tests do not yet assert no github.event.inputs/template interpolation regressions in reusable workflow paths.
pass — Authentication and Authorization: No application auth endpoints changed. Workflow permissions are contents: read, and the manual workflows do not broaden repository token permissions.
warning — Dependencies and Third-Party Libraries: Actions touched in the changed workflow are pinned to full SHAs and npm ci uses --ignore-scripts. The invoked WSL path still uses apt installs and curl -fsSL https://deb.nodesource.com/setup_22.x | bash -, which is a supply-chain-sensitive installer pattern; this appears pre-existing but is exercised by the new run-all workflow.
pass — Error Handling and Logging: No new secret logging was observed. Workflow summaries print scenario data as text/code-formatted output and do not echo NVIDIA_API_KEY.
pass — Cryptography and Data Protection: Not applicable — no cryptographic operations or data protection mechanisms are changed.
pass — Configuration and Security Headers: Workflow-level permissions remain restrictive with contents: read. Actions checkout/setup-node/upload-artifact are pinned to commit SHAs in the reviewed workflow.
warning — Security Testing: The test update does not cover the security-sensitive workflow changes: explicit secret mapping, pinned actions, avoidance of direct template expansion in run blocks, or run-all workflow coverage. Add static assertions to prevent regressions.
warning — Holistic Security Posture: The PR improves least privilege by replacing broad secret inheritance and fixes template injection, but it changes the trusted-code workflow boundary and fan-out execution paths while current-head CI/E2E validation remains incomplete.

Test / E2E status

Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: .github/workflows/e2e-scenarios-all.yaml, .github/workflows/e2e-scenarios.yaml. Static tests should also be expanded because the current diff only adjusts one artifact-name assertion.
E2E Advisor: ambiguous
Missing for analyzed SHA: E2E recommendation check for ba321522fb0781100dfe848a34a6769cefa57a11 is IN_PROGRESS, so current-head E2E Advisor completion is not established

✅ What looks good

Prior workflow security review items were addressed: broad secrets: inherit was replaced with explicit NVIDIA_API_KEY mapping, setup-node is pinned to a full commit SHA, and scenario interpolation in run blocks now uses env variables.
The reusable workflow keeps permissions limited to contents: read.
npm install steps use npm ci --ignore-scripts, which avoids package lifecycle script execution during dependency installation.
The run-all workflow is manual dispatch only and does not add push or pull_request triggers.

Review completeness

Review used trusted deterministic PR context plus read-only inspection of the changed files; no tests, scripts, package-manager commands, or workflows were executed.
GitHub status information is a snapshot and showed multiple checks still pending/in progress for the requested head SHA.
No linked issues were present, so acceptance mapping is limited to PR body clauses and comments provided in the trusted context.
E2E Advisor comment exists, but current-head E2E recommendation was still IN_PROGRESS in the status rollup, so E2E completion for ba32152 is not verified.
Human maintainer review required: yes

github-actions · 2026-05-20T19:40:07Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: ubuntu-repo-cloud-openclaw, e2e-scenarios-all

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

None. No merge-blocking product E2E is required because the PR changes E2E CI workflow orchestration and its static tests only; it does not modify installer, onboarding, sandbox lifecycle, credentials, network policy, inference routing, deployment artifacts, or assistant runtime code.

Optional E2E

ubuntu-repo-cloud-openclaw (medium): Optional smoke validation that the modified E2E / Scenario Runner still accepts inputs.scenario and executes the baseline Ubuntu cloud OpenClaw scenario end-to-end.
e2e-scenarios-all (high): Optional validation of the new fan-out caller workflow and workflow_call wiring. This is expensive because it dispatches all listed scenarios across Ubuntu, GPU, macOS, WSL, Brev, and negative preflight coverage.

New E2E recommendations

ci-e2e-workflow-orchestration (medium): The static workflow test was updated for the single-scenario artifact name but does not appear to validate the new all-scenarios fan-out workflow, its scenario list parity with test/e2e/nemoclaw_scenarios/scenarios.yaml, or that each job passes suite_filter and NVIDIA_API_KEY through workflow_call.
- Suggested test: Add scenario-framework coverage for .github/workflows/e2e-scenarios-all.yaml fan-out parity and reusable-workflow input/secret wiring.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-scenarios-all.yaml:
- Around line 26-31: Replace the broad secrets: inherit in each scenario job
(e.g., job ubuntu-repo-cloud-openclaw) in
.github/workflows/e2e-scenarios-all.yaml with an explicit secrets mapping that
only passes NVIDIA_API_KEY as declared by the reusable workflow contract in
e2e-scenarios.yaml; update all seven job blocks (the ones using
./.github/workflows/e2e-scenarios.yaml) to use a secrets: block that maps
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }} instead of inheriting every
secret.

In @.github/workflows/e2e-scenarios.yaml:
- Around line 89-95: Update the "Set up Node" step that currently uses the
mutable tag actions/setup-node@v6 and pin it to the known full commit SHA used
elsewhere (for example
actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e) to remove
supply-chain risk; edit the workflow step named "Set up Node" and replace the
uses value actions/setup-node@v6 with the full commit SHA, and make the
identical change in the other listed workflows (e2e-parity-compare.yaml,
e2e-branch-validation.yaml, e2e-advisor.yaml, nightly-e2e.yaml,
commit-lint.yaml).
- Around line 108-115: The workflow directly interpolates inputs.scenario into
the run script which allows shell injection; instead, export the input into the
step environment and reference it as a quoted shell variable. For the non-WSL
step that calls bash test/e2e/runtime/run-scenario.sh and the WSL step that uses
the PowerShell heredoc, add an env entry such as SCENARIO: ${{ inputs.scenario
}} and change the run lines to invoke the script with a quoted variable
("$SCENARIO" in bash, "$env:SCENARIO" or equivalent in PowerShell), so all
occurrences of ${ { inputs.scenario } } in the run blocks are replaced by the
safe env variable reference. Ensure both the step that runs run-scenario.sh and
the WSL heredoc section are updated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a0c321e2-6d69-46ba-8da6-da263a4e0dba

📥 Commits

Reviewing files that changed from the base of the PR and between e122450 and bdb8152.

📒 Files selected for processing (2)

.github/workflows/e2e-scenarios-all.yaml
.github/workflows/e2e-scenarios.yaml

…runner # Conflicts: # .github/workflows/e2e-scenarios.yaml

ci(e2e): add scenario run-all workflow

bdb8152

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread .github/workflows/e2e-scenarios-all.yaml Outdated

Comment thread .github/workflows/e2e-scenarios.yaml

Comment thread .github/workflows/e2e-scenarios.yaml

jyaunches added the v0.0.47 Release target label May 20, 2026

cv approved these changes May 20, 2026

View reviewed changes

jyaunches added 3 commits May 20, 2026 16:21

Merge remote-tracking branch 'origin/main' into ci/e2e-scenarios-all-…

e5d817f

…runner # Conflicts: # .github/workflows/e2e-scenarios.yaml

fix(e2e): align artifact expectation with reusable input

b23062d

fix(ci): limit scenario workflow secrets

ba32152

cv approved these changes May 20, 2026

View reviewed changes

cv merged commit dc63189 into main May 21, 2026
28 checks passed

coderabbitai Bot mentioned this pull request May 21, 2026

ci(e2e): add scenario advisor comment #4004

Merged

Conversation

jyaunches commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Follow-up

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

PR Review Advisor

Gate status

🔴 Blockers

🟡 Warnings

🔵 Suggestions

Acceptance coverage

Security review

Test / E2E status

✅ What looks good

Review completeness

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jyaunches commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading