test(e2e-scenario): delete obsolete run.ts framework, switch CI to run-scenario.sh#4379
test(e2e-scenario): delete obsolete run.ts framework, switch CI to run-scenario.sh#4379jyaunches wants to merge 2 commits into
Conversation
…n-scenario.sh
The hybrid scenario suite had two runner entrypoints:
- test/e2e-scenario/scenarios/run.ts (OLD: TS DSL + builders)
- test/e2e-scenario/runtime/run-scenario.sh (NEW: YAML metadata + bash)
The new bash runner via runtime/resolver/* is self-contained and does not
import from scenarios/. All recently-developed tests target the new path.
The OLD TS framework was kept only to satisfy CI (which was still calling
run.ts) and a handful of framework-tests that exercised the OLD DSL.
This commit removes the OLD path and migrates CI.
Deleted (OLD framework, only used by run.ts):
- test/e2e-scenario/scenarios/ (entire TS DSL, 33 files)
- test/e2e-scenario/manifests/ (only consumed by scenarios/manifests.ts)
- test/e2e-scenario/onboarding_assertions/ (dead in NEW path)
- 6 framework-tests that imported scenarios/*:
e2e-assertion-modules, e2e-manifests, e2e-migration-inventory-lock,
e2e-phase-orchestrators, e2e-plan-compiler, e2e-scenario-registry
Workflow and validator updates:
- .github/workflows/e2e-scenarios.yaml: replace 3 npx tsx ...run.ts calls
with bash test/e2e-scenario/runtime/run-scenario.sh <id>, looping over
comma-separated IDs in linux + WSL paths.
- tools/e2e-scenarios/workflow-boundary.mts: assert the bash entrypoint.
- test/e2e-scenario/framework-tests/e2e-scenarios-workflow.test.ts:
update fixture to the new entrypoint.
Bridge edit:
- tools/e2e-advisor/scenarios.mts: swap listScenarios() (OLD registry)
for loadMetadataFromDir() + resolveScenario() (NEW YAML resolver),
so the deterministic scenario advisor still compiles after the deletion.
Follow-up:
- test/e2e-scenario-advisor.test.ts: 2 cases skipped pending #4378.
nemoclaw_scenarios/scenarios.yaml setup_scenarios is missing user-friendly
aliases for 13 layered test_plans (telegram, discord, slack, brave,
resume, repair, double-*, token-rotation, openai-compatible,
hermes-discord, hermes-slack). The OLD TS registry was the only thing
that knew those IDs. #4378 (child of epic #3588 Phase 4) tracks adding
the aliases. A separate session is overhauling the deterministic
scenario advisor, which may obsolete this path entirely.
Verification:
- 267 framework + advisor tests pass; 2 skipped with #4378 reference.
- bash test/e2e-scenario/runtime/run-scenario.sh <id> --plan-only succeeds
for all 10 scenarios currently in setup_scenarios.
Refs: #3588, #4378
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughMigrates e2e scenario execution from a TypeScript registry/runner to a shell script dispatcher, updating GH Actions to call test/e2e-scenario/runtime/run-scenario.sh per scenario, adjusting tools/validators, skipping two advisor tests, and removing legacy assertion/client exports and one manifest. ChangesE2E Scenario Execution Modernization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorFailed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt |
E2E Scenario Advisor RecommendationRequired scenario E2E: Dispatch required scenario E2E:
Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tools/e2e-advisor/scenarios.mts (1)
287-295: ⚡ Quick winSurface skipped scenario resolution failures instead of swallowing them.
At Line 293, the empty
catchmakes advisor output silently incomplete when resolution breaks. Please at least record skipped IDs/errors so the result is auditable.Proposed patch
function loadScenarios(root: string): Record<string, ScenarioEntry> { const meta = loadMetadataFromDir(path.join(root, "test/e2e-scenario")); const out: Record<string, ScenarioEntry> = {}; + const skipped: string[] = []; for (const id of Object.keys(meta.scenarios.setup_scenarios)) { try { const plan = resolveScenario(id, meta); out[id] = { suites: plan.suites.map((s) => s.id), runner_requirements: plan.runner_requirements ?? [], }; - } catch { - // Skip scenarios that fail to resolve; they are not advisable targets. + } catch (error: unknown) { + skipped.push( + `${id}: ${error instanceof Error ? error.message : String(error)}`, + ); } } + if (skipped.length > 0) { + console.warn( + `e2e-advisor: skipped ${skipped.length} unresolved scenario(s)\n- ${skipped.join("\n- ")}`, + ); + } return out; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/e2e-advisor/scenarios.mts` around lines 287 - 295, The empty catch currently swallows failures from resolveScenario(id, meta); change it to catch the error (e.g., catch (err)) and record the failure into the output so skipped scenarios are auditable — for example, set out[id] to include an error field (stringified error message/stack) and a flag like skipped: true or add the id+error to a skipped_scenarios list; update the existing try block that builds out[id] (using resolveScenario, plan.suites, plan.runner_requirements) so failures populate out with diagnostic info rather than being ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tools/e2e-advisor/scenarios.mts`:
- Around line 287-295: The empty catch currently swallows failures from
resolveScenario(id, meta); change it to catch the error (e.g., catch (err)) and
record the failure into the output so skipped scenarios are auditable — for
example, set out[id] to include an error field (stringified error message/stack)
and a flag like skipped: true or add the id+error to a skipped_scenarios list;
update the existing try block that builds out[id] (using resolveScenario,
plan.suites, plan.runner_requirements) so failures populate out with diagnostic
info rather than being ignored.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0c87d069-8a34-44e5-b663-3f9fd532a80a
📒 Files selected for processing (69)
.github/workflows/e2e-scenarios.yamltest/e2e-scenario-advisor.test.tstest/e2e-scenario/framework-tests/e2e-assertion-modules.test.tstest/e2e-scenario/framework-tests/e2e-manifests.test.tstest/e2e-scenario/framework-tests/e2e-migration-inventory-lock.test.tstest/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.tstest/e2e-scenario/framework-tests/e2e-plan-compiler.test.tstest/e2e-scenario/framework-tests/e2e-scenario-registry.test.tstest/e2e-scenario/framework-tests/e2e-scenarios-workflow.test.tstest/e2e-scenario/manifests/hermes-nvidia-discord.yamltest/e2e-scenario/manifests/hermes-nvidia-slack.yamltest/e2e-scenario/manifests/hermes-nvidia.yamltest/e2e-scenario/manifests/openclaw-nvidia-brave.yamltest/e2e-scenario/manifests/openclaw-nvidia-brev-launchable.yamltest/e2e-scenario/manifests/openclaw-nvidia-custom-policies.yamltest/e2e-scenario/manifests/openclaw-nvidia-discord.yamltest/e2e-scenario/manifests/openclaw-nvidia-double-provider-switch.yamltest/e2e-scenario/manifests/openclaw-nvidia-double-same-provider.yamltest/e2e-scenario/manifests/openclaw-nvidia-gateway-port-conflict.yamltest/e2e-scenario/manifests/openclaw-nvidia-invalid-key.yamltest/e2e-scenario/manifests/openclaw-nvidia-macos.yamltest/e2e-scenario/manifests/openclaw-nvidia-no-docker-negative.yamltest/e2e-scenario/manifests/openclaw-nvidia-repair.yamltest/e2e-scenario/manifests/openclaw-nvidia-resume.yamltest/e2e-scenario/manifests/openclaw-nvidia-slack.yamltest/e2e-scenario/manifests/openclaw-nvidia-telegram.yamltest/e2e-scenario/manifests/openclaw-nvidia-token-rotation.yamltest/e2e-scenario/manifests/openclaw-nvidia-wsl.yamltest/e2e-scenario/manifests/openclaw-nvidia.yamltest/e2e-scenario/manifests/openclaw-ollama-gpu.yamltest/e2e-scenario/manifests/openclaw-openai-compatible.yamltest/e2e-scenario/onboarding_assertions/base/00-cli-installed.shtest/e2e-scenario/onboarding_assertions/preflight/00-preflight-expected-failed.shtest/e2e-scenario/onboarding_assertions/preflight/00-preflight-passed.shtest/e2e-scenario/scenarios/assertions/diagnostics.tstest/e2e-scenario/scenarios/assertions/environment.tstest/e2e-scenario/scenarios/assertions/hermes.tstest/e2e-scenario/scenarios/assertions/inference.tstest/e2e-scenario/scenarios/assertions/lifecycle.tstest/e2e-scenario/scenarios/assertions/messaging.tstest/e2e-scenario/scenarios/assertions/negative.tstest/e2e-scenario/scenarios/assertions/onboarding.tstest/e2e-scenario/scenarios/assertions/platform.tstest/e2e-scenario/scenarios/assertions/registry.tstest/e2e-scenario/scenarios/assertions/runtime.tstest/e2e-scenario/scenarios/assertions/security.tstest/e2e-scenario/scenarios/builder.tstest/e2e-scenario/scenarios/clients/agent.tstest/e2e-scenario/scenarios/clients/gateway.tstest/e2e-scenario/scenarios/clients/host-cli.tstest/e2e-scenario/scenarios/clients/provider.tstest/e2e-scenario/scenarios/clients/sandbox.tstest/e2e-scenario/scenarios/clients/state.tstest/e2e-scenario/scenarios/compiler.tstest/e2e-scenario/scenarios/js-yaml.d.tstest/e2e-scenario/scenarios/manifests.tstest/e2e-scenario/scenarios/matrix.tstest/e2e-scenario/scenarios/migration-inventory.tstest/e2e-scenario/scenarios/orchestrators/environment.tstest/e2e-scenario/scenarios/orchestrators/onboarding.tstest/e2e-scenario/scenarios/orchestrators/phase.tstest/e2e-scenario/scenarios/orchestrators/runner.tstest/e2e-scenario/scenarios/orchestrators/runtime.tstest/e2e-scenario/scenarios/registry.tstest/e2e-scenario/scenarios/run.tstest/e2e-scenario/scenarios/scenarios/baseline.tstest/e2e-scenario/scenarios/types.tstools/e2e-advisor/scenarios.mtstools/e2e-scenarios/workflow-boundary.mts
💤 Files with no reviewable changes (64)
- test/e2e-scenario/scenarios/assertions/platform.ts
- test/e2e-scenario/manifests/openclaw-ollama-gpu.yaml
- test/e2e-scenario/scenarios/assertions/hermes.ts
- test/e2e-scenario/manifests/openclaw-nvidia-repair.yaml
- test/e2e-scenario/scenarios/clients/host-cli.ts
- test/e2e-scenario/manifests/openclaw-nvidia-double-same-provider.yaml
- test/e2e-scenario/manifests/openclaw-nvidia-gateway-port-conflict.yaml
- test/e2e-scenario/manifests/hermes-nvidia-slack.yaml
- test/e2e-scenario/scenarios/builder.ts
- test/e2e-scenario/framework-tests/e2e-assertion-modules.test.ts
- test/e2e-scenario/manifests/openclaw-openai-compatible.yaml
- test/e2e-scenario/scenarios/clients/gateway.ts
- test/e2e-scenario/scenarios/orchestrators/environment.ts
- test/e2e-scenario/manifests/openclaw-nvidia-telegram.yaml
- test/e2e-scenario/scenarios/orchestrators/runner.ts
- test/e2e-scenario/scenarios/clients/sandbox.ts
- test/e2e-scenario/manifests/hermes-nvidia.yaml
- test/e2e-scenario/framework-tests/e2e-plan-compiler.test.ts
- test/e2e-scenario/scenarios/orchestrators/runtime.ts
- test/e2e-scenario/scenarios/orchestrators/phase.ts
- test/e2e-scenario/manifests/openclaw-nvidia-invalid-key.yaml
- test/e2e-scenario/scenarios/assertions/messaging.ts
- test/e2e-scenario/manifests/openclaw-nvidia.yaml
- test/e2e-scenario/manifests/openclaw-nvidia-no-docker-negative.yaml
- test/e2e-scenario/scenarios/manifests.ts
- test/e2e-scenario/manifests/openclaw-nvidia-brev-launchable.yaml
- test/e2e-scenario/scenarios/assertions/negative.ts
- test/e2e-scenario/scenarios/clients/agent.ts
- test/e2e-scenario/manifests/openclaw-nvidia-token-rotation.yaml
- test/e2e-scenario/framework-tests/e2e-manifests.test.ts
- test/e2e-scenario/framework-tests/e2e-scenario-registry.test.ts
- test/e2e-scenario/onboarding_assertions/base/00-cli-installed.sh
- test/e2e-scenario/manifests/openclaw-nvidia-macos.yaml
- test/e2e-scenario/scenarios/assertions/lifecycle.ts
- test/e2e-scenario/scenarios/scenarios/baseline.ts
- test/e2e-scenario/scenarios/assertions/environment.ts
- test/e2e-scenario/scenarios/clients/provider.ts
- test/e2e-scenario/scenarios/assertions/runtime.ts
- test/e2e-scenario/scenarios/types.ts
- test/e2e-scenario/manifests/openclaw-nvidia-wsl.yaml
- test/e2e-scenario/manifests/hermes-nvidia-discord.yaml
- test/e2e-scenario/framework-tests/e2e-migration-inventory-lock.test.ts
- test/e2e-scenario/scenarios/clients/state.ts
- test/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.ts
- test/e2e-scenario/manifests/openclaw-nvidia-brave.yaml
- test/e2e-scenario/scenarios/migration-inventory.ts
- test/e2e-scenario/scenarios/matrix.ts
- test/e2e-scenario/scenarios/assertions/registry.ts
- test/e2e-scenario/scenarios/assertions/diagnostics.ts
- test/e2e-scenario/manifests/openclaw-nvidia-custom-policies.yaml
- test/e2e-scenario/manifests/openclaw-nvidia-double-provider-switch.yaml
- test/e2e-scenario/manifests/openclaw-nvidia-resume.yaml
- test/e2e-scenario/onboarding_assertions/preflight/00-preflight-passed.sh
- test/e2e-scenario/onboarding_assertions/preflight/00-preflight-expected-failed.sh
- test/e2e-scenario/scenarios/js-yaml.d.ts
- test/e2e-scenario/scenarios/assertions/security.ts
- test/e2e-scenario/scenarios/registry.ts
- test/e2e-scenario/manifests/openclaw-nvidia-slack.yaml
- test/e2e-scenario/scenarios/run.ts
- test/e2e-scenario/scenarios/compiler.ts
- test/e2e-scenario/scenarios/assertions/onboarding.ts
- test/e2e-scenario/scenarios/assertions/inference.ts
- test/e2e-scenario/manifests/openclaw-nvidia-discord.yaml
- test/e2e-scenario/scenarios/orchestrators/onboarding.ts
PR Review AdvisorFindings: 2 needs attention, 7 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
…ired flag Closes two of the three sub-points from the PR Review Advisor's 'Validate scenario IDs against ROUTES' finding. The remaining sub-point (membership-in-ROUTES allowlist) is deferred to a follow-up that lands after #4379, since ROUTES will move with that PR and the runtime resolve-runner job already rejects unknown ids at dispatch. In tools/e2e-advisor/scenarios.mts: - e2e-scenarios-all.yaml fan-out is the only valid pairing for the synthetic id 'e2e-scenarios-all'. Drop incoherent pairings in either direction so the sticky comment never renders a fan-out command for a single scenario id, or a single-scenario --field scenarios= line carrying the synthetic fan-out id. - Authority for the required boolean is array position: items in required[] are required; items in optional[] are optional. The model's per-item value is ignored so it cannot promote/demote a recommendation against its bucket. In test/e2e-scenario-advisor.test.ts: - Negative test: workflow/id pairing mismatches in both directions are dropped; only the valid (e2e-scenarios-all.yaml, e2e-scenarios-all) pairing survives. - Negative test: model-supplied required boolean cannot override the array position. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
After switching CI from npx tsx run.ts to bash run-scenario.sh in this PR,
the workflow's summary and upload steps still referenced the deleted TS
runner's artifact names (plan.txt, run-plan.json, *.result.json). With
E2E_CONTEXT_DIR overridden to ${{ github.workspace }} (workspace root),
plan.json was not even landing under .e2e/, so the upload step warned
'No files were found with the provided path: .e2e/run-plan.json' on the
failing scenario run, publishing no useful plan artifact for reviewers.
Changes:
- Set E2E_CONTEXT_DIR to ${{ github.workspace }}/.e2e so run-scenario.sh
writes plan.json/expected-state-report.json/expected-vs-actual.json
and install.log/onboard.log/etc. under .e2e/ as the upload glob expects.
- Update the plan summary step to read plan.json (the YAML resolver's
actual output) instead of the deleted plan.txt.
- Trim the upload list to .e2e/ (covers all current runtime artifacts)
plus test/e2e/logs/, removing references to deleted-runner files.
Refs: #4378, advisor finding from PR #4379 review-advisor run.
Signed-off-by: Justine Yaunches <jyaunches@nvidia.com>
Summary
The hybrid scenario suite under
test/e2e-scenario/had two runner entrypoints:test/e2e-scenario/scenarios/run.ts— OLD: TypeScript DSL + builderstest/e2e-scenario/runtime/run-scenario.sh— NEW: YAML metadata + bash runtimeThe new bash runner (via
runtime/resolver/*) is self-contained and does not import anything fromscenarios/. All recently-developed tests target the new path. The OLD TS framework was kept only to satisfy CI (which still calledrun.ts) and a handful of framework-tests that exercised the OLD DSL.This PR deletes the OLD path and migrates CI to the new bash entrypoint.
What's deleted
OLD framework — only consumed by
run.ts:test/e2e-scenario/scenarios/— entire TS DSL (33 files:run.ts,compiler.ts,registry.ts,builder.ts,manifests.ts,matrix.ts,migration-inventory.ts,types.ts,js-yaml.d.ts,orchestrators/,scenarios/baseline.ts,assertions/,clients/)test/e2e-scenario/manifests/— only consumed byscenarios/manifests.tstest/e2e-scenario/onboarding_assertions/— dead in NEW path (no helper sources or runs them; YAML declares string IDs but the bash runner never invokes the.shfiles)scenarios/*:e2e-assertion-modules.test.tse2e-manifests.test.tse2e-migration-inventory-lock.test.tse2e-phase-orchestrators.test.tse2e-plan-compiler.test.tse2e-scenario-registry.test.tsWhat's updated
.github/workflows/e2e-scenarios.yaml: replaced 3npx tsx ...run.tsinvocations withbash test/e2e-scenario/runtime/run-scenario.sh <id>, looping over comma-separated IDs in both the linux and WSL paths.tools/e2e-scenarios/workflow-boundary.mts: assert the bash entrypoint instead ofrun.ts.test/e2e-scenario/framework-tests/e2e-scenarios-workflow.test.ts: update fixture to the new entrypoint.tools/e2e-advisor/scenarios.mts(bridge edit): swaplistScenarios()(OLD registry) forloadMetadataFromDir()+resolveScenario()(NEW YAML resolver), so the deterministic scenario advisor still compiles after the deletion.Follow-up
test/e2e-scenario-advisor.test.ts: 2 cases skipped pending #4378.While doing the bridge edit on
scenarios.mts, we discovered thatnemoclaw_scenarios/scenarios.yaml'ssetup_scenarios:is missing user-friendly aliases for 13 layeredtest_plans(telegram, discord, slack, brave, resume, repair, double-same-provider, double-provider-switch, token-rotation, openai-compatible, hermes-discord, hermes-slack). The OLD TS registry was the only thing that knew those IDs. #4378 (child of epic #3588 Phase 4) tracks adding the aliases. A separate session is also overhauling the deterministic scenario advisor and may obsolete this code path entirely.Verification
#4378reference.bash test/e2e-scenario/runtime/run-scenario.sh <id> --plan-onlysucceeds for all 10 scenarios currently insetup_scenarios:.Stats
Refs: #3588, #4378
Summary by CodeRabbit