fix: cross-demo script regressions + multi-ECU smoke-test flake#60
Merged
Conversation
The per-demo wrappers source lib/setup-trigger.sh and lib/watch-trigger.sh
from inside demos/<name>/. Both sourced files then re-source
triggers-api.sh with:
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
When sourced, $0 stays the caller's script path - the demo wrapper -
not the lib file's path. So SCRIPT_DIR resolves to demos/<name>/ and
the second source fails with:
demos/<name>/triggers-api.sh: No such file or directory
${BASH_SOURCE[0]} is the path of the file currently executing the line,
which is what we actually want.
…lisation The gateway normalises app IDs with hyphens - the entity is registered as e.g. diagnostic-bridge, manipulation-monitor, not diagnostic_bridge. The trigger wrappers were assuming the ROS-node-name spelling (underscore), which matches the reporting_sources path inside the FaultEvent payload but not the URL-addressable entity ID. setup-triggers returned HTTP 404 from /apps/<underscore_id>/triggers and watch-triggers found no active trigger to subscribe to. Switch every wrapper to the hyphenated form. For turtlebot3 also switch from anomaly-detector to diagnostic-bridge: faults on this stack arrive from /diagnostics through the bridge and bucket against the bridge's faults collection, so the anomaly-detector trigger would never see an event even with the correct ID.
Container scripts in moveit_pick_place and multi_ecu_aggregation start with "set -eu" and then source /opt/ros/jazzy/setup.bash. setup.bash dereferences AMENT_TRACE_SETUP_FILES without guarding for unset, so under nounset the source aborts before any payload runs: /opt/ros/jazzy/setup.bash: line 8: AMENT_TRACE_SETUP_FILES: unbound variable The Scripts API reports this as "Script exited with code 1" and the inject/restore never lands. Wrap the source pair with set +u / set -u so AMENT_* defaults can stay implicit while the rest of the script keeps strict variable checking.
…alization-failure The script reinitialises AMCL (scatters particles) and then asks bt-navigator to navigate. Under scattered particles the action server returns HTTP 400 with x-medkit-ros2-action-rejected - which is the demo's intended failure mode, the whole point of the injection. curl -sf turns that 400 into exit 22 and aborts the script before the final "Localization failure injected." print, and the Scripts API reports the script as failed even though the fault path it was supposed to trigger fired correctly. Drop -f on the navigate_to_pose call and print the response body (matching the pattern used in inject-nav-failure on the same demo).
Four host-side helper scripts lost their executable bit somewhere along the way: demos/sensor_diagnostics/inject-fault-scenario.sh demos/sensor_diagnostics/run-diagnostics.sh demos/moveit_pick_place/arm-self-test.sh demos/moveit_pick_place/planning-benchmark.sh README and run-demo.sh both reference them as ./script.sh, so running the demos as documented hits "Permission denied". chmod +x to restore the bit; no content change.
…ting The readiness gate polled only for the planning ECU's path-planner app, then asserted on all three peer ECUs. When the actuation ECU was slower to aggregate into robot-alpha, the apps assertion ran before its nodes were linked and reported "found 7" (perception + planning only), failing build-and-test-multi-ecu intermittently. The gate now waits for a representative app from each peer ECU (path-planner for planning, motor-controller for actuation) before the discovery assertions run.
There was a problem hiding this comment.
Pull request overview
This PR fixes several shell-script regressions across multiple demos and hardens the multi-ECU smoke test to reduce CI flakiness by gating on aggregation readiness from multiple peer ECUs.
Changes:
- Make trigger helper libs source-safe by resolving their own directory via
BASH_SOURCE[0]. - Align demo trigger scripts to gateway-normalized, hyphenated entity IDs (and update TB3 to use
diagnostic-bridge). - Reduce multi-ECU smoke-test flake by waiting for representative apps from both planning and actuation ECUs before asserting discovery.
- Relax
nounsetaround ROSsetup.bashsourcing in several container scripts.
Reviewed changes
Copilot reviewed 19 out of 23 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
tests/smoke_test_multi_ecu.sh |
Adds a stronger aggregation readiness gate by waiting for both planning + actuation marker apps. |
lib/watch-trigger.sh |
Uses BASH_SOURCE[0] to resolve paths correctly when sourced. |
lib/setup-trigger.sh |
Uses BASH_SOURCE[0] to resolve paths correctly when sourced. |
demos/turtlebot3_integration/watch-triggers.sh |
Switches watched entity ID to diagnostic-bridge (hyphenated). |
demos/turtlebot3_integration/setup-triggers.sh |
Switches trigger target to diagnostic-bridge and updates explanatory comments. |
demos/turtlebot3_integration/container_scripts/nav2-stack/inject-localization-failure/script.bash |
Treats nav2 rejection as expected by removing curl -f and printing the response. |
demos/sensor_diagnostics/watch-triggers.sh |
Switches watched entity ID to diagnostic-bridge (hyphenated). |
demos/sensor_diagnostics/setup-triggers.sh |
Switches trigger target to diagnostic-bridge (hyphenated). |
demos/sensor_diagnostics/run-diagnostics.sh |
Adds a Scripts API wrapper to run diagnostics from the host. |
demos/sensor_diagnostics/inject-fault-scenario.sh |
Adds a Scripts API wrapper to inject a composite fault scenario from the host. |
demos/multi_ecu_aggregation/container_scripts/planning-ecu/restore-normal/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/multi_ecu_aggregation/container_scripts/planning-ecu/inject-planning-delay/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/multi_ecu_aggregation/container_scripts/perception-ecu/restore-normal/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/multi_ecu_aggregation/container_scripts/perception-ecu/inject-sensor-failure/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/multi_ecu_aggregation/container_scripts/actuation-ecu/restore-normal/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/multi_ecu_aggregation/container_scripts/actuation-ecu/inject-gripper-jam/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/moveit_pick_place/watch-triggers.sh |
Switches watched entity ID to manipulation-monitor (hyphenated). |
demos/moveit_pick_place/setup-triggers.sh |
Switches trigger target to manipulation-monitor (hyphenated). |
demos/moveit_pick_place/planning-benchmark.sh |
Adds a Scripts API wrapper for the planning benchmark. |
demos/moveit_pick_place/container_scripts/moveit-planning/restore-normal/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/moveit_pick_place/container_scripts/moveit-planning/inject-planning-failure/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/moveit_pick_place/container_scripts/moveit-planning/inject-collision/script.bash |
Wraps ROS environment sourcing with set +u/set -u to avoid nounset failures. |
demos/moveit_pick_place/arm-self-test.sh |
Adds a Scripts API wrapper for the arm self-test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mfaferek93
approved these changes
May 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes a set of demo-script regressions surfaced while running every demo end-to-end with its documented scripts, plus an intermittent multi-ECU CI flake. None of the script issues are new - they accumulated over time and went unnoticed because no single change exercises every demo end-to-end. Split out from #58 so the OTA demo PR stays scoped to the OTA demo and these unrelated fixes land on their own.
Script regressions
BASH_SOURCEinstead of$0-lib/setup-trigger.shandlib/watch-trigger.shresolvedSCRIPT_DIRfrom$0, which stays the caller's path when the lib is sourced, so the re-source oftriggers-api.shfailed with "No such file or directory".diagnostic-bridge,manipulation-monitor, etc. The per-demo trigger wrappers passed the ROS-node-name spelling (diagnostic_bridge), which matches the reporting source path but not the URL-addressable entity ID, sosetup-triggersreturned HTTP 404. turtlebot3 also switched fromanomaly-detectortodiagnostic-bridgebecause faults bucket against the bridge on that stack.setup.bashsourcing - container scripts inmoveit_pick_placeandmulti_ecu_aggregationstart withset -eu, then source/opt/ros/jazzy/setup.bash, which dereferencesAMENT_TRACE_SETUP_FILESwithout guarding for unset. Every inject/restore aborted withAMENT_TRACE_SETUP_FILES: unbound variablebefore the payload ran. Wrapped each source pair withset +u/set -u.inject-localization-failure- the script scatters AMCL particles then asksbt-navigatorto navigate; under scattered particles the action server returns HTTP 400 (the whole point of the demo), butcurl -sfexit 22 aborted before the final print. Dropped-fto matchinject-nav-failureon the same demo.sensor_diagnosticsandmoveit_pick_placelost their executable bit; README andrun-demo.shreference them as./script.sh.chmod +x, no content change.CI flake fix
tests/smoke_test_multi_ecu.shgated readiness on the planning ECU'spath-planneronly, then asserted on all three peer ECUs. When the actuation ECU was slower to aggregate intorobot-alpha, the apps assertion ran early and reported "found 7" instead of >=10, failingbuild-and-test-multi-ecuintermittently. The gate now waits for a representative app from each peer ECU (path-plannerfor planning,motor-controllerfor actuation) before the discovery assertions run.Test plan
shellcheckclean across all*.sh/*.bash(CI-equivalentfind | xargs shellcheck)multi_ecu_aggregationstack locally and rantests/smoke_test_multi_ecu.sh: 24/24 pass, with the gate now reportingplanning ECU aggregatedandactuation ECU aggregatedbefore the assertionsactuation ECU not aggregated within 60s (motor-controller missing), instead of racing into a confusing "found 7" assertion failureRelated Issue
Split out from #58.
Checklist