System Stepping implemented as Resource #8453

dmlary · 2023-04-21T12:18:57Z

Objective

Add interactive system debugging capabilities to bevy, providing step/break/continue style capabilities to running system schedules.

Original implementation: System & frame stepping support #8063
- ignore_stepping() everywhere was too much complexity
Schedule-config & Resource discussion: Update/FixedUpdate Stepping #8168
- Decided on selective adding of Schedules & Resource-based control

Solution

Created Stepping Resource. This resource can be used to enable stepping on a per-schedule basis. Systems within schedules can be individually configured to:

AlwaysRun: Ignore any stepping state and run every frame
NeverRun: Never run while stepping is enabled
- this allows for disabling of systems while debugging
Break: If we're running the full frame, stop before this system is run

Stepping provides two modes of execution that reflect traditional debuggers:

Step-based: Only execute one system at a time
Continue/Break: Run all systems, but stop before running a system marked as Break

Demo

trimmed.mov

Breakout has been modified to use Stepping. The game runs normally for a couple of seconds, then stepping is enabled and the game appears to pause. A list of Schedules & Systems appears with a cursor at the first System in the list. The demo then steps forward full frames using the spacebar until the ball is about to hit a brick. Then we step system by system as the ball impacts a brick, showing the cursor moving through the individual systems. Finally the demo switches back to frame stepping as the ball changes course.

Limitations

Due to architectural constraints in bevy, there are some cases systems stepping will not function as a user would expect.

Event-driven systems

Stepping does not support systems that are driven by Events as events are flushed after 1-2 frames. Although game systems are not running while stepping, ignored systems are still running every frame, so events will be flushed.

This presents to the user as stepping the event-driven system never executes the system. It does execute, but the events have already been flushed.

This can be resolved by changing event handling to use a buffer for events, and only dropping an event once all readers have read it.

The work-around to allow these systems to properly execute during stepping is to have them ignore stepping: app.add_systems(event_driven_system.ignore_stepping()). This was done in the breakout example to ensure sound played even while stepping.

Conditional Systems

When a system is stepped, it is given an opportunity to run. If the conditions of the system say it should not run, it will not.

Similar to Event-driven systems, if a system is conditional, and that condition is only true for a very small time window, then stepping the system may not execute the system. This includes depending on any sort of external clock.

This exhibits to the user as the system not always running when it is stepped.

A solution to this limitation is to ensure any conditions are consistent while stepping is enabled. For example, all systems that modify any state the condition uses should also enable stepping.

State-transition Systems

Stepping is configured on the per-Schedule level, requiring the user to have a ScheduleLabel.

To support state-transition systems, bevy generates needed schedules dynamically. Currently it’s very difficult (if not impossible, I haven’t verified) for the user to get the labels for these schedules.

Without ready access to the dynamically generated schedules, and a resolution for the Event lifetime, stepping of the state-transition systems is not supported

Changelog

Schedule::run() updated to consult Stepping Resource to determine which Systems to run each frame
Added Schedule.label as a BoxedSystemLabel, along with supporting Schedule::set_label() and Schedule::label() methods
- Stepping needed to know which Schedule was running, and prior to this PR, Schedule didn't track its own label
- Would have preferred to add Schedule::with_label() and remove Schedule::new(), but this PR touches enough already
Added calls to Schedule.set_label() to App and World as needed
Added Stepping resource
Added Stepping::begin_frame() system to MainSchedulePlugin
- Run before Main::run_main()
- Notifies any Stepping Resource a new render frame is starting

Migration Guide

Add a call to Schedule::set_label() for any custom Schedule
- This is only required if the Schedule will be stepped

Needed a way to identify a `Schedule` by `ScheduleLabel` for stepping. This data wasn't currently stored in the `Schedule`, but only used as a key in the `Schedules` resource. That resource (and key) are not available when stepping is evaluating a `Schedule`, so the label needed to be added. Added `label` to `Schedule`, and updated `add_schedule()` functions to set the label. I would have preferred to implement this as `Schedule::with_label()`, and have `add_schedule()` pull the label directly from `Schedule`, but that's a breaking change for the API. I feel like this is a good compromise as it doesn't break the API, but the data still gets where it needs to be.

`Stepping` Resource for performing system stepping & breakpoints.

A list of systems to skip was added to `ScheduleExecutor::run()`. All executors updated to properly apply the list. Tests added to verify all executors apply the skip list.

Got `Stepping::continue_frame()` functioning. Expanded testing of `Stepping`, and added `assert_schedule_runs!` macro to simplify tests. Most of the functionality is present here, but I need to expand testing further, then do an architecture & polish pass on `Stepping`.

This is the first part of support for introspection of `Stepping` by external systems. As `Stepping#build_skip_list()` is called, we insert the various `Schedule`s we see in `Stepping.schedule_order`. This is exposed to users via `Stepping#schedules()`, and will allow them to generate an ordered list of schedules & systems that will be stepped through.

To support predictable system ordering for external consumers of stepping, I moved the stepping frame cursor up into the `Stepping` resource. `Stepping` will now explicitly follow a schedule order when stepping systems. This order is dynamically created based on the order in which schedules get their skip list. Still don't like the names of some key functions like `Stepping::skipped_systems()`.

need to merge in main for wgpu crash fix

Need to rewrite `Stepping::Cursor` to expose NodeId of next system, and ensure it always contains a valid NodeId. This means tracking what the first NodeId of each System will be.

It was challenging to write a stepping UI using justthe label & system index. So I've updated `Stepping.cursor()` to return `(BoxedScheduleLabel, NodeId)`. This required `Stepping` keeping a list of `NodeId`s.

dmlary · 2023-04-21T12:23:49Z

Opening this so I can use github to do my own code-review.

Please hold off on reviewing it until I pull it out of draft mode.

crates/bevy_app/src/app.rs

crates/bevy_ecs/src/schedule/mod.rs

crates/bevy_ecs/src/schedule/schedule.rs

crates/bevy_ecs/src/schedule/stepping.rs

examples/games/stepping.rs

dmlary · 2024-01-06T01:55:58Z

Ok, added a more straightforward example using the Stepping resource. It's incredibly wordy, but exercises the resource directly in what I hope is a clear manner.

There's gonna be a few pushes to get github to run CI as #10902 (comment) is blocking me from running CI locally.

github-actions · 2024-01-06T02:07:53Z

The generated examples/README.md is out of sync with the example metadata in Cargo.toml or the example readme template. Please run cargo run -p build-templated-pages -- update examples to update it, and commit the file change.

Added LogPlugin to system_stepping example so the Stepping log messages appear in the output. This helps clarify what actions stepping is performing during the example.

atlv24

Looks to be well thought out. Love the unit tests. I'd like to see some tests with multiple copies of the same system in a schedule, and some with run conditions too.

crates/bevy_ecs/src/schedule/stepping.rs

crates/bevy_ecs/src/schedule/schedule.rs

dmlary · 2024-01-06T17:56:33Z

Looks to be well thought out. Love the unit tests. I'd like to see some tests with multiple copies of the same system in a schedule, and some with run conditions too.

This exposed a shortcoming in the assert_schedule_runs!() macro. Split that macro out even further to assert_skip_list_eq!() for the duplicate system matching.

Added test to verify we correctly step through duplicate systems (only one instance of the system runs at a time).
Added test to verify behavior of stepping when the stepped system condition returns false (we don't run any other systems, and we step past it).

atlv24

thanks!!

cart

I think this is in a great spot. It is minimally invasive to existing code, well tested, and behaves as expected. Great job! And sorry for the long review cycle!

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs

cart · 2024-02-03T02:36:15Z

crates/bevy_ecs/src/schedule/stepping.rs

+        schedule: impl ScheduleLabel,
+        system: impl IntoSystem<(), (), Marker>,
+    ) -> &mut Self {
+        let type_id = IntoSystem::into_system(system).type_id();


Ideally we don't need to actually construct the system to get the TypeId here. We should be able to rely on TypeId::of::<IntoSys::System>(). However currently that is out of sync with FunctionSystem's type_id() implementation. I'll add a PERF todo.

Definitely not worth blocking on.

james7132

LGTM. It's great that this is minimally invasive and low overhead (compared to, say, run conditions on every system), and will be super useful when used as a part of the editor.

One wishlist item from me is to support a way to step all parallel systems together, so we provide ways to try to replicate the parallel nature of the schedule. For game devs coming from an engine like Unity or Unreal, where everything is inherently serialized, may find it hard to bridge the gap to Bevy, so providing a debugging option that lets you step through groups of parallel systems may help.

cart · 2024-02-03T05:18:32Z

One wishlist item from me is to support a way to step all parallel systems together, so we provide ways to try to replicate the parallel nature of the schedule. For game devs coming from an engine like Unity or Unreal, where everything is inherently serialized, may find it hard to bridge the gap to Bevy, so providing a debugging option that lets you step through groups of parallel systems may help.

Yup this is a good idea. Fortunately I think it plays nicely into the current "skiplist" approach.

# Objective Add interactive system debugging capabilities to bevy, providing step/break/continue style capabilities to running system schedules. * Original implementation: bevyengine#8063 - `ignore_stepping()` everywhere was too much complexity * Schedule-config & Resource discussion: bevyengine#8168 - Decided on selective adding of Schedules & Resource-based control ## Solution Created `Stepping` Resource. This resource can be used to enable stepping on a per-schedule basis. Systems within schedules can be individually configured to: * AlwaysRun: Ignore any stepping state and run every frame * NeverRun: Never run while stepping is enabled - this allows for disabling of systems while debugging * Break: If we're running the full frame, stop before this system is run Stepping provides two modes of execution that reflect traditional debuggers: * Step-based: Only execute one system at a time * Continue/Break: Run all systems, but stop before running a system marked as Break ### Demo https://user-images.githubusercontent.com/857742/233630981-99f3bbda-9ca6-4cc4-a00f-171c4946dc47.mov Breakout has been modified to use Stepping. The game runs normally for a couple of seconds, then stepping is enabled and the game appears to pause. A list of Schedules & Systems appears with a cursor at the first System in the list. The demo then steps forward full frames using the spacebar until the ball is about to hit a brick. Then we step system by system as the ball impacts a brick, showing the cursor moving through the individual systems. Finally the demo switches back to frame stepping as the ball changes course. ### Limitations Due to architectural constraints in bevy, there are some cases systems stepping will not function as a user would expect. #### Event-driven systems Stepping does not support systems that are driven by `Event`s as events are flushed after 1-2 frames. Although game systems are not running while stepping, ignored systems are still running every frame, so events will be flushed. This presents to the user as stepping the event-driven system never executes the system. It does execute, but the events have already been flushed. This can be resolved by changing event handling to use a buffer for events, and only dropping an event once all readers have read it. The work-around to allow these systems to properly execute during stepping is to have them ignore stepping: `app.add_systems(event_driven_system.ignore_stepping())`. This was done in the breakout example to ensure sound played even while stepping. #### Conditional Systems When a system is stepped, it is given an opportunity to run. If the conditions of the system say it should not run, it will not. Similar to Event-driven systems, if a system is conditional, and that condition is only true for a very small time window, then stepping the system may not execute the system. This includes depending on any sort of external clock. This exhibits to the user as the system not always running when it is stepped. A solution to this limitation is to ensure any conditions are consistent while stepping is enabled. For example, all systems that modify any state the condition uses should also enable stepping. #### State-transition Systems Stepping is configured on the per-`Schedule` level, requiring the user to have a `ScheduleLabel`. To support state-transition systems, bevy generates needed schedules dynamically. Currently it’s very difficult (if not impossible, I haven’t verified) for the user to get the labels for these schedules. Without ready access to the dynamically generated schedules, and a resolution for the `Event` lifetime, **stepping of the state-transition systems is not supported** --- ## Changelog - `Schedule::run()` updated to consult `Stepping` Resource to determine which Systems to run each frame - Added `Schedule.label` as a `BoxedSystemLabel`, along with supporting `Schedule::set_label()` and `Schedule::label()` methods - `Stepping` needed to know which `Schedule` was running, and prior to this PR, `Schedule` didn't track its own label - Would have preferred to add `Schedule::with_label()` and remove `Schedule::new()`, but this PR touches enough already - Added calls to `Schedule.set_label()` to `App` and `World` as needed - Added `Stepping` resource - Added `Stepping::begin_frame()` system to `MainSchedulePlugin` - Run before `Main::run_main()` - Notifies any `Stepping` Resource a new render frame is starting ## Migration Guide - Add a call to `Schedule::set_label()` for any custom `Schedule` - This is only required if the `Schedule` will be stepped --------- Co-authored-by: Carter Anderson <mcanders1@gmail.com>

# Objective * Fixes #11932 (performance impact when stepping is disabled) ## Solution The `Option<FixedBitSet>` argument added to `ScheduleExecutor::run()` in #8453 caused a measurable performance impact even when stepping is disabled. This can be seen by the benchmark of running `Schedule:run()` on an empty schedule in a tight loop (#11932 (comment)). I was able to get the same performance results as on 0.12.1 by changing the argument `ScheduleExecutor::run()` from `Option<FixedBitSet>` to `Option<&FixedBitSet>`. The down-side of this change is that `Schedule::run()` now takes about 6% longer (3.7319 ms vs 3.9855ns) when stepping is enabled --- ## Changelog * Change `ScheduleExecutor::run()` `_skipped_systems` from `Option<FixedBitSet>` to `Option<&FixedBitSet>` * Added a few benchmarks to measure `Schedule::run()` performance with various executors

# Objective * Fixes bevyengine#11932 (performance impact when stepping is disabled) ## Solution The `Option<FixedBitSet>` argument added to `ScheduleExecutor::run()` in bevyengine#8453 caused a measurable performance impact even when stepping is disabled. This can be seen by the benchmark of running `Schedule:run()` on an empty schedule in a tight loop (bevyengine#11932 (comment)). I was able to get the same performance results as on 0.12.1 by changing the argument `ScheduleExecutor::run()` from `Option<FixedBitSet>` to `Option<&FixedBitSet>`. The down-side of this change is that `Schedule::run()` now takes about 6% longer (3.7319 ms vs 3.9855ns) when stepping is enabled --- ## Changelog * Change `ScheduleExecutor::run()` `_skipped_systems` from `Option<FixedBitSet>` to `Option<&FixedBitSet>` * Added a few benchmarks to measure `Schedule::run()` performance with various executors

# Objective * Fixes #11932 (performance impact when stepping is disabled) ## Solution The `Option<FixedBitSet>` argument added to `ScheduleExecutor::run()` in #8453 caused a measurable performance impact even when stepping is disabled. This can be seen by the benchmark of running `Schedule:run()` on an empty schedule in a tight loop (#11932 (comment)). I was able to get the same performance results as on 0.12.1 by changing the argument `ScheduleExecutor::run()` from `Option<FixedBitSet>` to `Option<&FixedBitSet>`. The down-side of this change is that `Schedule::run()` now takes about 6% longer (3.7319 ms vs 3.9855ns) when stepping is enabled --- ## Changelog * Change `ScheduleExecutor::run()` `_skipped_systems` from `Option<FixedBitSet>` to `Option<&FixedBitSet>` * Added a few benchmarks to measure `Schedule::run()` performance with various executors

dmlary added 15 commits March 26, 2023 20:36

WIP: Add bevy_ecs::schedule::stepping

8db83c3

`Stepping` Resource for performing system stepping & breakpoints.

Update ScheduleExecutor for stepping

4ac6dfa

A list of systems to skip was added to `ScheduleExecutor::run()`. All executors updated to properly apply the list. Tests added to verify all executors apply the skip list.

Unify Stepping updates into single enum

6714bcf

Merge remote-tracking branch 'upstream/main' into stepping-resource

e5661c7

combine build step list functions

9004864

WIP: add stepping to breakout

9d85b2e

need to merge in main for wgpu crash fix

Merge remote-tracking branch 'upstream' into stepping-resource

62fa980

Added stepping to breakout

bca6c85

Need to rewrite `Stepping::Cursor` to expose NodeId of next system, and ensure it always contains a valid NodeId. This means tracking what the first NodeId of each System will be.

Merge remote-tracking branch 'upstream/main' into stepping-resource

315a88c

Stepping cursor changed to use label & NodeId

625e690

It was challenging to write a stepping UI using justthe label & system index. So I've updated `Stepping.cursor()` to return `(BoxedScheduleLabel, NodeId)`. This required `Stepping` keeping a list of `NodeId`s.

Add cursor tests, and Stepping _node() methods

4a6cea5

dmlary commented Apr 21, 2023

View reviewed changes

This was referenced Apr 22, 2023

System & frame stepping support #8063

Closed

Update/FixedUpdate Stepping #8168

Closed

dmlary added 3 commits April 22, 2023 15:56

Updates from code-review

16a9e59

Merge remote-tracking branch 'upstream/main' into stepping-resource

a9ebeb5

merge fixes, and CI fixes

1cb1c58

dmlary marked this pull request as ready for review April 22, 2023 22:02

dmlary changed the title ~~WIP: System Stepping implemented as Resource~~ System Stepping implemented as Resource Apr 22, 2023

alice-i-cecile added this to the 0.11 milestone Apr 22, 2023

dmlary added 3 commits January 5, 2024 20:10

update examples/README.md; add LogPlugin to system_stepping

4242192

Added LogPlugin to system_stepping example so the Stepping log messages appear in the output. This helps clarify what actions stepping is performing during the example.

Merge remote-tracking branch 'upstream' into stepping-resource

39b48f7

convert bevy_internal use in system_stepping example

5f88d6b

atlv24 approved these changes Jan 6, 2024

View reviewed changes

Fixes per code-review comments

7cc52fa

atlv24 approved these changes Jan 6, 2024

View reviewed changes

dmlary added 2 commits January 9, 2024 16:00

Merge remote-tracking branch 'upstream' into stepping-resource

3c767ab

Merge remote-tracking branch 'upstream' into stepping-resource

6cdc0c5

matiqo15 added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jan 29, 2024

james7132 self-requested a review February 2, 2024 23:51

Minor style tweaks

b455384

cart approved these changes Feb 3, 2024

View reviewed changes

james7132 approved these changes Feb 3, 2024

View reviewed changes

cart added this pull request to the merge queue Feb 3, 2024

Merged via the queue into bevyengine:main with commit 5c52d0a Feb 3, 2024
26 of 27 checks passed

This was referenced Feb 3, 2024

Stored TypeId of systems have an unexpected value #11679

Closed

Add the ability to step through groups of parallel systems #11680

Open

hymm mentioned this pull request Feb 18, 2024

Noticeable schedules running overhead #11932

Closed

dmlary mentioned this pull request Feb 18, 2024

Stepping disabled performance fix #11959

Merged

hymm moved this from Open PR to Merged PR in ECS Feb 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System Stepping implemented as Resource #8453

System Stepping implemented as Resource #8453

dmlary commented Apr 21, 2023

dmlary commented Apr 21, 2023

dmlary commented Jan 6, 2024

github-actions bot commented Jan 6, 2024

atlv24 left a comment

dmlary commented Jan 6, 2024

atlv24 left a comment

cart left a comment

cart Feb 3, 2024

cart Feb 3, 2024

james7132 left a comment

cart commented Feb 3, 2024

System Stepping implemented as Resource #8453

System Stepping implemented as Resource #8453

Conversation

dmlary commented Apr 21, 2023

Objective

Solution

Demo

Limitations

Event-driven systems

Conditional Systems

State-transition Systems

Changelog

Migration Guide

dmlary commented Apr 21, 2023

dmlary commented Jan 6, 2024

github-actions bot commented Jan 6, 2024

atlv24 left a comment

Choose a reason for hiding this comment

dmlary commented Jan 6, 2024

atlv24 left a comment

Choose a reason for hiding this comment

cart left a comment

Choose a reason for hiding this comment

cart Feb 3, 2024

Choose a reason for hiding this comment

cart Feb 3, 2024

Choose a reason for hiding this comment

james7132 left a comment

Choose a reason for hiding this comment

cart commented Feb 3, 2024