[Monitor] Support `repeat ...` loops + Update monitor output on WAL examples by ngernest · Pull Request #194 · cucapra/protocols

ngernest · 2026-02-20T20:55:21Z

This PR extends the monitor to support repeat loops by implementing the design discussed in #183 (see this comment for details about the design).

Other miscellaneous changes:

Define a file macros.rs with some helper macros for logging that print out the line number, function name and file name (the default info! macro from the log crate doesn't do this)
Changed the AugmentedTrace type from a type alias to a tuple struct so that we can define methods over it (see types.rs)
Renamed Stmt::BoundedLoop to RepeatLoop in the interpreter code

Current status (EoD 2/26):

1. Loop with assignments
Consider loop_with_assigns.prot:

// Each iteration of the loop performs the full add protocol:
//   assign inputs -> step -> release inputs -> assert output
// This tests that assignments, DontCare releases, and assertions
// all work correctly inside a bounded loop body.
prot loop_add<DUT: Adder>(in a: u32, in b: u32, in num_iters: u64, out s: u32) {
  repeat num_iters iterations {
    DUT.a := a;
    DUT.b := b;
    step();
    DUT.a := X;
    DUT.b := X;
    assert_eq(s, DUT.s);
  }
  fork();
  step();
}

This is the original transaction file supplied to the interpreter:

trace {
    loop_add(1, 2, 3, 3);   // assert 1+2=3 on each of 3 iterations
    loop_add(10, 20, 1, 30); // assert 10+20=30 on 1 iteration
}

The monitor currently produces the following output:

// trace 0
trace {
    loop_add(1, 2, 3, 3);
    loop_add(10, 20, 1, 30);
    loop_add(198490043, 4254467669, 1, 30);
}

// trace 1
trace {
    loop_add(1, 2, 1, 3);
    loop_add(1, 2, 2, 3);
    loop_add(10, 20, 1, 30);
    loop_add(198490043, 4254467669, 1, 30);
}

// trace 2
trace {
    loop_add(1, 2, 2, 3);
    loop_add(1, 2, 1, 3);
    loop_add(10, 20, 1, 30);
    loop_add(198490043, 4254467669, 1, 30);
}

// trace 3
trace {
    loop_add(1, 2, 1, 3);
    loop_add(1, 2, 1, 3);
    loop_add(1, 2, 1, 3);
    loop_add(10, 20, 1, 30);
    loop_add(198490043, 4254467669, 1, 30);
}

The issue is with the final spurious loop_add protocol in each of the traces suggested by the monitor. This is caused after a legitimate loop_call transaction completes and triggers an ExplicitFork (i.e. another loop_add protocol) at time = 4:

This new thread reads the values of DUT.a and DUT.b at time 4, both of which are randomized values (caused by DontCare), it does a step() (time is now 5), but assert_eq(s, DUT.s) succeeds since DUT.s is still 30. This thread finishes and emits loop_add(random_a, random_b, 1, 30).

Edit: we discussed on Monday how this is fine (this is an issue with how the waveform is being produced by the interpreter), so it doesn't block this PR

2: Nested busy wait
Monitor hangs on this protocol due to exponential blowup in the no. of threads (this also happens for push_pop_loop_empty.prot and push_pop_loop_not_empty.prot for the FIFO example):

// Nested bounded loops: the outer loop runs `outer_iters` times,
// and the inner loop runs `inner_iters` times per outer iteration,
// for a total of `outer_iters * inner_iters` steps.
// Since inputs persist on the adder, the final result is
// the same regardless of how many steps we take.
prot nested_busy_wait<DUT: Adder>(in a: u32, in b: u32, in outer_iters: u64, in inner_iters: u64, out s: u32) {
  DUT.a := a;
  DUT.b := b;
  repeat outer_iters iterations {
    repeat inner_iters iterations {
      step();
      assert_eq(s, DUT.s);
    }
    step();
    assert_eq(s, DUT.s);
  }
  DUT.a := X;
  DUT.b := X;
  assert_eq(s, DUT.s);
  fork();
  step();
}

Edit: we discussed on Monday this is fine. I added a 5 second timeout for Turnt for these specific test cases and allowed them to fail in Turnt to allow CI to proceed.

…tion

…epeat_loops

This reverts commit 1667082.

…ueue is empty

…plicitFork

…_repeat_loops

…us guesses for loop_args in repeat loops

ekiwi · 2026-03-04T14:04:10Z

    pub args_to_pins: FxHashMap<SymbolId, SymbolId>,

+    /// Map which tracks how many times an input/output parameter's value
+    /// was updated *after* it was initially inferred due to the corresponding


@ngernest What does it mean for the monitor if an "input/output parameter's value was updated after it was initially inferred"? Can you provide me with an example of why/how that would happen?

Yep! When the monitor is run on busy_wait.prot, previously there was this (wrong) trace being produced:

... // Trace 1 trace { add_busy_wait(4, 5, 3, 3); // no. of loop iterations is 2nd to last arg, so num_cycles = 3 } ...

This trace is wrong, because if we look at the waveform (busy_wait.fst below), we see that the trace should be add_busy_wait(1, 2, 1, 3); add_busy_wait(4, 5, 2, 9) instead, i.e. this trace is missing the first cycle (doesn't cover the whole waveform). This violates the assumption that at any point in the waveform, there must be at most one protocol that can explain the signals (and since the only protocol in busy_wait.prot is add_busy_wait, the monitor is missing the first instance of add_busy_wait).

Looking at the logs, it appears that the trace only containing add_busy_wait(4, 5, 3, 3) was caused by a thread where the input parameters a, b were initially inferred as 1 and 2 (by reading off DUT.a & DUT.b during the first cycle in the waveform), but after a few iterations of the repeat loop, the values of DUT.a and DUT.b changed to 4 and 5 in the waveform, i.e. this is an example where a parameter's value was initially inferred, but then it changed due to the waveform signal changing.

I should mention this rebind_counts map doesn't change the "runtime" behavior of the monitor (i.e. how it infers parameter values) -- it is only used when there are multiple candidate traces and we want to reject some of them (e.g. traces which only contain one instance of add_busy_wait). For other tests where the parameter values do change over time (e.g. the WAL AXI-Stream example), since there is only one trace produced by the monitor for those (since the monitor waits exactly for ready and valid to both become 1 in the same cycle to detect the data transfer), this strategy doesn't get used.

Looking at the logs, it appears that the trace only containing add_busy_wait(4, 5, 3, 3) was caused by a thread where the input parameters a, b were initially inferred as 1 and 2 (by reading off DUT.a & DUT.b during the first cycle in the waveform), but after a few iterations of the repeat loop, the values of DUT.a and DUT.b changed to 4 and 5 in the waveform,

Should that not lead to a failure of the trace?

This is what should happen when there is, e.g., a Dut.a := a in the protocol:

if a is not defined, use the value of Dut.a to define a

if a is defined, check the value of a against Dut.a if they disagree, then the protocol fails.

So then how would you ever have a rebind_count that isn't just zero?

This is what should happen when there is, e.g., a Dut.a := a in the protocol:

if a is not defined, use the value of Dut.a to define a

if a is defined, check the value of a against Dut.a if they disagree, then the protocol fails.

This is mostly what is done in the current implementation in the main branch, although in the main branch of scheduler.rs here, if a is defined and trace(DUT.a) != previous inferred value for a, we update the value for a.

The reason for this is that WAL AXIS test implicitly relies on this behavior in order to work -- here is the protocol:

prot send_data<D: AXIS>(in data: u8) { D.m_axis_tdata := data; D.m_axis_tvalid := 1'b1; while (!(D.m_axis_tready == 1'b1)) { step(); } // one cycle for the data transfer to take place step(); }

When a thread corresponding to send_data begins, it immediately infers data to be the value of D.m_axis_tdata at the time. However, the thread can stay in the while loop for many cycles, and after every step(), the current implementation of the monitor on the main branch re-performs the "compare data against the trace value for D.m_axis_tdata" check, and updates data to match D.m_axis_tdata, and this causes the WAL tests to currently pass. (As a result, rebind_count can be non-zero under this scheme.)

(Yesterday I tried enforcing a thread failure if trace(DUT.a) != previous inferred value for a but this caused the WAL tests to fail, so the rebind_counts approach was mostly a way to keep the existing monitor behavior while filtering out invalid traces from the add_busy_wait example that don't span the entire waveform. I'm happy to revert this change and try something else though!)

The reason for this is that WAL AXIS test implicitly relies on this behavior in order to work -- here is the protocol:

That means that something is wrong with the WAL AXIS protocol.

I looked at the WAL AXIS waveforms again and I noticed a bug that the existing approach might have caused:

Here is part of the trace produced for axis_truncated.prot (in axis_truncated.out) (with idle protocols explicitly printed out)

... idle(); // [time: 1400ns -> 1408ns] send_data(18); // [time: 1408ns -> 3272ns] idle(); // [time: 3272ns -> 3280ns] ...

This is the relevant section of the waveform corresponding to send_data(18) (between the two markers indicating 1408ns - 3272ns):

A thread is spawned at 1408ns, and at this time, data is 16, but because ready & valid are both 0 during this time, the thread keeps waiting (stuck in the while-loop) until 3264 ns (near the orange marker)
when ready and valid both become 1. During this time from 1408 - 3264ns, data changes from 16 -> 17 -> 18, and because the value of data at 3264 ns is 18, data is updated to be 18 after the next step().

The protocol that shows up in the trace (send_data(18)) is correct, since 18 is the value of data when both ready & valid are 1, but the timestamps are incorrect, since the starting timestamp of the thread 1480ns corresponds to when data = 16 still.

If we change the monitor behavior to more strictly align with this

if a is defined, check the value of a against Dut.a if they disagree, then the protocol fails

then the monitor should show a later starting timestamp for send_data(18), and at 1408ns (when data = 16), we would just have another idle protocol.

A thread is spawned at 1408ns, and at this time, data is 16, but because ready & valid are both 0 during this time, the thread keeps waiting (stuck in the while-loop) until 3264 ns (near the orange marker)

When valid is 0, then the send_data protocol should immediately fail. The protocol immediately constrains D.m_axis_tvalid := 1'b1; and that is never changed. So it cannot match any cycle where m_axis_tvalid is not 1.

Gotcha I'll debug this thanks (removed all the rebind_count logic for now and set threads to fail if their inferred parameter values don't match the waveform value)

…line numbers instead

…nfcombinatorial explosion

…es it correctly

…nferred value != waveform value)

…the current / next cycle) before emitting global monitor failure

ngernest · 2026-03-04T20:21:28Z

Latest updates:

The monitor now fails if a parameter is defined (i.e. we've already determined a value for it) and the waveform value doesn't match it in subsequent cycles
All WAL tests now fail as their waveforms are buggy (data is not stable when valid is asserted), the expected output for these tests in Turnt is an error message saying that no transactions match
- This involved updating the error reporting logic so that if a scheduler fails and no other scheduler can make any progress (i.e. its current and next queues are both empty, indicating that there are no more threads that can be run in either the current or the next cycle), then we report the error (see lines 129 - 149 in global_scheduler.rs)

One of the Brave New World tests has also been updated (the AXI-Stream FIFO axis-fifo-d4, [Monitor] AXI FIFO example from Brave New World artifact #140) as the monitor now fails on one of the waveforms. The monitor now reports an error for d4_fixed.prot, which is expected, since the bug involves fixing a FIFO so that we cannot perform any further push-es when it is full (the buggy protocol d4_buggy.prot still permits pushes even when the FIFO is full).

(More details about the bug in this PR comment)

ngernest added 30 commits February 20, 2026 15:54

Add LoopArgState definition

ad0ff22

Handle speculative loop exeuction

d968619

Exit loop when LoopArg becomes Known(n)

e0075a0

Formatting

8458992

Add loop_arg to args_mapping when forking exit_thread

0996bda

Refactor two cases of the RepeatLoop forking logic into a helper func…

da7824d

…tion

Handle case when multiple loops use the same loop arg

4ff13e6

Disallow nested repeat loops that use the same loop arg

4af2f99

Run Cargo in Quiet mode when running Snapshot Tests

cd0bc93

Add Cargo aliases

8c296ef

Update various Turnt.toml files to use Cargo aliases

bca2aa1

Merge remote-tracking branch 'origin/cargo_quiet_mode' into monitor_r…

59974b4

…epeat_loops

Add adder_d1 repeat loop tests

7a70e62

Add adder_d1 loop waveform files

2c7009d

Add Turnt args to adder_d1 loop tests

c8a42e0

Make Interpreter::to_protocol_application return an option

c261d2a

Add expected output for two adder_d1 tests with loops

68d44ba

Optimize Interpreter::to_protocol_application

d903d7a

Intercept fork before handling other statements

1667082

Revert "Intercept fork before handling other statements"

943d665

This reverts commit 1667082.

Add adder_d1 files that trigger monitor bugs

3793e71

Fix waveform file for wait_and_add

f7fbeea

Ensure that interpreter calls simulator.step even when next_threads q…

09200d6

…ueue is empty

Update waveform file for combinational adder in monitor/tests

919b79a

Move monitor adder tests to another branch

23ad90f

Comments

4ea3961

Make sure exited_threads arising from repeat loops are returned as Ex…

fa5654e

…plicitFork

Formatting

e6568e3

Merge remote-tracking branch 'origin/main' into monitor_repeat_loops

ebd30ea

Merge remote-tracking branch 'origin/monitor_adder_bugs' into monitor…

6b35084

…_repeat_loops

ngernest added 7 commits March 3, 2026 21:48

Track the rebind_count of each protocol parameter to eliminate spurio…

7957a68

…us guesses for loop_args in repeat loops

Update expected output for busy_wait.prot

1abc2a5

Refactor AugmentedTrace as tuple struct, add AugmentedTrace methods

6f1997c

Update Cargo.toml

1a4a37f

Add FIFO tests involving repeat loops

4df5e70

Add nested_busy_wait.prot

da31a87

Add blank file needed for Turnt

89df93d

ekiwi reviewed Mar 4, 2026

View reviewed changes

ngernest added 11 commits March 4, 2026 09:49

Clean up logging logic, use stdext::debug_name to extract file name, …

d424751

…line numbers instead

update comments

9bd8d85

Add timeout for individual Turnt tests w/ repeat loops that trigger i…

d079109

…nfcombinatorial explosion

No op to trigger CI again

12f5ea6

Remove trailing space from Turnt per-test-override so that Turnt pars…

b81cf52

…es it correctly

Remove rebind_counts logic + make binding strict (reject threads if i…

663cb82

…nferred value != waveform value)

Update logging for debugging WAL

abecf74

Updated expected output for WAL test cases (they should fail)

37933b9

rm unnecessary log stmt

6d1e2b5

Update expected output for axis-fifo-d4 test

d3e4b80

Check whether all scheduelrs are done (no more threads to execute in …

e97c7e6

…the current / next cycle) before emitting global monitor failure

ngernest marked this pull request as ready for review March 4, 2026 20:21

ngernest changed the title ~~(WIP) [Monitor] Support repeat ... loops~~ [Monitor] Support repeat ... loops + Update monitor output on WAL examples Mar 4, 2026

ngernest added 2 commits March 4, 2026 15:27

Merge remote-tracking branch 'origin/main' into monitor_repeat_loops

88ecbe4

Remove commented-out code

9a3aa36

ngernest mentioned this pull request Mar 5, 2026

(WIP) [Interp] Port axi-stream-s2 BNW bug for interpreter, rewrite using repeat loops #188

Draft

ngernest added 2 commits March 6, 2026 12:10

Remove unused args_to_pins fields in Interpreter / Thread structs

593b2a3

Update comments/logs

7ebcab9

ngernest merged commit 51de153 into main Mar 6, 2026
18 checks passed

ngernest deleted the monitor_repeat_loops branch March 6, 2026 19:41

ngernest mentioned this pull request Mar 17, 2026

[Monitor] Support repeat ... loops #183

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Monitor] Support `repeat ...` loops + Update monitor output on WAL examples#194

[Monitor] Support `repeat ...` loops + Update monitor output on WAL examples#194
ngernest merged 65 commits into
mainfrom
monitor_repeat_loops

ngernest commented Feb 20, 2026 •

edited

Loading

Uh oh!

ekiwi Mar 4, 2026

Uh oh!

ngernest Mar 4, 2026 •

edited

Loading

Uh oh!

ekiwi Mar 4, 2026

Uh oh!

ngernest Mar 4, 2026 •

edited

Loading

Uh oh!

ekiwi Mar 4, 2026

Uh oh!

ngernest Mar 4, 2026 •

edited

Loading

Uh oh!

ekiwi Mar 4, 2026

Uh oh!

ngernest Mar 4, 2026 •

edited

Loading

Uh oh!

ngernest commented Mar 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ngernest commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ekiwi Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ngernest Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekiwi Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ngernest Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekiwi Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ngernest Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekiwi Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ngernest Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngernest commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngernest commented Feb 20, 2026 •

edited

Loading

ngernest Mar 4, 2026 •

edited

Loading

ngernest Mar 4, 2026 •

edited

Loading

ngernest Mar 4, 2026 •

edited

Loading

ngernest Mar 4, 2026 •

edited

Loading

ngernest commented Mar 4, 2026 •

edited

Loading