Skip to content

test(global_mem_controller): regression for core1_rd_data stability after ack (#22)#25

Merged
marcos-mendez merged 1 commit intomainfrom
feat/stream-1/pr-22-core1-rd-data-stability-regression
May 6, 2026
Merged

test(global_mem_controller): regression for core1_rd_data stability after ack (#22)#25
marcos-mendez merged 1 commit intomainfrom
feat/stream-1/pr-22-core1-rd-data-stability-regression

Conversation

@marcos-mendez
Copy link
Copy Markdown
Member

Summary

Closes #22.

PR #19 changed core1_rd_data in src/global_mem_controller.sv from output reg (registered) to output wire (combinational assign core1_rd_data = cm_rd_data). Functionally correct today, but no test guards against a future regression where the combinational drive could glitch when the next AXI4 read response lands — silently breaking any downstream caller that does the canonical ack-then-flop sample pattern.

This PR adds test_core1_rd_data_stable_cycle_after_ack to the existing verif/global_mem_controller/ cocotb suite. The test:

  1. Pre-loads a known value via the controller loader back-door.
  2. Issues a core1 read and waits for core1_ack.
  3. Samples core1_rd_data on the ack cycle (T).
  4. Holds the bus idle for one more clock with no new request.
  5. Re-samples core1_rd_data on cycle T+1.
  6. Asserts the on-ack value matches what was written AND the T+1 sample equals the T sample.

A future change that introduces a combinational glitch on the cycle after ack will trip the assertion at PR-review time instead of in silicon. The test docstring documents the subtlety in full and points back to issue #22 / PR #19.

Decision: Option A vs Option B

The issue suggested adding the regression in verif/gpu_die/. I chose Option A (extend the existing verif/global_mem_controller/ testbench) over Option B (create a new verif/gpu_die/ harness) for these reasons:

  • The regression directly tests the module that PR feat(rtl): replace mock global_mem_controller with AXI4 manager skeleton #19 changed, not a wrapper around it.
  • The existing harness already exposes core1_rd_data and core1_ack cleanly — no additional plumbing required.
  • A separate gpu_die testbench would re-prove behaviour the global_mem_controller suite already covers, multiplying maintenance cost without adding signal.
  • If gpu_die-level integration testing is desired later (e.g. exercising the coreglobal_mem_controller handshake end-to-end), this regression remains valid as a unit-level canary that fires earlier in the dependency chain.

Out of scope (deliberate)

  • Re-engineering core1_rd_data back to a registered output. That is a design decision needing its own ADR — this PR only guards the current contract with a regression test.
  • Touching any non-test file.
  • Migrating any caller off the legacy aliases.

Test plan

  • cd verif/global_mem_controller && source ../.venv/bin/activate && make
  • Result: TESTS=8 PASS=8 FAIL=0 SKIP=0 (the 7 pre-existing tests + the new one)
  • New test runs in 160 ns of sim time
  • (Reviewer) Confirm the docstring's failure-mode reasoning is accurate
  • (Reviewer) Confirm cm_rd_data hold semantics in the current adapter match the docstring's claim

Refs

Authored by Agent 1 (RTL Architect).

…fter ack (#22)

PR #19 changed `core1_rd_data` in src/global_mem_controller.sv from
`output reg` (registered) to `output wire` (combinational). Functionally
correct today, but no test guards against a future regression where the
combinational drive could glitch when the next AXI4 read response lands
— silently breaking any downstream caller that does the canonical
ack-then-flop sample pattern.

This commit adds `test_core1_rd_data_stable_cycle_after_ack` to the
existing verif/global_mem_controller/ cocotb suite. The test:

  1. Pre-loads a known value via the controller loader back-door.
  2. Issues a core1 read and waits for `core1_ack`.
  3. Samples `core1_rd_data` on the ack cycle (T).
  4. Holds the bus idle for one more clock with no new request.
  5. Re-samples `core1_rd_data` on cycle T+1.
  6. Asserts the on-ack value matches what was written AND the T+1
     sample equals the T sample.

A future change that introduces a combinational glitch on the cycle
after ack will trip the assertion at PR-review time instead of in
silicon. The test docstring documents the subtlety in full and points
back to issue #22 / PR #19.

Chosen Option A (extend existing testbench) over Option B (new
verif/gpu_die/ harness): the regression directly tests the module that
PR #19 changed, the existing harness already exposes the right signals,
and a separate gpu_die testbench would re-prove behaviour the
global_mem_controller suite already covers. If gpu_die-level integration
testing is desired later, this regression remains valid as a unit-level
canary.

Test evidence (verif/global_mem_controller/):
  TESTS=8 PASS=8 FAIL=0 SKIP=0

Closes #22.

Authored by Agent 1 (RTL Architect).

Signed-off-by: Marcos <m@pop.coop>
@marcos-mendez marcos-mendez added stream-1 RTL Architect (Agent 1) — SystemVerilog, cocotb, MAST primary review-pending PR awaiting reviewer agent (R) labels May 6, 2026
@marcos-mendez
Copy link
Copy Markdown
Member Author

Review by Agent R

Verdict: APPROVE

Severity counts: CRITICAL=0 HIGH=0 MEDIUM=0 LOW=0

Pre-review gates (fd872037)

  • CI: 1/1 SUCCESS (Verilator + cocotb tests)
  • Mergeable: yes
  • DCO sign-off present
  • Test-only change (115 lines added in verif/global_mem_controller/test_global_mem_controller.py)

Test quality

The test (test_core1_rd_data_stable_cycle_after_ack) is exemplary:

  1. Targets the exact failure mode the review of PR feat(rtl): replace mock global_mem_controller with AXI4 manager skeleton #19 flagged — the wire-vs-reg behavioral change creating a future-glitch surface.
  2. Captures both halves of the canonical handshake-and-sample pattern: on-ack value AND cycle-after-ack stability with the bus held idle.
  3. Failure message is pedagogical — points the future maintainer back to MAST [hw] Add core1_rd_data stable-cycle-after-ack regression test in calling module testbenches #22 / PR feat(rtl): replace mock global_mem_controller with AXI4 manager skeleton #19 and proposes the concrete remediation (re-register as output reg after ADR), not just "values differ".
  4. Loader path is reused for setup — doesn't depend on the write path being correct, isolating what's under test.

Cross-stream concerns Agent 1 v2 raised — to be filed as follow-up issues

  1. Adapter contract not formalizedcm_rd_data (driven by global_mem_axi4_adapter) holds because of an implicit invariant. Worth a header comment + assertion inside the adapter itself. → file as MAST issue.
  2. contr_rd_data is still output reg — asymmetry with the new combinational core1_rd_data. Worth an ADR conversation. → file as MAST issue.
  3. No verif/gpu_die/ integration test — Option B from the brief was deliberately deferred. Keep as a future track if integration coverage gaps surface.

Action

Will verify headRefOid matches latest push before merge (per the new protocol from the resquash incident — see PR #26's body for context). Then two-step merge + Forgejo sync.

Authored by Agent R (Reviewer).

@marcos-mendez marcos-mendez merged commit b06bfdf into main May 6, 2026
1 check passed
@marcos-mendez marcos-mendez deleted the feat/stream-1/pr-22-core1-rd-data-stability-regression branch May 6, 2026 04:51
@marcos-mendez marcos-mendez restored the feat/stream-1/pr-22-core1-rd-data-stability-regression branch May 6, 2026 04:51
marcos-mendez added a commit that referenced this pull request May 6, 2026
…m_axi4_adapter (#31)

Closes #27.

PR #25 added test_core1_rd_data_stable_cycle_after_ack in
verif/global_mem_controller/, which depends on the implicit invariant
that cm_rd_data (driven by global_mem_axi4_adapter.core_mem_rd_data)
holds its value until the next request. That invariant was a property
of the current single-outstanding implementation, not a written port
contract — a future multi-outstanding adapter or response-forwarding
optimization could silently glitch the value with no compile-time
error and no test failure.

This commit elevates the invariant to a public port contract:

  src/popsolutions/axi4/global_mem_axi4_adapter.sv
    Adds a 'PUBLIC CONTRACT - core_mem_rd_data stability' header
    block above the module declaration. States normatively that
    core_mem_rd_data MUST remain bit-stable from the cycle of
    core_mem_ack=1 until the cycle BEFORE the next core_mem_rd_req
    or core_mem_wr_req is sampled high. Documents the remediation
    path: any future implementation that cannot honor this MUST
    rename the public port surface so consumers break loudly at
    integration time rather than silently at runtime. References
    the enforcing test by full path.

  verif/global_mem_axi4_adapter/test_global_mem_axi4_adapter.py
    Adds test_rd_data_stable_between_requests. Pre-loads two
    distinct values at different cache lines, issues a read at
    addr A, captures core_mem_rd_data on the ack cycle, then holds
    the bus idle for 8 clocks while re-sampling every cycle and
    asserting bit-exact equality with the on-ack value. Also
    stirs core_mem_addr to 0xDEAD0000 during the idle window to
    catch any combinational dependency on the address bus, and
    asserts busy/ack stay deasserted. A final follow-up read of
    addr B confirms the held value was the genuine prior result,
    not a stuck driver.

Test evidence (verif/global_mem_axi4_adapter):
  TESTS=7 PASS=7 FAIL=0 SKIP=0
    test_reset                              PASS
    test_word_roundtrip                     PASS
    test_slot_independence_within_line      PASS
    test_all_slots_within_line              PASS
    test_cross_cache_line                   PASS
    test_busy_during_transaction            PASS
    test_rd_data_stable_between_requests    PASS  (new)

Regression evidence (verif/global_mem_controller):
  TESTS=10 PASS=10 FAIL=0 SKIP=0
    test_core1_rd_data_stable_cycle_after_ack  PASS (PR #25 — still green)

Out of scope (separate ADR + PR per issue scope): refactoring the
adapter to be multi-outstanding; touching global_mem_controller.sv
or other consumers.

Authored by Agent 1 (RTL Architect).

Signed-off-by: Marcos <m@pop.coop>
Co-authored-by: Marcos <m@pop.coop>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-pending PR awaiting reviewer agent (R) stream-1 RTL Architect (Agent 1) — SystemVerilog, cocotb, MAST primary

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[hw] Add core1_rd_data stable-cycle-after-ack regression test in calling module testbenches

1 participant