feat(axi4_mem_model): back-door loader port for testbench / boot ROM#8
feat(axi4_mem_model): back-door loader port for testbench / boot ROM#8marcos-mendez merged 2 commits intomainfrom
Conversation
Adds three input ports to axi4_mem_model:
loader_en : 1-bit pulse-or-hold
loader_addr : phys_addr_width-bit byte address (must be 4-byte aligned)
loader_data : 32-bit datum
When loader_en is high on a posedge clk, the 32-bit datum is written into
the correct slot of the targeted 256-bit cache line, by byte address. No
handshake (caller holds the inputs stable for one cycle).
The loader path runs in parallel with the AXI4 write FSM and never
interferes with normal AXI4 traffic when loader_en is low. Production
synthesis can tie loader_en to 0 to drop the loader logic entirely.
Use case: cocotb / FPGA testbench pre-populates instruction memory with
a RISC-V program before releasing reset on the core, exactly the pattern
upstream uses with global_mem_controller's contr_wr_* port. This is the
last MAST piece needed before InnerJib7EA can run a real program
(sum_ints.asm) end-to-end through the AXI4 chain.
Verification:
TESTS=6 PASS=6 FAIL=0 SKIP=0 (axi4_mem_model)
+ new test_loader_then_axi4_read — pre-load 4 words across two cache
lines via loader_en, then read them back via AXI4. Verifies slot
indexing within the cache line and cross-line behaviour.
TESTS=5 PASS=5 FAIL=0 SKIP=0 (axi4_master_simple, no regression)
TESTS=5 PASS=5 FAIL=0 SKIP=0 (core_axi4_adapter, no regression)
The two existing testbench wrappers (axi4_master_mem_wrapper and
core_axi4_chain_wrapper) tie loader_en to 0; they don't need the
back-door for their own purposes.
Signed-off-by: Marcos <m@pop.coop>
Self-review by code-reviewer agent flagged 3 HIGH findings:
HIGH 1. Write-write conflict between loader and AXI4 FSM was
undefined behaviour per SV NBA semantics across separate
always blocks. Verilator happens to make the loader win
but other simulators may differ.
Fix: collapsed the standalone loader always block into the
AXI4 write FSM's always block, sequenced as the LAST NBA in
the procedural step. SV guarantees the last NBA in a single
procedural block wins on conflicts — deterministic loader-
wins-on-conflict semantics across all SV-compliant tools.
HIGH 2. Misaligned loader_addr silently corrupted adjacent 32-bit
slots (or wrote past bit 255 of the cache line at offsets
29-31).
Fix: added 'assert (loader_addr[1:0] == 2'b00) else $error'
inside the loader branch. Misaligned addresses now produce
a clear simulator error instead of silent corruption.
Synthesis ignores the assertion.
HIGH 3. Loader always block had no reset arm — inconsistent with
every other always block in the file.
Fix: subsumed by HIGH 1 fix. The merged always block
inherits the AXI4 FSM's '@(posedge clk or negedge rst_n)'
sensitivity list and reset semantics.
Also addressed LOW finding:
LOW. README.md in src/popsolutions/axi4/ did not document the
loader port contract, semantics, conflict resolution rule,
alignment requirement, or synthesis tie-off convention.
Fix: appended a 'Memory loader back-door' section covering
all of the above.
Test status (verified locally, all green):
axi4_mem_model: 6/6 pass (no regression)
axi4_master_simple: 5/5 pass (no regression)
core_axi4_adapter: 5/5 pass (no regression)
inner_jib_top (IJ7EA): 2/2 pass (no regression — full sum_ints
still runs end-to-end)
18/18 across both repos.
Deferred MEDIUM findings (not blocking):
- Test coverage gap for slots 3-7 within a cache line and explicit
misalignment negative test. Will land as a follow-up PR with
test_loader_all_slots and test_loader_misalign_errors.
- Bit-width arithmetic note already covered by the new alignment
assertion (the unsafe path now errors at runtime).
Signed-off-by: Marcos <m@pop.coop>
Self-review (code-reviewer agent, 2026-05-05)Per the GOVERNANCE.md self-review bridge, an independent review pass was run before requesting merge. Surfacing findings as required.
HIGH 1 — addressedWrite-write conflict undefined behaviour. The standalone loader HIGH 2 — addressedMisaligned HIGH 3 — addressedLoader always block lacked reset arm. Fix: subsumed by HIGH 1 (merged into the FSM's reset-sensitive block). MEDIUM 1 — deferred to follow-up PRTest coverage gaps: existing Verdict from this self-review passAPPROVE for merge. No remaining CRITICAL or HIGH findings. CI green required (Verilator + cocotb on every push, see .github/workflows/ci.yml). Per GOVERNANCE.md "No AI self-merge" clause, the merge button is yours, Marcos. Suggested merge command: gh pr merge 8 --repo popsolutions/MAST --squash |
loader) After PR popsolutions/MAST#7 (governance) and popsolutions/MAST#8 (loader port) merged to MAST main, this commit moves the InnerJib7EA submodule pin from the throwaway feat/mem-loader branch to MAST main HEAD. The SHA referenced is now reachable from MAST main and will not garbage collect. No source-level changes; just submodule pin bump. Signed-off-by: Marcos <m@pop.coop>
…ing (#7) * feat(inner_jib_top): full integration + sum_ints end-to-end test passing Sprint C complete. `src/inner_jib_top.sv` wires: upstream core (mast/src/core.sv) -> core_axi4_adapter -> axi4_master_simple -> axi4_mem_model (with loader back-door from MAST#8) The cocotb test in `verif/inner_jib_top/` runs the real RISC-V program `mast/examples/direct/sum_ints.asm` end-to-end through this chain: 1. Pre-assembled `fixtures/sum_ints.hex` is loaded into the mem model via loader_en/addr/data, byte address 128 onward. 2. set_pc_addr=128 + set_pc_req pulse points the core at the program. 3. ena=1 releases the core. 4. Test captures every (out, outen) pulse until halt. 5. Asserts the captured sequence equals [0, 5, 4, 3, 2, 1, 0, 15] — hand-traced from the asm source. 6. Asserts halt asserts within 50 000 cycles. TESTS=2 PASS=2 FAIL=0 SKIP=0 test_reset PASS 90 ns test_sum_ints_runs_and_halts PASS 2530 ns sim time This is the first program of any kind running on PopSolutions silicon- equivalent hardware in simulation. Every subsequent program (factorial, primes, matrix multiply, GGML kernels) lands on the same path. Closes #4 (top-level integration). DEPENDS ON popsolutions/MAST#8 (loader back-door port). The MAST submodule is temporarily pinned to MAST's feat/mem-loader branch (commit c12a6a5); it will be bumped to MAST main HEAD after MAST#8 merges. Also adds: - .github/workflows/ci.yml — Verilator + cocotb on every push/PR, with submodule checkout, Verilator 5.040 from source with cache. - README.md — status badge, run-locally instructions, MAST link. - .gitignore — verif outputs (sim_build, .venv, *.vcd, etc.) Signed-off-by: Marcos <m@pop.coop> * fix: bump MAST submodule + document fixture regen (PR #7 review) Self-review on PR #7 flagged 2 MEDIUM findings: MEDIUM 1. fixtures/sum_ints.hex had no regeneration recipe. Future updates to sum_ints.asm would have no documented path to refresh the fixture. Fix: added verif/inner_jib_top/fixtures/README.md with the exact assembler command, a hand-traced output expectation, and a rationale for checking the hex in (vs. regenerating in CI). MEDIUM 2. Loader-to-core start timing assumption (one clock gap between final loader_en=0 and set_pc_req=1) was not documented. Test passes locally but the assumption is implicit. Defer to a follow-up — the assumption is documented now in the fixture README via the rationale paragraph; a dedicated comment in the test loop can land in a small doc-only PR. Also bumps the MAST submodule pin to include MAST PR #8's self-review fixes (loader collapsed into AXI4 write FSM, alignment assert, README loader docs). The pin still points at MAST's feat/mem-loader branch because PR #8 has not yet merged to MAST main; once it does, a follow-up commit will bump the pin again to a main-reachable SHA. Test status (verified locally with the bumped submodule): inner_jib_top: 2/2 pass — sum_ints.asm still runs end-to-end. Signed-off-by: Marcos <m@pop.coop> * chore: bump MAST submodule pin to main (post-merge of #7 governance + #8 loader) After PR popsolutions/MAST#7 (governance) and popsolutions/MAST#8 (loader port) merged to MAST main, this commit moves the InnerJib7EA submodule pin from the throwaway feat/mem-loader branch to MAST main HEAD. The SHA referenced is now reachable from MAST main and will not garbage collect. No source-level changes; just submodule pin bump. Signed-off-by: Marcos <m@pop.coop> * fix(ci): switch MAST submodule URL from SSH to HTTPS The CI failure on this branch was: 'Host key verification failed' when GitHub Actions tried to clone git@git.pop.coop:pop/MAST.git via SSH. GitHub runners have no SSH key for git.pop.coop, so the SSH URL cannot be used in CI. Fix: switch the submodule URL to https://git.pop.coop/pop/MAST.git. The Forgejo MAST repo is public, so HTTPS clone needs no auth. Local developers who prefer SSH can override locally without committing: git config submodule.mast.url git@git.pop.coop:pop/MAST.git Also adds 'branch = main' to the .gitmodules entry so future 'git submodule update --remote' tracks the canonical branch. Signed-off-by: Marcos <m@pop.coop> * fix(ci): add -Wno-IMPLICITSTATIC for Verilator 5.040 strictness on upstream tasks CI on Verilator 5.040 (built from source in the workflow cache) now treats parameter declarations inside upstream tasks (mast/src/int/chunked_add_task.sv:12, chunked_sub_task.sv:12) as IMPLICITSTATIC warnings, which our Makefile escalates to errors via the cocotb default '%Error: Exiting due to N warning(s)'. The local Verilator 5.048 (Manjaro) does not flag these — version drift between local and CI again. Fix: add -Wno-IMPLICITSTATIC alongside the other upstream-legacy warning suppressions already in EXTRA_ARGS. The upstream files are otherwise clean and we don't want to modify them locally. Signed-off-by: Marcos <m@pop.coop> --------- Signed-off-by: Marcos <m@pop.coop> Co-authored-by: Marcos <m@pop.coop>
Adds 3 input ports (
loader_en,loader_addr,loader_data) toaxi4_mem_modelfor testbench-side and on-chip boot-ROM-style program loading. Production synthesis tiesloader_ento 0 and the loader logic drops out.Why now
InnerJib7EA needs to pre-populate instruction memory with a RISC-V program before releasing reset on the upstream core (so the core can fetch real instructions). Upstream achieves this via
global_mem_controller.contr_wr_*; we need the equivalent on our AXI4 mem model. This PR is the last MAST piece required before InnerJib7EA can runsum_ints.asmend-to-end through the AXI4 chain (Sprint C, follow-up PR in InnerJib7EA repo).Verification
Same Verilator + cocotb harness, all 16 tests green:
axi4_mem_modeltest_loader_then_axi4_read)axi4_master_simplecore_axi4_adapterThe new test pre-loads 4 distinct 32-bit words across two consecutive 32-byte cache lines via
loader_en, then reads them back via AXI4. Verifies slot indexing within a cache line and cross-line write behaviour.Profile
loader_en+loader_addr+loader_datafor one rising edge.core.sv); the mem model places it in the correct slot of the 256-bit cache line.axi4_master_mem_wrapper,core_axi4_chain_wrapper) tieloader_ento 0 — no behaviour change for those tests.Self-review
Will be added as a comment after CI goes green and the code-reviewer agent runs.