Fix: add mandatory arg_index to l3_l2_orch_comm test by ChaoWao · Pull Request #1171 · hw-native-sys/simpler

ChaoWao · 2026-06-26T07:06:19Z

Summary

Fixes the st-sim-a5 breakage on main: test_l3_l2_orch_comm.py calls CoreCallable.build without the mandatory arg_index introduced by #1123, so it raises at build time and fails on every PR merging against current main.

Declares arg_index=[0, 1] for signature=[D.IN, D.OUT] — the explicit per-slot payload mapping every other migrated incore already uses.
Root cause: Add: L3-L2 orchestration communication design #1015 (which added this test) merged after Fix: attribute dumped tensors per-subtask via mandatory arg_index #1123's ~250-site arg_index sweep without reconciling. See st-sim-a5 broken on main: test_l3_l2_orch_comm omits mandatory arg_index (#1015 × #1123 merge skew) #1170 for the full analysis.

Testing

st-sim-a5 (the failing job) — relying on CI; the omitted argument is the sole cause and the value matches the signature=[D.IN, D.OUT] ordering.

Fixes #1170

coderabbitai · 2026-06-26T07:06:33Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 38fe3b2a-8863-4e49-923f-f8ae29688c56

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

The test case in tests/st/a5/tensormap_and_ringbuffer/l3_l2_orch_comm/test_l3_l2_orch_comm.py now passes arg_index=[0, 1] to CoreCallable.build together with signature=[D.IN, D.OUT].

Changes

L3-L2 orchestration communication test

Layer / File(s)	Summary
Explicit child argument mapping `tests/st/a5/tensormap_and_ringbuffer/l3_l2_orch_comm/test_l3_l2_orch_comm.py`	The `CoreCallable.build(...)` child configuration adds `arg_index=[0, 1]` next to `signature=[D.IN, D.OUT]`.

Sequence Diagram(s)

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 I hop through tests with ears held high,
An index now sits by IN and OUT nearby.
One tiny tweak, a tidy little clue,
And the orchard path runs straight and true.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly states the main fix: adding the required arg_index in the l3_l2_orch_comm test.
Description check	✅ Passed	The description is directly related to the test fix and explains the breakage and the intended correction.
Linked Issues check	✅ Passed	The change matches `#1170` by adding arg_index=[0, 1] alongside signature=[D.IN, D.OUT] in the affected test.
Out of Scope Changes check	✅ Passed	The PR appears scoped to the single required test change with no unrelated modifications.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

gemini-code-assist

Code Review

This pull request updates the test configuration in test_l3_l2_orch_comm.py by adding the arg_index=[0, 1] parameter to the CoreCallable.build call. There are no review comments, and I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

PR hw-native-sys#1015 added the l3_l2_orch_comm family (a5 scene test + a2a3 stream example) with CoreCallable.build calls that omit arg_index. PR hw-native-sys#1123 had already made arg_index mandatory (parallel to signature, equal length) and removed the contiguous fallback, so the two merged without reconciling. Every PR that merges against current main then fails at build time with: ValueError: CoreCallable.build: arg_index is required and must be parallel to signature (equal length) affecting st-sim-a5 (the a5 test) and st-sim-a2a3 / st-onboard-a2a3 (the a2a3 example). Declare arg_index=[0, 1] for signature=[D.IN, D.OUT] at both sites, matching the explicit slot mapping every other migrated incore uses: - tests/st/a5/.../l3_l2_orch_comm/test_l3_l2_orch_comm.py - examples/a2a3/.../l3_l2_orch_comm_stream/l3_l2_orch_comm_stream.py

The args dump no longer needs a per-incore arg_index to map each declared tensor to a payload slot. Each incore declares its full (mix-task) signature and the dump maps signature entry i to payload slot i positionally; every record carries the task's active-subtask set as a func_id array (its mix membership) rather than a single scalar func_id. This supersedes the hw-native-sys#1123 arg_index mechanism (and the hw-native-sys#1171 follow-up that backfilled arg_index for l3_l2_orch_comm), tailored for the upcoming l0_swimlane tool, which reconstructs and replays a whole mix task rather than one kernel. - dump: a single positional walk over the first active subtask's signature; each payload tensor is emitted once, stamped with the func array (no per-subtask geometry duplication). Slots beyond the payload (a prefix-dispatched task) are skipped. - record/info/DumpedTensor: func_id scalar -> func_ids[3] + func_count (reuses the existing pad, 128B record unchanged; TENSOR_DUMP_MAX_FUNC_IDS is tied to PTO2_SUBTASK_SLOT_COUNT by a static_assert). - args_dump.json: func_id is now an array ([0,1,2] for a mix, [0] for a single-kernel task). - CoreCallable.build / make_callable: drop the arg_index parameter, field, and accessor; scene_test and the binding stop threading it. - migrate every CALLABLE incore / CoreCallable.build repo-wide to drop arg_index; complete the offset-mix signatures (mixed_example a2a3/a5, l2_swimlane_mixed) to full width so positional mapping covers payload. - docs/dfx/args-dump.md updated for the func_id array + positional model. Verified on a2a3 silicon (--dump-args 3, golden PASS, func_id arrays correct, each slot emitted once): mixed_example (offset mix, 108 records), l2_swimlane_mixed, spmd_basic (cooperative mix), l3_l2_orch_comm_stream (hw-native-sys#1171 site), dummy_task.

The args dump no longer needs a per-incore arg_index to map each declared tensor to a payload slot. Each incore declares its full (mix-task) signature and the dump maps signature entry i to payload slot i positionally; every record carries the task's active-subtask set as a func_id array (its mix membership) rather than a single scalar func_id. This supersedes the #1123 arg_index mechanism (and the #1171 follow-up that backfilled arg_index for l3_l2_orch_comm), tailored for the upcoming l0_swimlane tool, which reconstructs and replays a whole mix task rather than one kernel. - dump: a single positional walk over the first active subtask's signature; each payload tensor is emitted once, stamped with the func array (no per-subtask geometry duplication). Slots beyond the payload (a prefix-dispatched task) are skipped. - record/info/DumpedTensor: func_id scalar -> func_ids[3] + func_count (reuses the existing pad, 128B record unchanged; TENSOR_DUMP_MAX_FUNC_IDS is tied to PTO2_SUBTASK_SLOT_COUNT by a static_assert). - args_dump.json: func_id is now an array ([0,1,2] for a mix, [0] for a single-kernel task). - CoreCallable.build / make_callable: drop the arg_index parameter, field, and accessor; scene_test and the binding stop threading it. - migrate every CALLABLE incore / CoreCallable.build repo-wide to drop arg_index; complete the offset-mix signatures (mixed_example a2a3/a5, l2_swimlane_mixed) to full width so positional mapping covers payload. - docs/dfx/args-dump.md updated for the func_id array + positional model. Verified on a2a3 silicon (--dump-args 3, golden PASS, func_id arrays correct, each slot emitted once): mixed_example (offset mix, 108 records), l2_swimlane_mixed, spmd_basic (cooperative mix), l3_l2_orch_comm_stream (#1171 site), dummy_task.

…sys#1171) PR hw-native-sys#1015 added the l3_l2_orch_comm family (a5 scene test + a2a3 stream example) with CoreCallable.build calls that omit arg_index. PR hw-native-sys#1123 had already made arg_index mandatory (parallel to signature, equal length) and removed the contiguous fallback, so the two merged without reconciling. Every PR that merges against current main then fails at build time with: ValueError: CoreCallable.build: arg_index is required and must be parallel to signature (equal length) affecting st-sim-a5 (the a5 test) and st-sim-a2a3 / st-onboard-a2a3 (the a2a3 example). Declare arg_index=[0, 1] for signature=[D.IN, D.OUT] at both sites, matching the explicit slot mapping every other migrated incore uses: - tests/st/a5/.../l3_l2_orch_comm/test_l3_l2_orch_comm.py - examples/a2a3/.../l3_l2_orch_comm_stream/l3_l2_orch_comm_stream.py

…native-sys#1181) The args dump no longer needs a per-incore arg_index to map each declared tensor to a payload slot. Each incore declares its full (mix-task) signature and the dump maps signature entry i to payload slot i positionally; every record carries the task's active-subtask set as a func_id array (its mix membership) rather than a single scalar func_id. This supersedes the hw-native-sys#1123 arg_index mechanism (and the hw-native-sys#1171 follow-up that backfilled arg_index for l3_l2_orch_comm), tailored for the upcoming l0_swimlane tool, which reconstructs and replays a whole mix task rather than one kernel. - dump: a single positional walk over the first active subtask's signature; each payload tensor is emitted once, stamped with the func array (no per-subtask geometry duplication). Slots beyond the payload (a prefix-dispatched task) are skipped. - record/info/DumpedTensor: func_id scalar -> func_ids[3] + func_count (reuses the existing pad, 128B record unchanged; TENSOR_DUMP_MAX_FUNC_IDS is tied to PTO2_SUBTASK_SLOT_COUNT by a static_assert). - args_dump.json: func_id is now an array ([0,1,2] for a mix, [0] for a single-kernel task). - CoreCallable.build / make_callable: drop the arg_index parameter, field, and accessor; scene_test and the binding stop threading it. - migrate every CALLABLE incore / CoreCallable.build repo-wide to drop arg_index; complete the offset-mix signatures (mixed_example a2a3/a5, l2_swimlane_mixed) to full width so positional mapping covers payload. - docs/dfx/args-dump.md updated for the func_id array + positional model. Verified on a2a3 silicon (--dump-args 3, golden PASS, func_id arrays correct, each slot emitted once): mixed_example (offset mix, 108 records), l2_swimlane_mixed, spmd_basic (cooperative mix), l3_l2_orch_comm_stream (hw-native-sys#1171 site), dummy_task.

gemini-code-assist Bot reviewed Jun 26, 2026

View reviewed changes

ChaoWao mentioned this pull request Jun 26, 2026

Add: name the exhausted ring in task-allocator deadlock dump #1168

Closed

3 tasks

ChaoWao force-pushed the fix/l3-l2-orch-comm-arg-index branch from 689d715 to 5f06ff0 Compare June 26, 2026 08:32

ChaoWao mentioned this pull request Jun 26, 2026

st-sim-a5 broken on main: test_l3_l2_orch_comm omits mandatory arg_index (#1015 × #1123 merge skew) #1170

Closed

ChaoWao merged commit a27d9f8 into hw-native-sys:main Jun 26, 2026
16 checks passed

ChaoWao deleted the fix/l3-l2-orch-comm-arg-index branch June 26, 2026 08:45

indigo1973 mentioned this pull request Jun 27, 2026

Refactor: replace arg_index with positional dump + func_id array #1181

Merged

coderabbitai Bot mentioned this pull request Jul 1, 2026

Support: reuse resident prebuilt runtime arenas #1234

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: add mandatory arg_index to l3_l2_orch_comm test#1171

Fix: add mandatory arg_index to l3_l2_orch_comm test#1171
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
ChaoWao:fix/l3-l2-orch-comm-arg-index

ChaoWao commented Jun 26, 2026

Uh oh!

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ChaoWao commented Jun 26, 2026

Summary

Testing

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading