test(dfx): mirror a2a3 DFX scene tests to a5#1246
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR adds new system-test suites under tensormap_and_ringbuffer/dfx covering args_dump, l2_swimlane, pmu, and scope_stats features. It introduces two orchestration kernels (partial_dump_orch.cpp, chained_mix_orch.cpp), a shared swimlane validation helper, and five new pytest scene-test modules validating dump/profiling artifacts. ChangesDFX test suites
Estimated code review effort: 3 (Moderate) | ~25 minutes Sequence Diagram(s)sequenceDiagram
participant TestArgsDump
participant partial_dump_orch
participant args_dump.json
participant dump_viewer
TestArgsDump->>partial_dump_orch: run kernel with --dump-args
partial_dump_orch->>args_dump.json: emit tensor/scalar dump records
TestArgsDump->>args_dump.json: parse manifest
TestArgsDump->>TestArgsDump: validate func_id, task_id, arg_index
TestArgsDump->>dump_viewer: run subprocess smoke check
sequenceDiagram
participant Test
participant validate_perf_artifact
participant l2_swimlane_records.json
participant sched_overhead_analysis
participant verify_sched_overhead_differential
Test->>validate_perf_artifact: run(case_label)
validate_perf_artifact->>l2_swimlane_records.json: read_perf_data
validate_perf_artifact->>sched_overhead_analysis: run subprocess with deps.json
sched_overhead_analysis-->>validate_perf_artifact: stdout metrics
validate_perf_artifact->>verify_sched_overhead_differential: compare oracle vs stdout
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request adds system tests for DFX profiling features on the tensormap_and_ringbuffer runtime. It introduces orchestration kernels and test cases to validate argument dumping (args_dump), L2 swimlane profiling (l2_swimlane) for AIV-only and mixed AIC+AIV chained workloads, PMU profiling (pmu), and scope statistics (scope_stats). These tests verify the correct generation and schema of profiling artifacts such as args_dump.json, l2_swimlane_records.json, pmu.csv, and scope_stats.jsonl. There are no review comments provided, so I have no feedback to address.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/st/a5/tensormap_and_ringbuffer/dfx/scope_stats/test_scope_stats.py (1)
53-88: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueOptional: mutable default class attributes flagged by Ruff (RUF012).
CALLABLE(Line 53-79) andCASES(Line 81-88) are mutable dict/list class attributes withoutClassVarannotation. Since these are read-only declarative metadata consumed by thescene_testframework and mirror the pattern used across sibling DFX test files, this is low-risk, but you could silence the hint withtyping.ClassVarannotations if the team wants a clean lint pass.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/st/a5/tensormap_and_ringbuffer/dfx/scope_stats/test_scope_stats.py` around lines 53 - 88, The class-level declarative metadata in the test scope stats fixture is being flagged as mutable default attributes by Ruff. Update the test class that defines CALLABLE and CASES to annotate these attributes with typing.ClassVar so Ruff recognizes them as intentional read-only metadata, keeping the existing structure and names used by the scene_test framework.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/_swimlane_validate.py`:
- Around line 60-65: The validation helper in validate_perf_artifact currently
returns silently when no matching output directory is found, which lets broken
captures pass unnoticed. Update validate_perf_artifact to fail loudly when
matches is empty instead of returning, so test_l2_swimlane.py and
test_l2_swimlane_mixed.py surface missing output directories or sanitize
mismatches. Keep the existing l2_swimlane_records.json existence check, but make
the no-match path assert with a clear error message tied to _outputs_dir and
_sanitize_for_filename.
In `@tests/st/a5/tensormap_and_ringbuffer/dfx/pmu/test_pmu.py`:
- Line 16: The smoke test in test_pmu.py only checks header membership instead
of the documented ordered prefix, so a reordered header can still pass. Update
the header validation logic in the smoke/assertion path to verify that the
leading columns in the parsed header match the expected prefix in order, using
the existing header-related checks around the documented header assertion rather
than just `header_cols` membership.
- Around line 92-98: The artifact lookup in _validate_pmu_artifact can silently
pass when no output directory is found and can also validate a stale directory
from a prior run. Change the logic so the test fails explicitly if
_outputs_dir().glob(...) returns no matches, and ensure the selected directory
corresponds to the current test invocation rather than just the newest mtime
entry; use the existing _sanitize_for_filename and _outputs_dir helpers to
locate the current PMU run and assert against that exact directory before
checking pmu.csv.
---
Nitpick comments:
In `@tests/st/a5/tensormap_and_ringbuffer/dfx/scope_stats/test_scope_stats.py`:
- Around line 53-88: The class-level declarative metadata in the test scope
stats fixture is being flagged as mutable default attributes by Ruff. Update the
test class that defines CALLABLE and CASES to annotate these attributes with
typing.ClassVar so Ruff recognizes them as intentional read-only metadata,
keeping the existing structure and names used by the scene_test framework.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 844342b8-b714-4e1f-86c1-21f49ff33e5c
📒 Files selected for processing (9)
tests/st/a5/tensormap_and_ringbuffer/dfx/args_dump/kernels/orchestration/partial_dump_orch.cpptests/st/a5/tensormap_and_ringbuffer/dfx/args_dump/test_args_dump.pytests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/__init__.pytests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/_swimlane_validate.pytests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/kernels/orchestration/chained_mix_orch.cpptests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/test_l2_swimlane.pytests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/test_l2_swimlane_mixed.pytests/st/a5/tensormap_and_ringbuffer/dfx/pmu/test_pmu.pytests/st/a5/tensormap_and_ringbuffer/dfx/scope_stats/test_scope_stats.py
| safe_label = _sanitize_for_filename(case_label) | ||
| matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime) | ||
| if not matches: | ||
| return | ||
| perf = matches[-1] / "l2_swimlane_records.json" | ||
| assert perf.exists(), f"l2_swimlane_records.json missing under {matches[-1]} — swimlane capture failed?" |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Silent no-op when the output directory isn't found masks broken validation.
If matches is empty (e.g. label/sanitize mismatch, or the case failed to produce an output dir), validate_perf_artifact just returns without asserting anything. Callers (test_l2_swimlane.py, test_l2_swimlane_mixed.py) treat this as a green test, defeating the purpose of the differential gate — a capture regression that also breaks directory creation would silently pass instead of failing loudly.
💡 Suggested fix: fail loudly instead of silently skipping
safe_label = _sanitize_for_filename(case_label)
matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime)
- if not matches:
- return
+ assert matches, f"no output directory found for case '{case_label}' (glob: {safe_label}_*)"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| safe_label = _sanitize_for_filename(case_label) | |
| matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime) | |
| if not matches: | |
| return | |
| perf = matches[-1] / "l2_swimlane_records.json" | |
| assert perf.exists(), f"l2_swimlane_records.json missing under {matches[-1]} — swimlane capture failed?" | |
| safe_label = _sanitize_for_filename(case_label) | |
| matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime) | |
| assert matches, f"no output directory found for case '{case_label}' (glob: {safe_label}_*)" | |
| perf = matches[-1] / "l2_swimlane_records.json" | |
| assert perf.exists(), f"l2_swimlane_records.json missing under {matches[-1]} — swimlane capture failed?" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/st/a5/tensormap_and_ringbuffer/dfx/l2_swimlane/_swimlane_validate.py`
around lines 60 - 65, The validation helper in validate_perf_artifact currently
returns silently when no matching output directory is found, which lets broken
captures pass unnoticed. Update validate_perf_artifact to fail loudly when
matches is empty instead of returning, so test_l2_swimlane.py and
test_l2_swimlane_mixed.py surface missing output directories or sanitize
mismatches. Keep the existing l2_swimlane_records.json existence check, but make
the no-match path assert with a clear error message tied to _outputs_dir and
_sanitize_for_filename.
There was a problem hiding this comment.
由于是A2A3与A5共有的,后续pr统一修改
There was a problem hiding this comment.
@doraemonmj, understood — deferring this to a follow-up PR that unifies the fix across both a2a3 and a5 makes sense given the shared helper. Let me know if you'd like me to open a tracking issue for this so it doesn't get lost.
🧠 Learnings used
Learnt from: poursoul
Repo: hw-native-sys/simpler PR: 1093
File: python/simpler/task_interface.py:34-44
Timestamp: 2026-06-22T03:34:22.769Z
Learning: In this repository (hw-native-sys/simpler), treat renamed public Python types/classes as a hard rename: when a type is renamed (e.g., ContinuousTensor → Tensor), remove the old name across the repo rather than adding backward-compatibility aliases for the old symbol. Also, when a PR claims “behavioral non-breaking,” interpret that as runtime/wire semantics only (not Python public API name stability); do not flag the PR for breaking Python API symbol names as long as runtime/wire behavior is preserved.
| the AICore counters land in ``<output_prefix>/pmu.csv``, one data row per | ||
| task. The schema is fixed (see docs/dfx/pmu-profiling.md and | ||
| src/a5/platform/shared/host/pmu_collector.cpp's "Build CSV header" | ||
| block). Smoke asserts: file exists, header starts with the documented |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Header check doesn't verify the documented "starts with" ordering.
The docstring claims the smoke test asserts the header "header starts with the documented prefix", but the implementation only checks that each required column exists somewhere in header_cols (membership), not that they appear as a leading, ordered prefix. A header with the same columns in a different order would incorrectly pass.
🐛 Proposed fix to check prefix order
- header_cols = lines[0].split(",")
- for col in _REQUIRED_HEADER_PREFIX:
- assert col in header_cols, f"header missing required column '{col}': {header_cols}"
+ header_cols = lines[0].split(",")
+ actual_prefix = tuple(header_cols[: len(_REQUIRED_HEADER_PREFIX)])
+ assert actual_prefix == _REQUIRED_HEADER_PREFIX, (
+ f"header does not start with required prefix {_REQUIRED_HEADER_PREFIX}: {header_cols}"
+ )Also applies to: 101-103
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/st/a5/tensormap_and_ringbuffer/dfx/pmu/test_pmu.py` at line 16, The
smoke test in test_pmu.py only checks header membership instead of the
documented ordered prefix, so a reordered header can still pass. Update the
header validation logic in the smoke/assertion path to verify that the leading
columns in the parsed header match the expected prefix in order, using the
existing header-related checks around the documented header assertion rather
than just `header_cols` membership.
There was a problem hiding this comment.
由于是A2A3与A5共有的,后续pr统一修改
There was a problem hiding this comment.
@doraemonmj, understood — thanks for confirming this is shared between A2A3 and A5, and that the ordered-prefix fix will be addressed together in a follow-up PR. No action needed here for now.
🧠 Learnings used
Learnt from: poursoul
Repo: hw-native-sys/simpler PR: 1093
File: python/simpler/task_interface.py:34-44
Timestamp: 2026-06-22T03:34:22.769Z
Learning: In this repository (hw-native-sys/simpler), treat renamed public Python types/classes as a hard rename: when a type is renamed (e.g., ContinuousTensor → Tensor), remove the old name across the repo rather than adding backward-compatibility aliases for the old symbol. Also, when a PR claims “behavioral non-breaking,” interpret that as runtime/wire semantics only (not Python public API name stability); do not flag the PR for breaking Python API symbol names as long as runtime/wire behavior is preserved.
| def _validate_pmu_artifact(self, case): | ||
| safe_label = _sanitize_for_filename(f"TestPmu_{case['name']}") | ||
| matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime) | ||
| if not matches: | ||
| return | ||
| csv = matches[-1] / "pmu.csv" | ||
| assert csv.exists(), f"pmu.csv missing under {matches[-1]} — PMU capture failed?" |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Silent skip + stale-directory risk in artifact lookup.
Two issues in _validate_pmu_artifact:
- Line 95-96: if no output directory matches the glob, the method silently returns without any assertion — a capture failure that fails to even produce an output dir would pass the test instead of failing it.
- Line 94:
matches[-1]picks the most recent directory by mtime, but does not confirm it corresponds to the current test invocation. A stale directory left over from a prior run/session with the same case label would satisfy the glob and be validated instead of (or in addition to) the current run's actual output, masking a real capture regression in the current run.
🐛 Proposed fix to fail-fast instead of silently skipping
safe_label = _sanitize_for_filename(f"TestPmu_{case['name']}")
matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime)
- if not matches:
- return
+ assert matches, f"no output directory found for {safe_label}_* — PMU output not produced?"
csv = matches[-1] / "pmu.csv"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _validate_pmu_artifact(self, case): | |
| safe_label = _sanitize_for_filename(f"TestPmu_{case['name']}") | |
| matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime) | |
| if not matches: | |
| return | |
| csv = matches[-1] / "pmu.csv" | |
| assert csv.exists(), f"pmu.csv missing under {matches[-1]} — PMU capture failed?" | |
| def _validate_pmu_artifact(self, case): | |
| safe_label = _sanitize_for_filename(f"TestPmu_{case['name']}") | |
| matches = sorted(_outputs_dir().glob(f"{safe_label}_*"), key=lambda p: p.stat().st_mtime) | |
| assert matches, f"no output directory found for {safe_label}_* — PMU output not produced?" | |
| csv = matches[-1] / "pmu.csv" | |
| assert csv.exists(), f"pmu.csv missing under {matches[-1]} — PMU capture failed?" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/st/a5/tensormap_and_ringbuffer/dfx/pmu/test_pmu.py` around lines 92 -
98, The artifact lookup in _validate_pmu_artifact can silently pass when no
output directory is found and can also validate a stale directory from a prior
run. Change the logic so the test fails explicitly if _outputs_dir().glob(...)
returns no matches, and ensure the selected directory corresponds to the current
test invocation rather than just the newest mtime entry; use the existing
_sanitize_for_filename and _outputs_dir helpers to locate the current PMU run
and assert against that exact directory before checking pmu.csv.
There was a problem hiding this comment.
由于是A2A3与A5共有的,后续pr统一修改
There was a problem hiding this comment.
@doraemonmj, understood — thanks for the context. Since this logic is shared between the a2a3 and a5 scenes, deferring the fix to a follow-up PR that updates both consistently makes sense. I'll hold off on further pushback here.
🧠 Learnings used
Learnt from: poursoul
Repo: hw-native-sys/simpler PR: 1093
File: python/simpler/task_interface.py:34-44
Timestamp: 2026-06-22T03:34:22.769Z
Learning: In this repository (hw-native-sys/simpler), treat renamed public Python types/classes as a hard rename: when a type is renamed (e.g., ContinuousTensor → Tensor), remove the old name across the repo rather than adding backward-compatibility aliases for the old symbol. Also, when a PR claims “behavioral non-breaking,” interpret that as runtime/wire semantics only (not Python public API name stability); do not flag the PR for breaking Python API symbol names as long as runtime/wire behavior is preserved.
当前 a5 DFX 用例测试内容梳理本 PR 镜像过来的 7 个用例各自校验的内容(便于后续查漏补缺):
|
a5 only had dep_gen under tensormap_and_ringbuffer/dfx. Mirror the remaining a2a3 DFX cases so the two arches stay aligned: - args_dump (--dump-args capture + manifest schema) - l2_swimlane (swimlane records + mixed AIC/AIV variant) - pmu (--enable-pmu pmu.csv) - scope_stats (--enable-scope-stats NDJSON) Verbatim copies for the platform-agnostic helpers and orchestration kernels; test files only swap the platform tokens (examples/a2a3 -> examples/a5, ["a2a3sim","a2a3"] -> ["a5sim","a5"], src/a2a3 -> src/a5). All required a5 platform collectors and example/mixed_example kernels already exist. Collected clean on a5sim.
What
a5 only had
dep_genundertensormap_and_ringbuffer/dfx. This mirrors theremaining a2a3 DFX scene tests so the two arches stay aligned:
--dump-argscapture + manifest schema--enable-pmupmu.csv--enable-scope-statsNDJSONVerbatim copies for the platform-agnostic helpers and orchestration kernels;
test files only swap the platform tokens (
examples/a2a3→examples/a5,["a2a3sim","a2a3"]→["a5sim","a5"],src/a2a3→src/a5). All requireda5 platform collectors and example/mixed_example kernels already exist.
Test
All 7 a5 DFX cases pass on a5 hardware:
```
7 passed in 41.01s
```
🤖 Generated with Claude Code