feat(runtime): unify runtime_env ring sizing into one int-or-list field#1128
Conversation
hw-native-sys#1099 exposed ring sizing through two near-identical CallConfig.runtime_env names per resource that differ only by a trailing `s` — `ring_task_window` (scalar broadcast) vs `ring_task_windows` (per-ring array), etc. The one-letter difference is an ergonomics footgun and the layered "scalar baseline + per-ring override" semantics it bought are not worth the confusing twin names. Collapse each pair into a single field that accepts EITHER a scalar (broadcast to every ring) OR a 4-entry list (per-ring): cfg.runtime_env.ring_task_window = 128 # broadcast cfg.runtime_env.ring_task_window = [128, 0, 0, 0] # per-ring; 0 falls through Broadcast happens in the Python binding (int -> [v, v, v, v]); the wire format now carries only the three 4-element arrays (12 uint64, down from 15) and the getter always returns a 4-list. A 0 entry falls through to PTO2_RING_* env -> compile-time default; the separate scalar-CallConfig precedence tier is dropped (accepted trade-off — a 0 in a list no longer falls back to a sibling scalar). The internal C-API (run_prepared) and wire layout are internal-only and rebuild together via pip install, so this is a clean break with no back-compat shim. Mirrored across a2a3/a5, both runtimes, bindings, scene-test parsing, docs, unit tests, and the per_task_runtime_env examples. Closes hw-native-sys#1126.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
📝 WalkthroughWalkthrough
ChangesRuntimeEnv ring-field unification
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/ut/py/test_chip_worker.py (1)
84-130: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAdd a mixed-zero per-ring case for the new fall-through contract.
Line 86 and the mailbox roundtrip cover all-zero defaults and fully populated lists, but they never exercise the key new behavior from this PR: a 4-entry list with some
0elements should be accepted and preserved so those rings can fall through to env/default resolution. A regression that rejects or rewrites[0, 32, 0, 256]would still pass this suite today.Suggested test shape
def test_runtime_env_defaults_and_roundtrip(self): config = CallConfig() @@ config.runtime_env.ring_dep_pool = [64, 128, 256, 512] @@ assert "runtime_env.ring_dep_pool=[64, 128, 256, 512]" in r + + config.runtime_env.ring_task_window = [0, 32, 0, 256] + config.runtime_env.ring_heap = [0, 2048, 0, 8192] + config.runtime_env.ring_dep_pool = [0, 128, 0, 512] + assert config.runtime_env.ring_task_window == [0, 32, 0, 256] + assert config.runtime_env.ring_heap == [0, 2048, 0, 8192] + assert config.runtime_env.ring_dep_pool == [0, 128, 0, 512] + config.validate()Also applies to: 331-363
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ut/py/test_chip_worker.py` around lines 84 - 130, Add a test in test_runtime_env_defaults_and_roundtrip (or a nearby RuntimeEnv roundtrip test) that assigns a 4-entry mixed-zero list such as [0, 32, 0, 256] to a per-ring field like ring_task_window or ring_heap and asserts the exact list is preserved after readback and validate(). This should exercise the new fall-through contract without treating 0 as invalid or rewriting it, alongside the existing RuntimeEnv and CallConfig roundtrip checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/ut/py/test_chip_worker.py`:
- Around line 84-130: Add a test in test_runtime_env_defaults_and_roundtrip (or
a nearby RuntimeEnv roundtrip test) that assigns a 4-entry mixed-zero list such
as [0, 32, 0, 256] to a per-ring field like ring_task_window or ring_heap and
asserts the exact list is preserved after readback and validate(). This should
exercise the new fall-through contract without treating 0 as invalid or
rewriting it, alongside the existing RuntimeEnv and CallConfig roundtrip checks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 367cd3e0-dc60-44ef-980d-c817df063da4
📒 Files selected for processing (21)
examples/workers/l2/per_task_runtime_env/README.mdexamples/workers/l2/per_task_runtime_env/main.pyexamples/workers/l3/per_task_runtime_env/README.mdexamples/workers/l3/per_task_runtime_env/main.pypython/bindings/task_interface.cpppython/simpler/worker.pysimpler_setup/scene_test.pysrc/a2a3/runtime/host_build_graph/host/runtime_maker.cppsrc/a2a3/runtime/tensormap_and_ringbuffer/docs/MULTI_RING.mdsrc/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cppsrc/a5/runtime/host_build_graph/host/runtime_maker.cppsrc/a5/runtime/tensormap_and_ringbuffer/docs/MULTI_RING.mdsrc/a5/runtime/tensormap_and_ringbuffer/host/runtime_maker.cppsrc/common/platform/onboard/host/c_api_shared.cppsrc/common/platform/sim/host/c_api_shared.cppsrc/common/task_interface/call_config.hsrc/common/worker/chip_worker.cppsrc/common/worker/chip_worker.hsrc/common/worker/pto_runtime_c_api.htests/ut/cpp/types/test_call_config.cpptests/ut/py/test_chip_worker.py
Closes #1126.
Problem
#1099 exposed ring sizing through two near-identical
CallConfig.runtime_envnames per resource that differ only by a trailings:ring_task_windowring_task_windowsring_heapring_heapsring_dep_poolring_dep_poolsThe one-letter difference is an ergonomics footgun (easy to mistype, silently accepted), and the layered "scalar baseline + per-ring override" semantics it bought are not worth the confusing twin names for this project's usage.
Change
Collapse each pair into a single field that accepts either an
int(broadcast) or a 4-entry list (per-ring):int→[v, v, v, v]); the wire format now carries only the three 4-element arrays (12 × uint64, down from 15) and the getter always returns a 4-list.0entry falls through toPTO2_RING_*env → compile-time default. The separate scalar-CallConfig precedence tier is intentionally dropped (accepted trade-off): a0in a list can no longer fall back to a sibling scalar, only to env/default.run_prepared) and wire layout are internal-only (no external consumers; everything rebuilds together viapip install), so this is a clean break with no back-compat shim.Surface (mirrored a2a3 ⇄ a5)
Core struct + validate + wire asserts (
call_config.h); Python binding int|list property + repr (task_interface.cpp); wire pack/unpack (worker.py); scene-test parse (scene_test.py); internal C-API (pto_runtime_c_api.h,chip_worker.{h,cpp}, onboard+simc_api_shared.cpp, bothhost_build_graph/runtime_maker.cpp); resolution (bothtensormap_and_ringbuffer/host/runtime_maker.cpp); docs (bothMULTI_RING.md); tests (test_call_config.cpp,test_chip_worker.py); and the l2/l3per_task_runtime_envexamples.Test
tests/ut/py/test_chip_worker.py— 26 passed (defaults/roundtrip, validate rejects, mailbox wire roundtrip, length validation), updated to the unified API.tests/ut/cpp/types/test_call_config.cpp— updated to the array struct (compiles clean; the local cpput run hits a pre-existing, change-unrelated gtestEqFailurelink error that also fails on untouched targets liketest_child_memory).per_task_runtime_envexamples and thepaged_attention*scene tests pass undera2a3sim, exercising the full path: binding → wire → C-API →resolve_ring_config→ runtime → device sim. Scalar inputs correctly arrive as[v, v, v, v]; lists pass through; repr shows lists.