Skip to content

feat(runtime): unify runtime_env ring sizing into one int-or-list field#1128

Merged
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
ChaoZheng109:feature/unify-runtime-env-ring-sizing
Jun 24, 2026
Merged

feat(runtime): unify runtime_env ring sizing into one int-or-list field#1128
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
ChaoZheng109:feature/unify-runtime-env-ring-sizing

Conversation

@ChaoZheng109

Copy link
Copy Markdown
Collaborator

Closes #1126.

Problem

#1099 exposed ring sizing through two near-identical CallConfig.runtime_env names per resource that differ only by a trailing s:

scalar (broadcast) per-ring array
ring_task_window ring_task_windows
ring_heap ring_heaps
ring_dep_pool ring_dep_pools

The one-letter difference is an ergonomics footgun (easy to mistype, silently accepted), and the layered "scalar baseline + per-ring override" semantics it bought are not worth the confusing twin names for this project's usage.

Change

Collapse each pair into a single field that accepts either an int (broadcast) or a 4-entry list (per-ring):

cfg.runtime_env.ring_task_window = 128             # broadcast to every ring
cfg.runtime_env.ring_task_window = [128, 0, 0, 0]  # per-ring; 0 falls through
  • Broadcast happens in the Python binding (int[v, v, v, v]); the wire format now carries only the three 4-element arrays (12 × uint64, down from 15) and the getter always returns a 4-list.
  • A 0 entry falls through to PTO2_RING_* env → compile-time default. The separate scalar-CallConfig precedence tier is intentionally dropped (accepted trade-off): a 0 in a list can no longer fall back to a sibling scalar, only to env/default.
  • The internal C-API (run_prepared) and wire layout are internal-only (no external consumers; everything rebuilds together via pip install), so this is a clean break with no back-compat shim.

Surface (mirrored a2a3 ⇄ a5)

Core struct + validate + wire asserts (call_config.h); Python binding int|list property + repr (task_interface.cpp); wire pack/unpack (worker.py); scene-test parse (scene_test.py); internal C-API (pto_runtime_c_api.h, chip_worker.{h,cpp}, onboard+sim c_api_shared.cpp, both host_build_graph/runtime_maker.cpp); resolution (both tensormap_and_ringbuffer/host/runtime_maker.cpp); docs (both MULTI_RING.md); tests (test_call_config.cpp, test_chip_worker.py); and the l2/l3 per_task_runtime_env examples.

Test

  • tests/ut/py/test_chip_worker.py — 26 passed (defaults/roundtrip, validate rejects, mailbox wire roundtrip, length validation), updated to the unified API.
  • tests/ut/cpp/types/test_call_config.cpp — updated to the array struct (compiles clean; the local cpput run hits a pre-existing, change-unrelated gtest EqFailure link error that also fails on untouched targets like test_child_memory).
  • l2 + l3 per_task_runtime_env examples and the paged_attention* scene tests pass under a2a3sim, exercising the full path: binding → wire → C-API → resolve_ring_config → runtime → device sim. Scalar inputs correctly arrive as [v, v, v, v]; lists pass through; repr shows lists.
python examples/workers/l2/per_task_runtime_env/main.py -p a2a3sim -d 0
python examples/workers/l3/per_task_runtime_env/main.py -p a2a3sim -d 0

hw-native-sys#1099 exposed ring sizing through two near-identical CallConfig.runtime_env
names per resource that differ only by a trailing `s` — `ring_task_window`
(scalar broadcast) vs `ring_task_windows` (per-ring array), etc. The one-letter
difference is an ergonomics footgun and the layered "scalar baseline + per-ring
override" semantics it bought are not worth the confusing twin names.

Collapse each pair into a single field that accepts EITHER a scalar (broadcast
to every ring) OR a 4-entry list (per-ring):

    cfg.runtime_env.ring_task_window = 128             # broadcast
    cfg.runtime_env.ring_task_window = [128, 0, 0, 0]  # per-ring; 0 falls through

Broadcast happens in the Python binding (int -> [v, v, v, v]); the wire format
now carries only the three 4-element arrays (12 uint64, down from 15) and the
getter always returns a 4-list. A 0 entry falls through to PTO2_RING_* env ->
compile-time default; the separate scalar-CallConfig precedence tier is dropped
(accepted trade-off — a 0 in a list no longer falls back to a sibling scalar).

The internal C-API (run_prepared) and wire layout are internal-only and rebuild
together via pip install, so this is a clean break with no back-compat shim.
Mirrored across a2a3/a5, both runtimes, bindings, scene-test parsing, docs,
unit tests, and the per_task_runtime_env examples.

Closes hw-native-sys#1126.
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

CallConfig.runtime_env ring-sizing fields (ring_task_window, ring_heap, ring_dep_pool) are unified from a dual scalar+plural-array design into a single field accepting either an int (broadcast to all rings) or a 4-entry list. The change propagates through the C++ struct, C ABI, runtime resolution in a2a3/a5, Python bindings, wire packing, scene-test parsing, unit tests, docs, and examples.

Changes

RuntimeEnv ring-field unification

Layer / File(s) Summary
RuntimeEnv struct, constants, and validate
src/common/task_interface/call_config.h
Removes RUNTIME_ENV_SCALAR_FIELD_COUNT and RUNTIME_ENV_PER_RING_FIELD_GROUPS, adds RUNTIME_ENV_FIELD_GROUPS=3, restructures RuntimeEnv to hold only per-ring arrays (ring_task_window[N], ring_heap[N], ring_dep_pool[N]), simplifies any() and validate() to iterate those arrays only.
C ABI signature updates
src/common/worker/pto_runtime_c_api.h, src/common/worker/chip_worker.h, src/common/worker/chip_worker.cpp, src/common/platform/onboard/host/c_api_shared.cpp, src/common/platform/sim/host/c_api_shared.cpp
run_prepared and bind_callable_to_runtime_impl drop scalar uint64_t ring params and plural ring_*s pointer params, accepting only const uint64_t* for each ring field. ChipWorker::run removes local stack arrays and passes config.runtime_env array pointers directly.
resolve_ring_config and runtime_maker stubs (a2a3 & a5)
src/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp, src/a2a3/runtime/host_build_graph/host/runtime_maker.cpp, src/a5/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp, src/a5/runtime/host_build_graph/host/runtime_maker.cpp
resolve_ring_config accepts only per-ring pointer arrays; replaces scalar-broadcast + separate-array logic with a per-ring loop applying overrides when pointer is non-null and value is non-zero. Both host_build_graph stubs update their unused parameter lists to match the new signature.
Python bindings, wire packing, and scene-test parsing
python/bindings/task_interface.cpp, python/simpler/worker.py, simpler_setup/scene_test.py
Replaces def_rw scalar + plural vector properties with unified def_prop_rw accepting int (broadcast) or list[int] of length RUNTIME_ENV_RING_COUNT; updates __repr__. Wire unpacking slices a single ring_values sequence into per-ring lists. Scene-test config drops plural key support.
C++ and Python unit tests
tests/ut/cpp/types/test_call_config.cpp, tests/ut/py/test_chip_worker.py
Updates all tests to use per-ring array field access, expect 4-entry list defaults, verify scalar broadcasting, and check mailbox roundtrip with singular field names. Removes per_ring_any() assertions.
MULTI_RING docs and per-task examples
src/a2a3/runtime/tensormap_and_ringbuffer/docs/MULTI_RING.md, src/a5/runtime/tensormap_and_ringbuffer/docs/MULTI_RING.md, examples/workers/l2/per_task_runtime_env/*, examples/workers/l3/per_task_runtime_env/*
MULTI_RING docs update Runtime Overrides section for scalar-or-4-list semantics with 0 as fall-through. L2/L3 examples update RING_FIELDS to singular keys and populate configs with 4-entry lists under those keys.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • hw-native-sys/simpler#1099: Introduced the original scalar + plural per-ring array dual-field design in RuntimeEnv that this PR directly refactors away.
  • hw-native-sys/simpler#1122: Added the L2/L3 per-task runtime_env examples using the old plural keys (ring_task_windows/ring_heaps/ring_dep_pools) that are updated here to the new singular form.
  • hw-native-sys/simpler#1042: Modified the same ring_task_window/ring_heap/ring_dep_pool plumbing path through run_prepared → chip_worker → c_api_shared → runtime_maker that this PR reshapes.

Poem

🐇 Hop, hop — the rings collapse to one!
No more _windows vs _window confusion done.
A scalar broadcasts, a list per-depth stays true,
Zero falls through — the env picks up the cue.
Four entries aligned, the plural names are gone,
Clean fields from C struct to Python, carry on! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: unifying runtime_env ring sizing into a single int-or-list field.
Description check ✅ Passed The description matches the change set and explains the int-or-list unification, precedence changes, and scope.
Linked Issues check ✅ Passed The PR implements the #1126 requirements: removes plural variants, supports int broadcast and 4-entry lists, and updates docs/tests/surface.
Out of Scope Changes check ✅ Passed The changes stay within the ring-sizing API unification and related docs/tests/examples, with no obvious unrelated additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/ut/py/test_chip_worker.py (1)

84-130: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Add a mixed-zero per-ring case for the new fall-through contract.

Line 86 and the mailbox roundtrip cover all-zero defaults and fully populated lists, but they never exercise the key new behavior from this PR: a 4-entry list with some 0 elements should be accepted and preserved so those rings can fall through to env/default resolution. A regression that rejects or rewrites [0, 32, 0, 256] would still pass this suite today.

Suggested test shape
 def test_runtime_env_defaults_and_roundtrip(self):
     config = CallConfig()
@@
     config.runtime_env.ring_dep_pool = [64, 128, 256, 512]
@@
     assert "runtime_env.ring_dep_pool=[64, 128, 256, 512]" in r
+
+    config.runtime_env.ring_task_window = [0, 32, 0, 256]
+    config.runtime_env.ring_heap = [0, 2048, 0, 8192]
+    config.runtime_env.ring_dep_pool = [0, 128, 0, 512]
+    assert config.runtime_env.ring_task_window == [0, 32, 0, 256]
+    assert config.runtime_env.ring_heap == [0, 2048, 0, 8192]
+    assert config.runtime_env.ring_dep_pool == [0, 128, 0, 512]
+    config.validate()

Also applies to: 331-363

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/ut/py/test_chip_worker.py` around lines 84 - 130, Add a test in
test_runtime_env_defaults_and_roundtrip (or a nearby RuntimeEnv roundtrip test)
that assigns a 4-entry mixed-zero list such as [0, 32, 0, 256] to a per-ring
field like ring_task_window or ring_heap and asserts the exact list is preserved
after readback and validate(). This should exercise the new fall-through
contract without treating 0 as invalid or rewriting it, alongside the existing
RuntimeEnv and CallConfig roundtrip checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/ut/py/test_chip_worker.py`:
- Around line 84-130: Add a test in test_runtime_env_defaults_and_roundtrip (or
a nearby RuntimeEnv roundtrip test) that assigns a 4-entry mixed-zero list such
as [0, 32, 0, 256] to a per-ring field like ring_task_window or ring_heap and
asserts the exact list is preserved after readback and validate(). This should
exercise the new fall-through contract without treating 0 as invalid or
rewriting it, alongside the existing RuntimeEnv and CallConfig roundtrip checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 367cd3e0-dc60-44ef-980d-c817df063da4

📥 Commits

Reviewing files that changed from the base of the PR and between ae59a8e and ba585ea.

📒 Files selected for processing (21)
  • examples/workers/l2/per_task_runtime_env/README.md
  • examples/workers/l2/per_task_runtime_env/main.py
  • examples/workers/l3/per_task_runtime_env/README.md
  • examples/workers/l3/per_task_runtime_env/main.py
  • python/bindings/task_interface.cpp
  • python/simpler/worker.py
  • simpler_setup/scene_test.py
  • src/a2a3/runtime/host_build_graph/host/runtime_maker.cpp
  • src/a2a3/runtime/tensormap_and_ringbuffer/docs/MULTI_RING.md
  • src/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp
  • src/a5/runtime/host_build_graph/host/runtime_maker.cpp
  • src/a5/runtime/tensormap_and_ringbuffer/docs/MULTI_RING.md
  • src/a5/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp
  • src/common/platform/onboard/host/c_api_shared.cpp
  • src/common/platform/sim/host/c_api_shared.cpp
  • src/common/task_interface/call_config.h
  • src/common/worker/chip_worker.cpp
  • src/common/worker/chip_worker.h
  • src/common/worker/pto_runtime_c_api.h
  • tests/ut/cpp/types/test_call_config.cpp
  • tests/ut/py/test_chip_worker.py

@ChaoWao ChaoWao merged commit c635484 into hw-native-sys:main Jun 24, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code Health] Unify runtime_env ring sizing into a single int-or-list field (drop the *s plural variants)

2 participants