refactor(pi07_paligemma): layout-agnostic ContextItem prefix/suffix#262
Conversation
There was a problem hiding this comment.
Layout-agnostic prefix/suffix is a clear win — the policy is the single owner of layout, and the per-type dispatcher in _embed_item makes future additions (e.g. a new conditioning block) a one-line change.
A few small things flagged inline. None blocking; mainly docstring/maintenance suggestions plus a minor double-call of prepare_state.
|
[claude-review] summary for commit 96cb533 Round 3 — all four addressable findings from round 2 landed in
The unrelated test-skip suggestion was explicitly dismissed by the author and isn't re-flagged.
|
|
@claude fix |
- addresses @claude (sample_actions double-call): replace second prepare_state call with a direct read of batch["state"].shape (T=1 when ndim==2, else shape[-2]) so the warning derives t_dim without redoing the unsqueeze/pad. - addresses @claude (T=1 video override): _embed_item now raises ValueError if a T=1 video item carries obs_history_is_pad, instead of silently overwriting it with the synthesized temporal mask. Docstring updated. - addresses @claude (SigLIP-skip DDP invariant): _embed_item docstring now spells out that the data-dependent skip requires a video pad_mask derivable from config (not per-sample) so DDP ranks agree and find_unused_parameters=False does not deadlock. - addresses @claude (test parallel layout): _legacy_embed_prefix no longer re-implements the prefix layout. It now builds a fake PI07LowLevelPlannerPolicy via object.__new__, stubs the prepare_* methods to return the test tensors, and dispatches through PI07LowLevelPlannerPolicy._build_prefix_items. Single source of truth for layout; future _build_prefix_items changes flow into the tests. Dropped now-unused ContextItem import. tests: passed — pytest -m "not gpu" -n auto tests/policies/test_pi07_paligemma_low_level_planner.py (24 PaliGemma-gated tests skipped locally for lack of HF_TOKEN; CI has it) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The PI07LowLevelPlannerFlowMatching prefix and suffix layouts were hard-coded inside embed_prefix / embed_suffix, so adding a conditioning block or reordering existing ones required editing both the model and the policy in lockstep. This commit moves all layout decisions into PI07LowLevelPlannerPolicy._build_prefix_items: the policy constructs a list of ContextItem dataclasses (data + item_type + pad_mask + attention mode + exclude_from_cross_attention flag), and the model just dispatches per type, concatenates, and derives the 1-D attention mask. Future layout changes only touch the policy. Behavior is unchanged: same prefix order, same per-block masking semantics, same cross-attention exclusion of the trailing "Action: " + discrete-action block (now derived from the exclude_from_cross_attention flag rather than config-based slicing).
- addresses @claude (sample_actions double-call): replace second prepare_state call with a direct read of batch["state"].shape (T=1 when ndim==2, else shape[-2]) so the warning derives t_dim without redoing the unsqueeze/pad. - addresses @claude (T=1 video override): _embed_item now raises ValueError if a T=1 video item carries obs_history_is_pad, instead of silently overwriting it with the synthesized temporal mask. Docstring updated. - addresses @claude (SigLIP-skip DDP invariant): _embed_item docstring now spells out that the data-dependent skip requires a video pad_mask derivable from config (not per-sample) so DDP ranks agree and find_unused_parameters=False does not deadlock. - addresses @claude (test parallel layout): _legacy_embed_prefix no longer re-implements the prefix layout. It now builds a fake PI07LowLevelPlannerPolicy via object.__new__, stubs the prepare_* methods to return the test tensors, and dispatches through PI07LowLevelPlannerPolicy._build_prefix_items. Single source of truth for layout; future _build_prefix_items changes flow into the tests. Dropped now-unused ContextItem import. tests: passed — pytest -m "not gpu" -n auto tests/policies/test_pi07_paligemma_low_level_planner.py (24 PaliGemma-gated tests skipped locally for lack of HF_TOKEN; CI has it) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bd08132 to
e633dc8
Compare
There was a problem hiding this comment.
Round 2 — addressed feedback from the prior review looks landed (single source of truth via _build_prefix_items, the prepare_state double-call is gone, the DDP invariant is documented). Spotted a few new things on this pass: a missing test for the new cross_att_locked invariant, and the unrelated test-skip changes are worth a closer look — they pull coverage away from nightly regression on g6.12xlarge, not just from a 24GB RTX 3090.
|
@claude fix |
- addresses @claude (cross_att_locked test): added TestPI07EmbedPrefixInvariants in tests/policies/test_pi07_paligemma_low_level_planner.py — pins the raise path (excluded then non-excluded) and the legal trailing-run count via two CPU-only tests reusing the existing mock model. - addresses @claude (suffix att-mask dtype): aligned embed_suffix to build att_masks_flat as torch.bool to match embed_prefix:1744; both paths now use the same dtype side by side. - addresses @claude (redundant state_mask.clone): removed; both branches above already produce fresh tensors. Comment updated. - addresses @claude (fragile _legacy_embed_prefix): the helper now builds the fake policy from a local PI07LowLevelPlannerPolicy subclass whose __getattr__ raises a clear "missing attribute X" error. Any future _build_prefix_items access to an unstubbed attr will now point at the helper instead of bubbling a generic AttributeError up through every test that uses it. Docstring documents the contract. tests: passed — pytest tests/policies/test_pi07_paligemma_low_level_planner.py::TestPI07EmbedPrefixInvariants tests/policies/test_pi07_paligemma_low_level_planner.py::TestPI07LowLevelPlannerStateEmbedding (11/11 — covers the 2 new invariant tests + the 9 state-embedding tests that exercise the strict fake policy). The PaliGemma-gated test classes need HF_TOKEN and are not runnable locally; CI's cpu_test.yml will run them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What this does
Refactors
PI07LowLevelPlannerFlowMatchingso that the prefix and suffix layouts (which video / language / state / response / subgoal / metadata / discrete-action blocks appear, in what order, with what masking and attention) live entirely insidePI07LowLevelPlannerPolicy. The model now just consumes alist[ContextItem], dispatches peritem_type, concatenates, and derives the 1-D attention mask from each item'sattentionsetting (continue/bidirectional/causal).Adding a new conditioning block or reordering existing ones now only requires editing the policy's
_build_prefix_items— noFlowMatchingchanges.Specifically:
ContextItemdataclass (data,item_type,pad_mask,attention,exclude_from_cross_attention,obs_history_is_pad).embed_prefix(items)andembed_suffix(items, timestep)are layout-agnostic dispatchers; they returnnum_cross_att_tokensdirectly so the action expert no longer slices viadiscrete_action_indicator_max_length/discrete_action_max_lengthconfig fields.FlowMatching.forward/sample_actionsnow take a prebuilt prefix items list; the policy builds it via_build_prefix_items(batch, include_discrete_actions=...).num_frameszero-pad and the "skip SigLIP backbone when no sample needs this video" optimization are now generic per-type behaviors (no more bespoke subgoal handling)."Action: "indicator + discrete-action block.How it was tested
tests/policies/test_pi07_paligemma_low_level_planner.pypass (response/metadata/subgoal/state masking + state-projection invocation tests).tests/policies/CPU suite passes (228 passed, 2 skipped).How to checkout & try? (for the reviewer)
git checkout refactor/pi07-paligemma-context-items uv sync --extra dev --extra libero pytest tests/policies/test_pi07_paligemma_low_level_planner.py -m "not gpu" -n autoChecklist