TensorAuto · claude · May 7, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -32,6 +32,8 @@ These override defaults — read them before running anything.
    - **Per-rank branch decisions that fire collectives must be OR-reduced first.** When a `forward` takes a Python-level branch based on what the local micro-batch contains (e.g. `if has_response: embed_language_tokens(...)` in `embed_prefix`), use `_global_or_branch_decisions` in `src/opentau/policies/pi07/low_level/modeling_pi07_low_level.py` — one SUM all-reduce that both OR-reduces the per-rank decisions and asserts cross-rank presence agreement. Adding a new optional branch in distributed `forward` without going through it (or an equivalent pre-branch all-reduce) is the same bug.
    - **Composite forward units must be a single `nn.Module`.** Bundle multi-component decoder steps (e.g. a backbone layer paired with an action-expert layer) into one `nn.Module` so FSDP's all-gather hook prefetches every sub-component together — like `InterleavedDecoderLayer` in `src/opentau/policies/pi07/gemma3_with_expert.py`. Calling sub-components directly on a separately-wrapped layer (`layer.input_layernorm(...)`, `layer.self_attn.q_proj(...)`) bypasses the hook and triggers mismatched all-gather sizes across ranks.
 
+6. **Tests that mutate module-level state must save and restore it via `try`/`finally`.** Module-level dedup flags like `_CONTROL_MODE_WARNED` (set) and `_SKIP_TIMESTAMP_WARNED` (bool) in `src/opentau/datasets/lerobot_dataset.py` persist across tests within the same pytest-xdist worker process. A test that flips the flag to exercise the "first-time" branch and then leaves it flipped will silently mask any later test that wants to assert the warning fires again — a regression that won't show up locally but can flake under different `pytest-xdist` shard distributions. Pattern: capture the original up-front, mutate inside `try`, restore in `finally`. See `test_skip_timestamp_warning_emitted_once_per_process` in `tests/datasets/test_datasets.py` for the canonical shape.
+
 ## Project overview
 
 OpenTau is Tensor's open-source PyTorch training toolchain for vision-language-action (VLA) models — a fork of LeRobot with extra capabilities (heterogeneous-dataset co-training, discrete actions for π₀.₅, knowledge insulation, dropout in PaliGemma, π*₀.₆-style RL, validation splits, profilers). Any LeRobot-compliant policy and dataset works directly. Pinned to **Python 3.10**.