From 65f6b63ab95ccd8ad3bc9f99231e250a282aec63 Mon Sep 17 00:00:00 2001
From: "claude[bot]" <41898282+claude[bot]@users.noreply.github.com>
Date: Thu, 7 May 2026 21:08:10 +0000
Subject: [PATCH] chore(claude): learn from #278

Add a hard rule that tests mutating module-level state (e.g. the
dedup flags `_CONTROL_MODE_WARNED` / `_SKIP_TIMESTAMP_WARNED` in
`lerobot_dataset.py`) must save and restore the original value
via try/finally. Captured from the reviewer "fix the nit"
feedback on #278.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/CLAUDE.md b/CLAUDE.md
index 988589b9..aea67f37 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -32,6 +32,8 @@ These override defaults — read them before running anything.
    - **Per-rank branch decisions that fire collectives must be OR-reduced first.** When a `forward` takes a Python-level branch based on what the local micro-batch contains (e.g. `if has_response: embed_language_tokens(...)` in `embed_prefix`), use `_global_or_branch_decisions` in `src/opentau/policies/pi07/low_level/modeling_pi07_low_level.py` — one SUM all-reduce that both OR-reduces the per-rank decisions and asserts cross-rank presence agreement. Adding a new optional branch in distributed `forward` without going through it (or an equivalent pre-branch all-reduce) is the same bug.
    - **Composite forward units must be a single `nn.Module`.** Bundle multi-component decoder steps (e.g. a backbone layer paired with an action-expert layer) into one `nn.Module` so FSDP's all-gather hook prefetches every sub-component together — like `InterleavedDecoderLayer` in `src/opentau/policies/pi07/gemma3_with_expert.py`. Calling sub-components directly on a separately-wrapped layer (`layer.input_layernorm(...)`, `layer.self_attn.q_proj(...)`) bypasses the hook and triggers mismatched all-gather sizes across ranks.
 
+6. **Tests that mutate module-level state must save and restore it via `try`/`finally`.** Module-level dedup flags like `_CONTROL_MODE_WARNED` (set) and `_SKIP_TIMESTAMP_WARNED` (bool) in `src/opentau/datasets/lerobot_dataset.py` persist across tests within the same pytest-xdist worker process. A test that flips the flag to exercise the "first-time" branch and then leaves it flipped will silently mask any later test that wants to assert the warning fires again — a regression that won't show up locally but can flake under different `pytest-xdist` shard distributions. Pattern: capture the original up-front, mutate inside `try`, restore in `finally`. See `test_skip_timestamp_warning_emitted_once_per_process` in `tests/datasets/test_datasets.py` for the canonical shape.
+
 ## Project overview
 
 OpenTau is Tensor's open-source PyTorch training toolchain for vision-language-action (VLA) models — a fork of LeRobot with extra capabilities (heterogeneous-dataset co-training, discrete actions for π₀.₅, knowledge insulation, dropout in PaliGemma, π*₀.₆-style RL, validation splits, profilers). Any LeRobot-compliant policy and dataset works directly. Pinned to **Python 3.10**.