Skip to content
22 changes: 20 additions & 2 deletions .github/workflows/claude-implement-fixes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ jobs:
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_args: |
--model ${{ vars.CLAUDE_MODEL || 'claude-opus-4-7[1m]' }}
--permission-mode bypassPermissions
prompt: |
A reviewer asked you to address review feedback on this PR.
Expand All @@ -111,16 +112,33 @@ jobs:
2. For each actionable comment containing `@claude fix`, implement
the fix on the PR's branch.
3. Skip comments that are questions, taste preferences, or already addressed.
4. Run the test command from CLAUDE.md before pushing.
4. Decide whether to run tests:
- If the diff is purely documentation, comments, formatting,
string-literal text, or otherwise CANNOT change runtime behavior,
you MAY skip tests. Be honest about confidence — when in doubt, run.
If you skip, the commit body MUST contain a line of the form:
tests: skipped — <one-sentence reason>
- Otherwise (changes touching imports, function bodies, control
flow, types, configs read at runtime, dependencies, or build
manifests): run `pytest -m "not gpu" -n auto`. Scope to the
changed subtree where possible (e.g. `pytest tests/policies/test_pi05.py`
for a pi05 change) to keep the run fast. The commit body MUST contain:
tests: passed — <exact command run>
If tests fail: do NOT push; reply on the relevant PR comment
explaining the failure and stop.
5. Make ONE commit at the end of the session that addresses every
comment you decided to act on — do NOT push one commit per
comment. Subject line (must be < 80 chars per CLAUDE.md):

[claude-fix] address review feedback on #${{ github.event.issue.number || github.event.pull_request.number }}

Commit body: a bulleted list, one bullet per addressed comment:
Commit body: a bulleted list, one bullet per addressed comment,
followed by the `tests:` line from step 4:

- addresses @<reviewer> (<short topic>): <what you changed>
...
tests: passed — pytest -m "not gpu" tests/policies/test_pi05.py
(or: tests: skipped — comment-only change, no runtime impact)

Then push the single commit to the PR branch.
6. Reply individually to each addressed comment on the PR with
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/claude-pr-review.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_args: |
--model claude-opus-4-7
--model ${{ vars.CLAUDE_MODEL || 'claude-opus-4-7[1m]' }}
--permission-mode bypassPermissions
prompt: |
You are reviewing PR #${{ github.event.pull_request.number }} in
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/cpu_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,8 @@ jobs:
python3 -c "import sys; print(sys.path)"
python3 -c "import libero.libero" && echo "LIBERO config set successfully."
echo "Running cpu based pytest and generating coverage report..."
pytest -m "not gpu" -n auto -v --cov=lerobot/ --cov-report=xml:cpu_test/cpu_test.xml --ignore=tests/planner/test_planner.py --ignore tests/utils/test_libero_utils.py --deselect=tests/envs/test_factory.py::TestMakeEnv::test_make_env_async_vector_env --deselect=tests/envs/test_factory.py::TestMakeEnv::test_make_env_sync_vector_env tests/
# TODO(#210): drop --ignore=tests/policies/test_pi07_paligemma_low_level_planner.py once pi07 migrates to SpaceTimeSiglipVideoEncoder (#192).
pytest -m "not gpu" -n auto -v --cov=lerobot/ --cov-report=xml:cpu_test/cpu_test.xml --ignore=tests/planner/test_planner.py --ignore tests/utils/test_libero_utils.py --ignore=tests/policies/test_pi07_paligemma_low_level_planner.py --deselect=tests/envs/test_factory.py::TestMakeEnv::test_make_env_async_vector_env --deselect=tests/envs/test_factory.py::TestMakeEnv::test_make_env_sync_vector_env tests/
echo "Pytest execution and coverage report generation completed."

- name: Upload coverage reports
Expand Down
7 changes: 6 additions & 1 deletion .github/workflows/extract-claude-lessons.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,13 @@ permissions:

jobs:
extract-lessons:
# Gate on the head branch name, not user.login: in this repo Claude Code
# pushes to a `claude/*` branch and a human opens the PR, so the PR's
# `user.login` never contains 'claude'. The branch prefix is the reliable
# signal that Claude touched the PR.
if: >-
github.event.pull_request.merged == true
&& contains(github.event.pull_request.user.login, 'claude')
&& startsWith(github.event.pull_request.head.ref, 'claude/')
&& !startsWith(github.event.pull_request.title, 'chore(claude): learn from')
runs-on: ubuntu-latest
timeout-minutes: 20
Expand All @@ -39,6 +43,7 @@ jobs:
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_args: |
--model ${{ vars.CLAUDE_MODEL || 'claude-opus-4-7[1m]' }}
--permission-mode bypassPermissions
prompt: |
Review the comments on this merged PR. If any reviewer feedback
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/gpu_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,8 @@ jobs:
source .venv/bin/activate
mkdir -p /tmp/libero-assets/libero/libero
export LIBERO_CONFIG_PATH="$(pwd)/.github/assets/libero"
pytest -m "gpu" -n 0 -v tests/
# TODO(#210): drop --ignore=tests/policies/test_pi07_paligemma_low_level_planner.py once pi07 migrates to SpaceTimeSiglipVideoEncoder (#192).
pytest -m "gpu" -n 0 -v --ignore=tests/policies/test_pi07_paligemma_low_level_planner.py tests/

stop-runner:
name: Stop GPU Runner
Expand Down
2 changes: 2 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ These override defaults — read them before running anything.

3. **Verify determinism on any change to the training loop or model.** ML bugs hide in stochasticity: a bad change can still produce loss curves that *look* plausible. After touching anything in `scripts/train.py`, `policies/*/modeling_*.py`, `optim/`, or `datasets/sampler.py`, run a smoke config twice with the same `seed` and confirm the per-step loss series is bit-identical (not just "close"). Seeding utilities live in `src/opentau/utils/random_utils.py` (`set_seed`, `serialize_python_rng_state`, etc.). If two seeded runs diverge, that's a bug — investigate before claiming the change works.

4. **Pin training-path layout fixes with a CPU unit test.** When you fix a detail in `embed_prefix` / `embed_suffix` / position-id construction that shifts a cumsum boundary or alters which tokens fall inside vs. outside a causal block — `att_masks` patterns (e.g. `[1]+[0]*(N-1)` vs `[1]*N` per-token causal blocks), `position_ids` slices, prefix/suffix layout — add a CPU unit test in the policy's `test_*_cpu.py` that asserts the exact pattern (token count, the att_masks tail, indicator boundaries). Determinism (rule 3) only proves two runs agree; it does *not* prove the layout is correct. GPU integration and nightly regression tests run on a delayed schedule, so a layout regression can merge silently if only those gate it. A pinned CPU assertion fails on the same PR.

## Project overview

OpenTau is Tensor's open-source PyTorch training toolchain for vision-language-action (VLA) models — a fork of LeRobot with extra capabilities (heterogeneous-dataset co-training, discrete actions for π₀.₅, knowledge insulation, dropout in PaliGemma, π*₀.₆-style RL, validation splits, profilers). Any LeRobot-compliant policy and dataset works directly. Pinned to **Python 3.10**.
Expand Down
10 changes: 9 additions & 1 deletion src/opentau/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,15 @@
)

# lists all available policies from `src/opentau/policies`
available_policies = ["pi0", "pi05", "pi05_mem", "pi06", "value"]
available_policies = [
"pi0",
"pi05",
"pi05_mem",
"pi06",
"pi07_high_level",
"pi07_low_level",
"value",
]

# keys and values refer to yaml files
available_policies_per_env = {}
Expand Down
59 changes: 38 additions & 21 deletions src/opentau/datasets/lerobot_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -1766,15 +1766,23 @@ def _sample_subgoal_frame(self, ep_idx: int, frame_in_ep: int, *, at_end_of_segm
current segment (clipped to the episode's last frame). Otherwise samples
a timestamp uniformly in ``[t, t + 4s]`` (wall-clock) and converts it to
a frame index, clipping to the current segment end and the episode end.

Episodes that have no ``segments`` annotation in ``episodes.jsonl``
skip segment-aware clipping entirely and fall back to a fixed
~4-seconds-ahead subgoal frame (clipped to the episode end). This
keeps subgoal supervision available on legacy datasets that never
wrote per-episode segment boundaries.
"""
ep_length = self.episode_lengths[ep_idx]
window_frames = int(round(4.0 * self.fps))
if "segments" not in self.meta.episodes[ep_idx]:
return min(frame_in_ep + window_frames, ep_length - 1)
seg_idx = self._lookup_segment_index(ep_idx, frame_in_ep)
seg_end_excl = self._segment_end_in_ep(ep_idx, seg_idx)
upper = min(seg_end_excl, ep_length) - 1 # inclusive upper bound.
upper = max(upper, frame_in_ep)
if at_end_of_segment:
return upper
window_frames = int(round(4.0 * self.fps))
top = min(frame_in_ep + window_frames, upper)
if top <= frame_in_ep:
return frame_in_ep
Expand All @@ -1784,31 +1792,32 @@ def _sample_subgoal_frame(self, ep_idx: int, frame_in_ep: int, *, at_end_of_segm
def _load_subgoal_frames(self, ep_idx: int, frame_in_ep: int) -> dict[str, torch.Tensor]:
"""Decode subgoal frames — one per camera slot — for this sample.

Subgoal image paths must be declared in ``meta/info.json`` under the
``subgoals`` key. When the key is missing (the state of every
LeRobot dataset today), we assume no subgoal images exist and return
``{}``; :meth:`BaseDataset._emit_optional_keys` then emits
``subgoal_is_pad=True`` for every slot. Datasets opt in by adding
the key to info.json.
Subgoal supervision is always-on for any dataset that exposes camera
keys; the dedicated ``subgoals`` info.json declaration that the older
pi07_paligemma path required is no longer consulted. Datasets without
any cameras (``self.num_cams == 0`` or empty
``self.meta.camera_keys``) still return ``{}``, which lets
:meth:`BaseDataset._emit_optional_keys` emit ``subgoal_is_pad=True``
for every slot.

When the key IS present:
Behavior for camera-bearing datasets:
- The at-end-of-segment vs uniform sampling roll happens ONCE per
``__getitem__`` call (shared across all camera slots); each slot
decodes the frame from its own video.
fetches the frame from its own source — video file for ``video``
dtype features, parquet row for ``image`` dtype features (the
latter share the same within-episode frame index).
- Drop-roll is short-circuited here so a dropped subgoal skips the
per-camera ``_query_videos`` decode. When
per-camera decode/lookup. When
``self.enable_optional_key_dropout`` is False (e.g. the validation
subset), drop is never rolled — the frame-selection randomness
stays live because it's about which future frame to read, not
masking.
- Episodes with no ``segments`` entry in ``episodes.jsonl`` still
skip sampling (no segment boundaries → nothing to clip against).
- Episodes with no ``segments`` entry in ``episodes.jsonl`` fall
back to a fixed ~4 s lookahead inside ``_sample_subgoal_frame``
rather than skipping subgoal loading, so legacy datasets without
segment annotations still get supervision.
"""
if self.num_cams <= 0 or len(self.meta.video_keys) == 0:
return {}
if "subgoals" not in self.meta.info:
return {}
if "segments" not in self.meta.episodes[ep_idx]:
if self.num_cams <= 0 or len(self.meta.camera_keys) == 0:
return {}
# Roll drop before any video decoding — at `subgoal_drop_prob=0.75` the
# old ordering threw away 75% of decodes.
Expand All @@ -1818,13 +1827,21 @@ def _load_subgoal_frames(self, ep_idx: int, frame_in_ep: int) -> dict[str, torch
at_end = bool(torch.rand(()) < self.subgoal_end_of_segment_prob)
subgoal_frame = self._sample_subgoal_frame(ep_idx, frame_in_ep, at_end_of_segment=at_end)
ts = subgoal_frame / self.fps
ep_start = int(self.episode_data_index["from"][self.epi2idx[ep_idx]].item())
out: dict[str, torch.Tensor] = {}
for k in range(self.num_cams):
vid_key = name_map.get(f"camera{k}")
if vid_key is None or vid_key not in self.meta.video_keys:
cam_key = name_map.get(f"camera{k}")
if cam_key is None:
continue
frames = self._query_videos({vid_key: np.array([ts])}, ep_idx)
out[f"subgoal{k}_raw"] = frames[vid_key]
if cam_key in self.meta.video_keys:
frames = self._query_videos({cam_key: np.array([ts])}, ep_idx)
out[f"subgoal{k}_raw"] = frames[cam_key]
elif cam_key in self.meta.image_keys:
# Image-dtype cameras are stored per-frame in the parquet
# rows of ``hf_dataset``. The within-episode index returned
# by ``_sample_subgoal_frame`` maps directly to the absolute
# row ``ep_start + subgoal_frame``.
out[f"subgoal{k}_raw"] = self.hf_dataset[ep_start + subgoal_frame][cam_key]
return out

def _add_padding_keys(self, item: dict, padding: dict[str, list[bool]]) -> dict:
Expand Down
34 changes: 28 additions & 6 deletions src/opentau/policies/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,17 @@
from opentau.policies.pi05.configuration_pi05 import PI05Config
from opentau.policies.pi05_mem.configuration_pi05 import PI05MemConfig
from opentau.policies.pi06.configuration_pi06 import PI06Config
from opentau.policies.pi07_paligemma.high_level_planner.configuration_pi07_high_level import (
from opentau.policies.pi07.high_level_planner.configuration_pi07_high_level import (
PI07HighLevelPlannerConfig,
)
from opentau.policies.pi07.low_level_planner.configuration_pi07_low_level import (
PI07LowLevelPlannerConfig,
)
from opentau.policies.pi07_paligemma.high_level_planner.configuration_pi07_high_level import (
PI07HighLevelPlannerConfig as PI07PaligemmaHighLevelPlannerConfig,
)
from opentau.policies.pi07_paligemma.low_level_planner.configuration_pi07_low_level import (
PI07lowlevelPlannerConfig,
PI07lowlevelPlannerConfig as PI07PaligemmaLowLevelPlannerConfig,
)
from opentau.policies.pretrained import PreTrainedPolicy
from opentau.policies.value.configuration_value import ValueConfig
Expand Down Expand Up @@ -87,12 +93,24 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:
return PI06Policy
elif name == "pi07_paligemma_high_level_planner":
from opentau.policies.pi07_paligemma.high_level_planner.modeling_pi07_high_level import (
PI07HighLevelPlannerPolicy,
PI07HighLevelPlannerPolicy as PI07PaligemmaHighLevelPlannerPolicy,
)

return PI07HighLevelPlannerPolicy
return PI07PaligemmaHighLevelPlannerPolicy
elif name == "pi07_paligemma_low_level_planner":
from opentau.policies.pi07_paligemma.low_level_planner.modeling_pi07_low_level import (
PI07LowLevelPlannerPolicy as PI07PaligemmaLowLevelPlannerPolicy,
)

return PI07PaligemmaLowLevelPlannerPolicy
elif name == "pi07_high_level":
from opentau.policies.pi07.high_level_planner.modeling_pi07_high_level import (
PI07HighLevelPlannerPolicy,
)

return PI07HighLevelPlannerPolicy
elif name == "pi07_low_level":
from opentau.policies.pi07.low_level_planner.modeling_pi07_low_level import (
PI07LowLevelPlannerPolicy,
)

Expand Down Expand Up @@ -135,9 +153,13 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:
elif policy_type == "pi06":
return PI06Config(**kwargs)
elif policy_type == "pi07_paligemma_high_level_planner":
return PI07HighLevelPlannerConfig(**kwargs)
return PI07PaligemmaHighLevelPlannerConfig(**kwargs)
elif policy_type == "pi07_paligemma_low_level_planner":
return PI07lowlevelPlannerConfig(**kwargs)
return PI07PaligemmaLowLevelPlannerConfig(**kwargs)
elif policy_type == "pi07_high_level":
return PI07HighLevelPlannerConfig(**kwargs)
elif policy_type == "pi07_low_level":
return PI07LowLevelPlannerConfig(**kwargs)
elif policy_type == "value":
return ValueConfig(**kwargs)
else:
Expand Down
13 changes: 13 additions & 0 deletions src/opentau/policies/pi07/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright 2026 Tensor Auto Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Loading
Loading