TensorAuto · shuheng-liu · Mar 26, 2026 · Mar 25, 2026 · Mar 26, 2026 · Mar 26, 2026
diff --git a/README.md b/README.md
@@ -64,6 +64,10 @@ We provide fully functioning $\pi_{0.5}$ checkpoints trained with high success r
 
 | Model Checkpoint              | Description                                                                                                   | Success Rate (%)                                                   |
 |-------------------------------|---------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|
+| [TensorAuto/Robocasa_navigatekitchen][12] | A $\pi_{0.5}$ model checkpoint trained on Navigate to Kitchen objects task on Robocasa. | 97%                                                             |
+| [TensorAuto/Robocasa_Closeupdown][11] | A $\pi_{0.5}$ model checkpoint trained on Close Oven, Close Toaster and Close Dishwasher on Robocasa. | Close Oven : 90% <br> Close Toaster : 70% <br> Close Dishwasher : 90%                                                             |
+| [TensorAuto/TensorAuto/robocasa_Closesideways][10]| A $\pi_{0.5}$ model checkpoint trained on Close Microwave, Close Cabinet and Close Fridge on Robocasa. | Close Microwave : 97% <br> Close Cabinet : 65% <br> Close Fridge : 80%                                                             |
+| [TensorAuto/pi05_libero_continuous_state][9]   | A $\pi_{0.5}$ model checkpoint trained on Libero dataset with continuous states (projecting raw proprioceptive states to models latent dimension). | 92%                                                         |
 | [TensorAuto/moka_pot_libero_sft][6] <br> [TensorAuto/moka_pot_RECAP_R0][7] <br> [TensorAuto/moka_pot_RECAP_R1][8]   | A $\pi_{0}$ RECAP model checkpoint trained on moka pot task on libero. | 83% <br> 89% <br> 90%                                                             |
 | [TensorAuto/tPi0.5-libero][2] | A $\pi_{0.5}$ model checkpoint trained on the LIBERO dataset with discrete actions and knowledge insulation.  | 98.4% (10) <br> 97.6% (Goal) <br> 100% (Object) <br> 98% (Spatial) |
 | [TensorAuto/pi05_base][5]     | A $\pi_{0.5}$ model checkpoint converted from the official openpi checkpoint, with language embeddings added. | N/A                                                                |
@@ -81,3 +85,7 @@ This project builds on the $\pi$ series of [papers][3] and many other open-sourc
 [6]: https://huggingface.co/TensorAuto/moka_pot_libero_sft
 [7]: https://huggingface.co/TensorAuto/moka_pot_RECAP_R0
 [8]: https://huggingface.co/TensorAuto/moka_pot_RECAP_R1
+[9]: https://huggingface.co/TensorAuto/pi05_libero_continuous_state
+[10]: https://huggingface.co/TensorAuto/robocasa_Closesideways
+[11]: https://huggingface.co/TensorAuto/Robocasa_Closeupdown
+[12]: https://huggingface.co/TensorAuto/Robocasa_navigatekitchen
diff --git a/docs/source/tutorials.rst b/docs/source/tutorials.rst
@@ -16,3 +16,4 @@ This section provides step-by-step guides for common tasks in OpenTau, including
    RL
    tutorials/human_demo
    tutorials/ros_conversion
+   tutorials/robocasa
diff --git a/docs/source/tutorials/robocasa.rst b/docs/source/tutorials/robocasa.rst
@@ -0,0 +1,162 @@
+.. _robocasa:
+
+.. _robocasa_client_gist: https://gist.github.com/akshay18iitg/4d299c135c2d384ceb9a283b745baa01
+
+RoboCasa setup and rollout client
+=================================
+
+This page explains how to set up **RoboCasa** (kitchen simulation) alongside **OpenTau**, run the **policy WebSocket server** that serves an OpenTau checkpoint, and run the **rollout client** against that server.
+
+The rollout client code **is not shipped in the OpenTau repository**. Use the reference implementation in `robocasa_client_gist`_ (RoboCasa policy client: ``client`` and ``client_async``).
+
+.. note::
+   Complete the base :doc:`/installation` steps first. RoboCasa itself is installed **outside** the OpenTau package. OpenTau provides the **policy server**; you run the **client** inside your RoboCasa install (files from the gist, or equivalent).
+
+Overview
+--------
+
+The workflow is usually split across machines or terminals:
+
+1. **OpenTau host** — runs the WebSocket policy server, loads ``policy.pretrained_path`` from a training config, and returns **action chunks** via MessagePack.
+2. **RoboCasa host** — runs the kitchen sim, JPEG-encodes cameras, and talks to the server. Parallel rollouts use a threaded **async** client that **batches** observations for workers that need a new chunk.
+
+**In this repo**
+
+* ``opentau.scripts.robocasa.server`` — WebSocket server (single-observation or batched requests; replies are **action chunks** per request row).
+
+**Outside this repo**
+
+* ``robocasa.scripts.client`` / ``robocasa.scripts.client_async`` — reference rollout scripts from `robocasa_client_gist`_ (place them under your ``robocasa`` package tree or run them as you prefer).
+
+Server dependencies (``websockets``, ``msgpack``) are in OpenTau’s ``pyproject.toml``. The server needs **OpenCV** (``cv2``) to decode JPEG camera inputs.
+
+
+Prerequisites
+-------------
+
+**Hardware and OS**
+
+* Linux with an NVIDIA GPU is recommended for both RoboCasa (MuJoCo) and OpenTau inference.
+* Follow GPU guidance in :doc:`/installation`.
+
+**Python**
+
+* OpenTau targets **Python 3.10** (see ``requires-python`` in the repo root ``pyproject.toml``). Match or reconcile Python versions with your RoboCasa environment.
+
+**RoboCasa simulation**
+
+RoboCasa is not fully installed by ``pip install opentau``. Install the simulator and assets from upstream:
+
+* `RoboCasa installation <https://robocasa.ai/docs/introduction/installation.html>`_
+
+**OpenTau**
+
+Install OpenTau as in :doc:`/installation` (e.g. ``uv sync`` or ``pip install -e .``).
+
+
+Policy server (OpenTau)
+-----------------------
+
+The server listens on WebSocket and uses **MessagePack** for request and response bodies.
+
+**Inference**
+
+* Each successful call uses ``policy.sample_actions`` (not ``select_action``): the model predicts a **temporal chunk** of actions. The last dimension is trimmed or zero-padded to ``--robocasa_action_dim``.
+
+**Requests**
+
+* **Single observation:** top-level dict with ``images`` (JPEG bytes per camera name), ``state`` (list of floats), ``prompt`` (string).
+* **Batch:** ``{ "batch": true, "items": [ { ... same fields ... }, ... ] }``.
+
+**Responses**
+
+* **Single:** one chunk as nested lists: ``[[float, ...], ...]`` — shape ``(T, action_dim)`` with ``T`` equal to the policy’s predicted horizon (e.g. ``n_action_steps``).
+* **Batch:** ``[ chunk_0, chunk_1, ... ]`` — one chunk per ``items`` row, same order.
+
+**Entry point**
+
+.. code-block:: bash
+
+   python -m opentau.scripts.robocasa.server \
+       --config_path /path/to/train_config.json
+
+**RoboCasa-specific flags** (must appear **before** normal OpenTau config flags; they are parsed first and stripped from ``sys.argv``):
+
+.. list-table::
+   :header-rows: 1
+   :widths: 28 72
+
+   * - Flag
+     - Meaning
+   * - ``--robocasa_host``
+     - Bind address (default ``0.0.0.0``). Use ``127.0.0.1`` to listen only locally.
+   * - ``--robocasa_port``
+     - TCP port (default ``8765``).
+   * - ``--robocasa_action_dim``
+     - Flat action width for reply padding/trimming (default ``16``; align with RoboCasa env and training).
+   * - ``--robocasa_torch_compile``
+     - ``true`` / ``false`` — whether to compile ``sample_actions`` when supported (default ``true``).
+
+**Example**
+
+.. code-block:: bash
+
+   python -m opentau.scripts.robocasa.server \
+       --robocasa_host 0.0.0.0 \
+       --robocasa_port 8765 \
+       --robocasa_action_dim 16 \
+       --config_path /path/to/train_config.json
+
+The training config must define ``policy.pretrained_path`` and settings compatible with your checkpoint.
+
+
+Rollout client (RoboCasa environment)
+-------------------------------------
+
+Get the client sources from `robocasa_client_gist`_.
+
+Typical layout after copying into a RoboCasa checkout:
+
+* ``robocasa/scripts/client.py`` — single-env style client (if provided in the gist).
+* ``robocasa/scripts/client_async.py`` — threaded client that **batches** observations for workers that need a **new action chunk**, sends one WebSocket message per batch, receives one chunk per batch row, then **steps the simulator for every action in each chunk** before querying the server again.
+
+If your PandaOmron-style env expects actions in a particular layout, the gist may include a ``convert_action_pi05`` helper (or equivalent); wire it to match ``create_env`` / your task.
+
+**Example (async / batched client)**
+
+.. code-block:: bash
+
+   python -m robocasa.scripts.client_async ENV_NAME \
+       --host localhost \
+       --port 8765
+
+Replace ``ENV_NAME`` with a registered RoboCasa kitchen task. Common options (see the gist for the exact CLI):
+
+* ``--num-rollouts`` — total episodes.
+* ``--num-parallel`` — parallel env threads (batch size is at most the count of workers requesting a chunk at once).
+* ``--seed``, ``--split``, ``--output-dir``, ``--max-episode-steps``, ``--render``, ``--jpeg-quality``.
+
+**Environment variables** (if supported by the gist client)
+
+* ``ROBOCASA_POLICY_HOST`` — default host.
+* ``ROBOCASA_POLICY_PORT`` — default port.
+
+
+Protocol and outputs (summary)
+------------------------------
+
+* **Transport:** WebSocket binary frames, MessagePack.
+* **Client → server (batch):** ``{ "batch": true, "items": [ { "images": {...}, "state": [...], "prompt": "..." }, ... ] }``.
+* **Server → client (batch):** list of action chunks; each chunk is ``(T, action_dim)`` as nested lists.
+* **Rollout output:** directory with ``rollouts.json`` and, when not rendering on screen, per-rollout MP4s per camera (behavior as implemented in the gist).
+
+For server implementation details, see ``src/opentau/scripts/robocasa/server.py``. For client behavior and options, see `robocasa_client_gist`_.
+
+
+Troubleshooting
+---------------
+
+* **Import errors for ``robocasa``** — Install RoboCasa per upstream docs; run the client from that environment.
+* **Server JPEG decode errors** — Install OpenCV for Python on the server (``cv2``).
+* **Port in use** — Change ``--robocasa_port`` / client ``--port``.
+* **Action shape / chunk mismatch** — Align ``--robocasa_action_dim`` with training and env; ensure the client consumes **chunks** (multiple steps per server reply) if you use chunking inference.
diff --git a/src/opentau/scripts/robocasa/__init__.py b/src/opentau/scripts/robocasa/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2026 Tensor Auto Inc. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.