Skip to content

MIKASA-Robo-VLA v1.0.0

Latest

Choose a tag to compare

@avanturist322 avanturist322 released this 22 May 16:04
· 4 commits to main since this release

What is MIKASA-Robo-VLA?

MIKASA-Robo-VLA extends the MIKASA-Robo memory benchmark to language-conditioned Vision-Language-Action research. It provides tabletop robotic manipulation environments that require an agent to retain and use information across delayed, occluded, temporal, or multi-stage interactions.

The canonical VLA benchmark contains 90 tasks with natural-language instructions, ManiSkill/Gymnasium environments, and released trajectory datasets for training and evaluation. The benchmark task manifest is mikasa_robo_vla_envs.csv.

What changed from MIKASA-Robo (RL release)

  • Task set grows from 32 → 90 registered environments covering 10 memory types (vs 4 in the RL release).
  • Every task ships a natural-language LANGUAGE_INSTRUCTION for VLA conditioning.
  • Episodes are grouped into three horizon splits (Short / Medium / Long) so multi-task training and evaluation are tractable.
  • 22,500 PPO / motion-planning oracle trajectories are released on Hugging Face in RLDS and LeRobotDataset v3 formats — no further conversion needed (6+ million transitions).
  • Dense and normalised-dense rewards are calibrated for every task, enabling both offline imitation learning and online RL.
  • The original 32-task RL implementation is available from the mikasa-robo-rl branch and remains under mikasa_robo_suite/rl/ for backwards compatibility.

Installation

pip install mikasa-robo-suite

Links