GitHub - SJTU-DENG-Lab/WLA: The official implementation of World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

World-Language-Action Model for Unified World Modeling,
Language Reasoning, and Action Synthesis

If you find our project helpful, please give us a star ⭐ to support us 🙏🙏

📄 Paper | 🤗 Checkpoints | 📜 License

Demos.mp4

📌 ToDo

Training and evaluation code for LIBERO
Training and evaluation code for RoboTwin 2.0
Training and evaluation code for RMBench (before June 18)
Release code for learning new tasks from videos
Release code for Efficient Mode
Release code for TTS Mode

🤗 Models & Datasets

Model	Note
wla_libero_all_image_acton	Trained on all four LIBERO suites
wla_robotwin_all_image_action	Trained across all 50 RoboTwin 2.0 tasks
wla_rmbench_battery_try_image_language_action	will be released before June 18
wla_rmbench_blocks_ranking_try_image_language_action	will be released before June 18
wla_rmbench_cover_blocks_image_language_action	will be released before June 18
wla_rmbench_press_button_image_language_action	will be released before June 18
wla_robotwin_same_emb_videos_cotrain_image_action	Jointly trained on 45 seen tasks and same-embodiment videos of 5 unseen tasks
wla_robotwin_cross_emb_videos_cotrain_image_action	Jointly trained on 45 seen tasks and cross-embodiment videos of 5 unseen tasks

Dataset	Note
LIBERO_LeRobot	The LIBERO dataset in LeRobot v3.0 format
RoboTwin-LeRobot	The RoboTwin 2.0 dataset in LeRobot v3.0 format
RoboTwin-LeRobot-seen-tasks	The 45 seen-task subset of RoboTwin 2.0
RoboTwin-Lerobot-unseen-tasks-same-emb	The 5 unseen-task subset of RoboTwin 2.0 under the same-embodiment setting
RoboTwin-Lerobot-unseen-tasks-cross-emb	The 5 unseen-task subset of RoboTwin 2.0 under the cross-embodiment setting
RMBench-LeRobot	will be released before June 18

📈 Evaluation

LIBERO

First, clone the repository and create the conda environment:

git clone git@github.com:SJTU-DENG-Lab/WLA.git
cd WLA
conda env create -f configs/environment_libero.yml
conda activate wla_libero

Then clone and install the LIBERO repository:

git clone git@github.com:Lifelong-Robot-Learning/LIBERO.git
cd LIBERO
pip install -e .

Install other required packages:

cd ..
pip install -r experiments/libero/libero_requirements.txt

Evaluate the LIBERO benchmark:

bash experiments/libero/run_libero_eval.sh

You can modify the task_suite_name in the script to evaluate different task suites.

RoboTwin 2.0

First, create the conda environment:

conda env create -f configs/environment_robotwin.yml
conda activate wla_robotwin

Next, clone the RoboTwin 2.0 repository:

git clone git@github.com:RoboTwin-Platform/RoboTwin.git
cd RoboTwin

Then, follow the official installation guide to install RoboTwin. Once the installation is complete, you can run the evaluation on the RoboTwin 2.0 benchmark:

bash experiments/robotwin/run_robotwin_eval.sh

Modify TASK_NAME to evaluate different tasks. TASK_CONFIG determines whether to evaluate the demo_clean or demo_randomized setting.

CONTROL_MODE specifies the control mode: eef for 16-dim end-effector actions / states, or joint for 14-dim joint-angle actions / states. wla_libero_all_image_action uses eef, while the learning new tasks from videos experiments use joint.

RMBench

will be released before June 18

🔧 Training

First, create the training conda environment:

conda env create -f configs/environment_lerobot.yml
conda activate wla_lerobot

If you want to accelerate training with FlashAttention, run the following command to install it:

wget https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.7.16/flash_attn-2.8.3+cu128torch2.8-cp311-cp311-linux_x86_64.whl
pip install ./flash_attn-2.8.3+cu128torch2.8-cp311-cp311-linux_x86_64.whl

Next, modify the attn_implementation parameter in models/model.py:

attn_implementation="eager" => attn_implementation="flash_attention_2"

Then run the training script:

sh train.sh

You can modify the TRAINING_SETTING parameter in the script to train under different settings. The available options are as follows:

libero_all_image_action: Training on all four LIBERO suites
libero_all_action: Training on all four LIBERO suites without the World Expert
robotwin_all_image_action: Training on all 50 RoboTwin 2.0 tasks
robotwin_all_action: Training on all 50 RoboTwin 2.0 tasks without the World Expert
robotwin_seen_tasks_image_action: Training on the 45 seen-task subset of RoboTwin 2.0
robotwin_cross_emb_videos_cotrain_image_action: Joint training on 45 seen tasks and cross-embodiment videos of 5 unseen tasks
robotwin_same_emb_videos_cotrain_image_action: Joint training on 45 seen tasks and same-embodiment videos of 5 unseen tasks

✨ Acknowledgements

Heartfelt thanks to the creators of StarVLA and LeRobot for their open-sourced work!

📝 Citation

If you find our code or models useful in your work, please cite our paper:

@article{yang2026world,
  title={World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis},
  author={Yang, Yi and Liu, Zhihong and Kou, Siqi and Chen, Yiyang and Hu, Yanzhe and Zhou, Jianbo and Zhao, Boyuan and Wei, Zhijie and Xia, Xiao and Li, Xueqi and others},
  journal={arXiv preprint arXiv:2606.05979},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
configs		configs
experiments		experiments
models		models
utils		utils
utils_dataset		utils_dataset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
dataset.py		dataset.py
train.py		train.py
train.sh		train.sh
trainer.py		trainer.py
trainer_utils.py		trainer_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

World-Language-Action Model for Unified World Modeling,
Language Reasoning, and Action Synthesis

If you find our project helpful, please give us a star ⭐ to support us 🙏🙏

📘 Contents

📌 ToDo

🤗 Models & Datasets

📈 Evaluation

LIBERO

RoboTwin 2.0

RMBench

🔧 Training

✨ Acknowledgements

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

World-Language-Action Model for Unified World Modeling,Language Reasoning, and Action Synthesis

If you find our project helpful, please give us a star ⭐ to support us 🙏🙏

📘 Contents

📌 ToDo

🤗 Models & Datasets

📈 Evaluation

LIBERO

RoboTwin 2.0

RMBench

🔧 Training

✨ Acknowledgements

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

World-Language-Action Model for Unified World Modeling,
Language Reasoning, and Action Synthesis

Packages