This is the official repository for the paper:
MWM: Mobile World Models for Action-Conditioned Consistent Prediction
Han Yan*, Zishang Xiang*, Zeyu Zhang*†, and Hao Tang‡
School of Computer Science, Peking University
*Equal contribution. †Project lead. ‡Corresponding author
If you find our code or paper helpful, please consider starring ⭐ us and citing:
@article{yan2026mwm,
title={MWM: Mobile World Models for Action-Conditioned Consistent Prediction},
author={Yan, Han and Xiang, Zishang and Zhang, Zeyu and Tang, Hao},
journal={arXiv preprint arXiv:2603.07799},
year={2026}
}World models enable planning in imagined future predicted space, offering a promising framework for embodied navigation. However, existing navigation world models often lack action-conditioned consistency, so visually plausible pre- dictions can still drift under multi-step rollout and degrade planning. Moreover, efficient deployment requires few-step dif- fusion inference, but existing distillation methods do not explic- itly preserve rollout consistency, creating a training–inference mismatch. To address these challenges, we propose MWM, a mobile world model for planning-based image-goal navigation. Specifically, we introduce a two-stage training framework that combines structure pretraining with Action-Conditioned Consistency (ACC) post-training to improve action-conditioned rollout consistency. We further introduce Inference-Consistent State Distillation (ICSD) for few-step diffusion distillation with improved rollout consistency. Our experiments on benchmark and real-world tasks demonstrate consistent gains in visual fidelity, trajectory accuracy, planning success, and inference efficiency.
2026/03/12: 🎉 Our paper has been promoted by Heart of Embodied Intelligence.
- Upload our paper to arXiv and build project pages.
- Upload the code.
- Upload the model.
Clone the repository and Create a conda environment:
git clone https://github.com/AIGeeksGroup/MWM.git
cd MWM
conda create -n mwm python=3.10
conda activate mwm
pip install -r requirements.txtPlease follow the official download and preprocess guide at NWM for detailed data download and preprocessing instructions.
Two-stage training (Structure Pretraining + Action-Conditioned Consistency (ACC) Post-training)
cd mwm
bash finetune_in_envs.shThe LoRA adapter fine-tuned with ACC post-training on the SCAND dataset has been uploaded to Hugging Face. It is based on NWM cdit_xl_100000.
Evaluate ACC and generation quality in SCAND
bash single_frame_evaluation.shEvaluate navigation performance in SCAND
bash trajectory_evaluation.shcd realworld_deployStart the Inference Service
bash start_nwm_infer_service.shcd policies/nwm/realCollect data with MMK2
python record_data.pyData Processing
python process_episodes.pyThe client connects to both the MMK2 robot and the inference server, and is currently supported only on Windows.
First, enable port forwarding:
ssh -p <SSH_PORT> -L 8000:127.0.0.1:8000 <USERNAME>@<SERVER_HOST>
Then, run the client in Windows PowerShell:
cd realworld_deploy/policies/nwm/real
powershell -ExecutionPolicy Bypass -File .\run_client.ps1
We thank the authors of NWM for their open-source code.