WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

Official codebase for the ICLR 2026 paper: WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control.

WIMLE is a model-based reinforcement learning method: it learns stochastic, multi-modal world models and uses predictive uncertainty to emphasize the synthetic rollouts that matter most during training. Evaluated on 40 continuous-control benchmarks—DeepMind Control Suite, HumanoidBench, and MyoSuite—it substantially improves sample efficiency and asymptotic performance, with state-of-the-art results across the board.

DMC (Dog & Humanoid)	DMC (Full)
HumanoidBench	MyoSuite

Note: Our implementation and codebase structure are heavily inspired by the BiggerRegularizedOptimistic (BRO) codebase.

Getting Started

This codebase uses isolated Python virtual environments. To avoid dependency conflicts and JAX/CUDA version mismatches (especially with mujoco and dm_control), we provide separate requirements.txt files for each task suite.

Note on CUDA: The requirements install a local pinned version of nvidia-cuda-nvcc-cu12. To allow JAX to compile using ptxas without a system-wide CUDA installation, simply run the following command in your terminal after activating the environment:

export PATH=$(python -c "import site; print(site.getsitepackages()[0] + '/nvidia/cuda_nvcc/bin')"):$PATH

1. Setup Virtual Environment

First, create and activate a fresh Python virtual environment. We recommend naming it .test as our scripts automatically look for it:

python -m venv .test
source .test/bin/activate

2. Install Dependencies

Depending on your target benchmark suite, run one of the following inside your active environment:

For DeepMind Control Suite (DMC):

pip install -r requirements_dmc.txt

For HumanoidBench (HB):

pip install -r requirements_hb.txt

For MyoSuite (MYO):

pip install -r requirements_myo.txt

Training

You can easily launch training across any supported benchmark via the provided scripts/train.sh launcher. This script takes the benchmark and environment name as arguments, automatically parses the correct optimal rollout horizon ($H$) directly from the ICLR paper appendix, and handles invoking train_parallel.py!

# General Usage:
./scripts/train.sh <benchmark> <env_name>

# Running a HumanoidBench task (e.g., h1-maze-v0)
./scripts/train.sh hb h1-maze-v0

# Running a DMC task (e.g., humanoid-run)
./scripts/train.sh dmc humanoid-run

# Running a MyoSuite manipulation task (e.g., myo-key-turn-hard)
./scripts/train.sh myo myo-key-turn-hard

(For a complete list of tested environment names, please refer to the configurations inside scripts/train.sh).

Logs and metrics are recorded using Weights & Biases (wandb). By default, the wandb_mode in the launcher is set to online to automatically sync your runs to the W&B cloud!

Reported results

The numbers reported in the paper are bundled under results/. Curves use 100 evaluation points at 10k environment-step spacing (1M steps total per run).

results/np_results/ — Per-suite NumPy arrays (dmc/, hb/, myo/), one .npy file per environment (return curves; rows are seeds, columns are those 10k-step checkpoints).
results/wimle.csv — All runs in a single long-form table (exp_name, env_name, seed, metric, env_step, value).
results/wimle.pkl — Same data as a Python dict keyed by env_name (each value is the array for that task).

Star History

Citation

If you find this code or our paper useful in your research, please cite our work:

@inproceedings{aghabozorgi2026wimle,
    title={{WIMLE}: Uncertainty-Aware World Models with {IMLE} for Sample-Efficient Continuous Control},
    author={Mehran Aghabozorgi and Yanshu Zhang and Alireza Moazeni and Ke Li},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=mzLOnTb3WH}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
figures		figures
jaxrl		jaxrl
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hps.py		hps.py
index.html		index.html
requirements_dmc.txt		requirements_dmc.txt
requirements_hb.txt		requirements_hb.txt
requirements_myo.txt		requirements_myo.txt
train_parallel.py		train_parallel.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

Getting Started

1. Setup Virtual Environment

2. Install Dependencies

Training

Reported results

Star History

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

Getting Started

1. Setup Virtual Environment

2. Install Dependencies

Training

Reported results

Star History

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages