Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



18 Commits

Repository files navigation

EfficientZero (NeurIPS 2021)

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.


EfficientZero requires python3 (>=3.6) and pytorch (>=1.8.0) with the development headers.

We recommend to use torch amp (--amp_type torch_amp) to accelerate training.


Before starting training, you need to build the c++/cython style external packages. (GCC version 7.5+ is required.)

cd core/ctree

The distributed framework of this codebase is built on ray.


As for other packages required for this codebase, please run pip install -r requirements.txt.


Quick start

  • Train: python --env BreakoutNoFrameskip-v4 --case atari --opr train --amp_type torch_amp --num_gpus 1 --num_cpus 10 --cpu_actor 1 --gpu_actor 1 --force
  • Test: python --env BreakoutNoFrameskip-v4 --case atari --opr test --amp_type torch_amp --num_gpus 1 --load_model --model_path model.p \

Bash file

We provide and for training and evaluation.

  • Train:
    • With 4 GPUs (3090): bash
  • Test: bash
Required Arguments Description
--env Name of the environment
--case {atari} It's used for switching between different domains(default: atari)
--opr {train,test} select the operation to be performed
--amp_type {torch_amp,none} use torch amp for acceleration
Other Arguments Description
--force will rewrite the result directory
--num_gpus 4 how many GPUs are available
--num_cpus 96 how many CPUs are available
--cpu_actor 14 how many cpu workers
--gpu_actor 20 how many gpu workers
--seed 0 the seed
--use_priority use priority in replay buffer sampling
--use_max_priority use the max priority for the newly collectted data
--amp_type 'torch_amp' use torch amp for acceleration
--info 'EZ-V0' some tags for you experiments
--p_mcts_num 8 set the parallel number of envs in self-play
--revisit_policy_search_rate 0.99 set the rate of reanalyzing policies
--use_root_value use root values in value targets (require more GPU actors)
--render render in evaluation
--save_video save videos for evaluation

Architecture Designs

The architecture of the training pipeline is shown as follows:

Some suggestions

  • To use a smaller model, you can choose smaller dim of the projection layers (Eg: 256/64) and the LSTM hidden layer (Eg: 64) in the config.
  • For GPUs with 10G memory instead of 20G memory, you can allocate 0.25 gpu for each GPU maker (@ray.remote(num_gpus=0.25)) in core/

New environment registration

If you wan to apply EfficientZero to a new environment like mujoco. Here are the steps for registration:

  1. Follow the directory config/atari and create dir for the env at config/mujoco.
  2. Implement your MujocoConfig(BaseConfig) class and implement the models as well as your environment wrapper.
  3. Register the case at


Evaluation with 32 seeds for 3 different runs (different seeds).


If you find this repo useful, please cite our paper:

  title={Mastering Atari Games with Limited Data},
  author={Weirui Ye, and Shaohuai Liu, and Thanard Kurutach, and Pieter Abbeel, and Yang Gao},


If you have any question or want to use the code, please contact .


We appreciate the following github repos a lot for their valuable code base implementations:


Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.







No releases published


No packages published