GitHub - initial-h/CEER: Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay. ICLR 2023

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

Overview

PyTorch implementation of Conservative Estimation with Experience Replay (CEER).
Method is tested on Sokoban, Minigrid and MinAtar environments.

Installation

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt

My Python version is 3.7.11. CUDA version is 11.4.

Running Experiments

python main.py

Modify atari_name_list in ceer/arguments.py for different environments.
For example, 'atari_name_list': ['Sokoban-Push_5x5_1_120'].
Other parameters like sample_method_para # alpha,policy_loss_para # lambda are also in ceer/arguments.py.

Bibtex

@inproceedings{
zhang2023replay,
title={Replay Memory as An Empirical {MDP}: Combining Conservative Estimation with Experience Replay},
author={Hongming Zhang and Chenjun Xiao and Han Wang and Jun Jin and Bo Xu and Martin M{\"u}ller},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=SjzFVSJUt8S}
}

Acknowledgements

Awesome Environments used for testing:

Sokoban: https://github.com/mpSchrader/gym-sokoban

Minigrid: https://github.com/Farama-Foundation/Minigrid

MinAtar: https://github.com/kenjyoung/MinAtar
Some baselines can be found in following works:

TER: https://openreview.net/forum?id=OXRZeMmOI7a

Dreamerv2: https://github.com/RajGhugare19/dreamerv2

Tianshou: https://github.com/thu-ml/tianshou

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
env_wrappers		env_wrappers
pic		pic
LICENSE		LICENSE
README.md		README.md
agents.py		agents.py
arguments.py		arguments.py
buffers.py		buffers.py
main.py		main.py
networks.py		networks.py
optimal_lr.py		optimal_lr.py
requirements.txt		requirements.txt
rl_algorithms.py		rl_algorithms.py
schedules.py		schedules.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

Overview

Installation

Running Experiments

Bibtex

Acknowledgements

About

Releases

Packages

Languages

License

initial-h/CEER

Folders and files

Latest commit

History

Repository files navigation

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

Overview

Installation

Running Experiments

Bibtex

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages