Skip to content

ikostrikov/implicit_q_learning

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

Ilya Kostrikov and Ilya Kostrikov Unfreeze jax dependency.

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
October 26, 2021 22:57
October 9, 2021 18:38
October 11, 2021 15:34
January 23, 2022 11:51
October 9, 2021 18:38
October 9, 2021 18:38
October 9, 2021 18:38
October 26, 2021 22:57
October 9, 2021 18:38
October 9, 2021 18:38
October 9, 2021 18:38

Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

For a PyTorch reimplementation see https://github.com/rail-berkeley/rlkit/tree/master/examples/iql

How to run the code

Install dependencies

pip install --upgrade pip

pip install -r requirements.txt

# Installs the wheel compatible with Cuda 11 and cudnn 8.
pip install --upgrade "jax[cuda]>=0.2.27" -f https://storage.googleapis.com/jax-releases/jax_releases.html

Also, see other configurations for CUDA here.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Finetuning on AntMaze tasks

python train_finetune.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_finetune_config.py --eval_episodes=100 --eval_interval=100000 --replay_buffer_size 2000000

Misc

The implementation is based on JAXRL.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages