Skip to content

Shentao-YANG/SDM-GAN_ICML2022

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Model-based Stationary Distribution Regularization

Source codes for the experiments in Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning.

Installation

  1. Install basic packages, using e.g.,
conda create -n sdmgan python=3.8.5
conda activate sdmgan
pip install numpy matplotlib seaborn gym==0.17.0 torch==1.10.1 cudatoolkit==11.1.74

and adding other possible dependencies. 2. Install MuJoCo and mujoco-py. 3. Install D4RL.

Toy Experiment

Source code for the toy experiment is in the Toy_Experiment folder. Toy_Experiment/run.sh is an example of run file. Please modify it according to hardware availability.

Offline RL Experiment

Source code for the offline RL experiment using the D4RL benchmark is in the D4RL_Experiment folder.

SDM-GAN Variant

The run files for the SDM-GAN variant is generated by the D4RL_Experiment/submit_jobs_server_gan.py file. An example use of this file is

cd D4RL_Experiment
python submit_jobs_server_gan.py

Flags can be provided to the python command. Please take a look at this file for available flags.

The location of the generated run files will be printed out.

SDM-WGAN Variant

The file D4RL_Experiment/submit_jobs_server_w1.py generates the run files for the SDM-WGAN variant used in the ablation study. The usage is the same as the SDM-GAN variant.

Evaluation

The run files will generate a folder for each (dataset, seed) pair. Within a such folder, the file eval_norm.npy stores the normalized scores and eval.npy records the unnormalized scores. The normalized scores are calculated by the D4RL package.

License

MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published