Skip to content

JacobHA/RLC-SoftQBounding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

107 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to the repository for the RLC 2024 paper titled "Boosting Soft Q-Learning by Bounding" by Jacob Adamczyk, Volodymyr Makarenko, Stas Tiomkin, and Rahul Kulkarni Here you can reproduce the paper's experiments, and try bounded soft Q-learning on your own environments.

We use Gymnasium for RL environments and wandb for logging / hyperparameter tuning.

Setup

To prepare an environment with conda:

  1. setup a conda env conda create --name qbounds python=3.10
  2. activate the conda env conda activate qbounds
  3. python requirements: pip install -r requirements.txt

FA experiments

reproduce the finetuned results

  1. Run the best hparams for an environment
    python experiments.py --env_id=CartPole-v1

  2. plot the results for the spcific environment
    python tb_plotter.py --env_id=CartPole-v1

re-run the hparam sweep

  1. Run the hparam sweep for an environment
    python experiments.py --env_id=CartPole-v1 --do_sweep=True
  2. extract the best hparams logged with wandb
    python wandb_best_hparams.py --env_id=CartPole-v1 --entity=your_wandb_entity

Tabular experiments

reproduce the figures

  1. Run the lr sweep for each of the methods of interest. It will use 30 mazes contained in the mazes directory. python tabular_experiments/tabular_lr_sensitivity.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages