PID-Accelerated-TD-Learning

An application of ideas from control theory to hopefully accelerate the dynamics of TD learning.

This builds on the work of Farahmand and Ghavamzadeh [1] in an RL setting.

[1] A.M. Farahmand and Mohammad Ghavamzadeh, “PID Accelerated Value Iteration Algorithm,” International Conference on Machine Learning (ICML), 2021.

Reproducibility Instructions:

Create a directory called outputs in the top level if one doesn't already exist.
Change the base directory in the slurm/setup.sh and globals.py files.
Change the learning rates to grid search through at the top of TabularPID/hyperparameter_tests.py if needed.
Each slurm/*.sh file is an experiment. Change the parameters at the top as desired. See the list below for some documentation on their meaning.
Just run the file and the output should be created in the outputs directory.

Important parameters:

Repeat: The number of times an experiment is repeated on different seeds when calculating the results.

To run aggregated garnet results quicker, use run_aggregated_garnets.sh to split up the work across different nodes. Then, call slurm/plot_garnets.sh to plot the results.

Name		Name	Last commit message	Last commit date
Latest commit History 242 Commits
Experiments		Experiments
TabularPID		TabularPID
config		config
slurm		slurm
stable_baselines3		stable_baselines3
README.md		README.md
commit.bash		commit.bash
globals.py		globals.py
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiments

Experiments

TabularPID

TabularPID

config

config

slurm

slurm

stable_baselines3

stable_baselines3

README.md

README.md

commit.bash

commit.bash

globals.py

globals.py

makefile

makefile

Repository files navigation

PID-Accelerated-TD-Learning

Reproducibility Instructions:

About

Releases

Packages

Languages

Supermac30/PID-Accelerated-TD-Learning

Folders and files

Latest commit

History

Repository files navigation

PID-Accelerated-TD-Learning

Reproducibility Instructions:

About

Resources

Stars

Watchers

Forks

Languages