average-reward-methods

Accompanying code for the paper "Learning and Planning in Average-Reward Markov Decision Processes" by Yi Wan*, Abhishek Naik*, Rich Sutton.

agents/ folder contains all the algorithms.
environments/ folder contains all the environments.
config_files/ folder contains sample configuration files for various experiments.
experiments.py contains methods to run different kinds of experiments, e.g., prediction, control.
run_exp.py runs an experiment based on command-line arguments outlined below.

A typical experiment looks like:

python run_exp.py --exp run_exp_learning_control_no_eval --config-file config_files/control_AccessControl_diff-q.json --output-folder results/control/AccessControl

where,

exp: the experiment to be run. For prediction and control, this will generally be run_exp_learning_prediction or run_exp_learning_control_no_eval. Check experiments.py for full documentation and use-cases.
config-file: the file with all the experiment configurations
output-folder: the location where all the result-logs will be stored

Optional parameters for deploying experiments at scale:

cfg-start: the start index of the list of configurations for this script
cfg-end: the end index of the list of configurations for this script (refer to utils/sweeper.py for more details)

Check out the jupyter notebook learning_planning_exps.ipynb for sample experiments and the plots reported in the paper.

Requirements:

python3 (tested with 3.7.6)
numpy (tested with 1.18.1)
tqdm (tested with 4.40.2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agents

agents

config_files

config_files

environments

environments

utils

utils

.gitignore

.gitignore

README.md

README.md

experiments.py

experiments.py

learning_planning_exps.ipynb

learning_planning_exps.ipynb

run_exp.py

run_exp.py

Repository files navigation

average-reward-methods

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agents		agents
config_files		config_files
environments		environments
utils		utils
.gitignore		.gitignore
README.md		README.md
experiments.py		experiments.py
learning_planning_exps.ipynb		learning_planning_exps.ipynb
run_exp.py		run_exp.py

abhisheknaik96/average-reward-methods

Folders and files

Latest commit

History

Repository files navigation

average-reward-methods

About

Resources

Stars

Watchers

Forks

Languages