Skip to content

wenshuoguo/inverse-bandit-code-release

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

This repository provides implementation for experiments as described in the paper:

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Wenshuo Guo, Kumar Krishna Agrawal, Aditya Grover, Vidya Muthukumar, Ashwin Pananjady

Requirements

pip install -e . -r requirements.txt

Experiments

Experiments are organized in notebooks. Please see example usage

jupyter-lab two_arms.ipynb

To cite this paper:

@article{guo2021exploringdemonstrator,
    title={Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits},
    author={Wenshuo Guo, Kumar Krishna Agrawal, Aditya Grover, Vidya Muthukumar, Ashwin Pananjady},
    journal={arXiv preprint arXiv:2106.14866}.
    year={2021}
}

Acknowledgements

Notebook for the error-landscape in the battery dataset builds significantly on https://github.com/chueh-ermon/battery-fast-charging-optimization

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published