Skip to content

wilrev/MultimodalBandits

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Code for Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms

This repository contains the Python code used for the runtime and regret experiments in the paper.

Code Structure

➤ The code is organized in this order:

  1. Helpful auxiliary functions
  2. Main DP algorithm
  3. Improved DP algorithm (see Appendix E in the paper)
  4. Subgradient descent procedure
  5. OSSB implementation
  6. Experiments from the paper

➤ The flags RUNTIME_EXPERIMENT, RUNTIME_IMPROVED_DP_EXPERIMENT and REGRET_EXPERIMENT can be set to True to run the experiments of Appendix A.2, Appendix E.8 and Section 6 respectively.

All functions have a docstring, and a documentation is found in folder "docs".

License

MIT License.

Contact

Feel free to contact the authors:

William Réveillard wilrev@kth.se

Richard Combes richard.combes@centralesupelec.fr

About

Code for "Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages