Skip to content

yanyangbaobeiIsEmma-zz/Reinforcement-Learning-Contextual-Bandits

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning & Contextual MAB

Contextual Bandits Resources:

ICML 2017 Tutorial: provided by Emma Brunskill 😊 http://hunch.net/~rwil/

Agrawal & Goyal, Thompson Sampling for Contextual Bandits with Linear Payoffs, ICML 2013 https://arxiv.org/abs/1209.3352

Zhou, A survey on Contextual Multi-armed Bandits, 2015 https://arxiv.org/abs/1508.03326

Kaufmann, Contextual Bandit models for personalized recommendation (Slides), 2014 http://chercheurs.lille.inria.fr/ekaufman/ALICIA120514.pdf

Chapelle and Li, An Empirical Evaluation of Thompson Sampling, NIPS 2011 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/thompson.pdf

Li et al, A Contextual-Bandit Approach to Personalized News Article Recommendation https://arxiv.org/pdf/1003.0146.pdf

Bubeck and Bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit problems https://www.microsoft.com/en-us/research/wp-content/uploads/2017/01/SurveyBCB12.pdf

Introduction to Multi-armed bandits http://slivkins.com/work/MAB-book.pdf

A Tutorial of Thompson Sampling https://web.stanford.edu/~bvr/pubs/TS_Tutorial.pdf

Deep Reinforcement Learning:

https://www.youtube.com/watch?v=2pWv7GOvuf0&t=2217s

Complementary Materials:

UCB Algorithm: http://banditalgs.com/2016/09/18/the-upper-confidence-bound-algorithm/

Sub-Gaussian: https://ocw.mit.edu/courses/mathematics/18-s997-high-dimensional-statistics-spring-2015/lecture-notes/MIT18_S997S15_Chapter1.pdf

The first paper proving regret bound of MAB: http://www.rci.rutgers.edu/~mnk/papers/Lai_robbins85.pdf

Reinforcement Learning: An Introduction (Richard Sutton) https://pdfs.semanticscholar.org/aa32/c33e7c832e76040edc85e8922423b1a1db77.pdf

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published