Python replication for 'A Tutorial on Thompson Sampling', written by by Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband and Zheng Wen in 2018.
If you have any confusion about the code or want to report a bug, please open an issue instead of emailing me directly.
The article,'A Tutorial on Thompson Sampling',also shown as 'TS_tutorial' in file, covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, productrecommendation, assortment, active learning with neuralnetworks, and reinforcement learning in Markov decisionprocesses. Most of these problems involve complex information structures, where information revealed by taking anaction informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effectiveand relations to alternative algorithms.
All of the coding aim to promote the study of TS. Hopefully, it helps all of TS learner, including me, to grap the knowledge faster.
2019.7.18