A package playing with Stochastic Multi-Armed Bandits (MAB) 🎰 In the last decade bandits became widely used machine learning algorithm.
In following jupyter notebooks we tried demonstrate different types of arms, bandit policies and types of Stochastic Multi-Armed Bandits.
- Non-adaptive policies - Uniform exploration and Epsilon-Greedy policy
- Adaptive policies - Successive Elimination and UCB1 policy
The package was developed on Typed Python 3.8.0
and the required packages can be find in requirements folder.
In order to install bandito
from GitHub, run:
pip install git+https://github.com/matejker/bandito.git@master # install the latest [maybe not stable] version
pip install git+https://github.com/matejker/bandito.git@v0.1.0 # install specific version
In this repo we use a few tools to keep the code clean, styled and properly tested:
make lint # runs flake8 and Black check
make autoformat # runs Black formating
make typecheck # runs mypy
make test # runs unit [py]tests
[1] Slivkins A. (2019), Introduction to Multi-Armed Bandits, arXiv:1904.07272, https://arxiv.org/abs/1904.07272