Skip to content
πŸ”¬ Research Framework for Single and Multi-Players 🎰 Multi-Arms Bandits (MAB) Algorithms, implementing all the state-of-the-art algorithms for single-player (UCB, KL-UCB, Thompson...) and multi-player (MusicalChair, MEGA, rhoRand, MCTop/RandTopM etc).. Available on PyPI: https://pypi.org/project/SMPyBandits/ and documentation on
Jupyter Notebook TeX Python C++ C Makefile
Branch: master
Clone or download

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Should be good for SIC-MMAB, I also added two variants, one using nai… Oct 19, 2018
SMPyBandits fix(rawucb): missing constant in confidence tuning Feb 27, 2020
_templates Finally fixing the link for documentation, now everything relies on G… Feb 28, 2018
docs I submitted my paper on SMPyBandits once again to JMLR MLOSS. Oct 28, 2019
logs Almost done for #183 Mar 4, 2019
nix Adds vendored Nix packaging. Jan 12, 2020
notebooks Added some plots. May 29, 2019
other_scripts Fixing travis build. Dec 21, 2018
plots Added some plots. May 29, 2019
uml_diagrams
.gitignore Done for cleanup and re executing these notebooks. Nov 13, 2018
.readthedocs.yml See readthedocs/readthedocs.org#3070 (comment) Nov 20, 2018
.travis.yml Shorter travis build... This one crazily took >= 12hours, that's absurd. Dec 22, 2018
API.md Fixing links: any link in the *.md files point to the doc on https://… Dec 13, 2018
About_parallel_computations.md Typo May 9, 2019
Aggregation.md Added 'make nonstationnary' Dec 14, 2018
DoublingTrick.md Added 'make nonstationnary' Dec 14, 2018
How_to_run_the_code.md Adds vendored Nix packaging. Jan 12, 2020
LICENSE License not recognized as MIT ? Feb 27, 2018
MANIFEST.in See #113, and https://pypi.org/project/SMPyBandits/ Mar 7, 2018
Makefile Adds vendored Nix packaging. Jan 12, 2020
MultiPlayers.md Experiments on the threshold function beta(t, delta) for Bernoulli-GLR Jan 21, 2019
NonStationaryBandits.md Typo on the https://smpybandits.github.io/NonStationaryBandits.html page Jan 31, 2019
Profiling.md Added badges about: Nov 14, 2018
PublicationsWithSMPyBandits.md Update PublicationsWithSMPyBandits.md Mar 20, 2019
README.md Typo on documentation of Nix package. Jan 13, 2020
SMPyBandits.bib Finally fixing the link for documentation, now everything relies on G… Feb 28, 2018
SparseBandits.md Fixing links: any link in the *.md files point to the doc on https://… Dec 13, 2018
TODO.md
_config.yml Done for coding. Please hurry up on the report! Mar 2, 2017
complete-stats.txt Update stats and UML diagrams Jan 10, 2020
conf.py Update to v0.9.6 for PyPi (https://pypi.org/project/SMPyBandits/) Jul 12, 2019
default.nix Adds vendored Nix packaging. Jan 12, 2020
index.rst Added one page in the documentation May 9, 2019
logo.png Using smaller logo for the doc Oct 17, 2017
logo_large.png Using smaller logo for the doc Oct 17, 2017
requirements.txt Should start to be better for the travis build Sep 22, 2018
requirements_full.txt Cleanup hacks on import on config file, and fix a bug on Policies/Exp… Dec 22, 2018
setup.cfg See #113, and https://pypi.org/project/SMPyBandits/ Mar 7, 2018
setup.py Upgrade to v0.9.7 : https://pypi.org/project/SMPyBandits/0.9.7/ Oct 25, 2019
shell.nix Adds vendored Nix packaging. Jan 12, 2020

README.md

SMPyBandits

Open-Source Python package for Single- and Multi-Players multi-armed Bandits algorithms.

Logo, logo_large.png

This repository contains the code of Lilian Besson's numerical environment, written in Python (2 or 3), for numerical simulations on 🎰 single-player and multi-players Multi-Armed Bandits (MAB) algorithms.

Quick presentation

It contains the most complete collection of single-player (classical) bandit algorithms on the Internet (over 65!), as well as implementation of all the state-of-the-art multi-player algorithms.

I follow very actively the latest publications related to Multi-Armed Bandits (MAB) research, and usually implement quite quickly the new algorithms (see for instance, Exp3++, CORRAL and SparseUCB were each introduced by articles (for Exp3++, for CORRAL, for SparseUCB) presented at COLT in July 2017, LearnExp comes from a NIPS 2017 paper, and kl-UCB++ from an ALT 2017 paper.). More recent examples are klUCBswitch from a paper from May 2018, and also MusicalChairNoSensing from a paper from August 2018.

Open Source? Yes! Maintenance Ask Me Anything ! Analytics PyPI version PyPI implementation PyPI pyversions PyPI download PyPI status Documentation Status Build Status Stars of https://github.com/SMPyBandits/SMPyBandits/ Releases of https://github.com/SMPyBandits/SMPyBandits/

  • Classical MAB have a lot of applications, from clinical trials, A/B testing, game tree exploration, and online content recommendation (my framework does not implement contextual bandit - yet).
  • Multi-player MAB have applications in Cognitive Radio, and my framework implements all the collision models found in the literature, as well as all the algorithms from the last 10 years or so (rhoRand from 2009, MEGA from 2015, MusicalChair, and our state-of-the-art algorithms RandTopM and MCTopM, along with very recent algorithms SIC-MMAB from arXiv:1809.08151 and MusicalChairNoSensing from arXiv:1808.08416).
  • I'm working on adding a clean support for non-stationary MAB problem, and I will soon implement all state-of-the-art algorithms for these problems.

With this numerical framework, simulations can run on a single CPU or a multi-core machine, and summary plots are automatically saved as high-quality PNG, PDF and EPS (ready for being used in research article). Making new simulations is very easy, one only needs to write a configuration script and basically no code! See these examples (files named configuration_*.py).

A complete Sphinx documentation for each algorithms and every piece of code (included constants in the configurations!) is available here: SMPyBandits.GitHub.io. (I will use ReadTheDocs for this project, but I won't use any continuous integration, don't even think of it!)

I (Lilian Besson) have started my PhD in October 2016, and this is a part of my on going research since December 2016.

I launched the documentation on March 2017, I wrote my first research articles using this framework in 2017 and decided to (finally) open-source my project in February 2018. Commits of https://github.com/SMPyBandits/SMPyBandits/ / Date of last commit of https://github.com/SMPyBandits/SMPyBandits/ Issues of https://github.com/SMPyBandits/SMPyBandits/ : Open issues of https://github.com/SMPyBandits/SMPyBandits/ / Closed issues of https://github.com/SMPyBandits/SMPyBandits/


How to cite this work?

If you use this package for your own work, please consider citing it with this piece of BibTeX:

@misc{SMPyBandits,
    title =   {{SMPyBandits: an Open-Source Research Framework for Single and Multi-Players Multi-Arms Bandits (MAB) Algorithms in Python}},
    author =  {Lilian Besson},
    year =    {2018},
    url =     {https://github.com/SMPyBandits/SMPyBandits/},
    howpublished = {Online at: \url{github.com/SMPyBandits/SMPyBandits}},
    note =    {Code at https://github.com/SMPyBandits/SMPyBandits/, documentation at https://smpybandits.github.io/}
}

I also wrote a small paper to present SMPyBandits, and I will send it to JMLR MLOSS. The paper can be consulted here on my website.

A DOI will arrive as soon as possible! I tried to publish a paper on both JOSS and MLOSS.

List of research publications using SMPyBandits

1st article, about policy aggregation algorithm (aka model selection)

I designed and added the Aggregator policy, in order to test its validity and performance.

It is a "simple" voting algorithm to combine multiple bandit algorithms into one. Basically, it behaves like a simple MAB bandit just based on empirical means (even simpler than UCB), where arms are the child algorithms A_1 .. A_N, each running in "parallel".

For more details, refer to this file: Aggregation.md and this research article.

PDF : BKM_IEEEWCNC_2018.pdf | HAL notice : BKM_IEEEWCNC_2018 | BibTeX : BKM_IEEEWCNC_2018.bib | Source code and documentation Published Maintenance Ask Me Anything !

2nd article, about Multi-players Multi-Armed Bandits

There is another point of view: instead of comparing different single-player policies on the same problem, we can make them play against each other, in a multi-player setting. The basic difference is about collisions : at each time t, if two or more user chose to sense the same channel, there is a collision. Collisions can be handled in different way from the base station point of view, and from each player point of view.

For more details, refer to this file: MultiPlayers.md and this research article.

PDF : BK__ALT_2018.pdf | HAL notice : BK__ALT_2018 | BibTeX : BK__ALT_2018.bib | Source code and documentation Published Maintenance Ask Me Anything !

3rd article, using Doubling Trick for Multi-Armed Bandits

I studied what Doubling Trick can and can't do to obtain efficient anytime version of non-anytime optimal Multi-Armed Bandits algorithms.

For more details, refer to this file: DoublingTrick.md and this research article.

PDF : BK__DoublingTricks_2018.pdf | HAL notice : BK__DoublingTricks_2018 | BibTeX : BK__DoublingTricks_2018.bib | Source code and documentation Published Maintenance Ask Me Anything !

4th article, about Piece-Wise Stationary Multi-Armed Bandits

With Emilie Kaufmann, we studied the Generalized Likelihood Ratio Test (GLRT) for sub-Bernoulli distributions, and proposed the B-GLRT algorithm for change-point detection for piece-wise stationary one-armed bandit problems. We combined the B-GLRT with the kl-UCB multi-armed bandit algorithm and proposed the GLR-klUCB algorithm for piece-wise stationary multi-armed bandit problems. We prove finite-time guarantees for the B-GLRT and the GLR-klUCB algorithm, and we illustrate its performance with extensive numerical experiments.

For more details, refer to this file: NonStationaryBandits.md and this research article.

PDF : BK__COLT_2019.pdf | HAL notice : BK__COLT_2019 | BibTeX : BK__COLT_2019.bib | Source code and documentation Published Maintenance Ask Me Anything !


Other interesting things

Single-player Policies

Arms and problems

  • My framework mainly targets stochastic bandits, with arms following Bernoulli, bounded (truncated) or unbounded Gaussian, Exponential, Gamma or Poisson distributions, and more.
  • The default configuration is to use a fixed problem for N repetitions (e.g. 1000 repetitions, use MAB.MAB), but there is also a perfect support for "Bayesian" problems where the mean vector Β΅1,…,Β΅K change at every repetition (see MAB.DynamicMAB).
  • There is also a good support for Markovian problems, see MAB.MarkovianMAB, even though I didn't implement any policies tailored for Markovian problems.
  • I'm actively working on adding a very clean support for non-stationary MAB problems, and MAB.PieceWiseStationaryMAB is already working well. Use it with policies designed for piece-wise stationary problems, like Discounted-Thompson, one of the CD-UCB algorithms, M-UCB, SlidingWindowUCB or Discounted-UCB, or SW-UCB#.

Other remarks

  • Everything here is done in an imperative, object oriented style. The API of the Arms, Policy and MultiPlayersPolicy classes is documented in this file (API.md).
  • The code is clean, valid for both Python 2 and Python 3.
  • Some piece of code come from the pymaBandits project, but most of them were refactored. Thanks to the initial project!
  • G.Varoquaux's joblib is used for the Evaluator and EvaluatorMultiPlayers classes, so the simulations are easily parallelized on multi-core machines. (Put n_jobs = -1 or PARALLEL = True in the config file to use all your CPU cores, as it is by default).

How to run the experiments ?

See this document: How_to_run_the_code.md for more details (or this documentation page).

TL;DR: this short bash snippet shows how to clone the code, install the requirements for Python 3 (in a virtualenv, and starts some simulation for N=100 repetitions of the default non-Bayesian Bernoulli-distributed problem, for K=9 arms, an horizon of T=10000 and on 4 CPUs (it should take about 20 minutes for each simulations):

cd /tmp/  # or wherever you want
git clone -c core.symlinks=true https://GitHub.com/SMPyBandits/SMPyBandits.git
cd SMPyBandits
# just be sure you have the latest virtualenv from Python 3
sudo pip3 install --upgrade --force-reinstall virtualenv
# create and active the virtualenv
virtualenv venv
. venv/bin/activate
type pip  # check it is /tmp/SMPyBandits/venv/bin/pip
type python  # check it is /tmp/SMPyBandits/venv/bin/python
# install the requirements in the virtualenv
pip install -r requirements_full.txt
# run a single-player simulation!
N=100 T=10000 K=9 N_JOBS=4 make single
# run a multi-player simulation!
N=100 T=10000 M=3 K=9 N_JOBS=4 make moremulti

You can also install it directly with pip and from GitHub:

cd /tmp/ ; mkdir SMPyBandits ; cd SMPyBandits/
virtualenv venv
. venv/bin/activate
type pip  # check it is /tmp/SMPyBandits/venv/bin/pip
type python  # check it is /tmp/SMPyBandits/venv/bin/python
pip install git+https://github.com/SMPyBandits/SMPyBandits.git#egg=SMPyBandits[full]
  • If speed matters to you and you want to use algorithms based on kl-UCB, you should take the time to build and install the fast C implementation of the utilities KL functions. Default is to use kullback.py, but using the C version from Policies/C/ really speeds up the computations. Just follow the instructions, it should work well (you need gcc to be installed).
  • And if speed matters, be sure that you have a working version of Numba, it is used by many small functions to (try to automatically) speed up the computations.

Nix

A pinned Nix environment is available for this experimental setup in the nix/pkgs/ directory. From the root of the project:

$ nix-shell
nix-shell$ jupyter_notebook 
nix-shell$ N=100 T=10000 K=9 N_JOBS=4 make single

The following one-liner lets you explore one of the example notebooks from any Nix-enabled machine, without cloning the repository:

$ nix-shell https://github.com/SMPYBandits/SMPyBandits/archive/master.tar.gz --run 'jupyter-notebook $EXAMPLE_NOTEBOOKS/Example_of_a_small_Multi-Player_Simulation__with_Centralized_Algorithms.ipynb' 

πŸ’₯ Warning

Contributing?

I don't except issues or pull requests on this project, but you are welcome to.

Contributions (issues, questions, pull requests) are of course welcome, but this project is and will stay a personal environment designed for quick research experiments, and will never try to be an industry-ready module for applications of Multi-Armed Bandits algorithms. If you want to contribute, please have a look to the CONTRIBUTING.md file, and if you want to be more seriously involved, read the CODE_OF_CONDUCT.md file.

πŸ’₯ TODO

See this file TODO.md, and the issues on GitHub.


πŸ“œ License ? GitHub license

MIT Licensed (file LICENSE).

Β© 2016-2018 Lilian Besson, with help from contributors.

Maintenance Ask Me Anything ! Analytics PyPI version PyPI implementation PyPI pyversions PyPI download PyPI status Documentation Status Build Status

Stars of https://github.com/SMPyBandits/SMPyBandits/ Contributors of https://github.com/SMPyBandits/SMPyBandits/ Watchers of https://github.com/SMPyBandits/SMPyBandits/ Forks of https://github.com/SMPyBandits/SMPyBandits/

Releases of https://github.com/SMPyBandits/SMPyBandits/ Commits of https://github.com/SMPyBandits/SMPyBandits/ / Date of last commit of https://github.com/SMPyBandits/SMPyBandits/

Issues of https://github.com/SMPyBandits/SMPyBandits/ : Open issues of https://github.com/SMPyBandits/SMPyBandits/ / Closed issues of https://github.com/SMPyBandits/SMPyBandits/

Pull requests of https://github.com/SMPyBandits/SMPyBandits/ : Open pull requests of https://github.com/SMPyBandits/SMPyBandits/ / Closed pull requests of https://github.com/SMPyBandits/SMPyBandits/

ForTheBadge uses-badges ForTheBadge uses-git forthebadge made-with-python ForTheBadge built-with-science

You can’t perform that action at this time.