Contextual: Multi-Armed Bandits in R
R package facilitating the simulation and evaluation of context-free and contextual Multi-Armed Bandit policies.
The package has been developed to:
- Ease the implementation, evaluation and dissemination of both existing and new contextual Multi-Armed Bandit policies.
- Introduce a wider audience to contextual bandit policies' advanced sequential decision strategies.
To install contextual from CRAN:
To install the development version (requires the devtools package):
When working on or extending the package, clone its GitHub repository, then do:
install.packages("devtools") devtools::install_deps(dependencies = TRUE) devtools::build() devtools::reload()
clean and rebuild...
See the demo directory for practical examples and replications of both synthetic and offline (contextual) bandit policy evaluations.
How to replicate figures from two introductory context-free Multi-Armed Bandits texts:
- Replication of figures from Sutton and Barto, "Reinforcement Learning: An Introduction", Chapter 2
- Replication of figures from "Bandit algorithms for website optimization" by John Miles White
Basic, context-free multi-armed bandit examples:
- Basic MAB Epsilon Greedy evaluation
- Synthetic MAB policy comparison
- Replication Eckles & Kaptein (Bootstrap Thompson Sampling)
Examples of both synthetic and offline contextual multi-armed bandit evaluations:
Some more extensive vignettes to get you started with the package:
- Getting started: running simulations
- Offline evaluation: replication of Li et al (2010)
- Class reference
Paper offering a general overview of the package's structure & API:
Overview of core classes
Policies and Bandits
Overview of contextual's growing library of contextual and context-free bandit policies:
|CMAB Naive Epsilon-Greedy
LinUCB (General, Disjoint, Hybrid)
Linear Thompson Sampling
|Lock-in Feedback (LiF)
Overview of contextual's bandit library:
|Basic Synthetic||Contextual Synthetic||Offline||Continuous|
|Basic Bernoulli Bandit
Basic Gaussian Bandit
Alternative parallel backends
By default, "contextual" uses R's built-in parallel package to facilitate parallel evaluation of multiple agents over repeated simulation. See the demo/alternative_parallel_backends directory for several alternative parallel backends:
- Microsoft Azure VM's using doAzureParallel.
- Redis using doRedis.
- MPI (Message-Passing Interface) using Rmpi and doMPI.
Robin van Emden: author, maintainer* Maurits Kaptein: supervisor*
If you encounter a clear bug, please file a minimal reproducible example on GitHub.