Skip to content

dourouc05/CombinatorialBandits.jl

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

CombinatorialBandits

Project Status: Active – The project has reached a stable, usable state and is being actively developed. The MIT License example workflow Coverage Status codecov.io

This package implements several algorithms to deal with combinatorial multi-armed bandit (CMAB), including the first polynomial-time optimum-regret algorithms: AESCB (described in our paper) and AOSSB (article in press).

See also Bandits.jl, focusing on multi-armed bandits (i.e. not combinatorial).

To install:

]add CombinatorialBandits

Example usage:

using CombinatorialBandits, Distributions

n = 20
m = 8
ε = 0.1
distr = Distribution[Bernoulli(.5 + ((i % 3 == 0) ? ε : -ε)) for i in 1:n]

i = MSet(distr, 8, MSetAlgosSolver())
@time simulate(i, ThompsonSampling(), 200)
@time simulate(i, LLR(), 200)
@time simulate(i, CUCB(), 200)
@time simulate(i, ESCB2(ESCB2Budgeted(.1, true)), 200)

Citing

If you use this package in your research, please cite either article:

@article{cuvelier2021aescb,
    author = {Cuvelier, Thibaut and Combes, Richard and Gourdin, Eric},
    title = {Statistically Efficient, Polynomial-Time Algorithms for Combinatorial Semi-Bandits},
    year = {2021},
    issue_date = {March 2021},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {5},
    number = {1},
    url = {https://doi.org/10.1145/3447387},
    doi = {10.1145/3447387},
    journal = {Proc. ACM Meas. Anal. Comput. Syst.},
    month = feb,
    articleno = {09},
    numpages = {31},
    keywords = {combinatorial bandits, combinatorial optimization, bandits}
}

@article{cuvelier2021glpg,
  title={Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time},
  author={Cuvelier, Thibaut and Combes, Richard and Gourdin, Eric},
  journal={arXiv preprint arXiv:2102.07254},
  year={2021}
}

About

Implementation of several algorithms for combinatorial bandits, a kind of reinforcement learning.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •