Demo of using skopt "ask-and-tell" API for multi-armed bandit problems #407

Naereen · 2017-06-19T10:59:37Z

Hi,
For my research, I wrote this small notebook, showing how to use the (awesome) "ask-and-tell" API for skopt Optimizers.
It shows that it is easy to compare a Black-Box Bayesian optimization algorithm, here RandomForestRegressor, on Multi-armed bandits problems 🎰. It is compared against classical online algorithms like UCB and Thompson sampling.

I don't know if it can be interesting for skopt's documentation, but I wanted to let you know. ✨

If you want to include it / point to it, don't hesitate to ask me for rewriting or cleaning or improving the notebook! The most important part is here 🛠️

Thanks 😃.

The text was updated successfully, but these errors were encountered:

betatim · 2017-06-19T19:44:47Z

I think we need more examples and narrative documentation. So I think we should add it to the examples section. Is there someone around who knows enough about multi-arm bandits and the like to judge the content?

betatim · 2017-06-19T19:48:24Z

After a first read through I think we should include a shortened version in the docs and link to the long version.

Naereen · 2017-06-19T19:52:56Z

Hi,
I agree, and this long version uses a toolbox I am working on for which a doc is already online (http://banditslilian.gforge.inria.fr/) but the code itself is not (yet) open-source.

I will write a smaller version, with manual implementation of UCB and Thompson Sampling, as short as possible, and try to make it self-contained.

betatim · 2017-06-19T20:47:31Z

Ok. Making it self contained is good, especially if it otherwise involves code that people can't get :) We should find someone who understands both SMBO and the more classic multi arm bandit methods so we have some kind of sanity check on the comparisons (comparing A with B almost always upsets someone if you aren't careful :) )

Naereen · 2017-06-19T22:13:59Z

Here, on that small notebook, I compare:

"very slow" Black Box Bayesian optimization,
against "quick" online bandit algorithms (state-of-the-art),

Both perform almost as well, the Bayesian optimization being slightly better (on some toy problem) but really slower.

MechCoder · 2017-06-20T04:54:31Z

Thanks a lot for the highly detailed notebook. I am not too familiar with multi-arm bandit methods either but upon a quick read, I find that the goal of multi-armed bandits is to minimize the expected reward after n iterations, while bayesian optimisation finds the argmin of the expensive function. Both seem to solve different problems right?

Or is the rationale that inspite of SMBO solving a different problem, it performs as well?

Naereen · 2017-06-20T12:10:44Z

@MechCoder Indeed the Multi-Armed bandit problem is a reinforcement learning problem where the task is the maximize the expected reward (or minimize the regret) upon an (unknown) horizon n. This is essentially done by finding the best arm and not pulling all the sub-optimal arms. And here the Bayesian optimization task aims finding the optimal arm, by trying all of them, which will result in a small regret also.

I think what was interesting is to show that black box Bayesian optimization can also be applied to reinforcement learning experiments, even if the cost in terms of computational complexity is not worth the improvement in terms of rewards gain / regret diminution (as BlackBoxOpt performs indeed way better than state-of-the-art Thompson Sampling or klUCB on the basic examples I considered).

I am re-running the experiments with a longer horizon and more repetitions (reduce noise in the plots), and after I will rewrite it more concisely to be self-contained (and only present one experiment).

Naereen closed this as completed Jan 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo of using skopt "ask-and-tell" API for multi-armed bandit problems #407

Demo of using skopt "ask-and-tell" API for multi-armed bandit problems #407

Naereen commented Jun 19, 2017

betatim commented Jun 19, 2017

betatim commented Jun 19, 2017

Naereen commented Jun 19, 2017

betatim commented Jun 19, 2017

Naereen commented Jun 19, 2017

MechCoder commented Jun 20, 2017

Naereen commented Jun 20, 2017

Demo of using skopt "ask-and-tell" API for multi-armed bandit problems #407

Demo of using skopt "ask-and-tell" API for multi-armed bandit problems #407

Comments

Naereen commented Jun 19, 2017

betatim commented Jun 19, 2017

betatim commented Jun 19, 2017

Naereen commented Jun 19, 2017

betatim commented Jun 19, 2017

Naereen commented Jun 19, 2017

MechCoder commented Jun 20, 2017

Naereen commented Jun 20, 2017