Bandits regressors for model selection (new PR to use Github CI/CD) #397

etiennekintzler · 2020-11-26T17:57:53Z

Description

The PR introduce bandits (epsilon-greedy and UCB) for model selection (see issue #270 ). The PR concerns only regressors, but I can add the classifiers in a subsequent PR.

The use of the classes are straightforward :

bandit = UCBRegressor(models=models, metric=metrics.MSE()),

for (x, y) in data.take(N):    
        y_pred = bandit.predict_one(x=x)
        bandit.learn_one(x=x, y=y)

best_model = bandit.best_model

There are convenience methods such as :

percentage_pulled : to get the percentage each arm was pulled
best_model : return the model with the highest average reward

Also I added a method add_models where the user can add models on the fly.

I am also working on a notebook that studies the behavior of the bandits for model selection. The notebook also include Exp3, which seems promising but has numerical stability issue and yields counter-intuitive results (see section 3 of the NB). That's why I kept it out of this PR. More generally, the performances of UCB and epsilon-greedy are rather good but there seems to be some variance in the performance.

Improvements

It's still WIP on the following points :

docstring, mainly add examples + cleaning.
some comments might be removed.
the name of the classes and the methods are open for changes

etiennekintzler · 2020-11-26T18:11:47Z

As in #391 :

The test check_shuffle_features_no_impact from module estimator_checks fails. The reason is not because of the shuffling of the features per se but because the bandit learning process is not deterministic.

To put it differently, if there is two arms (A1, A2) and two bandits (B1, B2), and if the bandit B1 pull arm A1 and bandit B2 pull arm A2 at round 0, they won't output the same prediction at round 1 because their internals (in particular _average_reward) will differ.

smastelini · 2020-11-26T20:03:49Z

The test check_shuffle_features_no_impact from module estimator_checks fails. The reason is not because of the shuffling of the features per se but because the bandit learning process is not deterministic.

Hi @etiennekintzler. Some while ago I had to set the seed parameter of some methods in order to make them pass the tests. The default seed parameter must be added in the _default_params method. Please, take a look at that as an example.

MaxHalford · 2020-11-26T20:10:57Z

Indeed @etiennekintzler, as @smastelini is saying, you need to seed the randomized parts of your code. For instance, instead of calling random.random(), first initialize a random number generator in your __init__:

self._rng = random.Random(seed)

where seed is a parameter provided by the user that defaults to None.

You can then do self._rng.random().

etiennekintzler · 2020-11-26T22:31:38Z

Indeed @etiennekintzler, as @smastelini is saying, you need to seed the randomized parts of your code. For instance, instead of calling random.random(), first initialize a random number generator in your __init__:
self._rng = random.Random(seed)
where seed is a parameter provided by the user that defaults to None.

You can then do self._rng.random()

@MaxHalford I misunderstood when you first talk about this in the 1st PR. I thought you meant fix a different seed for each bandit (for model and shuffled as named in the test) which would not solve the problem. You are talking about randomizing in the _defaut_params so the 2 bandits will pull the same arm ?

The test check_shuffle_features_no_impact from module estimator_checks fails. The reason is not because of the shuffling of the features per se but because the bandit learning process is not deterministic.

Hi @etiennekintzler. Some while ago I had to set the seed parameter of some methods in order to make them pass the tests. The default seed parameter must be added in the _default_params method. Please, take a look at that as an example.

Hello @smastelini, I check your code, I think I got it now, thanks you 👍

Also @MaxHalford not related to this, but I think the CI cancelled because of RAM usage (it happens when I run the tests on my computer), you can try to provision more RAM and see if the problem persists.

codecov-io · 2020-11-29T22:31:39Z

Codecov Report

Merging #397 (8d5c0a1) into master (a9bfae0) will decrease coverage by 0.63%.
The diff coverage is 80.15%.

@@            Coverage Diff             @@
##           master     #397      +/-   ##
==========================================
- Coverage   85.38%   84.75%   -0.64%     
==========================================
  Files         276      276              
  Lines       13352    13626     +274     
==========================================
+ Hits        11401    11549     +148     
- Misses       1951     2077     +126

Impacted Files	Coverage Δ
river/base/ensemble.py	`100.00% <ø> (+10.00%)`	⬆️
river/datasets/bikes.py	`85.71% <ø> (ø)`
river/datasets/elec2.py	`85.71% <ø> (ø)`
river/datasets/higgs.py	`75.00% <ø> (ø)`
river/datasets/http.py	`75.00% <0.00%> (ø)`
river/datasets/malicious_url.py	`33.33% <0.00%> (ø)`
river/datasets/movielens100k.py	`85.71% <ø> (ø)`
river/datasets/music.py	`100.00% <ø> (ø)`
river/datasets/phishing.py	`100.00% <ø> (ø)`
river/datasets/restaurants.py	`87.50% <ø> (ø)`
... and 268 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c104e0f...8d5c0a1. Read the comment docs.

MaxHalford

Looking good!

river/expert/bandit.py

etiennekintzler · 2020-12-01T18:30:58Z

Looking good!

Thanks :)

Regarding the example in the docstring what dataset in river would you recommend for model selection ?

MaxHalford · 2020-12-01T18:34:33Z

Not sure, but something like logistic regression / Hoeffding tree / GausianNB sounds good

etiennekintzler · 2020-12-01T18:49:30Z

You are talking about methods nope ? (I was talking about dataset)

Regarding methods I would have liked to use online PCA (to do selection on the number of components) but doesn't seem to exist yet in river (just saw it on #3 )

MaxHalford · 2020-12-01T18:54:55Z

Lol my bad: use Phishing for binary classification, ImageSegments for multi-class, and TrumpApproval for regression :). Indeed, we haven't implemented online PCA yet :)

etiennekintzler · 2020-12-19T11:41:41Z

Hello @MaxHalford !

Would it help to have a "warm-up" period where each model is updated regardless of its cumulative reward?

Yes I did, this what I meant when i was talking about "explore each arm first" strategy in my previous message.

After thinking and tinkering with the rewarding system, I found an alternative that seems to work well which is to get rid of the online scaling for the reward and to use strong discounting for the first n rewards.

More specifically:

fixed scaling system like sigmoid function for the reward instead of online scaler.
online scaler to standardize y. It allows sigmoid to be used for problems which have different scale for y (and thus MSE/MAE).
discounting factor for the first rewards, like exponential decay or just setting reward to 0 for the first n iterations. The goal is to have learning for the y scaler and for the models without skewing the average reward for the future. Sounds a bit like warm-up/explore first strategy but without messing with the average reward.

The benefits of abandoning the online scaler for the reward are also :

simplify the choice the user has to make
avoid the risk that a model which has really bad reward skew the whole scaling.

However the main drawback is that some bandit model (like UCB) make hypothesis about the distribution of the reward (e.g subgaussian), which might be a bit different with what we obtained using sigmoid.

…le to avoid DocTestFailure

…t_after in __init__

MaxHalford

Looking good!

One important thing: since you've started this PR we've added more thorough code style rules. Essentially you'll want to run pre-commit install --hook-type pre-push on your work station. This will run black + flake8 before you git push. You can fix the code style for black by running black river --config .black.

river/expert/bandit.py

etiennekintzler · 2020-12-21T16:55:41Z

One important thing: since you've started this PR we've added more thorough code style rules. Essentially you'll want to run pre-commit install --hook-type pre-push on your work station. This will run black + flake8 before you git push. You can fix the code style for black by running black river --config .black.

I had done it (as suggested in CONTRIBUTING.md) but had error when pushing from local.

Thinking the issue was on my side, I pushed directly from Github (as 7bea5bb was just name change) and had the same error as the one I had in local, that is :

would reformat /home/runner/work/river/river/river/expert/bandit.py
Oh no! 💥 💔 💥
1 file would be reformatted, 311 files would be left unchanged.
would reformat /home/runner/work/river/river/river/expert/bandit.py
Oh no! 💥 💔 💥
1 file would be reformatted, 311 files would be left unchanged.

flake8...................................................................Failed
- hook id: flake8
- exit code: 1

river/utils/math.py:335:12: E741 ambiguous variable name 'l'
river/expert/bandit.py:143:5: E303 too many blank lines (2)

Error: Process completed with exit code 1.

does it mean that E741 and E303 are blocking and that I have to be resolve it by myself ?

Also, I don't know what to do with the DocTestFailure that happens because the docstring's Example should have the output of the command as well (when it's not stored in a variable). For example I can't use :

>>> for x, y in dataset:
...    bandit.learn_one(x=x, y=y)

since learn_one return self. Of course the result could be stored in a variable (I guess that's the reason why it's written like this in the README.md) but it looks a bit awkward to me (and misleading for the user).

MaxHalford · 2020-12-21T17:59:04Z

does it mean that E741 and E303 are blocking and that I have to be resolve it by myself ?

Yep those are flake8 errors that you have to correct.

It seems that you also have some black errors, which you have correct by running black river --config .black. I advise installing the dev version of black before doing this: pip install git+git://github.com/psf/black --upgrade.

Also, I don't know what to do with the DocTestFailure that happens because the docstring's Example should have the output of the command as well (when it's not stored in a variable)

Alas you have to assign the output to a variable. Typically I would write bandit = bandit.learn_one(x, y). Now that I think about it, we could consider removing the return self from each fit_one method altogether as I don't see it bringing any value.

etiennekintzler · 2020-12-21T18:49:39Z

does it mean that E741 and E303 are blocking and that I have to be resolve it by myself ?

Yep those are flake8 errors that you have to correct.

Ok !

It seems that you also have some black errors, which you have correct by running black river --config .black. I advise installing the dev version of black before doing this: pip install git+git://github.com/psf/black --upgrade.

Thank you :) work well after update

Also, I don't know what to do with the DocTestFailure that happens because the docstring's Example should have the output of the command as well (when it's not stored in a variable)

Alas you have to assign the output to a variable. Typically I would write bandit = bandit.learn_one(x, y). Now that I think about it, we could consider removing the return self from each fit_one method altogether as I don't see it bringing any value.

Yep or you could just remove the related test (that will run for every base.Regressor I guess).

MaxHalford · 2020-12-22T09:41:58Z

It's a bit more complicated than that because we need to update every example.

etiennekintzler · 2020-12-23T17:55:45Z

Oh ok I see,

Should I set it to

>>> for x, y in dataset:
...    bandit = bandit.learn_one(x=x, y=y)

and keep return self in learn_one ... or wait for the change ?

MaxHalford · 2020-12-23T18:08:47Z

I wouldn't wait for the changes, it'll take some time :)

…_iter extra line; make average_reward 'public'

etiennekintzler · 2020-12-27T11:56:27Z

Hello @MaxHalford !

There are tests not related to this PR that are failing.

They are located in river/compat/test_sklearn.py.

Also there is another test that results in internal error (in local as in the CI/CD pipeline). It happens right after river/metrics/test_.py::test_pickling[_RollingRegressionReport] (see in CI/CD logs), error is:

warnings._OptionError: invalid module name: 'sklearn.metrics.classification'

MaxHalford · 2020-12-27T13:43:13Z

Hey mate!

I fixed the tests :). It was all down to the new release of scikit-learn (0.24).

etiennekintzler · 2020-12-27T15:19:37Z

Great ! Can you push it ? :D

etiennekintzler · 2020-12-27T16:48:15Z

Cool, all tests passed now !

I've nothing else to add for now, so you can review and merge if it looks good to you :)

The main change since your last review is the argmax function defined directly in river/bandits.py. Standard argmax will tend to favour pulling the first model since argmax([0, 0, 0, 0]) will always return 0. This argmax will pull a random arm in this very case. When there is one and only one maximum, both functions are of course the same.

Also, I don't know how you want to frame it but I think it could be good to mark this bandit class as experimental since this implementation doesn't really stem from theory (hence the importance of having researchers input on this issue).

MaxHalford

Looks really good IMO! Ideally, I would like it if you could add some more comments to the internal functions. Maybe that adding a docstring to the Bandit would help. You could quickly describe the purpose of each function and how they work together. Not sure if I'm clear :). I just want newcomers to be able to grok how we're framing this. Then again, the code is really clear.

river/expert/bandit.py

MaxHalford · 2020-12-27T17:00:29Z

river/expert/bandit.py

+
+        # Predict and learn with the chosen model
+        chosen_model = self[chosen_arm]
+        y_pred = chosen_model.predict_one(x)


This might be predict_proba for a classifier right?

Yes, you're right, didn't anticipate that. Also, for classifier the whole scaling thing is less of a problem (since the target is {0, 1}).

…d_reward

MaxHalford · 2021-01-04T10:46:16Z

I guess we can merge now, and take care of the classification aspect in another PR?

etiennekintzler · 2021-01-04T18:42:35Z

Looks really good IMO! Ideally, I would like it if you could add some more comments to the internal functions. Maybe that adding a docstring to the Bandit would help. You could quickly describe the purpose of each function and how they work together. Not sure if I'm clear :). I just want newcomers to be able to grok how we're framing this. Then again, the code is really clear.

Thank you really much ! I added some docstring

I guess we can merge now, and take care of the classification aspect in another PR?

Yes, I think you can merge 👌 !

The classification aspect in _compute_scaled_reward can be treated when introducing classifiers. Before classifiers, I'll do a PR with Exp3 regressor which is straightforward to implement and offer good results (see this WIP notebook).

etiennekintzler added 15 commits November 15, 2020 13:27

[WIP] first layout for bandits classes

da7a823

improve docstring, rm class not in PR, clean some

13154a6

delete nb from pr, clean some

b0fefc6

enumerate all parameters in __init__, fix \epsilon

e128e61

align on convention: single quote and import

a50693f

rm print_every, change print_info->__repr__, skip line after class

22c7923

substitute stdlib for numpy

afe1cac

rm metrics tracing, add _learn_one for powerusers

3c25c93

forget to rm object tracing in class __init__ signature

860e779

add type to models, _default_params, use '+=' for list append

9fe14f1

change parameters in _default_params

f0b1b6c

intercept_lr instead of lr in LinearRegression

9cb720c

mv argmax to utils.math

d3169dc

fix mistake: EpsilonGreedyRessor didnt inherit from base.Regressor

c7b20ca

Merge branch 'master' into bandits_regressors_PR

bf1fcf5

smastelini assigned etiennekintzler Nov 26, 2020

smastelini added the New feature label Nov 26, 2020

add seed arg for reproducibility/tests

35d37f8

MaxHalford reviewed Nov 30, 2020

View reviewed changes

river/expert/bandit.py Outdated Show resolved Hide resolved

river/expert/bandit.py Outdated Show resolved Hide resolved

add typing for seed parameter, rm seed from _default_params

0ba1cc1

first draft for docstring's example

02097ae

etiennekintzler added 3 commits December 19, 2020 17:31

raw: sigmoid scaler, warm_up, mv explore_each_arm in Bandit,cut Examp…

a954fb7

…le to avoid DocTestFailure

Merge branch 'master' into bandits_regressors_PR

af32a24

chg classmethod's name _default_params to _unit_test_params, fix star…

7bea5bb

…t_after in __init__

MaxHalford reviewed Dec 21, 2020

View reviewed changes

river/expert/bandit.py Outdated Show resolved Hide resolved

river/expert/bandit.py Outdated Show resolved Hide resolved

river/expert/bandit.py Outdated Show resolved Hide resolved

run black, test commit hook

513a982

more docstring; add randomize argmax, default value for metric; rm _n…

929c035

…_iter extra line; make average_reward 'public'

etiennekintzler changed the title ~~[WIP] Bandits regressors for model selection (new PR to use Github CI/CD)~~ Bandits regressors for model selection (new PR to use Github CI/CD) Dec 24, 2020

etiennekintzler added 2 commits December 27, 2020 11:20

possibility to add function with seed in argmax

99926a0

fix docstring Example output

69b0d11

etiennekintzler added 2 commits December 27, 2020 16:57

small modif to docstring

8e6963a

Merge branch 'master' into bandits_regressors_PR

14420ef

MaxHalford reviewed Dec 27, 2020

View reviewed changes

river/expert/bandit.py Outdated Show resolved Hide resolved

river/expert/bandit.py Outdated Show resolved Hide resolved

MaxHalford reviewed Dec 27, 2020

View reviewed changes

etiennekintzler added 2 commits December 27, 2020 22:46

shorten some lines

8599730

docstring effort (not finished), rm c=1 parameter from _compute_scale…

0cb664e

…d_reward

cosmetic

8d5c0a1

MaxHalford merged commit 1853cec into online-ml:master Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bandits regressors for model selection (new PR to use Github CI/CD) #397

Bandits regressors for model selection (new PR to use Github CI/CD) #397

etiennekintzler commented Nov 26, 2020

etiennekintzler commented Nov 26, 2020

smastelini commented Nov 26, 2020

MaxHalford commented Nov 26, 2020

etiennekintzler commented Nov 26, 2020 •

edited

codecov-io commented Nov 29, 2020 •

edited

MaxHalford left a comment

etiennekintzler commented Dec 1, 2020

MaxHalford commented Dec 1, 2020

etiennekintzler commented Dec 1, 2020 •

edited

MaxHalford commented Dec 1, 2020

etiennekintzler commented Dec 19, 2020

MaxHalford left a comment

etiennekintzler commented Dec 21, 2020 •

edited

MaxHalford commented Dec 21, 2020 •

edited

etiennekintzler commented Dec 21, 2020

MaxHalford commented Dec 22, 2020

etiennekintzler commented Dec 23, 2020

MaxHalford commented Dec 23, 2020

etiennekintzler commented Dec 27, 2020

MaxHalford commented Dec 27, 2020

etiennekintzler commented Dec 27, 2020

etiennekintzler commented Dec 27, 2020

MaxHalford left a comment

MaxHalford Dec 27, 2020

etiennekintzler Dec 27, 2020

MaxHalford commented Jan 4, 2021

etiennekintzler commented Jan 4, 2021

Bandits regressors for model selection (new PR to use Github CI/CD) #397

Bandits regressors for model selection (new PR to use Github CI/CD) #397

Conversation

etiennekintzler commented Nov 26, 2020

Description

Improvements

etiennekintzler commented Nov 26, 2020

smastelini commented Nov 26, 2020

MaxHalford commented Nov 26, 2020

etiennekintzler commented Nov 26, 2020 • edited

codecov-io commented Nov 29, 2020 • edited

Codecov Report

MaxHalford left a comment

Choose a reason for hiding this comment

etiennekintzler commented Dec 1, 2020

MaxHalford commented Dec 1, 2020

etiennekintzler commented Dec 1, 2020 • edited

MaxHalford commented Dec 1, 2020

etiennekintzler commented Dec 19, 2020

MaxHalford left a comment

Choose a reason for hiding this comment

etiennekintzler commented Dec 21, 2020 • edited

MaxHalford commented Dec 21, 2020 • edited

etiennekintzler commented Dec 21, 2020

MaxHalford commented Dec 22, 2020

etiennekintzler commented Dec 23, 2020

MaxHalford commented Dec 23, 2020

etiennekintzler commented Dec 27, 2020

MaxHalford commented Dec 27, 2020

etiennekintzler commented Dec 27, 2020

etiennekintzler commented Dec 27, 2020

MaxHalford left a comment

Choose a reason for hiding this comment

MaxHalford Dec 27, 2020

Choose a reason for hiding this comment

etiennekintzler Dec 27, 2020

Choose a reason for hiding this comment

MaxHalford commented Jan 4, 2021

etiennekintzler commented Jan 4, 2021

etiennekintzler commented Nov 26, 2020 •

edited

codecov-io commented Nov 29, 2020 •

edited

etiennekintzler commented Dec 1, 2020 •

edited

etiennekintzler commented Dec 21, 2020 •

edited

MaxHalford commented Dec 21, 2020 •

edited