Add Stable-Baselines3 RL Example. #1420

araffin · 2020-06-24T10:24:46Z

Note: the only thing missing now is to deactivate tests for python < 3.6 (I don't know where that should be changed)

Motivation

Description of the changes

Add an hyperparameter tuning example in an reinforcement learning context using Stable-Baselines3.

araffin · 2020-06-24T12:11:37Z

~~@HideakiImamura where is the flake8 config to check the style locally? I've been using flake8 . at the root of the project but it does not show the errors that CircleCI shows...~~

EDIT: I installed hacking and could fix the issues

crcrpar

First of all, thank you so much for doing this awesome work!!!

I left a couple of comments but most of them are cosmetic.
So, before resolving them, let @hvy review this and make sure we are on the same page. 😃

examples/rl/sb3_simple.py

crcrpar · 2020-06-24T14:40:43Z

examples/rl/sb3_simple.py

+    try:
+        study.optimize(objective, n_trials=N_TRIALS, n_jobs=N_JOBS)
+    except KeyboardInterrupt:
+        pass


Is it possible to use timeout instead of KeyboardInterrupt?

We can have both, no?
This is more intended to create the report even when the user kills the optimization early.
The timeout would be more for the tests, no?

You are absolutely right and what you do, i.e., the use of KeyboardInterrupt is really cool.

That being said, it's embarrassing to say this though, could you set timeout for the faster CI runs like this?

optuna/examples/pytorch_simple.py

Line 136 in e249aa8

study.optimize(objective, n_trials=100, timeout=600)

examples/rl/sb3_simple.py

Co-authored-by: Masaki Kozuki <masaki.kozuki.2014@gmail.com>

araffin · 2020-06-26T08:41:06Z

Thanks for your comments, I added your suggestions.
One last thing that I did not find: how do you add that example to CI tests?

In the past, there was a stage but it was removed recently apparently.

crcrpar · 2020-06-26T09:19:16Z

One last thing that I did not find: how do you add that example to CI tests?

In the past, there was a stage but it was removed recently apparently.

Good catch! Currently, example runs are daily and not checked in PR's CI.

optuna/.github/workflows/examples.yml

Lines 3 to 5 in e249aa8

    
           on: 
        
             schedule: 
        
               - cron: '0 15 * * *'

So, I think what we need to do is to add your example to IGNORES.

optuna/.github/workflows/examples.yml

Lines 38 to 39 in e249aa8

    
                   if [ ${{ matrix.python-version }} = 3.5 ]; then 
        
                     IGNORES='chainermn_.*|pytorch_lightning.*|fastai_.*|allennlp_.*'

hvy

Thanks for your quick action. Took a quick skim through your code and it basically LGTM.

hvy · 2020-06-26T07:59:39Z

examples/rl/sb3_simple.py

+            self.eval_idx += 1
+            self.trial.report(self.last_mean_reward, self.eval_idx)
+            # Prune trial if need
+            if self.trial.should_prune(self.eval_idx):


As described here the step argument has been deprecated for a while and has now been discarded. The logic should remain unchained in this case so let's simply omit it.

Suggested change

if self.trial.should_prune(self.eval_idx):

if self.trial.should_prune():

thanks, I think I wrote that code one year and half ago... so things have changed a bit ;)

hvy · 2020-06-26T08:16:13Z

examples/rl/sb3_simple.py

+        # Sometimes, random hyperparams can generate NaN
+        # Prune hyperparams that generate NaNs
+        print(e)
+        raise optuna.exceptions.TrialPruned()


Just a tip but if applicable, you can also, after cleaning up, return a float('nan') from the objective function instead of treating it as a pruned trial. Optuna will treat that trial as a failed trial https://github.com/optuna/optuna/blob/master/optuna/study.py#L737 and samplers/pruners in Optuna will know how to handle it.

examples/rl/sb3_simple.py

HideakiImamura · 2020-07-01T03:29:45Z

@araffin Thanks for the PR! The introduced example looks very impressive. I would like to review this PR after you addressing above @hvy's reviews. Thanks!

HideakiImamura

Thanks for the swift action! I have one suggestion around suggest_* methods.

examples/rl/sb3_simple.py

Co-authored-by: Hideaki Imamura <38826298+HideakiImamura@users.noreply.github.com>

araffin · 2020-07-01T14:21:23Z

Is there an easy way to display the user attributes for each trial in terminal?

Because for a RL researcher, it's not very intuitive to see "gamma=0.002" in the terminal...

HideakiImamura

Thanks for the swift action! LGTM except for one minor comment!

examples/rl/sb3_simple.py

HideakiImamura · 2020-07-06T10:10:50Z

IMO, the current visualization of the optimized parameters looks good. If we have a sufficient time budget, it would be a good idea to run the training again with the optimized parameters and evaluate the performance of the parameters.

Co-authored-by: Hideaki Imamura <38826298+HideakiImamura@users.noreply.github.com>

HideakiImamura · 2020-07-07T01:45:47Z

Could you merge the master branch? It will resolve the CI failure.

HideakiImamura

Thanks for your great effort! LGTM!

hvy

Sorry for the late review, and again thanks for you effort. LGTM!

I did not check every detail with stable_baselines3 but the usage of Optuna seems good and I also verified the example locally.

araffin added 2 commits June 24, 2020 12:23

Add Stable-Baselines3 RL Example

8467f29

Fix style

e5ef83e

HideakiImamura added the example label Jun 24, 2020

Fix H style issues

396b640

hvy self-assigned this Jun 25, 2020

crcrpar reviewed Jun 25, 2020

View reviewed changes

araffin and others added 2 commits June 26, 2020 10:33

Apply suggestions from code review

a28f716

Co-authored-by: Masaki Kozuki <masaki.kozuki.2014@gmail.com>

Add timeout

b851b97

hvy requested changes Jun 26, 2020

View reviewed changes

Skip sb3 test for python 3.5

a7e8a2a

HideakiImamura self-assigned this Jun 30, 2020

araffin added 2 commits July 1, 2020 10:37

Merge branch 'master' of github.com:optuna/optuna

f4f021f

Address comments

7a6820f

HideakiImamura reviewed Jul 1, 2020

View reviewed changes

examples/rl/sb3_simple.py Outdated Show resolved Hide resolved

araffin and others added 4 commits July 1, 2020 16:03

Update examples/rl/sb3_simple.py

9ebbf27

Co-authored-by: Hideaki Imamura <38826298+HideakiImamura@users.noreply.github.com>

Merge branch 'master' of github.com:optuna/optuna

1004e22

Style fixes

f6163e0

Fixes for SB3

052889b

Print user attributes

4dba206

HideakiImamura reviewed Jul 6, 2020

View reviewed changes

examples/rl/sb3_simple.py Outdated Show resolved Hide resolved

Update examples/rl/sb3_simple.py

b8653d9

Co-authored-by: Hideaki Imamura <38826298+HideakiImamura@users.noreply.github.com>

Merge branch 'master' of github.com:optuna/optuna

3087c1d

HideakiImamura approved these changes Jul 10, 2020

View reviewed changes

hvy approved these changes Jul 13, 2020

View reviewed changes

hvy merged commit ff7f9f6 into optuna:master Jul 13, 2020

HideakiImamura added this to the v2.0.0 milestone Jul 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Stable-Baselines3 RL Example. #1420

Add Stable-Baselines3 RL Example. #1420

araffin commented Jun 24, 2020 •

edited

araffin commented Jun 24, 2020 •

edited

crcrpar left a comment

crcrpar Jun 24, 2020

araffin Jun 26, 2020

crcrpar Jun 26, 2020

araffin commented Jun 26, 2020

crcrpar commented Jun 26, 2020

hvy left a comment

hvy Jun 26, 2020

araffin Jul 1, 2020

hvy Jun 26, 2020

HideakiImamura commented Jul 1, 2020

HideakiImamura left a comment •

edited

araffin commented Jul 1, 2020 •

edited

HideakiImamura left a comment

HideakiImamura commented Jul 6, 2020

HideakiImamura commented Jul 7, 2020

HideakiImamura left a comment

hvy left a comment

	if self.trial.should_prune(self.eval_idx):
	if self.trial.should_prune():

Add Stable-Baselines3 RL Example. #1420

Add Stable-Baselines3 RL Example. #1420

Conversation

araffin commented Jun 24, 2020 • edited

Motivation

Description of the changes

araffin commented Jun 24, 2020 • edited

crcrpar left a comment

Choose a reason for hiding this comment

crcrpar Jun 24, 2020

Choose a reason for hiding this comment

araffin Jun 26, 2020

Choose a reason for hiding this comment

crcrpar Jun 26, 2020

Choose a reason for hiding this comment

araffin commented Jun 26, 2020

crcrpar commented Jun 26, 2020

hvy left a comment

Choose a reason for hiding this comment

hvy Jun 26, 2020

Choose a reason for hiding this comment

araffin Jul 1, 2020

Choose a reason for hiding this comment

hvy Jun 26, 2020

Choose a reason for hiding this comment

HideakiImamura commented Jul 1, 2020

HideakiImamura left a comment • edited

Choose a reason for hiding this comment

araffin commented Jul 1, 2020 • edited

HideakiImamura left a comment

Choose a reason for hiding this comment

HideakiImamura commented Jul 6, 2020

HideakiImamura commented Jul 7, 2020

HideakiImamura left a comment

Choose a reason for hiding this comment

hvy left a comment

Choose a reason for hiding this comment

araffin commented Jun 24, 2020 •

edited

araffin commented Jun 24, 2020 •

edited

HideakiImamura left a comment •

edited

araffin commented Jul 1, 2020 •

edited