Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #100 #102

Closed
wants to merge 1 commit into from
Closed

Fix issue #100 #102

wants to merge 1 commit into from

Conversation

jkterry1
Copy link
Contributor

@jkterry1 jkterry1 commented Jun 2, 2021

Close #100

@@ -65,7 +65,7 @@
choices=["halving", "median", "none"],
)
parser.add_argument("--n-startup-trials", help="Number of trials before using optuna sampler", type=int, default=10)
parser.add_argument("--n-evaluations", help="Number of evaluations for hyperparameter optimization", type=int, default=20)
parser.add_argument("--n-evaluations", help="Number of timesteps at which the environment is evaluated during for a single hyperparameter combination", type=int, default=20)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not what it does...
For hyperparameter optimization, you give each trial a maximum budget of n_timesteps and to decide whether to prune or not, you evaluate each trial every n_timesteps // args.n_evaluations

btw, could you merge your two doc PR together and update the changelog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"you give each trial a maximum budget of n_timesteps and to decide whether to prune or not"

After reading through the code either I'm misunderstanding you, I'm going crazy or thats wrong. n_timesteps how by timesteps an individual training run for set of hyperparameters runs for and that's used for reference in pruning. e.g. model.learn(self.n_timesteps, callback=eval_callback)

However, my new sentence did have some grammar errors, this is what it probably should've said:
"Maximum number of training timesteps for each set of hyperparameters during optimization"

Is that better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Maximum number of training timesteps for each set of hyperparameters during optimization"

This is what n_timesteps is when doing hyperparameter optimization.

And using the TrialEvalCallback, we can stop each trial early by returning False according to the pruner:

if self.trial.should_prune():

which resutls in a trial doing less than n_timesteps.

And to decide whether to prune or not, you have regular evaluations every n_timesteps // args.n_evaluations, so a maximum of args.n_evaluations in total.

If you want, there is a self contained example in the optuna repo or here: https://github.com/araffin/rl-handson-rlvs21/blob/main/optuna/sb3_simple.py

@jkterry1 jkterry1 closed this Jun 9, 2021
@jkterry1 jkterry1 mentioned this pull request Jun 9, 2021
@jkterry1 jkterry1 deleted the patch-2 branch December 16, 2021 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Documentation for n_evaluations flag should be improved
2 participants