Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better per_run_time_limit default #764

Closed
mfeurer opened this issue Jan 20, 2020 · 5 comments
Closed

Better per_run_time_limit default #764

mfeurer opened this issue Jan 20, 2020 · 5 comments

Comments

@mfeurer
Copy link
Contributor

mfeurer commented Jan 20, 2020

Currently, the hyperparameter per_run_time_limit is set to 360s. However it should be 1/10 of the total time limit per default.

@brunompacheco
Copy link

@mfeurer Should this take on account the number of jobs? So per_run_time_limit would be 1/10 of n_jobs * time_left_for_this_task.

@mfeurer
Copy link
Contributor Author

mfeurer commented Jun 19, 2020

So per_run_time_limit would be 1/10 of n_jobs * time_left_for_this_task.
Good catch, yes, it should be. But on the other hand, it would also be good to enforce that at least two models are built in each process so that the ensemble has at least some models to work with (if the time limit would be used completely in case of 10 processes, the ensemble script would not find any models and terminate the same time Auto-sklearn terminates, leaving us without an ensemble).

@franchuterivera
Copy link
Contributor

Hello! As of now, 2 arguments control the time limit enforcing, and both of them are agnostic to the number of jobs.

  • time_left_for_this_task

  • per_run_time_limit

So if n_jobs>1, autosklearn code creates several automl objects that each one of them call smac with time_left_for_this_task/per_run_time_limit directly as provided to the user.

So shouldn't this per_run_time_limit be 1/10 of time_left_for_this_task instead of 1/10 of n_jobs * time_left_for_this_task ? Else, we should clarify also about what is the implication to the other argument time_left_for_this_task.

@mfeurer
Copy link
Contributor Author

mfeurer commented Jun 19, 2020

Actually, I would suggest making per_run_time_limit None and then compute it based on the time_left_per_task. Only if the user gives this argument we would use it (and only cap it to be lower than time_left_for_this_task. Furthermore, I would not change the meaning of time_left_for_this_task and keep it exactly the way it is.

In my opinion, the per_run_time_limit default should be n_jobs * time_left_for_this_task / 10 to ensure a total of ten configurations being looked at (which is also the default now). To also make sure that the ensemble script can load some models, I suggest adding the extra constraint in the automatic equation that each worker should have time to evaluate two configurations.

@mfeurer
Copy link
Contributor Author

mfeurer commented Aug 4, 2020

Closing this as we merged this via #884.

@mfeurer mfeurer closed this as completed Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants