-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't run more than n trials with trialConcurrency=n > 1 #5689
Comments
I encountered the same problem, sometimes it stopped after about ten trials ,and sometimes it stopped after more than 100 trials. I haven't found what caused the problem. |
I also have similar problems. |
I have the same issue as well and looking forward the solution. Environment: |
same issue.
I set trial_concurrency==8 and it always stopped at 10~14 trials. |
Same issue, I set trial_concurrency=16 and stopped at ~20 trials, the dispatcher is terminated |
Same issue here on the latest version of NNI. It seems random how many trials along it gets each time. Always
in I think the problem went away after downgrading to |
I faced the same problem, and in my case, a stopgap solution is to use "Anneal" tuner instead of "TPE" tuner. |
I found anything above 2.5 gives me the problem, been okay up to the hard coded memory limit with version 2.5 (roughly 45k trials) |
Describe the issue:
When I set trialConcurrency > 1, NNI fails out with
When the trialConcurrency = n > 1, then NNI runs n trials and fails out with this error. This happens for all the different n values i've tried (2, 5, 10, 100). When trialConcurrency=1, no problems.
Environment:
Configuration:
I haven't created a minimal reproducible example yet, I'm hoping someone might recognize this problem, as it seems pretty basic and maybe is just a version issue somewhere?
The text was updated successfully, but these errors were encountered: