Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow passing a custom pmap #63

Merged
merged 5 commits into from Aug 31, 2021
Merged

allow passing a custom pmap #63

merged 5 commits into from Aug 31, 2021

Conversation

ericphanson
Copy link
Contributor

closes #59

with thanks to @omus who helped me figure out a path here.

I think @phyperopt needs more correctness work to go beyond simple samplers like the RandomSampler, but this at least lets us use the RandomSampler robustly by passing e.g. a Parallelism.robust_pmap or such.

@@ -157,7 +157,7 @@ function optimize(ho::Hyperoptimizer)
if e isa InterruptException
@info "Aborting hyperoptimization"
else
rethrow(e)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will mess with the stacktrace and point to this line instead of the original one, so I fixed it to just rethrow() which does the right thing in a catch block

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I learned something new 😃

src/Hyperopt.jl Outdated
res = $(esc(ex.args[2])) # ex.args[2] = Body of the For loop

res, $(Expr(:tuple, esc.(params[2:end])...))
put!(new_history, (i, res, $(Expr(:tuple, esc.(params[2:end])...))))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, we update new_history (which lives on the manager node) from the worker nodes, so that we don't need to care about the result of pmap. Why? Because if we pass a custom on_error, it would need to return both a loss and the parameters in order to be iterated later, and the user probably doesn't know they can't just pass a loss.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment perhaps deserve to be included in the source as well? I can imagine future me scratching my head over this (Hyperopt.jl has a tendency to require a long warm up before I can do any maintainance on it 🙃)

@@ -157,7 +157,7 @@ function optimize(ho::Hyperoptimizer)
if e isa InterruptException
@info "Aborting hyperoptimization"
else
rethrow(e)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I learned something new 😃

src/Hyperopt.jl Outdated
res = $(esc(ex.args[2])) # ex.args[2] = Body of the For loop

res, $(Expr(:tuple, esc.(params[2:end])...))
put!(new_history, (i, res, $(Expr(:tuple, esc.(params[2:end])...))))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment perhaps deserve to be included in the source as well? I can imagine future me scratching my head over this (Hyperopt.jl has a tendency to require a long warm up before I can do any maintainance on it 🙃)

@ericphanson
Copy link
Contributor Author

one sec, I had a better idea for how to do this 😄

@ericphanson
Copy link
Contributor Author

ericphanson commented Aug 10, 2021

I pushed a new commit that takes a better approach I think— instead of syncing the history and results with a channel, it does the whole hyperoptimizer instead. I think that will allow workers to observe the history to choose iterates with that knowledge for any samplers that use that.

I think an even better approach might
be to do all the sampling on the manager itself, so the workers only need to run the objective function. But that would need more refactoring and I don’t know how to do it off the top of my head.

Two tests are failing locally so I’ll have to look more at why tomorrow.

Edit: they aren’t failing here so maybe I fixed it? Could’ve been a revise issue, since the workers don’t always like revise

edit edit: they didn’t run at all, I’m just confused lol

@ericphanson ericphanson reopened this Aug 11, 2021
@ericphanson
Copy link
Contributor Author

@omus I'd be interested in your thoughts on this new approach if you get a chance to have a look (using a RemoteChannel to access the single hyperoptimizer object, instead of using a RemoteChannel to store the history and results, but having one hyperoptimizer per worker).

@ericphanson
Copy link
Contributor Author

@baggepinnen I'll ask for a re-review if you get a chance, since I ended up changing this a fair bit since you reviewed. The new version seems to fix #62 though, and I was able to do a big 5-6 day long training job with this (only 12 iterations but each one is slow...).

@baggepinnen baggepinnen merged commit 23096d8 into baggepinnen:master Aug 31, 2021
@baggepinnen
Copy link
Owner

Cool approach! Thanks for fixing Eric 😃

New release will be out soon

@ericphanson
Copy link
Contributor Author

awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants