Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Paper: Better and faster hyper-parameter optimization with Dask #464
Co-authors are Tom Augspurger and Matthew Rocklin. The paper will be built on http://procbuild.scipy.org/download/stsievert-dask-ml-model-selection
Hey @stsievert I'm one of the reviewers for this.
Some high level things:
After a first pass I would say this 90-95% of the way to meeting all the criteria.
Thanks for the review @NickleDave! I've pushed a couple changes that make the paper easier to interpret IMO.
Before the individual responses, here are my TODOs:
Good suggestion. Done.
I've edited the paragraphs just under the "Hyperband" section title, and added some words too. I think it better introduces the algorithm. Can you give some more feedback? Thank you.
#221 should be merged shortly. I'll update the link then.
Loss is all I can show because this a regression problem (thanks – some typos fixed). I show validation loss.
Showing test loss instead of validation loss would require significant implementation. I'd rather the validation set be very large to approximate the true error.
Are you asking to compare one more model selection algorithm alongside
I can try an equivalent version of
I think future work will be to implement that example in dask-examples. I'm seeing this as a suggestion for documentation improvements, not as an edit for the paper. Am I reading that right?
(I've also added the appendix with the complete code, so now
Thanks @deniederhut! I've added some simulations, re-organized a bit and edited with a fine-toothed comb. I've pushed my changes, and I think they're ready for another look. They're visible at http://procbuild.scipy.org/download/stsievert-dask-ml-model-selection.
The largest edit I've made has been adding the simulations as suggested by @NickleDave. This is a good suggestion, and presents one of my results more cleanly. The resulting figure is a good illustration of this:
This enables cleaner presentation of my two main takeaways:
I am working on improving the graphs that show best score. The graph above
referenced this pull request
Jun 22, 2019
I'm glad the comments were helpful.
@deniederhut how fixed is the deadline of the 25th?
Whoa! This is one of the cooler things I've seen come out of our open review process. Come find me at SciPy and I'll buy both of you a beer
We try not to publicize this, but we build in some wiggle room to the deadlines. If Scott is okay getting the feedback on Wed, then it's okay with me
Yup, fine by me. I'll try to reply on Wednesday night.
I think I'm ready for another review as mentioned in #464 (comment). I'll put in some work to try and update the image denoising figure before Wednesday (though I believe only the image will change).