-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUBDEV-7197 - Parallel GS threads should call Hyperspace iterator one… #4207
Conversation
@michalkurka I don't understand why this PR is needed. Changes from this PR are already in #4191 . Could you please consider to approve my PR? |
@deil87 will look into it, thanks |
It seems to me your PR is more complex and introduces additional changes and refactoring. This PR is focused on fixing a specific issue. Can we accept @Pscheidl change and include into the next fix release and then merge yours into master? It seems to me that would be the correct approach for the fix release cycle. WDYT? |
@michalkurka we were discussing this issue in my PR. I think it would be right thing to do to discuss it there and with me ( or at least ask me for a review of this PR ). |
@deil87 I would like to include this PR in the fix release because it is focused on just one issue. I do need more time for reviewing your PR as there currently are more pressing matters. Thank you for understanding. |
@michalkurka ok, no problem. But description should reference original PR with corresponding credit. |
@michalkurka also this comment suggests that you were not against merging it into release branch #4191 (comment) . Based on that I have changed base branch of the PR - that could have been avoided I believe. |
@deil87 I am not against anything (yet), I didn't review the PR yet and didn't make up my mind about it. With the change in this PR I now think it might make more sense to merge this into fix release and put your change into master/dev. This is my current thinking. I am also open to including your change into rel-yu if there is a need for it. I see this PR as a no-brainer fix for the reported issue. And we can build on top of that. |
No. This PR is based on observations made during the last weekend spent on benchmarking Parallel GS: https://github.com/Pscheidl/h2o-parallel-grid-search-benchmark In the https://github.com/h2oai/h2o-3/pull/4191/files PR, early stopping logic is being removed from HyperspaceIterator's hashNext method and several other code changes are introduced, including changes breaking API. The author @deil87 seems to add some locking, but completely different than I have (he uses synchronized blocks). As a side effect, he ends up synchronizing the same block of code too, but in a different way using different means. I use my ReentrantLock mutex that has already been there for quite a while. This PR is not based on the above-mentioned PR by @deil87 - it doesn't share any code and has much more narrow focus. I used this code to build a custom H2O jar and use it for the benchmarks and now I'm trying to make it part of our codebase. However, I have no problem to give credit to @deil87. It surely helps the project. Therefore I did.I surely don't want anyone's credit to be forgotten. Thank you @deil87 . |
@Pscheidl we were discussing together synchronisation here #4191 (comment) 5 days ago . Please remove these comments from the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Please remove credits from the code.
I wanted to give the missing credit as you pointed out. I'll of course remove the code as you instruct. Not sure what should I do now to correct my mistakes. I'll leave it as-is for now and await further instructions. Not merging the PR for now ! |
… at a time
https://0xdata.atlassian.net/browse/PUBDEV-7197
In the GridSearch class, there is
method which calls hyperspace iterator and asks for next model params. It is perfectly possible (but highly improbable) for two threads accessing the nextModelParams method to obtain the same parameters - hyperspace iterators are not thread safe.
Solution is to use mutually exclusive access.
Currently, the worst thing that can happen is that one model parameters are used twice and thus the same model is built twice. It has no impact on the grid, as all the models are stored in the grid as key -> value, where key is has of the parameters. Therefore, the model will still occur only once in the Grid.