-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUBDEV-7204: GridSearch over TE parameters is not working #4216
Conversation
@@ -113,7 +113,7 @@ public final Futures delete(Key<Job> job_key, Futures fs, boolean cascade) { | |||
@Override public Lockable atomic(Lockable old) { | |||
_old = old; | |||
if( old != null ) { // Prior Lockable exists? | |||
assert !old.is_wlocked(_job_key) : "Key "+_key+" already locked (or deleted); lks="+Arrays.toString(old._lockers); // No double locking by same job | |||
if(old.is_wlocked(_job_key)) return old; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michalkurka can we not fail if Lockable is already locked with the same key? This way we can probably include write_lock
inside update
. I don't know if there are situations when we want to update without prior locking. Maybe local updates do not require distributed locks( then it is better not to include write_lock inside update to give flexibility) .
Of course without failing developers are free to abuse this ability and use write_lock unnecessary number of times. But still it is easier than trying to understand why some execution paths have different lock story. I run into such situation : for GBM model parameters process of building a model does not release lock but for TargetEncoder model parameters it does. But it is hard to understand who and where releases the lock in TE path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is outdated. But still interesting to know.
@@ -55,9 +60,8 @@ public void computeImpl() { | |||
disableIgnoreConstColsFeature(); | |||
TargetEncoderModel.TargetEncoderOutput output = new TargetEncoderModel.TargetEncoderOutput(TargetEncoderBuilder.this, _targetEncodingMap, priorMean); | |||
_targetEncoderModel = new TargetEncoderModel(_job._result, _parms, output, tec); | |||
|
|||
_targetEncoderModel.write_lock(_job); | |||
_targetEncoderModel.unlock(_job); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michalkurka What is the purpose of these two lines? Does it make sure that _targetEncoderModel
is updated globally?
...nsions/target-encoder/src/test/java/ai/h2o/targetencoding/strategy/TargetEncoderRGSTest.java
Outdated
Show resolved
Hide resolved
7a8efb0
to
9ce5a40
Compare
|
||
_targetEncoderModel.write_lock(_job); | ||
_targetEncoderModel.unlock(_job); | ||
_targetEncoderModel = new TargetEncoderModel(_result, _parms, output, tec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
job._result
was a wrong key to pass to a Model. It was causing key to be removed by Scope. Needed key that is stored in ModelBuilder
's _result
field first, matches keys that are used to retrieve models from the grid ( this way we don't get N null models in the end) and second, this key is being created outside of (before) Scope.enter() and is not being unnecessary deleted from DKV.
I will review this first thing tomorrow - in progress now :) |
@michalkurka I also noticed that
? |
use progressUnits = 1
PUBDEV-7204: Fix grid search for Target Encoder
This PR fixes two issues:
Implementation of TargetEncoderBuilder does not take into account the way how GridSearch is working. Instead of using TargetEncoderBuilder's constructor which sets passed model parameters, GridSearch is using startupOnce constructor with default parameters so that it can later reassign them with values from the grid. Even though initialisation is being called, training frame is not available in default model parameters
Wrong key was passed to TargetEncoderModel
new TargetEncoderModel(_result, ...
. Instead of_result
from ModelBuilder the one from Job was used. See for more details my comment PUBDEV-7204: GridSearch over TE parameters is not working #4216 (comment)