-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix performance issue with sample weights in model.fit() #17357
Conversation
Ensure that handle_partial_sample_weights recieves a list-like instead of a tensor.
Pending on another reply from @fchollet . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
Unfortunately I'm not able to merge as I've seeing a lot of test failures:
Can you take a look? |
@nershman Would you please take a look at the test failures? |
Hi @nershman Can you please check @haifeng-jin's comments and keep us posted ? Thank you! |
Hi @nershman Any update on this PR? Please. Thank you! |
Hi, I have been so busy with work recently, sorry. I have some notes on this and I'll look deeper into it this weekend. |
modify partial weights check instead of changing shape of sample_weight, it causes issues downstream.
Wrapping the weights was causing issues further down in the function. I just added a case to the partial sample check in the beginning instead.
Tests pass on my machine now. (data_adapter_test, training_test, metrics_correctness_test, temporal_sample_weights_correctness_test) |
simplified logic a but. the first check will always return true because even (None,) != None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am approving this PR to see if internal tests passes.
Is it reasonable to try to have NumPy process the weights directly if possible, and in doing so give any non-finite weight the same treatment as if sample_weights is None:
any_sample_weight = False
partial_sample_weight = False
else:
try:
sample_weights_isfinite = np.isfinite(sample_weights)
any_sample_weight = np.any(sample_weights_isfinite)
if any_sample_weight:
partial_sample_weight = not np.all(sample_weights_isfinite)
if partial_sample_weight:
new_sample_weights = np.nan_to_num(sample_weights, nan=1.0, posinf=1.0, neginf=1.0)
return new_sample_weights, any_sample_weight, partial_sample_weight
except TypeError:
if not isinstance(sample_weights, (list, tuple)):
any_sample_weight = True
partial_sample_weight = False
else:
any_sample_weight = any(w is not None for w in sample_weights)
partial_sample_weight = any_sample_weight and any(w is None for w in sample_weights)
if not any_sample_weight:
return None, any_sample_weight, partial_sample_weight
if not partial_sample_weight:
return sample_weights, any_sample_weight, partial_sample_weight |
Imported from GitHub PR #17357 I previously had a PR open for this but I guess it got automatically closed when I reverted my commits... Previous PR: #16177 @gbaned @fchollet Since the way DataAdapter works is not clear to me I went back to `training_utils.handle_partial_sample_weights`. The function is being passed a tensor when it should be passed a list. I think we can simply add a typecheck and if a tensor is passed then we wrap it in a list. This will fix both the slowdown as well as make sure the functions is checking that sample_weights correspond to inputs and outputs instead of checking every single sample in the tensor. i.e. ``` if not isinstance(sample_weights, (list, tuple)): sample_weights = (sample_weights,) ``` And this will work fine, when the `[sample_weights]` workaround is used in `model.fit()` this is exactly what it does, it causes a tuple of one tensor to be passed to the function instead of just a tensor. How is that? Copybara import of the project: -- 083b213 by Sherman <sma232@gmail.com>: Update training_utils.py Ensure that handle_partial_sample_weights recieves a list-like instead of a tensor. -- 82130f7 by Sherman <sma232@gmail.com>: Update training_utils.py modify partial weights check instead of changing shape of sample_weight, it causes issues downstream. -- a2d5ea9 by Sherman <sma232@gmail.com>: Update training_utils.py simplified logic a but. the first check will always return true because even (None,) != None. Merging this change closes #17357 FUTURE_COPYBARA_INTEGRATE_REVIEW=#17357 from nershman:master a2d5ea9 PiperOrigin-RevId: 522355426
@chuckatkins I think you should make a separate bug report for this, I'm not familiar with what you're trying to fix. But my concern with using numpy would be creating issues with eager execution. |
I previously had a PR open for this but I guess it got automatically closed when I reverted my commits...
Previous PR: #16177
@gbaned
@fchollet Since the way DataAdapter works is not clear to me I went back to
training_utils.handle_partial_sample_weights
.The function is being passed a tensor when it should be passed a list. I think we can simply add a typecheck and if a tensor is passed then we wrap it in a list. This will fix both the slowdown as well as make sure the functions is checking that sample_weights correspond to inputs and outputs instead of checking every single sample in the tensor.
i.e.
And this will work fine, when the
[sample_weights]
workaround is used inmodel.fit()
this is exactly what it does, it causes a tuple of one tensor to be passed to the function instead of just a tensor.How is that?