Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance issue with sample weights in model.fit() #17357

Merged
merged 3 commits into from
Apr 6, 2023

Conversation

nershman
Copy link
Contributor

@nershman nershman commented Dec 19, 2022

I previously had a PR open for this but I guess it got automatically closed when I reverted my commits...

Previous PR: #16177

@gbaned
@fchollet Since the way DataAdapter works is not clear to me I went back to training_utils.handle_partial_sample_weights.

The function is being passed a tensor when it should be passed a list. I think we can simply add a typecheck and if a tensor is passed then we wrap it in a list. This will fix both the slowdown as well as make sure the functions is checking that sample_weights correspond to inputs and outputs instead of checking every single sample in the tensor.

i.e.

if not isinstance(sample_weights, (list, tuple)):
    sample_weights = (sample_weights,)

And this will work fine, when the [sample_weights] workaround is used in model.fit() this is exactly what it does, it causes a tuple of one tensor to be passed to the function instead of just a tensor.
How is that?

Ensure that handle_partial_sample_weights recieves a list-like instead of a tensor.
@nershman nershman changed the title Update training_utils.py Fix perforamnce issue with sample weights in model.fit() Dec 19, 2022
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Dec 19, 2022
@gbaned gbaned requested a review from fchollet December 19, 2022 09:21
@google-ml-butler google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Dec 19, 2022
@nershman nershman changed the title Fix perforamnce issue with sample weights in model.fit() Fix performance issue with sample weights in model.fit() Dec 20, 2022
@haifeng-jin
Copy link
Contributor

Pending on another reply from @fchollet .
Hold till 01/09/2023 to add the pending label again if no response.

@haifeng-jin haifeng-jin removed the keras-team-review-pending Pending review by a Keras team member. label Dec 22, 2022
Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Dec 27, 2022
@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Dec 27, 2022
@gbaned gbaned added ready to pull Ready to be merged into the codebase and removed ready to pull Ready to be merged into the codebase labels Jan 4, 2023
@fchollet
Copy link
Member

fchollet commented Jan 9, 2023

Unfortunately I'm not able to merge as I've seeing a lot of test failures:

keras/engine:data_adapter_test
keras/engine:data_adapter_test
keras/engine:training_test
keras/engine:training_test
keras/metrics:metrics_correctness_test
keras/metrics:metrics_correctness_test
keras/tests:temporal_sample_weights_correctness_test
keras/tests:temporal_sample_weights_correctness_test

Can you take a look?

@haifeng-jin haifeng-jin removed the ready to pull Ready to be merged into the codebase label Feb 3, 2023
@haifeng-jin
Copy link
Contributor

@nershman Would you please take a look at the test failures?
Thanks!

@gbaned
Copy link
Collaborator

gbaned commented Feb 15, 2023

Hi @nershman Can you please check @haifeng-jin's comments and keep us posted ? Thank you!

@gbaned
Copy link
Collaborator

gbaned commented Mar 15, 2023

Hi @nershman Any update on this PR? Please. Thank you!

@nershman
Copy link
Contributor Author

Hi @nershman Any update on this PR? Please. Thank you!

Hi, I have been so busy with work recently, sorry. I have some notes on this and I'll look deeper into it this weekend.

modify partial weights check instead of changing shape of sample_weight, it causes issues downstream.
@nershman
Copy link
Contributor Author

nershman commented Mar 26, 2023

Wrapping the weights was causing issues further down in the function. I just added a case to the partial sample check in the beginning instead.

    if not isinstance(sample_weights, (list, tuple)): 
        any_sample_weight = (sample_weights,) is not None and sample_weights is not None  #wrap the weights during check instead of overwriting
        partial_sample_weight = any_sample_weight and sample_weights is None
    else: #normal check
        any_sample_weight = sample_weights is not None and any(
            w is not None for w in sample_weights
        )
        partial_sample_weight = any_sample_weight and any(
            w is None for w in sample_weights
        )

Tests pass on my machine now. (data_adapter_test, training_test, metrics_correctness_test, temporal_sample_weights_correctness_test)

simplified logic a but. the first check will always return true because even (None,) != None.
dnerini added a commit to MeteoSwiss/mlpp-lib that referenced this pull request Mar 31, 2023
@gbaned gbaned requested a review from fchollet April 3, 2023 14:33
@google-ml-butler google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Apr 3, 2023
Copy link
Contributor

@haifeng-jin haifeng-jin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am approving this PR to see if internal tests passes.

@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Apr 5, 2023
@chuckatkins
Copy link

Is it reasonable to try to have NumPy process the weights directly if possible, and in doing so give any non-finite weight the same treatment as None? Something like:

if sample_weights is None:
    any_sample_weight = False
    partial_sample_weight = False
else:
    try:
        sample_weights_isfinite = np.isfinite(sample_weights)
        any_sample_weight = np.any(sample_weights_isfinite)
        if any_sample_weight:
            partial_sample_weight = not np.all(sample_weights_isfinite)
            if partial_sample_weight:
                new_sample_weights = np.nan_to_num(sample_weights, nan=1.0, posinf=1.0, neginf=1.0)
                return new_sample_weights, any_sample_weight, partial_sample_weight
    except TypeError:
        if not isinstance(sample_weights, (list, tuple)):
            any_sample_weight = True
            partial_sample_weight = False
        else:
            any_sample_weight = any(w is not None for w in sample_weights)
            partial_sample_weight = any_sample_weight and any(w is None for w in sample_weights)

if not any_sample_weight:
    return None, any_sample_weight, partial_sample_weight

if not partial_sample_weight:
    return sample_weights, any_sample_weight, partial_sample_weight

copybara-service bot pushed a commit that referenced this pull request Apr 6, 2023
Imported from GitHub PR #17357

I previously had a PR open for this but I guess it got automatically closed when I reverted my commits...

Previous PR: #16177

@gbaned
@fchollet Since the way DataAdapter works is not clear to me I went back to `training_utils.handle_partial_sample_weights`.

The function is being passed a tensor when it should be passed a list. I think we can simply add a typecheck and if a tensor is passed then we wrap it in a list. This will fix both the slowdown as well as make sure the functions is checking that sample_weights correspond to inputs and outputs instead of checking every single sample in the tensor.

i.e.

```
if not isinstance(sample_weights, (list, tuple)):
    sample_weights = (sample_weights,)
```

And this will work fine, when the `[sample_weights]` workaround is used in `model.fit()` this is exactly what it does, it causes a tuple of one tensor to be passed to the function instead of just a tensor.
How is that?

Copybara import of the project:

--
083b213 by Sherman <sma232@gmail.com>:

Update training_utils.py

Ensure that handle_partial_sample_weights recieves a list-like instead of a tensor.
--
82130f7 by Sherman <sma232@gmail.com>:

Update training_utils.py

modify partial weights check instead of changing shape of sample_weight, it causes issues downstream.
--
a2d5ea9 by Sherman <sma232@gmail.com>:

Update training_utils.py

simplified logic a but. the first check will always return true because even (None,) != None.

Merging this change closes #17357

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17357 from nershman:master a2d5ea9
PiperOrigin-RevId: 522355426
@copybara-service copybara-service bot merged commit db138de into keras-team:master Apr 6, 2023
PR Queue automation moved this from Approved by Reviewer to Merged Apr 6, 2023
@nershman
Copy link
Contributor Author

nershman commented Apr 7, 2023

@chuckatkins I think you should make a separate bug report for this, I'm not familiar with what you're trying to fix. But my concern with using numpy would be creating issues with eager execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keras-team-review-pending Pending review by a Keras team member. ready to pull Ready to be merged into the codebase size:XS
Projects
PR Queue
Merged
Development

Successfully merging this pull request may close these issues.

None yet

6 participants