Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid NaN values in torch.cdist backward for p<1 #45720

Closed

Conversation

kurtamohler
Copy link
Collaborator

Fixes #36493

@codecov
Copy link

codecov bot commented Oct 2, 2020

Codecov Report

Merging #45720 into master will decrease coverage by 0.07%.
The diff coverage is 61.70%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #45720      +/-   ##
==========================================
- Coverage   68.61%   68.53%   -0.08%     
==========================================
  Files         405      409       +4     
  Lines       52045    52553     +508     
==========================================
+ Hits        35710    36018     +308     
- Misses      16335    16535     +200     
Impacted Files Coverage Δ
torch/_torch_docs.py 100.00% <ø> (ø)
torch/cuda/amp/grad_scaler.py 23.94% <0.00%> (-0.46%) ⬇️
torch/nn/modules/loss.py 97.76% <ø> (ø)
torch/onnx/symbolic_opset9.py 35.56% <ø> (ø)
torch/overrides.py 97.08% <ø> (ø)
torch/quantization/fx/pattern_utils.py 87.23% <ø> (-1.86%) ⬇️
torch/onnx/symbolic_opset11.py 21.42% <9.09%> (-1.28%) ⬇️
.../testing/_internal/distributed/distributed_test.py 30.29% <12.19%> (-0.41%) ⬇️
torch/onnx/symbolic_opset10.py 40.00% <16.66%> (+3.74%) ⬆️
torch/utils/benchmark/utils/timer.py 78.82% <25.00%> (-8.85%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6acd7b6...476adff. Read the comment docs.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good.
Added comments inline.

test/test_autograd.py Outdated Show resolved Hide resolved
test/test_autograd.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
We can think of more "near to zero" optimization in a follow up PR if we want to.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@nikitaved
Copy link
Collaborator

@alban, we can modify the formula for different cases p < 1, p > 1 to make it more numerically stable. It will be bc-breaking for sure.

@albanD
Copy link
Collaborator

albanD commented Oct 5, 2020

Not all BC-breaking are a nogo. If we change the computed value to be closer to the correct thing, that's an OK thing to do.

The main thing is that we should make sure that it does not make other regions significantly worse.

@facebook-github-bot
Copy link
Contributor

@albanD merged this pull request in 54aaffb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

torch.cdist gradients are NAN for p<1 and very small differences in a given dimension (0<delta<~e-45)
6 participants