New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[numpy] torch.lgamma: promote integer inputs to float #50140
[numpy] torch.lgamma: promote integer inputs to float #50140
Conversation
💊 CI failures summary and remediationsAs of commit 341664f (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
dtypesIfCUDA=all_types_and(torch.bool, torch.half), | ||
skips=( | ||
SkipInfo('TestUnaryUfuncs', 'test_reference_numerics', | ||
dtypes=[torch.bfloat16]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skipping Bfloat16
>>> import torch
>>> torch.tensor(500.0, dtype=torch.bfloat16)
tensor(500., dtype=torch.bfloat16)
>>> t = torch.tensor(500.0, dtype=torch.bfloat16)
>>> torch.lgamma(t)
tensor(2608., dtype=torch.bfloat16)
>>> torch.lgamma(t.to(torch.float32))
tensor(2605.1160)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good skip. This again highlights that we should break-up test_reference_numerics into different value ranges (like small values, large values, extremal values, for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful if ranges
can be specified per dtype or for group of dtype (maybe something similar to precision-override), because we don't want to decrease the range of all dtypes just to be able to support one dtype with smaller range.
Codecov Report
@@ Coverage Diff @@
## master #50140 +/- ##
==========================================
- Coverage 80.91% 80.91% -0.01%
==========================================
Files 1926 1926
Lines 210014 210023 +9
==========================================
+ Hits 169943 169945 +2
- Misses 40071 40078 +7 |
@@ -1260,6 +1260,28 @@ def reference_sigmoid(x): | |||
return (1 / (1 + np.exp(-x))) | |||
return scipy.special.expit(x) | |||
|
|||
def reference_lgamma(x): | |||
# scipy.special.gammaln returns `-inf` when input is `-inf`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bug in SciPy.
scipy.special.gammaln is documented as equivalent to math.lgamma, but math.lgamma returns inf when given -inf.
Would you file an issue with SciPy, @kshitij12345?
cc @rgommers
@@ -1260,6 +1260,28 @@ def reference_sigmoid(x): | |||
return (1 / (1 + np.exp(-x))) | |||
return scipy.special.expit(x) | |||
|
|||
def reference_lgamma(x): | |||
# scipy.special.gammaln returns `-inf` when input is `-inf`. | |||
# While Pytorch, C and C++, all return `inf` when input is `-inf`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good comment. Would you extend it to include Python's math.lgamma, too?
# https://en.cppreference.com/w/c/numeric/math/lgamma | ||
|
||
# To handle the above discrepancy, | ||
# we never pass `-inf` to scipy.special.gammaln. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workaround is more clever than this comment suggests. It's not only that this never passes -inf to scipy.special.gammaln, but that it replaces -inf with inf so the mapping -inf->inf occurs as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we phrase it as,
# To handle the above discrepancy,
# we never pass `-inf` to scipy.special.gammaln and replace them with `inf`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pretty good, but it requires the reader make the connection about how -inf is mapped to inf.
"this replaces -inf with inf so values that were originally -inf map to inf as expected"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is very verbose but atleast clear. Will use that thanks!
# Reference: https://github.com/pytorch/pytorch/pull/50140#issuecomment-756150214 | ||
SkipInfo('TestUnaryUfuncs', 'test_reference_numerics', | ||
dtypes=[torch.float32, torch.float64], active_if=IS_WINDOWS), | ||
# Backward of `lgamma` uses `digamma` but `digamma` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. From the perspective of this PR this skip is correct, but I thought test_variant_consistency_jit wrapped the backward call in a try/except block to catch these errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have verified. I don't see any try/execpt block in the failing test.
Stack Trace
======================================================================
ERROR: test_variant_consistency_jit_lgamma_cpu_bfloat16 (__main__.TestCommonCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 284, in instantiated_test
raise rte
File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 279, in instantiated_test
result = test_fn(self, *args)
File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 253, in test_wrapper
return test(*args, **kwargs)
File "test/test_ops.py", line 290, in test_variant_consistency_jit
no_grad=(dtype not in dtypes_to_grad_check))
File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_jit.py", line 77, in check_against_reference
allow_unused=allow_unused)
File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/autograd/__init__.py", line 225, in grad
inputs, allow_unused, accumulate_grad=False)
RuntimeError: "digamma" not implemented for 'BFloat16'
----------------------------------------------------------------------
Ran 94 tests in 5.282s
FAILED (errors=1, skipped=8)
Code fails in check_against_reference
Lines 272 to 300 in 5546a12
with disable_autodiff_subgraph_inlining(): | |
def fn(*inputs, **kwargs): | |
output = func(*inputs, **kwargs) | |
return op.output_func(output) | |
# bfloat16 grad doesn't work for some operators | |
dtypes_to_grad_check = floating_and_complex_types_and(torch.half) \ | |
if op.skip_bfloat16_grad else floating_and_complex_types_and(torch.half, torch.bfloat16) | |
# Check scripted forward, grad, and grad grad | |
script_fn = create_script_fn(self, name, func_type, op.output_func) | |
check_against_reference(self, | |
script_fn, | |
fn, | |
(*sample.input,) + sample.args, | |
sample.kwargs, | |
no_grad=(dtype not in dtypes_to_grad_check)) | |
# Check traced forward, grad, and grad grad | |
traced_fn = create_traced_fn(self, variant) | |
check_against_reference(self, | |
traced_fn, | |
fn, | |
(*sample.input,) + sample.args, | |
sample.kwargs, | |
no_grad=(dtype not in dtypes_to_grad_check)) | |
# Check alias annotation schema for correctness (make |
The code for check_against_reference
also doesn't have any try/except
pytorch/torch/testing/_internal/common_jit.py
Lines 39 to 108 in 5546a12
def check_against_reference(self, func, reference_func, args, kwargs=None, | |
allow_unused=True, check_types=True, no_grad=False): | |
kwargs = kwargs if kwargs else {} | |
def allSum(vs): | |
if isinstance(vs, torch.Tensor): | |
vs = (vs,) | |
return sum((i + 1) * v.sum() | |
for i, v in enumerate(vs) | |
if v is not None and v.dtype in floating_and_complex_types_and(torch.half, torch.bfloat16)) | |
def clone_inputs(requires_grad): | |
inputs = [ | |
arg.detach().clone().requires_grad_(requires_grad and arg.requires_grad) | |
if isinstance(arg, torch.Tensor) else arg for arg in args | |
] | |
return inputs, [input for input in inputs if isinstance(input, torch.Tensor) and input.requires_grad] | |
nograd_inputs, nograd_tensors = clone_inputs(False) | |
recording_inputs, recording_tensors = clone_inputs(True) | |
# test no gradients case | |
outputs = self.runAndSaveRNG(reference_func, nograd_inputs, kwargs) | |
with enable_profiling_mode_for_profiling_tests(): | |
outputs_test = self.runAndSaveRNG(func, nograd_inputs, kwargs) | |
self.assertEqual(outputs, outputs_test) | |
if check_types: | |
check_output_types(self, func, outputs_test, nograd_inputs, kwargs) | |
if no_grad: | |
# skip grad tests | |
return | |
with enable_profiling_mode_for_profiling_tests(): | |
# test single grad case | |
outputs = self.runAndSaveRNG(reference_func, recording_inputs, kwargs) | |
grads = torch.autograd.grad(allSum(outputs), recording_tensors, | |
allow_unused=allow_unused) | |
outputs_test = self.runAndSaveRNG(func, recording_inputs, kwargs) | |
grads_test = torch.autograd.grad(allSum(outputs_test), recording_tensors, | |
allow_unused=allow_unused) | |
self.assertEqual(outputs, outputs_test) | |
self.assertEqual(grads, grads_test) | |
# test the grad grad case | |
if self._testMethodName in nn_functional_single_grad: | |
return | |
outputs = self.runAndSaveRNG(reference_func, recording_inputs, kwargs) | |
l1 = allSum(outputs) | |
grads = torch.autograd.grad(l1, recording_tensors, create_graph=True, | |
allow_unused=allow_unused) | |
l2 = (allSum(grads) * l1) | |
grads2 = torch.autograd.grad(l2, recording_tensors, allow_unused=allow_unused) | |
recording_inputs, recording_tensors = clone_inputs(True) | |
outputs_test = self.runAndSaveRNG(func, recording_inputs, kwargs) | |
l1_test = allSum(outputs_test) | |
grads_test = torch.autograd.grad( | |
l1_test, recording_tensors, create_graph=True, allow_unused=allow_unused) | |
l2_test = (allSum(grads_test) * l1_test) | |
grads2_test = torch.autograd.grad(l2_test, recording_tensors, allow_unused=allow_unused) | |
self.assertEqual(outputs, outputs_test) | |
self.assertEqual(grads, grads_test) | |
for g2, g2_test in zip(grads2, grads2_test): | |
if g2 is None and g2_test is None: | |
continue | |
self.assertTrue(torch.allclose(g2, g2_test, atol=5e-4, rtol=1e-4)) |
Note:
The test_variant_consistency_eager
test has a try/except block which checks if the eager mode raises or not.
Lines 183 to 237 in 5546a12
# Tests that the forward and backward passes of operations produce the | |
# same values for the cross-product of op variants (method, inplace) | |
# against eager's gold standard op function variant | |
@ops(op_db) | |
def test_variant_consistency_eager(self, device, dtype, op): | |
samples = op.sample_inputs(device, dtype, requires_grad=True) | |
if len(samples) == 0: | |
self.skipTest("Skipped! No sample inputs!") | |
for sample in samples: | |
# Acquires variants to test | |
method = op.get_method() | |
inplace = op.get_inplace() | |
variants = (v for v in (method, inplace) if v is not None) | |
# Computes expected forward | |
# below calls op's function variant | |
expected_forward = op(*sample.input, *sample.args, **sample.kwargs) | |
# Computes expected backward | |
# NOTE: backward may fail for some dtypes | |
exception_during_backwards = False | |
expected_grad = None | |
try: | |
expected_forward.sum().backward() | |
expected_grad = sample.input.grad | |
sample.input.grad = None | |
except Exception as e: | |
exception_during_backwards = True | |
# Test eager consistency | |
for variant in variants: | |
# Verifies that inplace operations that promote int->float fail | |
# on tensors with integer dtypes. | |
if (variant is inplace and op.promotes_integers_to_float and | |
dtype in (torch.bool, torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64)): | |
try: | |
variant_forward = variant(*(clone_input_helper(input) for input in sample.input), | |
*sample.args, | |
**sample.kwargs) | |
except Exception as e: | |
continue | |
self.fail("Inplace operation on integer tensor that should be promoted to float didn't fail!") | |
# Compares variant's forward | |
# Note: copy the tensor-type inputs when testing inplace operation | |
variant_forward = variant(*(clone_input_helper(input) if variant is inplace else input | |
for input in sample.input), | |
*sample.args, | |
**sample.kwargs) | |
self.assertEqual(variant_forward, expected_forward) | |
# Compares variant's backward | |
if variant is not inplace or op.test_inplace_grad: | |
self.check_variant_backward(sample.input, variant_forward, | |
expected_grad, exception_during_backwards) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! That's why I was confused. Thank you for verifying. I'll update the test suite tracker.
Clever workarounds for that SciPy reference's bug on this one. Nice job, @kshitij12345. I think we can kill this, too: Line 6920 in 78e71ce
Would you also take a look at test_variant_consistency_jit briefly to see what's going on in backward? I really thought we had written that test so backwards could fail and we would just verify it failed in both eager mode and the jit. |
Thanks for the review. Changes:
Have checked the failing Note : Thanks! |
@mruberry PTAL :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work as usual, @kshitij12345. The test failure looks unrelated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
SkipInfo('TestCommon', 'test_variant_consistency_jit', | ||
dtypes=[torch.bfloat16]), | ||
), | ||
promotes_integers_to_float=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has to be rebased because this will cause a logical merge conflict with @peterbell10's
I think it just needs to be changed to safe_casts_outputs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Reference: #42515