Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[numpy] torch.lgamma: promote integer inputs to float #50140

Conversation

kshitij12345
Copy link
Collaborator

Reference: #42515

@facebook-github-bot facebook-github-bot added cla signed oncall: jit Add this issue/PR to JIT oncall triage queue labels Jan 6, 2021
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 6, 2021

💊 CI failures summary and remediations

As of commit 341664f (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

dtypesIfCUDA=all_types_and(torch.bool, torch.half),
skips=(
SkipInfo('TestUnaryUfuncs', 'test_reference_numerics',
dtypes=[torch.bfloat16]),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping Bfloat16

>>> import torch
>>> torch.tensor(500.0, dtype=torch.bfloat16)
tensor(500., dtype=torch.bfloat16)
>>> t = torch.tensor(500.0, dtype=torch.bfloat16)
>>> torch.lgamma(t)
tensor(2608., dtype=torch.bfloat16)
>>> torch.lgamma(t.to(torch.float32))
tensor(2605.1160)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good skip. This again highlights that we should break-up test_reference_numerics into different value ranges (like small values, large values, extremal values, for example).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful if ranges can be specified per dtype or for group of dtype (maybe something similar to precision-override), because we don't want to decrease the range of all dtypes just to be able to support one dtype with smaller range.

@kshitij12345
Copy link
Collaborator Author

@codecov
Copy link

codecov bot commented Jan 11, 2021

Codecov Report

Merging #50140 (341664f) into master (8690819) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #50140      +/-   ##
==========================================
- Coverage   80.91%   80.91%   -0.01%     
==========================================
  Files        1926     1926              
  Lines      210014   210023       +9     
==========================================
+ Hits       169943   169945       +2     
- Misses      40071    40078       +7     

@kshitij12345 kshitij12345 marked this pull request as ready for review January 12, 2021 09:12
@@ -1260,6 +1260,28 @@ def reference_sigmoid(x):
return (1 / (1 + np.exp(-x)))
return scipy.special.expit(x)

def reference_lgamma(x):
# scipy.special.gammaln returns `-inf` when input is `-inf`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in SciPy.

scipy.special.gammaln is documented as equivalent to math.lgamma, but math.lgamma returns inf when given -inf.

Would you file an issue with SciPy, @kshitij12345?

cc @rgommers

@@ -1260,6 +1260,28 @@ def reference_sigmoid(x):
return (1 / (1 + np.exp(-x)))
return scipy.special.expit(x)

def reference_lgamma(x):
# scipy.special.gammaln returns `-inf` when input is `-inf`.
# While Pytorch, C and C++, all return `inf` when input is `-inf`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good comment. Would you extend it to include Python's math.lgamma, too?

# https://en.cppreference.com/w/c/numeric/math/lgamma

# To handle the above discrepancy,
# we never pass `-inf` to scipy.special.gammaln.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workaround is more clever than this comment suggests. It's not only that this never passes -inf to scipy.special.gammaln, but that it replaces -inf with inf so the mapping -inf->inf occurs as expected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we phrase it as,

# To handle the above discrepancy,
# we never pass `-inf`  to scipy.special.gammaln and replace them with `inf`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty good, but it requires the reader make the connection about how -inf is mapped to inf.

"this replaces -inf with inf so values that were originally -inf map to inf as expected"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is very verbose but atleast clear. Will use that thanks!

# Reference: https://github.com/pytorch/pytorch/pull/50140#issuecomment-756150214
SkipInfo('TestUnaryUfuncs', 'test_reference_numerics',
dtypes=[torch.float32, torch.float64], active_if=IS_WINDOWS),
# Backward of `lgamma` uses `digamma` but `digamma`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. From the perspective of this PR this skip is correct, but I thought test_variant_consistency_jit wrapped the backward call in a try/except block to catch these errors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have verified. I don't see any try/execpt block in the failing test.

Stack Trace
======================================================================
ERROR: test_variant_consistency_jit_lgamma_cpu_bfloat16 (__main__.TestCommonCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 284, in instantiated_test
    raise rte
  File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 279, in instantiated_test
    result = test_fn(self, *args)
  File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 253, in test_wrapper
    return test(*args, **kwargs)
  File "test/test_ops.py", line 290, in test_variant_consistency_jit
    no_grad=(dtype not in dtypes_to_grad_check))
  File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/testing/_internal/common_jit.py", line 77, in check_against_reference
    allow_unused=allow_unused)
  File "/home/kshiteej/.conda/envs/PytorchENV/lib/python3.7/site-packages/torch/autograd/__init__.py", line 225, in grad
    inputs, allow_unused, accumulate_grad=False)
RuntimeError: "digamma" not implemented for 'BFloat16'

----------------------------------------------------------------------
Ran 94 tests in 5.282s

FAILED (errors=1, skipped=8)

Code fails in check_against_reference

pytorch/test/test_ops.py

Lines 272 to 300 in 5546a12

with disable_autodiff_subgraph_inlining():
def fn(*inputs, **kwargs):
output = func(*inputs, **kwargs)
return op.output_func(output)
# bfloat16 grad doesn't work for some operators
dtypes_to_grad_check = floating_and_complex_types_and(torch.half) \
if op.skip_bfloat16_grad else floating_and_complex_types_and(torch.half, torch.bfloat16)
# Check scripted forward, grad, and grad grad
script_fn = create_script_fn(self, name, func_type, op.output_func)
check_against_reference(self,
script_fn,
fn,
(*sample.input,) + sample.args,
sample.kwargs,
no_grad=(dtype not in dtypes_to_grad_check))
# Check traced forward, grad, and grad grad
traced_fn = create_traced_fn(self, variant)
check_against_reference(self,
traced_fn,
fn,
(*sample.input,) + sample.args,
sample.kwargs,
no_grad=(dtype not in dtypes_to_grad_check))
# Check alias annotation schema for correctness (make

The code for check_against_reference also doesn't have any try/except

def check_against_reference(self, func, reference_func, args, kwargs=None,
allow_unused=True, check_types=True, no_grad=False):
kwargs = kwargs if kwargs else {}
def allSum(vs):
if isinstance(vs, torch.Tensor):
vs = (vs,)
return sum((i + 1) * v.sum()
for i, v in enumerate(vs)
if v is not None and v.dtype in floating_and_complex_types_and(torch.half, torch.bfloat16))
def clone_inputs(requires_grad):
inputs = [
arg.detach().clone().requires_grad_(requires_grad and arg.requires_grad)
if isinstance(arg, torch.Tensor) else arg for arg in args
]
return inputs, [input for input in inputs if isinstance(input, torch.Tensor) and input.requires_grad]
nograd_inputs, nograd_tensors = clone_inputs(False)
recording_inputs, recording_tensors = clone_inputs(True)
# test no gradients case
outputs = self.runAndSaveRNG(reference_func, nograd_inputs, kwargs)
with enable_profiling_mode_for_profiling_tests():
outputs_test = self.runAndSaveRNG(func, nograd_inputs, kwargs)
self.assertEqual(outputs, outputs_test)
if check_types:
check_output_types(self, func, outputs_test, nograd_inputs, kwargs)
if no_grad:
# skip grad tests
return
with enable_profiling_mode_for_profiling_tests():
# test single grad case
outputs = self.runAndSaveRNG(reference_func, recording_inputs, kwargs)
grads = torch.autograd.grad(allSum(outputs), recording_tensors,
allow_unused=allow_unused)
outputs_test = self.runAndSaveRNG(func, recording_inputs, kwargs)
grads_test = torch.autograd.grad(allSum(outputs_test), recording_tensors,
allow_unused=allow_unused)
self.assertEqual(outputs, outputs_test)
self.assertEqual(grads, grads_test)
# test the grad grad case
if self._testMethodName in nn_functional_single_grad:
return
outputs = self.runAndSaveRNG(reference_func, recording_inputs, kwargs)
l1 = allSum(outputs)
grads = torch.autograd.grad(l1, recording_tensors, create_graph=True,
allow_unused=allow_unused)
l2 = (allSum(grads) * l1)
grads2 = torch.autograd.grad(l2, recording_tensors, allow_unused=allow_unused)
recording_inputs, recording_tensors = clone_inputs(True)
outputs_test = self.runAndSaveRNG(func, recording_inputs, kwargs)
l1_test = allSum(outputs_test)
grads_test = torch.autograd.grad(
l1_test, recording_tensors, create_graph=True, allow_unused=allow_unused)
l2_test = (allSum(grads_test) * l1_test)
grads2_test = torch.autograd.grad(l2_test, recording_tensors, allow_unused=allow_unused)
self.assertEqual(outputs, outputs_test)
self.assertEqual(grads, grads_test)
for g2, g2_test in zip(grads2, grads2_test):
if g2 is None and g2_test is None:
continue
self.assertTrue(torch.allclose(g2, g2_test, atol=5e-4, rtol=1e-4))

Note:
The test_variant_consistency_eager test has a try/except block which checks if the eager mode raises or not.

pytorch/test/test_ops.py

Lines 183 to 237 in 5546a12

# Tests that the forward and backward passes of operations produce the
# same values for the cross-product of op variants (method, inplace)
# against eager's gold standard op function variant
@ops(op_db)
def test_variant_consistency_eager(self, device, dtype, op):
samples = op.sample_inputs(device, dtype, requires_grad=True)
if len(samples) == 0:
self.skipTest("Skipped! No sample inputs!")
for sample in samples:
# Acquires variants to test
method = op.get_method()
inplace = op.get_inplace()
variants = (v for v in (method, inplace) if v is not None)
# Computes expected forward
# below calls op's function variant
expected_forward = op(*sample.input, *sample.args, **sample.kwargs)
# Computes expected backward
# NOTE: backward may fail for some dtypes
exception_during_backwards = False
expected_grad = None
try:
expected_forward.sum().backward()
expected_grad = sample.input.grad
sample.input.grad = None
except Exception as e:
exception_during_backwards = True
# Test eager consistency
for variant in variants:
# Verifies that inplace operations that promote int->float fail
# on tensors with integer dtypes.
if (variant is inplace and op.promotes_integers_to_float and
dtype in (torch.bool, torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64)):
try:
variant_forward = variant(*(clone_input_helper(input) for input in sample.input),
*sample.args,
**sample.kwargs)
except Exception as e:
continue
self.fail("Inplace operation on integer tensor that should be promoted to float didn't fail!")
# Compares variant's forward
# Note: copy the tensor-type inputs when testing inplace operation
variant_forward = variant(*(clone_input_helper(input) if variant is inplace else input
for input in sample.input),
*sample.args,
**sample.kwargs)
self.assertEqual(variant_forward, expected_forward)
# Compares variant's backward
if variant is not inplace or op.test_inplace_grad:
self.check_variant_backward(sample.input, variant_forward,
expected_grad, exception_during_backwards)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! That's why I was confused. Thank you for verifying. I'll update the test suite tracker.

@mruberry
Copy link
Collaborator

Clever workarounds for that SciPy reference's bug on this one. Nice job, @kshitij12345. I think we can kill this, too:

('lgamma', '', _small_3d, lambda t, d: [], 1e-2, 1e-1, 1e-5, _float_types_no_half, [torch.bfloat16]),

Would you also take a look at test_variant_consistency_jit briefly to see what's going on in backward? I really thought we had written that test so backwards could fail and we would just verify it failed in both eager mode and the jit.

@kshitij12345
Copy link
Collaborator Author

@mruberry

Thanks for the review.
Have made the necessary changes.

Changes:

  • Removed the redundant test.
  • Have updated the comment.

Have checked the failing test_variant_consistency_jit and looks like it doesn't have a try/except. (test_variant_consistency_eager does check for that and that test passes). Have replied with more details on the relevant thread.

Note : clang-format looks irrelevant for now but not sure.

Thanks!

@izdeby izdeby added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 12, 2021
@kshitij12345
Copy link
Collaborator Author

@mruberry PTAL :)

@mruberry mruberry self-requested a review January 19, 2021 09:30
Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work as usual, @kshitij12345. The test failure looks unrelated.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@kshitij12345 kshitij12345 mentioned this pull request Jan 19, 2021
8 tasks
SkipInfo('TestCommon', 'test_variant_consistency_jit',
dtypes=[torch.bfloat16]),
),
promotes_integers_to_float=True),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has to be rebased because this will cause a logical merge conflict with @peterbell10's

0436ea1.

I think it just needs to be changed to safe_casts_outputs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Updated.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 6d09809.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged oncall: jit Add this issue/PR to JIT oncall triage queue open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants