torch.polygamma inconsistent with scipy.special.polygamma for n >= 1 #106692

igm503 · 2023-08-07T05:36:43Z

kshitij12345 raised an issue two years ago concerning the fact that pytorch's implementation of the 1st order polygamma function incorrectly produces finite values for negative integers. The issue was closed, but the issue still seems to be present in the latest nightlies and the latest release of pytorch.

n = 1

>>> t = torch.tensor([-1.])
>>> torch.polygamma(1, t)
tensor([1.2914e+15])
>>> scipy.special.polygamma(1, t.numpy())
array([inf], dtype=float32)

>>> t = torch.tensor([-501.])
>>> torch.polygamma(1, t)
tensor([2.0831e+09])
>>> scipy.special.polygamma(1, t.numpy())
array([inf], dtype=float32)

>>> t = torch.tensor([-float('inf')])
>>> torch.polygamma(1, t)
tensor([nan])
>>> scipy.special.polygamma(1, t.numpy())
array([inf], dtype=float32)
>>>

n > 1

>>> t = torch.tensor([float('inf')])
>>> torch.polygamma(2, t)
tensor([nan])
>>> scipy.special.polygamma(2, t.numpy())
array([-0.], dtype=float32)
>>> torch.polygamma(3, t)
tensor([nan])
>>> scipy.special.polygamma(3, t.numpy())
array([0.], dtype=float32)
>>> torch.polygamma(4, t)
tensor([nan])
>>> scipy.special.polygamma(4, t.numpy())
array([-0.], dtype=float32)

cc @mruberry @rgommers @kshitij12345

_Originally posted in #55357

Fixes issue mentioned in #77764 e.g. #77764 (comment) Adds MPS support for the following ops: - lgamma - mvlgamma - digamma - polygamma The lgamma fucntion does not yet have an MPS backend implementation. I've added one using a custom metal kernel (following John D. Cook's c++ implementation of the log gamma function: https://www.johndcook.com/blog/cpp_gamma/). For the backward pass op, I've added a digamma kernel that follows the cpu+cuda digamma implementation, and for the backward pass of the digamma op, I've added a polygamma + trigamma kernel following, again, the cpu+cuda implementations. NOTE: The cpu implementation of the polygamma function incorrectly (as far as I can tell) outputs a finite number for order = 1 and x in the negative integers. The mps implementation correctly outputs infinite. (see #106692) The polygamma tests currently don't pass because of the error in the cpu+cuda kernels, but also because there are smallish discrepancies near the negative integers between the cpu+cuda and the mps polygamma and trigamma kernels. I'm not sure exactly why this is, but let me know if the discrepancies are too big. Pull Request resolved: #106292 Approved by: https://github.com/kulinseth

pearu · 2024-02-10T16:37:47Z

Re

>>> t = torch.tensor([-1.])
>>> torch.polygamma(1, t)
tensor([1.2914e+15])

The problem with polygamma(1, t) is that it internally uses trigamma that does not implement the non-positive-integer arguments check. The reported value is a result of trigamma(t) approximation (pi/sin(pi*t))**2+... that for the given t evaluates to

>>> (torch.pi/torch.sin(t*torch.pi)) ** 2
tensor(1.2914e+15)

It is finite because sin(pi*t) is never exactly zero for a floating point value t.
A fix is to insert std::floor(t) == t check into trigamma that returns inf when the check evaluates true.

Re

>>> t = torch.tensor([-float('inf')])
>>> torch.polygamma(1, t)
tensor([nan])
>>> scipy.special.polygamma(1, t.numpy())
array([inf], dtype=float32)

and other examples with -inf. Here, the torch.polygamma results is as expected while scipy.special.polygamma results are incorrect: polygamma has poles at nonpositive integer values for any n; when evaluating polygamma at -inf, the only reasonable result is nan.

… tensor data." As in the title. The aim of this addition is to make debugging certain CI failures (that cannot be reproduced locally) easier. For instance, currently we see messages like ``` Exception: Caused by sample input at index 0: SampleInput(input=Tensor[size=(20,), device="cuda:0", dtype=torch.float64], args=(), kwargs={}, broadcasts_input=False, name='') ``` that is not really useful (as all those sample parameters can often be detected by other means) without showing actual sample data. The sample data can then be related to the `index` part in the error messages like: ``` Mismatched elements: 2 / 20 (10.0%) Greatest absolute difference: nan at index (10,) (up to 1e-05 allowed) Greatest relative difference: nan at index (10,) (up to 1e-07 allowed) ``` As an example of usefulness of this PR, consider the following failure message: ``` inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_polygamma_polygamma_n_0_cpu_int32 ('RERUN', {'yellow': True}) [1.5510s] [ 70%] inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_polygamma_polygamma_n_0_cpu_int32 ('RERUN', {'yellow': True}) [0.0473s] [ 70%] inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_polygamma_polygamma_n_0_cpu_int32 FAILED [0.0493s] [ 70%] ==================================== RERUNS ==================================== __ TestInductorOpInfoCPU.test_comprehensive_polygamma_polygamma_n_0_cpu_int32 __ Traceback (most recent call last): <snip> AssertionError: Tensor-likes are not close! Mismatched elements: 9 / 25 (36.0%) Greatest absolute difference: inf at index (0, 0) (up to 1e-05 allowed), inf vs 20177651499008.0 Greatest relative difference: inf at index (0, 0) (up to 1.3e-06 allowed) The above exception was the direct cause of the following exception: <snip> Exception: Caused by sample input at index 0: SampleInput(input=Tensor[size=(5, 5), device="cpu", dtype=torch.int32, data=[-8, 6, 9, 0, 0, 5, 5, 7, 6, 5, 1, -5, 2, -1, 8, -4, 0, -6, 3, -5]], args=(1), kwargs={}, broadcasts_input=False, name='') ``` from which we learn that `torch.polygamma` result is actually correct because `polygamma(0, -8) -> inf` while the used reference value (20177651499008.0) is wrong (see #106692 for more details). [ghstack-poisoned]

As in the title. The aim of this addition is to make debugging certain CI failures (that cannot be reproduced locally) easier. For instance, currently we see messages like ``` Exception: Caused by sample input at index 0: SampleInput(input=Tensor[size=(20,), device="cuda:0", dtype=torch.float64], args=(), kwargs={}, broadcasts_input=False, name='') ``` that is not really useful (as all those sample parameters can often be detected by other means) without showing actual sample data. The sample data can then be related to the `index` part in the error messages like: ``` Mismatched elements: 2 / 20 (10.0%) Greatest absolute difference: nan at index (10,) (up to 1e-05 allowed) Greatest relative difference: nan at index (10,) (up to 1e-07 allowed) ``` As an example of usefulness of this PR, consider the following failure message: ``` inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_polygamma_polygamma_n_0_cpu_int32 ('RERUN', {'yellow': True}) [1.5510s] [ 70%] inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_polygamma_polygamma_n_0_cpu_int32 ('RERUN', {'yellow': True}) [0.0473s] [ 70%] inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_polygamma_polygamma_n_0_cpu_int32 FAILED [0.0493s] [ 70%] ==================================== RERUNS ==================================== __ TestInductorOpInfoCPU.test_comprehensive_polygamma_polygamma_n_0_cpu_int32 __ Traceback (most recent call last): <snip> AssertionError: Tensor-likes are not close! Mismatched elements: 9 / 25 (36.0%) Greatest absolute difference: inf at index (0, 0) (up to 1e-05 allowed), inf vs 20177651499008.0 Greatest relative difference: inf at index (0, 0) (up to 1.3e-06 allowed) The above exception was the direct cause of the following exception: <snip> Exception: Caused by sample input at index 0: SampleInput(input=Tensor[size=(5, 5), device="cpu", dtype=torch.int32, data=[-8, 6, 9, 0, 0, 5, 5, 7, 6, 5, 1, -5, 2, -1, 8, -4, 0, -6, 3, -5]], args=(1), kwargs={}, broadcasts_input=False, name='') ``` from which we learn that `torch.polygamma` result is actually correct because `polygamma(0, -8) -> inf` while the used reference value (20177651499008.0) is wrong (see #106692 for more details). [ghstack-poisoned]