[MPS] Adding lgamma, digamma, and polygamma implementations #106292

igm503 · 2023-07-31T05:49:49Z

Fixes issue mentioned in #77764

Adds MPS support for the following ops:

lgamma
mvlgamma
digamma
polygamma

The lgamma fucntion does not yet have an MPS backend implementation. I've added one using a custom metal kernel (following John D. Cook's c++ implementation of the log gamma function: https://www.johndcook.com/blog/cpp_gamma/). For the backward pass op, I've added a digamma kernel that follows the cpu+cuda digamma implementation, and for the backward pass of the digamma op, I've added a polygamma + trigamma kernel following, again, the cpu+cuda implementations.

NOTE:

The cpu implementation of the polygamma function incorrectly (as far as I can tell) outputs a finite number for order = 1 and x in the negative integers. The mps implementation correctly outputs infinite. (see #106692)

The polygamma tests currently don't pass because of the error in the cpu+cuda kernels, but also because there are smallish discrepancies near the negative integers between the cpu+cuda and the mps polygamma and trigamma kernels. I'm not sure exactly why this is, but let me know if the discrepancies are too big.

pytorch-bot · 2023-07-31T05:49:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106292

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 8 Unrelated Failures

As of commit cc34d98 with merge base 703cdd7 ():

NEW FAILURES - The following jobs have failed:

linux-focal-cuda12.1-py3.10-gcc9 / test (nogpu_AVX512, 1, 1, linux.2xlarge) (gh)
linux-focal-cuda12.1-py3.10-gcc9 / test (nogpu_NO_AVX2, 1, 1, linux.2xlarge) (gh)

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 4, 5, linux.g5.4xlarge.nvidia.gpu, unstable) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

igm503 · 2023-08-26T07:10:08Z

@kulinseth Any chance you can give this a look and advise about whether the test failures are a problem?

kulinseth · 2023-08-31T21:30:10Z

=================================== FAILURES ===================================
______________ TestFallbackWarning.test_error_on_not_implemented _______________
Traceback (most recent call last):
  File "/Users/ec2-user/runner/_work/pytorch/pytorch/test/test_mps.py", line 10438, in test_error_on_not_implemented
    fn(*args, **kwargs)
  File "/Users/ec2-user/runner/_work/_temp/conda_environment_5983090866/lib/python3.9/unittest/case.py", line 226, in __exit__
    self._raiseFailure("{} not raised".format(exc_name))
  File "/Users/ec2-user/runner/_work/_temp/conda_environment_5983090866/lib/python3.9/unittest/case.py", line 163, in _raiseFailure
    raise self.test_case.failureException(msg)
AssertionError: NotImplementedError not raised

This issue seems unrelated to the PR. Can you @igm503 please rebase the PR?

kulinseth · 2023-09-03T14:31:45Z

@kulinseth Any chance you can give this a look and advise about whether the test failures are a problem?

@igm503 the assertion is coming from not implemented test . Can you check if lgamma tests are not in that category class in test_mps .

igm503 · 2023-09-03T21:05:11Z

@kulinseth I've fixed the assertion error by swapping another not-yet-implemented op for lgamma in the not_implemented test.

igm503 · 2023-09-06T01:40:26Z

@kulinseth So, at least as I'm typing this, the test errors are now those that I mentioned in the pull request body: in some cases, they're precision issues, but in other cases, I think the cpu implementation is incorrect.

kulinseth · 2023-09-06T05:29:02Z

@kulinseth So, at least as I'm typing this, the test errors are now those that I mentioned in the pull request body: in some cases, they're precision issues, but in other cases, I think the cpu implementation is incorrect.

@igm503 , I see, we can add these tests to XFAILLIST here.

igm503 · 2023-09-07T00:17:59Z

@kulinseth So, at least as I'm typing this, the test errors are now those that I mentioned in the pull request body: in some cases, they're precision issues, but in other cases, I think the cpu implementation is incorrect.

@igm503 , I see, we can add these tests to XFAILLIST here.

The tests now pass on the macos 13 builds.

@kulinseth However, since there are precision issues with test_output_grad_match_polygamma_polygamma_n_0_cpu_float32 on macos 12 as well, where should I put that exception? I scanned the different XFAILLISTs, and I don't see a clear place for it. Of course, I could put it in the pre-13 XFAIL list, but that would make it seem like it's fixed for >13, which it isn't.

…unction had been made static elsewhere

…or test_error_on_not_implemented

igm503 · 2023-09-11T22:45:23Z

@kulinseth I went ahead and added the failing tests to the MACOS_BEFORE_13_3_XFAILLIST as well. Let me know if there's a more appropriate place to put them.

kulinseth

Looks good

igm503 · 2023-09-12T12:52:57Z

@pytorchbot merge -i

pytorchmergebot · 2023-09-12T12:57:04Z

Merge started

Your change will be merged while ignoring the following 6 checks: pull / linux-focal-py3.8-clang10 / test (default, 2, 3, linux.2xlarge), pull / linux-jammy-py3.9-clang12-asan / test (default, 6, 6, linux.4xlarge), pull / linux-jammy-py3.8-gcc11 / test (default, 2, 3, linux.2xlarge), pull / linux-focal-py3.11-clang10 / test (default, 3, 3, linux.2xlarge), pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 4, 5, linux.g5.4xlarge.nvidia.gpu, unstable), pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 1, 5, linux.4xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-09-12T13:12:30Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / macos-12-py3-arm64 / test (default, 3, 3, macos-m1-12)

Details for Dev Infra team

Raised by workflow job

igm503 · 2023-09-12T16:41:24Z

@pytorchbot merge -i

pytorchmergebot · 2023-09-12T16:43:31Z

Merge started

Your change will be merged while ignoring the following 10 checks: pull / linux-focal-py3.8-clang10 / test (default, 2, 3, linux.2xlarge), pull / linux-jammy-py3.9-clang12-asan / test (default, 6, 6, linux.4xlarge), pull / linux-jammy-py3.8-gcc11 / test (default, 2, 3, linux.2xlarge), pull / linux-focal-py3.11-clang10 / test (default, 3, 3, linux.2xlarge), pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 4, 5, linux.g5.4xlarge.nvidia.gpu, unstable), pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 1, 5, linux.4xlarge.nvidia.gpu), trunk / macos-12-py3-arm64 / test (default, 3, 3, macos-m1-12), trunk / linux-focal-rocm5.6-py3.8 / test (default, 2, 3, linux.rocm.gpu), trunk / linux-focal-cuda12.1-py3.10-gcc9 / test (nogpu_AVX512, 1, 1, linux.2xlarge), trunk / linux-focal-cuda12.1-py3.10-gcc9 / test (nogpu_NO_AVX2, 1, 1, linux.2xlarge)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Jul 31, 2023

pytorchbot added the open source label Jul 31, 2023

igm503 marked this pull request as ready for review August 7, 2023 06:29

igm503 requested a review from kulinseth as a code owner August 7, 2023 06:29

igm503 changed the title ~~added log-gamma function kernel for mps backend~~ [MPS] lgamma, digamma, and polygamma implementations Aug 7, 2023

igm503 changed the title ~~[MPS] lgamma, digamma, and polygamma implementations~~ [MPS] Adding lgamma, digamma, and polygamma implementations Aug 7, 2023

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 7, 2023

igm503 force-pushed the lgamma branch 2 times, most recently from 048b88a to 3938014 Compare September 2, 2023 20:04

igm503 closed this Sep 2, 2023

igm503 deleted the lgamma branch September 2, 2023 20:08

igm503 restored the lgamma branch September 2, 2023 20:14

igm503 reopened this Sep 2, 2023

igm503 force-pushed the lgamma branch from 026a574 to f264a55 Compare September 3, 2023 20:29

igm503 force-pushed the lgamma branch from f264a55 to 44c13bc Compare September 6, 2023 23:38

igm503 added 6 commits September 11, 2023 17:43

added log-gamma function kernel for mps backend

4fadbbe

adding digamma mps

13284e5

digamma working; slight innaccuracy near negative integers

32613c6

fixed digamma innaccuray in negative non-ints

ac1cfff

fixed test XFAIL removals

325c337

added polygamma for mps; working on accuracy relative to cpu

0d45f61

igm503 added 8 commits September 11, 2023 17:43

minor kernel refactor

92de97d

moved around MLS functions

c6ae7cf

made get_cpl_state static and added dispatch1djob to gamma.mm since f…

6dcfdcf

…unction had been made static elsewhere

ran linter on gamma.mm and replaced lgamma with lcm as the function f…

9d24102

…or test_error_on_not_implemented

fixed imprecision in trigamma kernel; put some tests on XFail list

293a1aa

polar and cumprod tests were eronneously added back to XFAIL

8801beb

igamma op removed upon rebase

d41f22b

added failing tests to before 13.3 XFailList

cc34d98

igm503 force-pushed the lgamma branch from e16d763 to cc34d98 Compare September 11, 2023 22:44

kulinseth approved these changes Sep 12, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 12, 2023

pytorchmergebot added the merging label Sep 12, 2023

pytorchmergebot removed the merging label Sep 12, 2023

pytorchmergebot added the merging label Sep 12, 2023

pytorchmergebot added Merged and removed merging labels Sep 12, 2023

pytorchmergebot closed this in 1b9b3a2 Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MPS] Adding lgamma, digamma, and polygamma implementations #106292

[MPS] Adding lgamma, digamma, and polygamma implementations #106292

igm503 commented Jul 31, 2023 •

edited

pytorch-bot bot commented Jul 31, 2023 •

edited

igm503 commented Aug 26, 2023

kulinseth commented Aug 31, 2023

kulinseth commented Sep 3, 2023

igm503 commented Sep 3, 2023

igm503 commented Sep 6, 2023

kulinseth commented Sep 6, 2023

igm503 commented Sep 7, 2023 •

edited

igm503 commented Sep 11, 2023

kulinseth left a comment

igm503 commented Sep 12, 2023

pytorchmergebot commented Sep 12, 2023

pytorchmergebot commented Sep 12, 2023

igm503 commented Sep 12, 2023

pytorchmergebot commented Sep 12, 2023

[MPS] Adding lgamma, digamma, and polygamma implementations #106292

[MPS] Adding lgamma, digamma, and polygamma implementations #106292

Conversation

igm503 commented Jul 31, 2023 • edited

pytorch-bot bot commented Jul 31, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106292

❌ 2 New Failures, 8 Unrelated Failures

igm503 commented Aug 26, 2023

kulinseth commented Aug 31, 2023

kulinseth commented Sep 3, 2023

igm503 commented Sep 3, 2023

igm503 commented Sep 6, 2023

kulinseth commented Sep 6, 2023

igm503 commented Sep 7, 2023 • edited

igm503 commented Sep 11, 2023

kulinseth left a comment

Choose a reason for hiding this comment

igm503 commented Sep 12, 2023

pytorchmergebot commented Sep 12, 2023

Merge started

pytorchmergebot commented Sep 12, 2023

Merge failed

igm503 commented Sep 12, 2023

pytorchmergebot commented Sep 12, 2023

Merge started

igm503 commented Jul 31, 2023 •

edited

pytorch-bot bot commented Jul 31, 2023 •

edited

igm503 commented Sep 7, 2023 •

edited