Fix test_addmm_gelu assertion on Windows CUDA #104031

aakhundov · 2023-06-22T09:19:26Z

Stack from ghstack (oldest at bottom):

-> Fix test_addmm_gelu assertion on Windows CUDA #104031

Summary:

This PR fixes the wrong assertion in the test_addmm_gelu happening in the Windows CUDA CI job caused by #103811. The addmm + GELU fusion is likely not happening (or not using the tanh approximation) on Windows. See this comment in the #103811 for the details of the error.

Test Plan:

$ python test/test_linalg.py -k test_addmm_relu -v
test_addmm_relu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_relu_cpu_bfloat16) ... ok
test_addmm_relu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float32) ... ok
test_addmm_relu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float64) ... ok
test_addmm_relu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_bfloat16) ... ok
test_addmm_relu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float32) ... ok
test_addmm_relu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float64) ... ok

----------------------------------------------------------------------
Ran 6 tests in 2.131s

OK

$ python test/test_linalg.py -k test_addmm_gelu -v
test_addmm_gelu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_bfloat16) ... ok
test_addmm_gelu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float32) ... ok
test_addmm_gelu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float64) ... ok
test_addmm_gelu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_bfloat16) ... ok
test_addmm_gelu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float32) ... ok
test_addmm_gelu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float64) ... ok

----------------------------------------------------------------------
Ran 6 tests in 2.194s

OK

Reviewers: @eellison @huydhn

Subscribers:

Tasks:

Tags:

Differential Revision: D46931688

@eellison

Summary: This PR fixes the wrong assertion in the `test_addmm_gelu` happening in the Windows CUDA CI job caused by #103811. The addmm + GELU fusion is likely not happening (or not using the tanh approximation) on Widnows. See [this comment](#103811 (comment)) in the #103811 for the details of the error. Test Plan: ``` $ python test/test_linalg.py -k test_addmm_relu -v test_addmm_relu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_relu_cpu_bfloat16) ... ok test_addmm_relu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float32) ... ok test_addmm_relu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float64) ... ok test_addmm_relu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_bfloat16) ... ok test_addmm_relu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float32) ... ok test_addmm_relu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.131s OK $ python test/test_linalg.py -k test_addmm_gelu -v test_addmm_gelu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_bfloat16) ... ok test_addmm_gelu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float32) ... ok test_addmm_gelu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float64) ... ok test_addmm_gelu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_bfloat16) ... ok test_addmm_gelu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float32) ... ok test_addmm_gelu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.194s OK ``` Reviewers: @eellison @huydhn Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorch-bot · 2023-06-22T09:19:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/104031

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit 5a3d309:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: This PR fixes the wrong assertion in the `test_addmm_gelu` happening in the Windows CUDA CI job caused by #103811. The addmm + GELU fusion is likely not happening (or not using the tanh approximation) on Widnows. See [this comment](#103811 (comment)) in the #103811 for the details of the error. Test Plan: ``` $ python test/test_linalg.py -k test_addmm_relu -v test_addmm_relu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_relu_cpu_bfloat16) ... ok test_addmm_relu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float32) ... ok test_addmm_relu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float64) ... ok test_addmm_relu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_bfloat16) ... ok test_addmm_relu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float32) ... ok test_addmm_relu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.131s OK $ python test/test_linalg.py -k test_addmm_gelu -v test_addmm_gelu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_bfloat16) ... ok test_addmm_gelu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float32) ... ok test_addmm_gelu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float64) ... ok test_addmm_gelu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_bfloat16) ... ok test_addmm_gelu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float32) ... ok test_addmm_gelu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.194s OK ``` Reviewers: eellison huydhn Subscribers: Tasks: Tags: ghstack-source-id: 992faa201a4269bde3859df462cf00761cf8666f Pull Request resolved: #104031

aakhundov · 2023-06-22T09:25:49Z

@aakhundov has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

huydhn · 2023-06-22T16:06:33Z

@pytorchbot rebase

huydhn · 2023-06-22T16:07:33Z

Jsut FYI, the Windows build failure https://github.com/pytorch/pytorch/actions/runs/5343807634/jobs/9693267625 is a known issue that can be fixed by rebasing to include the fix commit in trunk.

pytorchmergebot · 2023-06-22T16:08:27Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

Summary: This PR fixes the wrong assertion in the `test_addmm_gelu` happening in the Windows CUDA CI job caused by #103811. The addmm + GELU fusion is likely not happening (or not using the tanh approximation) on Widnows. See [this comment](#103811 (comment)) in the #103811 for the details of the error. Test Plan: ``` $ python test/test_linalg.py -k test_addmm_relu -v test_addmm_relu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_relu_cpu_bfloat16) ... ok test_addmm_relu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float32) ... ok test_addmm_relu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float64) ... ok test_addmm_relu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_bfloat16) ... ok test_addmm_relu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float32) ... ok test_addmm_relu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.131s OK $ python test/test_linalg.py -k test_addmm_gelu -v test_addmm_gelu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_bfloat16) ... ok test_addmm_gelu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float32) ... ok test_addmm_gelu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float64) ... ok test_addmm_gelu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_bfloat16) ... ok test_addmm_gelu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float32) ... ok test_addmm_gelu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.194s OK ``` Reviewers: eellison huydhn Subscribers: Tasks: Tags: Differential Revision: [D46931688](https://our.internmc.facebook.com/intern/diff/D46931688) [ghstack-poisoned]

pytorchmergebot · 2023-06-22T16:08:44Z

Successfully rebased gh/aakhundov/4/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/104031)

Summary: This PR fixes the wrong assertion in the `test_addmm_gelu` happening in the Windows CUDA CI job caused by #103811. The addmm + GELU fusion is likely not happening (or not using the tanh approximation) on Widnows. See [this comment](#103811 (comment)) in the #103811 for the details of the error. Test Plan: ``` $ python test/test_linalg.py -k test_addmm_relu -v test_addmm_relu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_relu_cpu_bfloat16) ... ok test_addmm_relu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float32) ... ok test_addmm_relu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_relu_cpu_float64) ... ok test_addmm_relu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_bfloat16) ... ok test_addmm_relu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float32) ... ok test_addmm_relu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_relu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.131s OK $ python test/test_linalg.py -k test_addmm_gelu -v test_addmm_gelu_cpu_bfloat16 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_bfloat16) ... ok test_addmm_gelu_cpu_float32 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float32) ... ok test_addmm_gelu_cpu_float64 (__main__.TestLinalgCPU.test_addmm_gelu_cpu_float64) ... ok test_addmm_gelu_cuda_bfloat16 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_bfloat16) ... ok test_addmm_gelu_cuda_float32 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float32) ... ok test_addmm_gelu_cuda_float64 (__main__.TestLinalgCUDA.test_addmm_gelu_cuda_float64) ... ok ---------------------------------------------------------------------- Ran 6 tests in 2.194s OK ``` Reviewers: eellison huydhn Subscribers: Tasks: Tags: ghstack-source-id: 6b3a992609dd70f67a803b497e068d4f46583d4d Pull Request resolved: #104031

malfet · 2023-06-22T16:38:52Z

@pytorchbot merge -f "Lint is green and It simply skips the test"

pytorchmergebot · 2023-06-22T16:40:57Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-06-22T16:41:01Z

Merge failed

Reason: This PR has internal changes and must be landed via Phabricator

Details for Dev Infra team

Raised by workflow job

aakhundov · 2023-06-22T16:46:00Z

We've discussed the issue with @eellison and agreed that it would be better to switch to the "tanh" approximation for non-fused GELU in the aten._addmm_activation op, for consistency. As the API is private, this sounds like a more consistent way than patching different cases in the unit test. Will update the PR soon accordingly.

eellison · 2023-06-22T16:49:45Z

@aakhundov if HUD is in a bad state, maybe it makes sense to land this if it fixes tests, then in a separate pr update ?

aakhundov · 2023-06-22T17:21:25Z

@pytorchbot merge

pytorchmergebot · 2023-06-22T17:23:20Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-06-22T17:33:29Z

Merge failed

Reason: 1 jobs have failed, first few of them are: periodic / ios-12-5-1-x86-64-coreml / build (default, 1, 1, macos-12)

Details for Dev Infra team

Raised by workflow job

aakhundov · 2023-06-22T17:36:40Z

@huydhn @malfet @eellison the merge failed again due to the periodic / ios-12-5-1-x86-64-coreml / build (default, 1, 1, macos-12) job, which also seems unrelated. Any hint on how to bypass / fix this? Thanks!

eellison · 2023-06-22T17:37:35Z

You can merge past failures with merge -f {reason}

aakhundov · 2023-06-22T17:39:03Z

@pytorchbot merge -f "The failing CI job seems unrelated, merging to fix the HUD."

pytorchmergebot · 2023-06-22T17:42:28Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

aakhundov requested review from lezcano, nikitaved and IvanYashchuk as code owners June 22, 2023 09:19

pytorch-bot bot added the topic: not user facing topic category label Jun 22, 2023

aakhundov requested review from huydhn and eellison June 22, 2023 09:20

aakhundov mentioned this pull request Jun 22, 2023

Enable addmm + GELU epilogue fusion via cuBLASLt #103811

Closed

aakhundov removed the request for review from lezcano June 22, 2023 09:22

aakhundov self-assigned this Jun 22, 2023

huydhn added ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR test-config/default labels Jun 22, 2023

huydhn approved these changes Jun 22, 2023

View reviewed changes

malfet approved these changes Jun 22, 2023

View reviewed changes

pytorchmergebot added the merging label Jun 22, 2023

pytorchmergebot removed the merging label Jun 22, 2023

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 22, 2023

pytorchmergebot added the merging label Jun 22, 2023

pytorchmergebot removed the merging label Jun 22, 2023

pytorchmergebot added the merging label Jun 22, 2023

pytorchmergebot added Merged and removed merging labels Jun 22, 2023

pytorchmergebot closed this in f818036 Jun 22, 2023

facebook-github-bot deleted the gh/aakhundov/4/head branch June 26, 2023 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test_addmm_gelu assertion on Windows CUDA #104031

Fix test_addmm_gelu assertion on Windows CUDA #104031

aakhundov commented Jun 22, 2023 •

edited by malfet

pytorch-bot bot commented Jun 22, 2023 •

edited

aakhundov commented Jun 22, 2023

huydhn commented Jun 22, 2023

huydhn commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

malfet commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

aakhundov commented Jun 22, 2023

eellison commented Jun 22, 2023

aakhundov commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

aakhundov commented Jun 22, 2023

eellison commented Jun 22, 2023

aakhundov commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

Fix test_addmm_gelu assertion on Windows CUDA #104031

Fix test_addmm_gelu assertion on Windows CUDA #104031

Conversation

aakhundov commented Jun 22, 2023 • edited by malfet

pytorch-bot bot commented Jun 22, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/104031

❌ 3 New Failures

aakhundov commented Jun 22, 2023

huydhn commented Jun 22, 2023

huydhn commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

malfet commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

Merge started

pytorchmergebot commented Jun 22, 2023

Merge failed

aakhundov commented Jun 22, 2023

eellison commented Jun 22, 2023

aakhundov commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

Merge started

pytorchmergebot commented Jun 22, 2023

Merge failed

aakhundov commented Jun 22, 2023

eellison commented Jun 22, 2023

aakhundov commented Jun 22, 2023

pytorchmergebot commented Jun 22, 2023

Merge started

aakhundov commented Jun 22, 2023 •

edited by malfet

pytorch-bot bot commented Jun 22, 2023 •

edited