Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S390x complex division #108516

Conversation

AlekseiNikiforovIBM
Copy link
Contributor

@AlekseiNikiforovIBM AlekseiNikiforovIBM commented Sep 4, 2023

Adopt algorithm from AVX2 implementation.
This change fixes test test_complex_div_underflow_overflow_cpu_complex128
from test/test_binary_ufuncs.py

At the same time it breaks some of Arithmetics/*.Division tests
from vec_test_all_types_ZVECTOR,
but it's also broken on AVX2 and AVX512.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 4, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108516

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c524f2f with merge base c6f435b (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions github-actions bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Sep 4, 2023
@ezyang
Copy link
Contributor

ezyang commented Sep 5, 2023

So what's the plan for the tests?

@ezyang ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 5, 2023
@ezyang ezyang self-requested a review September 5, 2023 01:15
@AlekseiNikiforovIBM
Copy link
Contributor Author

For now it is needed to be figured out how they should actually be working on x86.

@ezyang
Copy link
Contributor

ezyang commented Sep 5, 2023

I think it's definitely possible we did it wrong on X86. Let us know what your analysis concludes.

@AlekseiNikiforovIBM
Copy link
Contributor Author

AlekseiNikiforovIBM commented Sep 26, 2023

I took one more look, but I don't know where the error is. In the end, it's a matter of precision, and different tests expect different precision. I'd like to have s390x behave similar to x86 for now. Also, I'll split out the test update into separate PR since it might break CI.

Adopt algorithm from AVX2 implementation.
This change fixes test test_complex_div_underflow_overflow_cpu_complex128
from test/test_binary_ufuncs.py

At the same time it breaks some of Arithmetics/*.Division tests
from vec_test_all_types_ZVECTOR,
but it's also broken on AVX2 and AVX512.
@AlekseiNikiforovIBM
Copy link
Contributor Author

I've found that there was update for division algorithms in #93277, but tests were not updated. I did the updating. Now it works for s390x, but it still doesn't work for x86_64 unfortunately. The issue is in those 2 lines:

https://github.com/pytorch/pytorch/pull/108516/files#diff-8b434aee6409eded80b20e81b7d0e27336e4f6c7e4bba0f2e50ced64354f23bcR1230

auto scale = _mm256_rcp_ps(_mm256_max_ps(fabs_cd, fabs_dc)); // 1/sc 1/sc

Function _mm256_rcp_ps unfortunately produces result with different precision.

@ezyang
Copy link
Contributor

ezyang commented Nov 7, 2023

well, if you want to land this for S390x, it's fine to just keep the old x86 impl and ifdef this only

Division algorithms were updated in
pytorch#93277
but tests were not.
@AlekseiNikiforovIBM
Copy link
Contributor Author

@ezyang, like this?

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sure

@ezyang
Copy link
Contributor

ezyang commented Nov 7, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 7, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@ezyang
Copy link
Contributor

ezyang commented Nov 7, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@ezyang
Copy link
Contributor

ezyang commented Nov 8, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Skylion007 pushed a commit to Skylion007/pytorch that referenced this pull request Nov 14, 2023
Adopt algorithm from AVX2 implementation.
This change fixes test test_complex_div_underflow_overflow_cpu_complex128
from test/test_binary_ufuncs.py

At the same time it breaks some of Arithmetics/*.Division tests
from vec_test_all_types_ZVECTOR,
but it's also broken on AVX2 and AVX512.

Pull Request resolved: pytorch#108516
Approved by: https://github.com/ezyang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source release notes: performance_as_product release notes category topic: new features topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants