Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong vector shift results on PowerPC #109777

Open
Flamefire opened this issue Sep 21, 2023 · 2 comments
Open

Wrong vector shift results on PowerPC #109777

Flamefire opened this issue Sep 21, 2023 · 2 comments
Labels
module: POWER Issues specific to the POWER/ppc architecture module: vectorization Related to SIMD vectorization, e.g., Vec256 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Flamefire
Copy link
Collaborator

馃悰 Describe the bug

This is basically #70904 which is still present for PPC after #98511 added vector operations.

Basically: Shifting a vector with out-of-bounds values results in undefined behavior and sporadic wrong results.
E.g. test_non_contig_bitwise_right_shift_cpu_int32 fails because of right shifts with too-large values and test_shift_limits_cpu_int16 fails due to left shifts with negative values. The latter is very much understandable given that #98511 uses a reinterpret_cast to the unsigned vector type

Reproduced with python run_test.py -i test_binary_ufuncs which fails

  • test_contig_vs_every_other_bitwise_right_shift_cpu_int16
  • test_contig_vs_every_other_bitwise_right_shift_cpu_int32
  • test_contig_vs_every_other_bitwise_right_shift_cpu_int64
  • test_non_contig_bitwise_right_shift_cpu_int16
  • test_non_contig_bitwise_right_shift_cpu_int32
  • test_non_contig_bitwise_right_shift_cpu_int64
  • test_shift_limits_cpu_int16
  • test_shift_limits_cpu_int32
  • test_shift_limits_cpu_int64

Versions

Using PyTorch 2.0.1 with the above PRs applied as patches.

@ezyang ezyang added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: vectorization Related to SIMD vectorization, e.g., Vec256 module: POWER Issues specific to the POWER/ppc architecture labels Sep 21, 2023
@quickwritereader
Copy link
Contributor

quickwritereader commented Sep 21, 2023

I see. could you remove the reinterpret_cast s and see if it complies to the desired.

@Flamefire
Copy link
Collaborator Author

I see. could you remove the reinterpret_cast s and see if it complies to the desired.

No that does not work because vec_sl is only supported for unsigned rhs which matches C++ specs where negative shifts and shifts greater than the bitsize are UB.

My current workaround is the first approach of #98511 with an #ifdef for the shift kernels using cpu_kernel with the explicit checks for wrong shift values introduced in #96659

An alternative I can see is to implement the vectorized version of the checks present in e.g. the AVX code for VSX. And likely use vec_sla/vec_sra instead of vec_sl/vec_sr as seemingly arithmetic shifts are intended

Flamefire added a commit to Flamefire/pytorch that referenced this issue Sep 22, 2023
Similar to pytorch#96659 this implements the conditionals handling the
out-of-limit values in the shift amounts (rhs) for the vectorized VSX
code using the same logic as the scalar code.

Fixes pytorch#109777
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: POWER Issues specific to the POWER/ppc architecture module: vectorization Related to SIMD vectorization, e.g., Vec256 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants