Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3.1.0 volk_32fc_s32f_atan2_32f.h avx2 and avx2_fma kernels return NaN for an input element 0+0j #730

Closed
jj1bdx opened this issue Dec 17, 2023 · 1 comment · Fixed by #731

Comments

@jj1bdx
Copy link
Contributor

jj1bdx commented Dec 17, 2023

Synopsis

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns -nan for the following kernels when an input element is 0+0j:

  • avx2 (a_avx2, u_avx2)
  • avx2_fma (a_avx2_fma, u_avx2_fma)

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns zero for the following kernels when an input element is 0+0j:

  • generic
  • polynomial

(where j^2 = -1)

Tested on Ubuntu 22.04.3 x86_64.

Test details are available at: https://gist.github.com/jj1bdx/62e27aac4b54a29dfafe210c73c49b0e

What should be done

The avx2 and avx2_fma kernels should behave the same as the generic and polynomial kernels, i.e., returning zero when the input is 0+0j. I have to screen this irregularity in my own application as in this example, which affects the overall performance and introduces unnecessary complexity.

I would appreciate it if the original author of the kernel would kindly help fix this bug.

References

  • Issue new kernels for atan2 #636
  • Speeding up atan2f by 50x
    • In the Edge Cases section of this article, the author explicitly states that: "All our functions do not handle inputs containing infinity, or when both x and y are 0, while conforming implementations must handle them specially". So, this is assumed to be a known issue.
@jdemel
Copy link
Contributor

jdemel commented Jan 7, 2024

Thanks for this detailed bug report. I agree, all kernels should follow the generic behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants