New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
volk_32f_cos_32f/volk_32f_sin_32f error #52
Comments
volk version is 1.1 |
Yikes! That looks pretty bad. Thanks for the report. I'll look in to this. Just to clarify, it's passing QA and volk_profile gives no errors, but the result is negative for those 3 chunks of angles? |
Here's the error in tabular format. angle = 0, 0.000000, sin = 0.000000, volk sin = 0.000000, cos = 1.000000, volk cos = 1.000000 |
OK, just got around to this today. My first step whenever I look at this stuff is to figure out why our QA fails. In this case it's because our auto-generated numbers are (-1,1). When I change that to (-pi, pi) I get failures for the trig functions (tan is also off, tanh is too, but that's just a tolerance problem for high input vals). Unfortunately I haven't seen an obvious transition/pattern with where the sign errors are (please tell me if you do, which may help exploring the math). I didn't actually write the original code, so it'll take another night of re-deriving the appromixations. |
For cosine, the sign errors are from 3/8 pi to 1/2 pi and 11/8 pi to 3/2 pi. For sine, the sign errors are from 7/8 pi to pi and 15/8 pi to 2 pi. |
Just looked through the code and tried to fix it, no luck :/ But I don't think that sse 4.1 is necessary, sse 2 should be sufficient for this kernel. I could be wrong! |
The value of q is supposed to be floored rather than rounded. This fixes the issue in the cosine kernel and adds a generic version using the same approximation for the cosine as well as updating some constants to be more precise for the cosine.
@stwunsch, I think you're correct that we only need SSE2 for this, but that's an API change since the name of protokernels do get exported publicly. For now, I found the paper this is based on and discovered a missing floor operation on a value used in a if-branch to flip the sign. This should be fixed. Sorry for the long delay, and I appreciate the patience :-) While I was at it I updated the QA to test from [-pi, pi] so we test all around the unit circle. As a result these moved in to the "inaccurate" section of kernels because the normal tolerance is quite tight, but in reality these are very good approximations. I'm going to close this since I think the issue is fixed. Holler if you see any issues. |
Heh, ok... This will still fail QA because the randomly generated numbers provide some really small angles and the % difference used in the QA makes it tough to do these comparisons. That's been on my TODO list for a while and it looks like I have a good reason to do it now.. For now ignore really small errors even though the QA will fail. Example: |
Can you post the name of the paper? Thx in advance! |
Efficient evaluation methods of elementary functions suitable for SIMD computation. The basic idea is to get the form s + pi/4 * q, where q is an integer that represents the quadrant, then put use cosine double angle identity (3 times in this case) to put s in to a smaller range so it's good for a taylor series approximation and use q to recover the correct sign. Taking advantage of the double-angle multiple times (along with the math to unwind) to get a nicely linearized region is the clever bit. |
output phase error
input is -38.1703:-0.0746:-53.0061
please see the result in the attachment
The text was updated successfully, but these errors were encountered: