Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added tests for avx #49

Merged
merged 9 commits into from
Aug 23, 2020
Merged

Added tests for avx #49

merged 9 commits into from
Aug 23, 2020

Conversation

ronniec95
Copy link
Contributor

Hi Lokathor

I've managed to finish off the work to add avx(2) functionality. The following new types have been added
f32x8
f64x4
i64x4
u64x4

along with their corresponding tests. There's some annoying changes I had to do to make it work and there might be better optimisations that I could not figure out; though I think I've done the best that's possible with safe_arch.

Please review and let me know if there are issues. All the tests work and I've used it for some pricing simulations without issues so far.

@Lokathor
Copy link
Owner

Lokathor commented Aug 22, 2020

So when fusha tried to add acos they had some trouble on the i586 builds with the mask values being stored as floats, and converting some bit patterns to float doesn't preserve the bits exactly because a signaling nan pattern will be turned into a quiet nan pattern.

I won't have much time to check on this today, though I'll have some tomorrow, but that's one guess just looking at the fact that it's an i586 build that's failing.

The fix here is to make the masks their own type of values, and then on i586 the masks can always be stored as unsigned values instead of floats. Obviously this is a breaking change, though it is one i want to get around to doing.

For now... i guess just cfg out any tests that fail and their associated methods unless the appropriate x86/x64 CPU features are enabled.

We'll have to implement #43 and then we can probably remove the cfg restrictions

@ronniec95
Copy link
Contributor Author

ronniec95 commented Aug 22, 2020

Hey Lokathor

Apologies for the multiple commits. I was trying to replicate your build environment locally to ensure the tests passed in the same way. The sse and above functionality worked yesterday, but getting all the tests to pass in all environments and combinations was trickier.

Due to the from_bits/to_bits issue in #43 the Not() method for f64x4 is wrong which fails the ln() tests for that type in i586. For some curious reason for f32x8::Not trait works perfectly fine in the existing tests. Is this because of luck or ?

Outside that for any sse or above instruction set the tests all pass and the difference to std.rs is appropriately small. This seems like a good sign but let's test this on more examples to be sure. I've tried my default go-to pricing library and so far so good.

Feel free to clean things up as you see fit and I'd be interested in your next full release with this functionality in.

@Lokathor
Copy link
Owner

As many commits as you need is fine. I occasionally resort to "CI based coding", so I know how it goes, particularly when you're aiming at a target that your own machine isn't.

I can merge now if you like.

@ronniec95
Copy link
Contributor Author

Yep, great merge this in please and close the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants