Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AVX/AVX2 version of functions #11

Closed
redorav opened this issue Mar 25, 2019 · 7 comments
Closed

Add AVX/AVX2 version of functions #11

redorav opened this issue Mar 25, 2019 · 7 comments
Assignees

Comments

@redorav
Copy link
Owner

redorav commented Mar 25, 2019

No description provided.

@redorav redorav self-assigned this Mar 27, 2019
@diogovtx
Copy link

I wonder if it would be worth it to have float8 (for avx256) and float16 (for avx512) section. Would be a superset of actual hlsl, but could be handy. Maybe even add some calls to query their availability at runtime.

Just wondering if it would be possible and if there are negative implications going down this path. Asking because hlslpp is a very accessible, and readable way to write simd code in general. Probably the easiest I've used so far, tbh, and I'm getting great results. Could be a good way to extend it to the future, assuming it's possible.

@redorav
Copy link
Owner Author

redorav commented Jul 14, 2019

@diogovtx Sure, I've always thought of extending hlsl++ to things that technically aren't in hlsl but easily derive from it. To some extent I already have, take for example the quaternion class. I would like to do float8 and float16, although I'd need to see how to add support to it on platforms that don't have such wide registers.

I'm not quite sure how to do that last part. I started doing double vectors and got it relatively far using AVX, but I don't have a solution on how I might make it work on NEON. Should it even be supported? Should I be able to mix and match scalar versions of the library?

@diogovtx
Copy link

I guess you'd have to rely on compiler preprocessor to conditionally enable float8 and float16. They could simply be unavailable, or emulated using float4.

When enabled by the preprocessor, responsibility would fall on the programmer using hlslpp to ensure the supported path is taken at runtime. Some hlslpp runtime query calls could help with that, using API calls outside of hlsl spec ofc.

@redorav
Copy link
Owner Author

redorav commented Jul 14, 2019

I'm not sure I understand what you mean by runtime query calls, do you mean to inform the user that there is hardware support for the feature they're trying to use?

@diogovtx
Copy link

Yes. That's what I meant. Sorry if I wasn't clear.

@redorav
Copy link
Owner Author

redorav commented Oct 13, 2019

@diogovtx I have recently introduced the float8 type. Unfortunately the nice swizzles that were possible with smaller types end up in a combinatorial explosion (8^8) so instead there's a templated swizzle function that serves that purpose. I took a look at OpenCL and it's not clear what they provide in the docs, I think short of language support this can't be done. Anyhow, the functionality is there, give it a spin if you want :)

@redorav
Copy link
Owner Author

redorav commented Dec 22, 2022

Closing due to inactivity (and AVX has been implemented in many parts of the codebase such as matrices and float8)

@redorav redorav closed this as completed Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants