Add AVX/AVX2 version of functions #11

redorav · 2019-03-25T20:12:22Z

No description provided.

diogovtx · 2019-07-14T16:20:10Z

I wonder if it would be worth it to have float8 (for avx256) and float16 (for avx512) section. Would be a superset of actual hlsl, but could be handy. Maybe even add some calls to query their availability at runtime.

Just wondering if it would be possible and if there are negative implications going down this path. Asking because hlslpp is a very accessible, and readable way to write simd code in general. Probably the easiest I've used so far, tbh, and I'm getting great results. Could be a good way to extend it to the future, assuming it's possible.

redorav · 2019-07-14T16:49:47Z

@diogovtx Sure, I've always thought of extending hlsl++ to things that technically aren't in hlsl but easily derive from it. To some extent I already have, take for example the quaternion class. I would like to do float8 and float16, although I'd need to see how to add support to it on platforms that don't have such wide registers.

I'm not quite sure how to do that last part. I started doing double vectors and got it relatively far using AVX, but I don't have a solution on how I might make it work on NEON. Should it even be supported? Should I be able to mix and match scalar versions of the library?

diogovtx · 2019-07-14T17:16:58Z

I guess you'd have to rely on compiler preprocessor to conditionally enable float8 and float16. They could simply be unavailable, or emulated using float4.

When enabled by the preprocessor, responsibility would fall on the programmer using hlslpp to ensure the supported path is taken at runtime. Some hlslpp runtime query calls could help with that, using API calls outside of hlsl spec ofc.

redorav · 2019-07-14T17:26:53Z

I'm not sure I understand what you mean by runtime query calls, do you mean to inform the user that there is hardware support for the feature they're trying to use?

diogovtx · 2019-07-14T19:16:06Z

Yes. That's what I meant. Sorry if I wasn't clear.

redorav · 2019-10-13T12:48:10Z

@diogovtx I have recently introduced the float8 type. Unfortunately the nice swizzles that were possible with smaller types end up in a combinatorial explosion (8^8) so instead there's a templated swizzle function that serves that purpose. I took a look at OpenCL and it's not clear what they provide in the docs, I think short of language support this can't be done. Anyhow, the functionality is there, give it a spin if you want :)

redorav · 2022-12-22T20:24:45Z

Closing due to inactivity (and AVX has been implemented in many parts of the codebase such as matrices and float8)

redorav added the feature label Mar 25, 2019

redorav self-assigned this Mar 27, 2019

redorav added performance and removed feature labels Dec 30, 2019

redorav closed this as completed Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AVX/AVX2 version of functions #11

Add AVX/AVX2 version of functions #11

redorav commented Mar 25, 2019

diogovtx commented Jul 14, 2019

redorav commented Jul 14, 2019

diogovtx commented Jul 14, 2019

redorav commented Jul 14, 2019

diogovtx commented Jul 14, 2019

redorav commented Oct 13, 2019

redorav commented Dec 22, 2022

Add AVX/AVX2 version of functions #11

Add AVX/AVX2 version of functions #11

Comments

redorav commented Mar 25, 2019

diogovtx commented Jul 14, 2019

redorav commented Jul 14, 2019

diogovtx commented Jul 14, 2019

redorav commented Jul 14, 2019

diogovtx commented Jul 14, 2019

redorav commented Oct 13, 2019

redorav commented Dec 22, 2022