New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support all fp types in GPU SparseTensorDenseMatMul #47419
Support all fp types in GPU SparseTensorDenseMatMul #47419
Conversation
I'm investigating the test failures. |
- Adds GPU registrations for double and complex types. - Also adds correct conjugation in the GPU kernel.
d499701
to
e377b73
Compare
- This requires accumulating into a temporary buffer of a higher-precision type (float) to maintain precision. Implementing this required some minor refactoring which inflates the diff a bit.
e377b73
to
ab89002
Compare
I've fixed the complex type test failures (I just missed #47355) and added support for float32 accumulation for float16 input/output, along with tests for float16. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the unit tests were soft-placing onto the CPU previously? Looks good.
The tests were running on the GPU, it was just that #47355 changed (fixed) the handling of complex conjugates (on CPU), and my local branch didn't have that commit. |
When we build internally I see a bunch of references to quiet_NaN not being supported for complex and complex:
I see the float and double specializations there, but nothing for complex types. Do you know where it's supposed to be coming from? |
- Eigen::NumTraits<T> is not implemented for std::complex so we need to special-case it.
I didn't realize there was no NumTraits implementation for complex (I guess compiler warning flags are more strict internally?). |
PiperOrigin-RevId: 362105835 Change-Id: I056a02b10f94e5033f940da4e14f630a4ae212b7
Adds Eigen::half for CPU, and Eigen::half, double, and complex for GPU.
cc @nluehr