-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug][Feature] Added more missing FP16 specializations #4140
Conversation
To trigger regression tests:
|
3d913a9
to
ad8972c
Compare
src/array/cuda/array_index_select.cu
Outdated
#ifdef USE_FP16 | ||
// The initialization constructor for __half is apparently a device- | ||
// only function in some setups, but the current function isn't run | ||
// on the device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear which 'current' function we are talking here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the current function" here refers to the function containing this comment, the function that is currently being run (on the host) as this comment is passed. If I referred to it by name instead, i.e. IndexSelect
, it might sound like the comment is referring to the other function named IndexSelect
, above. Maybe it would be clear if I included both, though. 🤔 I'll try something when I add the "TODO"s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment. Hopefully it's a bit clearer now. Thanks!
c36e7fb
to
c425866
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally it looks good--just the way some of the errors are reported needs updating.
…IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU` * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half` * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
…ations of Xgeam, CSRGEMM, and CSRGEAM
* Added clearer comment explaining why the cast to long long is necessary
…f can't be constructed on the host side
* Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
7418f5c
to
83d41e9
Compare
Description
__half
ofDLDataTypeTraits
,IndexSelect
,Full
,Scatter_
,CSRGetData
,CSRMM
,CSRSum
_LinearSearchKernel
that was preventing it from supporting__half
Xgeam
,CSRGEMM
, andCSRGEAM
, which would require functions that aren't provided by cublasChecklist
or have been fixed to be compatible with this change