- 
                Notifications
    You must be signed in to change notification settings 
- Fork 130
[SYCL] Fix CUDA tests using bfloat16 #1421
[SYCL] Fix CUDA tests using bfloat16 #1421
Conversation
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
| @JackAKirk - Would you be able to help verify that these fix the problems seen in pre-commit CI? For example in intel/llvm#7563. | 
| 
 Yep sure, no problem. | 
| 
 Yes with your two PRs the joint_matrix_tensorcores_legacy.cpp and bfloat16_builtins.cpp are passing. Note that cuda doesn't run on element_wise_all_ops_bf16.cpp. Changes in the tests here look good. bfloat16_type.cpp is failing locall for me too, as well as Matrix/element_wise_wi_marray_legacy.cpp: Matrix/element_wise_wi_marray_legacy.cpp can probably be removed anyway since I think we are going to scrap this functionality, at least for the time being anyway, because it is not very important and we would have to create a whole new joint_matrix class in an experimental::cuda name to be consistent with scope conventions. | 
| Thanks a ton for checking! 😄 
 bfloat16_type.cpp seems to have been split up into a CUDA variant at some point too as the CUDA backend needs SM80 on the corresponding device, so I've disabled CUDA in that test again and changed the CUDA variant of it slightly to check for SM version. See #1423. For SYCL/Matrix/element_wise_wi_marray_legacy.cpp I suspect there's just a missing  | 
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
| /verify with intel/llvm#7567 | 
* [SYCL] Fix CUDA tests using bfloat16 * Add missing using in element_wise_wi_marray_legacy Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
* [SYCL] Fix CUDA tests using bfloat16 * Add missing using in element_wise_wi_marray_legacy Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
SYCL/BFloat16/bfloat16_builtins.cpp, SYCL/Matrix/element_wise_all_ops_bf16.cpp, SYCL/Matrix/element_wise_wi_marray_legacy.cpp are made to use the correct extension namespaces to have both
bfloat16and the corresponding experimental builtins without namespaces.SYCL/Matrix/joint_matrix_tensorcores_legacy.cpp is changed to case convert
bfloattofloatdirectly - using the existingelse-branch - instead of using the removedbfloat16::raw()method and custom conversion.These test changes require intel/llvm#7567.