[CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37 #5969

skosnits · 2022-04-05T20:45:25Z

Describe the bug
[CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37

To Reproduce

Get SYCL with CUDA support (GSG)
Build attached source file: crossEntropy_for_external.cpp.txt

clang++ -fsycl -fsycl-unnamed-lambda -fsycl-targets=nvptx64-nvidia-cuda-sycldevice crossEntropy_for_external.cpp -o crossEntropy.run  -lOpenCL

Error message:

clang++: warning: argument 'nvptx64-nvidia-cuda-sycldevice' is deprecated, use 'nvptx64-nvidia-cuda' instead [-Wdeprecated]
warning: linking module 'remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: 'remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'crossEntropy_for_external.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
1 warning generated.
fatal error: error in backend: Cannot select: t38: v2f16 = bitcast t37
  t37: f32,ch = CopyFromReg t0, Register:f32 %7
    t36: f32 = Register %7
In function: _ZZZ15loss_fwd_kernelIN2cl4sycl6detail9half_impl4halfEENS1_5eventERNS1_5queueEPT_PlSA_S9_S9_S9_S9_ENKUlRNS1_7handlerEE_clESC_ENKUlNS1_7nd_itemILi3EEEE_clESF_
llvm-foreach:
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 8ec97550b7a84e5d2483f4891059f80abd8db39c)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: <path to built sycl>
clang++: note: diagnostic msg: Error generating preprocessed source(s).

Environment

OS: Linux
Target device and vendor: Nvidia GPU
DPC++ version: https://github.com/intel/llvm.git 8ec9755
CUDA version:

nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+

The text was updated successfully, but these errors were encountered:

zjin-lcf · 2022-04-06T03:05:08Z

I didn't observe compile/run errors using HIP backend.

bader · 2022-04-06T06:09:43Z

This looks like a bug in NVPTX LLVM target back-end.

jchlanda · 2022-04-21T13:59:47Z

Would it not be an UB to typecast between float and 2xhalf, like the example here does?
Saying that, I think the compiler should do better than hard crash on it. Submitted a patch against the mainline here: https://reviews.llvm.org/D124171

zjin-lcf · 2022-04-21T14:55:39Z

The cuda compiler supports the conversion.

jchlanda · 2022-04-21T14:59:44Z

I've run it on a titan machine with DPCPP just now:

# ./crossEntropy.run
Matrix size(bs * W * H) = 128 * 81 * 8732
LossNLL_FWD CPU time(ms)=. 2.01997
LossNLL_FWD GPU time(ms)=. 2.00132
FWDBandWidth = 179.963433 (GB / s), 386722816.000000
LossNLL_BWD CPU time(ms)=. 1.82208
LossNLL_BWD GPU time(ms)=. 1.80568
BWDBandWidth = 214.169755 (GB / s), 386722816.000000

Make sure NVPTX backend can handle bitcasting between `float` and `<2 x half>` types. This was discovered through: intel/llvm#5969 I'm not suggesting that such bitcasts make much sense, but it feels like the compiler should not hard crash on them. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D124171

jchlanda · 2022-05-16T13:07:11Z

This has been merged into sycl branch in: 76d1f5e

Make sure NVPTX backend can handle bitcasting between `float` and `<2 x half>` types. This was discovered through: intel/llvm#5969 I'm not suggesting that such bitcasts make much sense, but it feels like the compiler should not hard crash on them. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D124171

skosnits added bug Something isn't working cuda CUDA back-end compiler Compiler related issue labels Apr 5, 2022

skosnits added this to Needs triage in oneAPI DPC++ via automation Apr 5, 2022

skosnits changed the title ~~[CUDA] Failed to build crossEntropy_for_external.cpp with error in backend: Cannot select: t38: v2f16 = bitcast t37~~ [CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37 Apr 5, 2022

AerialMantis moved this from Needs triage to Selected in oneAPI DPC++ Apr 11, 2022

jchlanda closed this as completed May 16, 2022

oneAPI DPC++ automation moved this from Selected to Closed May 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37 #5969

[CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37 #5969

skosnits commented Apr 5, 2022 •

edited

Loading

zjin-lcf commented Apr 6, 2022

bader commented Apr 6, 2022

jchlanda commented Apr 21, 2022

zjin-lcf commented Apr 21, 2022

jchlanda commented Apr 21, 2022 •

edited

Loading

jchlanda commented May 16, 2022

[CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37 #5969

[CUDA] Failed to build crossEntropy with error in backend: Cannot select: t38: v2f16 = bitcast t37 #5969

Comments

skosnits commented Apr 5, 2022 • edited Loading

zjin-lcf commented Apr 6, 2022

bader commented Apr 6, 2022

jchlanda commented Apr 21, 2022

zjin-lcf commented Apr 21, 2022

jchlanda commented Apr 21, 2022 • edited Loading

jchlanda commented May 16, 2022

skosnits commented Apr 5, 2022 •

edited

Loading

jchlanda commented Apr 21, 2022 •

edited

Loading