hmin/hmax already defined on compute_cap 75 with newer driver version #762

opfromthestart · 2023-05-03T00:43:02Z

When I try to compile dfdx while using cuda, I get the following error

  --- stderr
  thread 'main' panicked at 'nvcc error while compiling "src/optim/adam/adam.cu":

  # stdout


  # stderr
  src/tensor_ops/utilities/compatibility.cuh(9): error: function "__hmax" has already been defined
    __attribute__((device)) __inline__ __attribute__((always_inline)) __half __hmax(__half a, __half b) {
                                                                             ^

  src/tensor_ops/utilities/compatibility.cuh(12): error: function "__hmin" has already been defined
    __attribute__((device)) __inline__ __attribute__((always_inline)) __half __hmin(__half a, __half b) {
                                                                             ^

  2 errors detected in the compilation of "src/optim/adam/adam.cu".

My guess is that its related to the fix for compatibility of 75, which I think I had but I updated my drivers so it now it is already defined.

The text was updated successfully, but these errors were encountered:

opfromthestart · 2023-05-03T00:50:51Z

nvidia-smi --query-gpu compute_cap --format=csv gives compute_cap 7.5
nvcc --version gives

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

nvcc --list-gpu-code gives

sm_50
sm_52
sm_53
sm_60
sm_61
sm_62
sm_70
sm_72
sm_75
sm_80
sm_86
sm_87
sm_89
sm_90

coreylowman · 2023-05-03T12:40:46Z

Can you expand on what you mean by updated your drivers? I guess I had assumed all compute_caps of the same number would have similar issues, but you're still compiling with 75 and getting this error?

opfromthestart · 2023-05-03T13:47:04Z

My drivers were on version 525 and I had version 11.6 and 12.1 of all cuda-related libraries. I installed version 530 of the drivers and removed the 11.6 versions of the libraries, and that made the llama-dfdx example work.

opfromthestart · 2023-05-03T14:03:22Z

When I try to use the 525 version of drivers I get the following error

Caused by:
  process didn't exit successfully: `/home/opfromthestart/rust/game/touhou-diff/target/release/build/dfdx-5455800ceba8656f/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  cargo:rustc-cfg=feature="nightly"
  cargo:rustc-env=CUDA_INCLUDE_DIR=/usr/local/cuda/include
  cargo:rerun-if-changed=src/tensor_ops/utilities/binary_op_macros.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/compatibility.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/cuda_utils.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/unary_op_macros.cuh

  --- stderr
  thread 'main' panicked at 'assertion failed: `(left == right)`
    left: `"Failed to initialize NVML: Driver/library version mismatch"`,
   right: `"compute_cap"`', /home/opfromthestart/.cargo/git/checkouts/dfdx-318e6e5ad83eea79/5e2b93d/build.rs:132:17

Which was why I upgraded to 530

coreylowman · 2023-05-03T18:32:44Z

Ahh okay, so are you still having the original error then about hmin/hmax?

Maybe we should be hooking into driver versions instead of GPU_ARCH for the ifdefs? I wonder if thats available...

coreylowman · 2023-05-03T18:35:30Z

Hmm it seems like getting driver version is limited to runtime. 🤔

9876691 · 2023-05-13T11:35:50Z

I also get this error. I setup a vscode dev container with the following .devcontainer/devcontainer.json

{
	"name": "Rust",
	"image": "nvidia/cuda:12.1.1-devel-ubuntu20.04", 
	
	"runArgs": [
		"--gpus",
		"all"
	],
	"features": {
		"ghcr.io/devcontainers/features/rust:1": {}
	}
}

Running nvidia-smi in the container gives.

root@e5d2279e80a7:/workspaces/dfdx# nvidia-smi
Sat May 13 11:23:01 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:1D:00.0  On |                  N/A |
|  0%   39C    P0    N/A /  90W |   1273MiB /  4096MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Running cargo test --features "cuda"

root@e5d2279e80a7:/workspaces/dfdx# cargo test --features "cuda"
   Compiling dfdx v0.11.2 (/workspaces/dfdx)
error: failed to run custom build command for `dfdx v0.11.2 (/workspaces/dfdx)`

Caused by:
  process didn't exit successfully: `/workspaces/dfdx/target/debug/build/dfdx-30e6be024c8b3335/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  cargo:rustc-env=CUDA_INCLUDE_DIR=/usr/local/cuda/include
  cargo:rerun-if-changed=src/tensor_ops/utilities/binary_op_macros.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/compatibility.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/cuda_utils.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/unary_op_macros.cuh
  cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP
  cargo:rustc-env=CUDA_COMPUTE_CAP=sm_61
  cargo:rerun-if-changed=src/optim/adam/adam.cu
  cargo:rerun-if-changed=src/optim/rmsprop/rmsprop.cu
  cargo:rerun-if-changed=src/optim/sgd/sgd.cu
  cargo:rerun-if-changed=src/tensor_ops/abs/abs.cu
  cargo:rerun-if-changed=src/tensor_ops/add/binary_add.cu
  cargo:rerun-if-changed=src/tensor_ops/add/scalar_add.cu
  cargo:rerun-if-changed=src/tensor_ops/attention_reshape/attention_reshape.cu
  cargo:rerun-if-changed=src/tensor_ops/axpy/axpy.cu
  cargo:rerun-if-changed=src/tensor_ops/bce/bce.cu
  cargo:rerun-if-changed=src/tensor_ops/boolean/boolean.cu
  cargo:rerun-if-changed=src/tensor_ops/choose/choose.cu
  cargo:rerun-if-changed=src/tensor_ops/clamp/clamp.cu
  cargo:rerun-if-changed=src/tensor_ops/cmp/cmp.cu
  cargo:rerun-if-changed=src/tensor_ops/conv2d/conv2d.cu
  cargo:rerun-if-changed=src/tensor_ops/convtrans2d/convtrans2d.cu
  cargo:rerun-if-changed=src/tensor_ops/cos/cos.cu
  cargo:rerun-if-changed=src/tensor_ops/div/binary_div.cu
  cargo:rerun-if-changed=src/tensor_ops/div/scalar_div.cu
  cargo:rerun-if-changed=src/tensor_ops/dropout/dropout.cu
  cargo:rerun-if-changed=src/tensor_ops/exp/exp.cu
  cargo:rerun-if-changed=src/tensor_ops/gelu/gelu.cu
  cargo:rerun-if-changed=src/tensor_ops/huber_error/huber_error.cu
  cargo:rerun-if-changed=src/tensor_ops/ln/ln.cu
  cargo:rerun-if-changed=src/tensor_ops/max_to/max_to.cu
  cargo:rerun-if-changed=src/tensor_ops/maximum/maximum.cu
  cargo:rerun-if-changed=src/tensor_ops/min_to/min_to.cu
  cargo:rerun-if-changed=src/tensor_ops/minimum/minimum.cu
  cargo:rerun-if-changed=src/tensor_ops/mul/binary_mul.cu
  cargo:rerun-if-changed=src/tensor_ops/mul/scalar_mul.cu
  cargo:rerun-if-changed=src/tensor_ops/nans_to/nans_to.cu
  cargo:rerun-if-changed=src/tensor_ops/negate/negate.cu
  cargo:rerun-if-changed=src/tensor_ops/pool2d/pool2d.cu
  cargo:rerun-if-changed=src/tensor_ops/pow/pow.cu
  cargo:rerun-if-changed=src/tensor_ops/recip/recip.cu
  cargo:rerun-if-changed=src/tensor_ops/relu/relu.cu
  cargo:rerun-if-changed=src/tensor_ops/roll/roll.cu
  cargo:rerun-if-changed=src/tensor_ops/select_and_gather/gather.cu
  cargo:rerun-if-changed=src/tensor_ops/select_and_gather/select.cu
  cargo:rerun-if-changed=src/tensor_ops/sigmoid/sigmoid.cu
  cargo:rerun-if-changed=src/tensor_ops/sin/sin.cu
  cargo:rerun-if-changed=src/tensor_ops/slice/slice.cu
  cargo:rerun-if-changed=src/tensor_ops/sqrt/sqrt.cu
  cargo:rerun-if-changed=src/tensor_ops/square/square.cu
  cargo:rerun-if-changed=src/tensor_ops/sub/binary_sub.cu
  cargo:rerun-if-changed=src/tensor_ops/sub/scalar_sub.cu
  cargo:rerun-if-changed=src/tensor_ops/sum_to/sum_to.cu
  cargo:rerun-if-changed=src/tensor_ops/tanh/tanh.cu
  cargo:rerun-if-changed=src/tensor_ops/upscale2d/upscale2d.cu

  --- stderr
  thread 'main' panicked at 'nvcc error while compiling "src/optim/adam/adam.cu":

  # stdout


  # stderr
  src/tensor_ops/utilities/compatibility.cuh(9): error: function "__hmax" has already been defined
    __attribute__((device)) __inline__ __attribute__((always_inline)) __half __hmax(__half a, __half b) {
                                                                             ^

  src/tensor_ops/utilities/compatibility.cuh(12): error: function "__hmin" has already been defined
    __attribute__((device)) __inline__ __attribute__((always_inline)) __half __hmin(__half a, __half b) {
                                                                             ^

  2 errors detected in the compilation of "src/optim/adam/adam.cu".
  ', build.rs:197:17
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

nvcc --version

root@e5d2279e80a7:/workspaces/dfdx# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

coreylowman changed the title ~~Error compiling CUDA kernels~~ __hmin/__hmax already defined on compute_cap 75 with newer driver version May 5, 2023

coreylowman added the bug Something isn't working label May 5, 2023

coreylowman added a commit that referenced this issue May 18, 2023

#762 Removing __hmax and __hmin compat functions

8f06097

coreylowman mentioned this issue May 18, 2023

Removing __hmax and __hmin compat functions #788

Merged

coreylowman closed this as completed in #788 May 18, 2023

coreylowman added a commit that referenced this issue May 18, 2023

#762 Removing __hmax and __hmin compat functions (#788)

4c604b5

opfromthestart mentioned this issue Jan 15, 2024

CUDA kernels missing __hmin and __hmax #910

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hmin/hmax already defined on compute_cap 75 with newer driver version #762

hmin/hmax already defined on compute_cap 75 with newer driver version #762

opfromthestart commented May 3, 2023

opfromthestart commented May 3, 2023

coreylowman commented May 3, 2023

opfromthestart commented May 3, 2023

opfromthestart commented May 3, 2023

coreylowman commented May 3, 2023

coreylowman commented May 3, 2023

9876691 commented May 13, 2023

__hmin/__hmax already defined on compute_cap 75 with newer driver version #762

__hmin/__hmax already defined on compute_cap 75 with newer driver version #762

Comments

opfromthestart commented May 3, 2023

opfromthestart commented May 3, 2023

coreylowman commented May 3, 2023

opfromthestart commented May 3, 2023

opfromthestart commented May 3, 2023

coreylowman commented May 3, 2023

coreylowman commented May 3, 2023

9876691 commented May 13, 2023

hmin/hmax already defined on compute_cap 75 with newer driver version #762

hmin/hmax already defined on compute_cap 75 with newer driver version #762