Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect PATH in Linux causes build to implode #781

Closed
sdake opened this issue May 11, 2023 · 3 comments · Fixed by #784
Closed

Incorrect PATH in Linux causes build to implode #781

sdake opened this issue May 11, 2023 · 3 comments · Fixed by #784

Comments

@sdake
Copy link

sdake commented May 11, 2023

Hi there,

First off, awesome work!

I had not set the path to nvcc, so ]dfdx imploded during build. You may find it of value to tell the user that nvcc could not be found.


ubuntu@instance-20230508-1136:~/repos/dfdx$ cargo clean
ubuntu@instance-20230508-1136:~/repos/dfdx$ nvcc
-bash: nvcc: command not found
ubuntu@instance-20230508-1136:~/repos/dfdx$ cargo build -F cuda
    Updating crates.io index
    Updating git repository `https://github.com/coreylowman/cudarc`
    Updating git repository `https://github.com/starkat99/half-rs.git`
  Downloaded cfg-if v1.0.0
  Downloaded either v1.8.1
  Downloaded num-complex v0.4.3
  Downloaded gemm-f64 v0.15.3
  Downloaded gemm-c64 v0.15.3
  Downloaded dyn-stack v0.9.0
  Downloaded ppv-lite86 v0.2.17
  Downloaded rand_chacha v0.3.1
  Downloaded rand v0.8.5
  Downloaded scopeguard v1.1.0
  Downloaded seq-macro v0.3.3
  Downloaded autocfg v1.1.0
  Downloaded bitflags v1.3.2
  Downloaded crossbeam-channel v0.5.8
  Downloaded crossbeam-epoch v0.9.14
  Downloaded rand_core v0.6.4
  Downloaded rand_distr v0.4.3
  Downloaded rayon-core v1.11.0
  Downloaded rayon v1.7.0
  Downloaded raw-cpuid v10.7.0
  Downloaded memoffset v0.8.0
  Downloaded reborrow v0.5.4
  Downloaded gemm-c32 v0.15.3
  Downloaded gemm v0.15.3
  Downloaded bytemuck v1.13.1
  Downloaded gemm-f16 v0.15.3
  Downloaded gemm-f32 v0.15.3
  Downloaded gemm-common v0.15.3
  Downloaded half v2.2.1
  Downloaded libm v0.2.6
  Downloaded lazy_static v1.4.0
  Downloaded glob v0.3.1
  Downloaded crossbeam-utils v0.8.15
  Downloaded crossbeam-deque v0.8.3
  Downloaded num_cpus v1.15.0
  Downloaded paste v1.0.12
  Downloaded libc v0.2.144
  Downloaded num-traits v0.2.15
  Downloaded 38 crates (1.9 MB) in 0.54s
   Compiling autocfg v1.1.0
   Compiling crossbeam-utils v0.8.15
   Compiling cfg-if v1.0.0
   Compiling libm v0.2.6
   Compiling libc v0.2.144
   Compiling scopeguard v1.1.0
   Compiling rayon-core v1.11.0
   Compiling paste v1.0.12
   Compiling either v1.8.1
   Compiling bitflags v1.3.2
   Compiling reborrow v0.5.4
   Compiling bytemuck v1.13.1
   Compiling lazy_static v1.4.0
   Compiling seq-macro v0.3.3
   Compiling rand_core v0.6.4
   Compiling ppv-lite86 v0.2.17
   Compiling cudarc v0.9.8 (https://github.com/coreylowman/cudarc?branch=dfdx-half#bb2d7009)
   Compiling glob v0.3.1
   Compiling raw-cpuid v10.7.0
   Compiling dyn-stack v0.9.0
   Compiling memoffset v0.8.0
   Compiling num-traits v0.2.15
   Compiling crossbeam-epoch v0.9.14
   Compiling dfdx v0.11.2 (/home/ubuntu/repos/dfdx)
   Compiling rand_chacha v0.3.1
   Compiling rand v0.8.5
   Compiling crossbeam-channel v0.5.8
   Compiling num_cpus v1.15.0
error: failed to run custom build command for `dfdx v0.11.2 (/home/ubuntu/repos/dfdx)`

Caused by:
  process didn't exit successfully: `/home/ubuntu/repos/dfdx/target/debug/build/dfdx-30e6be024c8b3335/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  cargo:rustc-env=CUDA_INCLUDE_DIR=/usr/local/cuda/include
  cargo:rerun-if-changed=src/tensor_ops/utilities/binary_op_macros.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/compatibility.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/cuda_utils.cuh
  cargo:rerun-if-changed=src/tensor_ops/utilities/unary_op_macros.cuh

  --- stderr
  thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', build.rs:139:22
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
ubuntu@instance-20230508-1136:~/repos/dfdx$ locate nvcc
/home/ubuntu/.local/lib/python3.10/site-packages/cmake/data/share/cmake-3.26/Modules/FindCUDA/run_nvcc.cmake
/home/ubuntu/.local/lib/python3.10/site-packages/torch/share/cmake/Caffe2/Modules_CUDA_fix/upstream/FindCUDA/run_nvcc.cmake
/usr/local/cuda-12.1/bin/__nvcc_device_query
/usr/local/cuda-12.1/bin/nvcc
/usr/local/cuda-12.1/bin/nvcc.profile
/usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.26/Modules/FindCUDA/run_nvcc.cmake
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Modules_CUDA_fix/upstream/FindCUDA/run_nvcc.cmake
/usr/share/doc/cuda-nvcc-12-1
/usr/share/doc/cuda-nvcc-12-1/changelog.Debian.gz
/usr/share/doc/cuda-nvcc-12-1/copyright
/var/cache/apt/archives/cuda-nvcc-12-1_12.1.105-1_amd64.deb
/var/lib/dpkg/info/cuda-nvcc-12-1.list
/var/lib/dpkg/info/cuda-nvcc-12-1.md5sums
ubuntu@instance-20230508-1136:~/repos/dfdx$ export PATH=$PATH:/usr/local/cuda-12.1/bin
ubuntu@instance-20230508-1136:~/repos/dfdx$ cargo build -F cuda
   Compiling num-traits v0.2.15
   Compiling crossbeam-deque v0.8.3
   Compiling dfdx v0.11.2 (/home/ubuntu/repos/dfdx)
   Compiling rayon-core v1.11.0
   Compiling num-complex v0.4.3
   Compiling half v2.2.1
   Compiling rand_distr v0.4.3
   Compiling rayon v1.7.0
warning: Compiled 48 cuda kernels in 1.152008619s
   Compiling gemm-common v0.15.3
   Compiling gemm-f32 v0.15.3
   Compiling gemm-c32 v0.15.3
   Compiling gemm-c64 v0.15.3
   Compiling gemm-f64 v0.15.3
   Compiling gemm-f16 v0.15.3
   Compiling gemm v0.15.3
    Finished dev [unoptimized + debuginfo] target(s) in 9.71s
ubuntu@instance-20230508-1136:~/repos/dfdx$

Thank you,
-steve

@coreylowman
Copy link
Owner

Thanks for opening. We are just calling std::process::Command("nvcc") in our build.rs script, which will depend on nvcc being on path.

I see a couple options here:

  1. Simplest is just improve the error message here to something very explicit like "nvcc executable not found on $PATH".
  2. Trying to call something like locate nvcc on supported platforms and giving the found path to std::process::Command
  3. Maybe both of the above

Thoughts?

@sdake
Copy link
Author

sdake commented May 12, 2023 via email

@coreylowman
Copy link
Owner

For now I've just improved the error messages. I think locate is nice, but windows doesn't have it for example, and it's relatively striaghtforward to juts have users add nvcc to their path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants