Skip to content
This repository has been archived by the owner on Mar 28, 2022. It is now read-only.

accel and nvptx64 linker issues #32

Closed
ehsanmok opened this issue Apr 12, 2018 · 11 comments
Closed

accel and nvptx64 linker issues #32

ehsanmok opened this issue Apr 12, 2018 · 11 comments

Comments

@ehsanmok
Copy link

Hi

I can build nvptx64 sub-crate but cargo test fails on it due to "Link" issue (when compiling ptx-builder v0.1.0). I've LLVM-6.0, CUDA-8.0 installed, and tried to changed the linker in nvptx64-nvidia-cuda.json to llvm-linker(?!) but didn't help.

Is it because of my gpu titan xp arch? or something else?

Also when I try to cargo build the root accel crate the error is:

error: linking with cc failed: exit code: 1
.... OMITTED ....
note: /usr/bin/ld: cannot find -lcudart
/usr/bin/ld: cannot find -lcublas
collect2: error: ld returned 1 exit status

But I have them in my /usr/local/cuda/lib64 and /usr/local/cuda/include/.

Any idea how to resolve it?
Thanks

@termoshtt
Copy link
Owner

/usr/bin/ld: cannot find -lcublas

This could be caused by

  • libcublas.a does not exists on your system
  • env variable $LIBRARY_PATH does not containing dir where libcublas.a exists

I guess $LIBRARY_PATH setting is failing. If you do not prefer to modify it, you can use $CUDA_LIBRARY_PATH instead.

Dockerfile of termoshtt/rust-cuda may help construct your env

@ehsanmok
Copy link
Author

ehsanmok commented Apr 12, 2018

Thanks! I ran ld -lcudart --verbose and saw it's searching over wrong places so I symlinked
sudo ln -s /usr/local/cuda-8.0/lib64/libcublas.so /usr/lib/libcublas.so then sudo ldconfig. The same for libcudart and could successfully build accel root crate.

However, cargo test fails with

error: custom attribute panicked
 --> examples/add.rs:9:1
  |
9 | #[kernel]
  | ^^^^^^^^^
  |
  = help: message: Failed to compile: IOError((Link, Os { code: 2, kind: NotFound, message: "No such file or directory" }))

error: aborting due to previous error

error: Could not compile `accel`.

It seems it's the linker issue and cargo test fails for nvptx with error

running 2 tests
    Blocking waiting for file lock on nvptx64-nvidia-cuda's sysroot
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling accel-core v0.2.0-alpha
   Compiling ptx-builder v0.1.0 (file:///tmp/ptx-builder.9Vy9DYkX6TOf)
    Finished release [optimized] target(s) in 0.22 secs
test compile_tmp ... FAILED
   Compiling accel-core v0.2.0-alpha
   Compiling ptx-builder v0.1.0 (file:///home/ilab/tmp/rust2ptx)
    Finished release [optimized] target(s) in 0.21 secs
test compile_path ... FAILED

failures:

---- compile_tmp stdout ----
        Already clean (dir = /tmp/ptx-builder.9Vy9DYkX6TOf/target)
thread 'compile_tmp' panicked at 'called `Result::unwrap()` on an `Err` value: IOError((Link, Os { code: 2, kind: N                                                                                               otFound, message: "No such file or directory" }))', libcore/result.rs:945:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- compile_path stdout ----
        thread 'compile_path' panicked at 'called `Result::unwrap()` on an `Err` value: IOError((Link, Os { code: 2                                                                                               , kind: NotFound, message: "No such file or directory" }))', libcore/result.rs:945:5


failures:
    compile_path
    compile_tmp

test result: FAILED. 0 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out

and accel-core build fails with (might be related to arch?)

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.read.ptx.sreg.ntid.x
error: Could not compile `accel-core`.

Though, cuda-sys and accel-derive have passed the tests.

Any idea how to resolve them?
Thanks!

@ehsanmok ehsanmok changed the title nvptx64 link issue nvptx64 linker issue Apr 12, 2018
@ehsanmok ehsanmok changed the title nvptx64 linker issue accel and nvptx64 linker issues Apr 12, 2018
@termoshtt
Copy link
Owner

Thanks for testing.

error: custom attribute panicked
test compile_tmp ... FAILED
test compile_path ... FAILED

These are raised while the link step. Could you check there are three commands ar, llvm-link, and llc in your $PATH? This will be a bug if they exists.

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.read.ptx.sreg.ntid.x

You need to compile accel-core for nvptx64-nvidia-cuda target. It can be done using cargo-nvptx in nvptx crate (created very recently #29).

cd nvptx
cargo install  # cargo-nvptx is installed into ~/.cargo/bin
cd ../accel-core
cargo nvptx  # this will show the generated PTX

Be sure that the interface of cargo-nvptx will change (´・ω・`)
You can also build it using xargo directory

cp nvptx/src/nvptx64-nvidia-cuda.json accel-core/
xargo rustc --target nvptx64-nvidia-cuda

@ehsanmok
Copy link
Author

Thanks! it's strange that I get miss-matched errors:

Could you check there are three commands ar, llvm-link, and llc in your $PATH? This will be a bug if they exists.

I have ar in PATH only.

cd nvptx
cargo install  # cargo-nvptx is installed into ~/.cargo/bin
cd ../accel-core
cargo nvptx  # this will show the generated PTX

nvptx install was successful though cargo nvptx in accel-core fails with

thread 'main' panicked at 'Link failed: IOError((Link, Os { code: 2, kind: NotFound, message: "No such file or directory" }))', libcore/result.rs:945:5 

I tried xargo rustc --target nvptx64-nvidia-cuda and was successful

though and cargo test fails on accel-core with

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.read.ptx.sreg.ntid.x
error: Could not compile `accel-core`.

and the root test fails as before.

@ehsanmok
Copy link
Author

Here's my main issue to use nvptx from nightly 1.27. Did you have such issues? if so what did you do?

@termoshtt
Copy link
Owner

The sample at japaric/nvptx uses japaric/core64, which does not updated. You can use latest libcore on the main rust library, so I guess you can compile that sample by removing Xargo.toml.

nvptx install was successful though cargo nvptx in accel-core fails with

Does cargo test on nvptx crate works? Anyway, this error message has no information. I have to fix #33 first (´・ω・`)

@shritesh
Copy link

I went through the same issues with the same error messages.

In my case, llvm-link and llc commands do not exist in my $PATH but they are hardcoded in nvptx::compile::Builder::link. I edited the commands to llvm-link-6.0 and llc-6.0 and it works on my machine. The Dockerfile has RUN bash -c 'for e in $(ls /usr/bin/ll*-6.0); do mv $e ${e%-6.0}; done' to do the same.

Hope this helps.

@termoshtt
Copy link
Owner

I edited the commands to llvm-link-6.0 and llc-6.0 and it works on my machine.

Thanks comment. These naming rules should be detected in cargo-nvptx since such postfix is widely used. I will fix it.

@ehsanmok
Copy link
Author

ehsanmok commented Apr 25, 2018

Does cargo test on nvptx crate works?

With changes @shritesh mentioned, the build and test were successfully for me as well. But when I tried to build accel-core separately it complained about LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.read.ptx.sreg.ntid.x however, root accel build and tests were successful.

@termoshtt
Copy link
Owner

termoshtt commented May 12, 2018

llc-6.0-like naming is supported in #39.

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.read.ptx.sreg.ntid.x

accel-core cannot be compiled to X86. It must be compiled to NVPTX target (defined in LLVM), and cargo-nvptx is a tool for it.

cd nvptx
cargo install --path .  # install cargo-nvptx into ~/.cargo/bin/
cd ../accel-core
cargo nvptx  # this will show compiled PTX text

as CI does

cd nvptx
cargo install -f
cd ../accel-core
cargo nvptx

@termoshtt
Copy link
Owner

Please re-open if not resolved

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants