Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation with CUDA11 #20

Closed
constantinpape opened this issue Sep 21, 2021 · 7 comments
Closed

Installation with CUDA11 #20

constantinpape opened this issue Sep 21, 2021 · 7 comments
Assignees

Comments

@constantinpape
Copy link

Hi,

I tried to install RAMA with CUDA11, using the pip installation instruction (python -m pip install git+https://github.com/pawelswoboda/RAMA.git). Using

CUDA version 11.1
GCC version 10.2
CMAKE version 3.20

This fails with the following error:

  /tmp/pip-req-build-d9hjyjt3/external/cudaMST/gpuMST/gpuMST.cu(138): warning: function "__any"                                         
  /g/easybuild/x86_64/CentOS/7/skylake/software/CUDAcore/11.1.1/bin/../targets/x86_64-linux/include/device_atomic_functions.h(178): here was declared deprecated ("__any() is deprecated in favor of __any_sync() and may be removed in a future release (Use -Wno-deprecated-declarations to suppress this warning).")
  
  /g/easybuild/x86_64/CentOS/7/skylake/software/GCCcore/10.2.0/include/c++/10.2.0/tuple(566): error: pack "_UElements" does not have the same number of elements as "_Elements"
            detected during instantiation of "__nv_bool std::tuple<_Elements...>::__nothrow_constructible<_UElements...>() [with _Elements=<const thrust::device_vector<int, thrust::device_allocator<int>> &, const thrust::device_vector<int, thrust::device_allocator<int>> &, const thrust::device_vector<float, thrust::device_allocator<float>> &>, _UElements=<>]"
  /tmp/pip-req-build-d9hjyjt3/src/multicut_message_passing.cu(350): here
  
  /g/easybuild/x86_64/CentOS/7/skylake/software/GCCcore/10.2.0/include/c++/10.2.0/tuple(566): error: pack "_UElements" does not have the same number of elements as "_Elements"
            detected during:
              instantiation of "__nv_bool std::tuple<_Elements...>::__nothrow_constructible<_UElements...>() [with _Elements=<thrust::device_vector<int, thrust::device_allocator<int>> &, thrust::device_vector<int, thrust::device_allocator<int>> &, thrust::device_vector<float, thrust::device_allocator<float>> &>, _UElements=<>]"
  (1616): here
              instantiation of "std::tuple<_Elements &...> std::tie(_Elements &...) noexcept [with _Elements=<thrust::device_vector<int, thrust::device_allocator<int>>, thrust::device_vector<int, thrust::device_allocator<int>>, thrust::device_vector<float, thrust::device_allocator<float>>>]"
  /tmp/pip-req-build-d9hjyjt3/src/dCOO.cu(166): here

I also tried with CUDA 10, but ran into other issues. CUDA 11 would be much better for me in order to be compatible with our stack.

@aabbas90
Copy link
Collaborator

aabbas90 commented Sep 21, 2021

Hi,

Thanks for reporting. Would it be possible for you to try out with CUDA 11.2? It seems to be a known issue with CUDA 11.1 as reported on nvidia forums. Thanks!

@aabbas90 aabbas90 self-assigned this Sep 21, 2021
@constantinpape
Copy link
Author

constantinpape commented Sep 21, 2021

I see. Unfortunately we don't have 11.2 available on our cluster yet. I will check with our cluster admins if it's possible to add 11.2.

@aabbas90
Copy link
Collaborator

By the way, you can also install the CUDA toolkit locally. The driver does not need to be updated. I did it from:

https://developer.nvidia.com/cuda-11.2.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Debian&target_version=10&target_type=runfilelocal

Then run:
bash <PATH_TO_ABOVE_FILE> --toolkit --installpath=<YOURINSTALLDIR>
and add the install path to PATH, LD_LIBRARY_PATH environment variables.

@constantinpape
Copy link
Author

By the way, you can also install the CUDA toolkit locally.

That's not really helpful on our cluster set-up; if I can't use one of our precompiled CUDA versions I can't deploy this and it's overall much more effort for me to setup.

We have a CUDA 11.4 set-up now and I can run the installation with it, but it fails at runtime:

Going to use GeForce RTX 2080 Ti 7.5, device number 0
Traceback (most recent call last):
  File "test_rama.py", line 10, in <module>
    node_labels = rama_py.rama_cuda(
MemoryError: std::bad_alloc: cudaErrorUnsupportedPtxVersion: the provided PTX was compiled with an unsupported toolchain.

@aabbas90
Copy link
Collaborator

Sorry for the inconvenience.

The error with CUDA 11.4 seems like the CUDA driver is old and does not support CUDA 11.4. For 11.4 you need a driver of version >=470.42.01 as per CUDA release notes. Could you please check if the driver version satisfies this version using nvidia-smi? Thanks!

@constantinpape
Copy link
Author

Sorry for the inconvenience.

No worries, that's the downside of custom CUDA code, installation is always a pain.

The error with CUDA 11.4 seems like the CUDA driver is old and does not support CUDA 11.4. For 11.4 you need a driver of version >=470.42.01 as per CUDA release notes. Could you please check if the driver version satisfies this version using nvidia-smi?

Ok, that makes sense. Indeed the driver is older Driver Version: 460.32.03. I will talk to our cluster admins what their plan on updating is.

@constantinpape
Copy link
Author

Ok, I managed to install with CUDA 10.2 now. (This failed for me earlier because I didn't use the gcc that was used to compile this CUDA version.)

Thanks for the help. I will look into running some of my own benchmarks for rama and will for sure follow up with some other questions ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants