Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ND4J][CUDA 12] Support nd4j for cuda 12.4.1 ? #10067

Closed
freedom1b2830 opened this issue May 7, 2024 · 3 comments
Closed

[ND4J][CUDA 12] Support nd4j for cuda 12.4.1 ? #10067

freedom1b2830 opened this issue May 7, 2024 · 3 comments

Comments

@freedom1b2830
Copy link

When will there be support for cuda 12?

When adding a dependency

<dependency>
	<groupId>org.nd4j</groupId>
	<artifactId>nd4j-cuda-11.4-platform</artifactId>
	<version>1.0.0-M2.1</version>
</dependency>

An exception is thrown:

[0.001s][warning][logging] No tag set matches selection: native. Did you mean any of the following? native* class+load+cause+native
File downloaded to /tmp/Shakespeare.txt
Loaded and converted file: 965255 valid characters of 969521 total characters (4266 removed)
SLF4J(I): Connected with provider of type [ch.qos.logback.classic.spi.LogbackServiceProvider]
07:40:14.558 [main] INFO org.nd4j.linalg.factory.Nd4jBackend -- Loaded [JCublasBackend] backend
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.nd4j.jita.concurrency.CudaAffinityManager.getNumberOfDevices(CudaAffinityManager.java:136)
	at org.nd4j.jita.constant.ConstantProtector.purgeProtector(ConstantProtector.java:60)
	at org.nd4j.jita.constant.ConstantProtector.<init>(ConstantProtector.java:53)
	at org.nd4j.jita.constant.ConstantProtector.<clinit>(ConstantProtector.java:41)
	at org.nd4j.jita.constant.ProtectedCudaConstantHandler.<clinit>(ProtectedCudaConstantHandler.java:69)
	at org.nd4j.jita.constant.CudaConstantHandler.<clinit>(CudaConstantHandler.java:38)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:529)
	at java.base/java.lang.Class.forName(Class.java:508)
	at org.nd4j.common.config.ND4JClassLoading.loadClassByName(ND4JClassLoading.java:62)
	at org.nd4j.common.config.ND4JClassLoading.loadClassByName(ND4JClassLoading.java:56)
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5124)
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5064)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:284)
	at org.deeplearning4j.nn.conf.NeuralNetConfiguration$Builder.seed(NeuralNetConfiguration.java:655)
	at org.deeplearning4j.examples.advanced.modelling.charmodelling.generatetext.GenerateTxtModel.main(GenerateTxtModel.java:139)
Caused by: java.lang.RuntimeException: ND4J is probably missing dependencies. For more information, please refer to: https://deeplearning4j.konduit.ai/nd4j/backend
	at org.nd4j.nativeblas.NativeOpsHolder.<init>(NativeOpsHolder.java:107)
	at org.nd4j.nativeblas.NativeOpsHolder.<clinit>(NativeOpsHolder.java:41)
	... 16 more
Caused by: java.lang.UnsatisfiedLinkError: no jnind4jcuda in java.library.path: /usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib
	at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2439)
	at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:916)
	at java.base/java.lang.System.loadLibrary(System.java:2063)
	at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1832)
	at org.bytedeco.javacpp.Loader.load(Loader.java:1423)
	at org.bytedeco.javacpp.Loader.load(Loader.java:1234)
	at org.bytedeco.javacpp.Loader.load(Loader.java:1210)
	at org.nd4j.linalg.jcublas.bindings.Nd4jCuda.<clinit>(Nd4jCuda.java:10)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:529)
	at java.base/java.lang.Class.forName(Class.java:508)
	at org.nd4j.common.config.ND4JClassLoading.loadClassByName(ND4JClassLoading.java:62)
	at org.nd4j.common.config.ND4JClassLoading.loadClassByName(ND4JClassLoading.java:56)
	at org.nd4j.nativeblas.NativeOpsHolder.<init>(NativeOpsHolder.java:97)
	... 17 more
Caused by: java.lang.UnsatisfiedLinkError: /home/user/.javacpp/cache/nd4j-cuda-11.4-1.0.0-M2.1-linux-x86_64.jar/org/nd4j/linalg/jcublas/bindings/linux-x86_64/libjnind4jcuda.so: libcublas.so.11: невозможно открыть разделяемый объектный файл: Нет такого файла или каталога
	at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
	at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:331)
	at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:197)
	at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:139)
	at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2399)
	at java.base/java.lang.Runtime.load0(Runtime.java:852)
	at java.base/java.lang.System.load(System.java:2025)
	at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1779)
	... 27 more

Linux:

Linux archlinux 6.8.9-arch1-1 # 1 SMP PREEMPT_DYNAMIC Thu, 02 May 2024 17:49:46 +0000 x86_64 GNU/Linux

CUDA version: 12.4.1

LANG=C pacman -Si cuda

Repository      : extra
Name            : cuda
Version         : 12.4.1-3
Description     : NVIDIA's GPU programming toolkit
Architecture    : x86_64
URL             : https://developer.nvidia.com/cuda-zone
Licenses        : LicenseRef-NVIDIA-CUDA
Groups          : None
Provides        : cuda-toolkit  cuda-sdk  libcudart.so=12-64  libcublas.so=12-64  libcublas.so=12-64  libcusolver.so=11-64  libcusolver.so=11-64  libcusparse.so=12-64  libcusparse.so=12-64
Depends On      : opencl-nvidia  python  gcc
Optional Deps   : gdb: for cuda-gdb
                  glu: required for some profiling tools in CUPTI
                  nvidia-utils: for NVIDIA drivers (not needed in CDI containers)
                  rdma-core: for GPUDirect Storage (libcufile_rdma.so)
Conflicts With  : None
Replaces        : cuda-toolkit  cuda-sdk  cuda-static
Download Size   : 1804.22 MiB
Installed Size  : 4728.67 MiB
Packager        : Jakub Klinkovsk
Build Date      : Fri May 3 14:55:08 2024
Validated By    : SHA-256 Sum  Signature

find .m2/rep* -name nd4j-cuda*jar

.m2/repository/org/nd4j/nd4j-cuda-11.4-platform/1.0.0-M2.1/nd4j-cuda-11.4-platform-1.0.0-M2.1.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4-platform/1.0.0-M2.1/nd4j-cuda-11.4-platform-1.0.0-M2.1-sources.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4/1.0.0-M2.1/nd4j-cuda-11.4-1.0.0-M2.1.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4/1.0.0-M2.1/nd4j-cuda-11.4-1.0.0-M2.1-windows-x86_64.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4/1.0.0-M2.1/nd4j-cuda-11.4-1.0.0-M2.1-linux-x86_64.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4/1.0.0-M2.1/nd4j-cuda-11.4-1.0.0-M2.1-sources.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4/1.0.0-M2.1/nd4j-cuda-11.4-1.0.0-M2.1-javadoc.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4-preset/1.0.0-M2.1/nd4j-cuda-11.4-preset-1.0.0-M2.1.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4-preset/1.0.0-M2.1/nd4j-cuda-11.4-preset-1.0.0-M2.1-linux-x86_64.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4-preset/1.0.0-M2.1/nd4j-cuda-11.4-preset-1.0.0-M2.1-sources.jar
.m2/repository/org/nd4j/nd4j-cuda-11.4-preset/1.0.0-M2.1/nd4j-cuda-11.4-preset-1.0.0-M2.1-javadoc.jar
@agibsonccc
Copy link
Contributor

@freedom1b2830 the error there is pretty self explanatory. You need to use cuda 11.x for now. Cuda 12 support will come in the next release. Unfortunately I've been in the middle of a big refactoring. If it's urgent for your team please reach out for a commercial engagement: https://www.konduit.ai/#contacts

@freedom1b2830
Copy link
Author

I can't downgrade to 11, so I'll wait for the release.

@agibsonccc
Copy link
Contributor

@freedom1b2830 you should be able to use the cuda redist artifacts from javacpp as a workaround if you want a self contained cuda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants