Unable to Install and use with GPUs #3

ajtarraga · 2023-03-08T18:45:51Z

Hi @mdolz, I am trying to install and configure PyDTNN in a project with several heterogeneous nodes for supercomputing. In this nodes I have several GPUs interconnected via GPUDirect Storage and RDMA.

While I am trying to execute PyDTNN but while I execute the command python3 -Ou pydtnn_benchmark.py --model=vgg16_cifar10 --dataset=cifar10 --dataset_train_path=datasets/cifar-10/cifar-10-batches-bin --dataset_test_path=datasets/cifar-10/cifar-10-batches-bin --evaluate_only=True --batch_size=64 --validation_split=0.2 --weights_and_bias_filename=vgg16-weights-nhwc.npz --tracing=False --profile=False --enable_gpu=True --dtype=float32 (it is the example that you gives in the code), I obtain the next output:
/home/ajtarraga/.local/lib/python3.8/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
warnings.warn('creating CUBLAS context to get version number')
Please, install pycuda, skcuda, and cudnn to be able to use the GPUs!

I have installed pycuda, skcuda and cudnn:
$ pip3 install -r requirements_cuda_2.txt
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pycuda>=2021.1 in /home/ajtarraga/.local/lib/python3.8/site-packages (from -r requirements_cuda_2.txt (line 1)) (2022.2.2)
Requirement already satisfied: scikit-cuda>=0.5.3 in /home/ajtarraga/.local/lib/python3.8/site-packages (from -r requirements_cuda_2.txt (line 2)) (0.5.3)
Requirement already satisfied: nvidia-cudnn>=8.1.1.33 in /home/ajtarraga/.local/lib/python3.8/site-packages (from -r requirements_cuda_2.txt (line 3)) (8.2.0.51)
Requirement already satisfied: appdirs>=1.4.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (1.4.4)
Requirement already satisfied: pytools>=2011.2 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (2022.1.14)
Requirement already satisfied: mako in /home/ajtarraga/.local/lib/python3.8/site-packages (from pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (1.2.4)
Requirement already satisfied: numpy>=1.2.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from scikit-cuda>=0.5.3->-r requirements_cuda_2.txt (line 2)) (1.24.1)
Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cudnn>=8.1.1.33->-r requirements_cuda_2.txt (line 3)) (0.34.2)
Requirement already satisfied: setuptools in /home/ajtarraga/.local/lib/python3.8/site-packages (from nvidia-cudnn>=8.1.1.33->-r requirements_cuda_2.txt (line 3)) (65.6.3)
Requirement already satisfied: MarkupSafe>=0.9.2 in /home/ajtarraga/.local/lib/python3.8/site-packages (from mako->pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (2.1.1)
Requirement already satisfied: typing-extensions>=4.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pytools>=2011.2->pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (4.4.0)
Requirement already satisfied: platformdirs>=2.2.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pytools>=2011.2->pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (3.1.0)

What do you think could be the problem and how can I solve it?

The text was updated successfully, but these errors were encountered:

barrachi · 2023-03-10T11:23:26Z

Hi @ajtarraga,

This PyDTNN error message is issued on the next lines of code:

try:
                import pydtnn.backends.gpu.tensor_gpu
                global gpuarray
                import pycuda.gpuarray as gpuarray
                import pycuda.driver as drv
                from pydtnn.backends.gpu.libs import libcudnn as cudnn
                # noinspection PyUnresolvedReferences
                from skcuda import cublas
except (ImportError, ModuleNotFoundError, OSError):
                supported_cudnn = False
                print("Please, install pycuda, skcuda, and cudnn to be able to use the GPUs!")
                sys.exit(-1)

Because of the cublas message in your output, we can suppose that everything previous to that line of code was right. So the error must be on the cublas from skuda importation. Could you please open an interactive python and issue that line alone? Perhaps the error output will give more information of what is failing. Just in case, the scikit-cuda building and installation guide (https://scikit-cuda.readthedocs.io/en/latest/install.html#building-and-installation) states that scikit-cuda searches for CUDA libraries in the system library search path when imported and tells how to solve this issue.

Please, tell us which message was generated (perhaps we should also issue that error on the output).

Also, as a general recommendation, when using python, it is convenient to isolate the environment you are using from the system provided one, for example, using a virtualenv (https://realpython.com/python-virtual-environments-a-primer/). This way, you can keep control of which libraries have been installed and when they should be upgraded for a given environment (which can be used on several applications).

Best regards,

Sergio Barrachina Mir

ajtarraga · 2023-03-14T18:30:54Z

Hi @barrachi

I have cublas installed. I have tried with an interactive python and I have import it correctly. I will show it to you.

$ python3
Python 3.8.10 (default, Nov 14 2022, 12:59:47)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from skcuda import cublas
/home/ajtarraga/.local/lib/python3.8/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
warnings.warn('creating CUBLAS context to get version number')
>>>

Thank you for the recomendation of virtualenv.

What do you think could be the problem while installing? I think I had installed cublas correctly.

barrachi · 2023-03-16T09:29:26Z

Hi @ajtarraga

Well, that should have been the line that failed, but if it does not, could you please execute the following commands on an interactive shell?

import pydtnn.backends.gpu.tensor_gpu
import pycuda.gpuarray as gpuarray
import pycuda.driver as drv
from pydtnn.backends.gpu.libs import libcudnn as cudnn

Another option could be to create a virtualenv and launch pydtnn from that virtualenv. Do you have the last version of PyDTNN installed? (just in case I'm looking a different version code)

Best regards,

Sergio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Install and use with GPUs #3

Unable to Install and use with GPUs #3

ajtarraga commented Mar 8, 2023 •

edited

barrachi commented Mar 10, 2023

ajtarraga commented Mar 14, 2023

barrachi commented Mar 16, 2023

Unable to Install and use with GPUs #3

Unable to Install and use with GPUs #3

Comments

ajtarraga commented Mar 8, 2023 • edited

barrachi commented Mar 10, 2023

ajtarraga commented Mar 14, 2023

barrachi commented Mar 16, 2023

ajtarraga commented Mar 8, 2023 •

edited