-
-
Notifications
You must be signed in to change notification settings - Fork 777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v7.4.0 cupy/cuda/driver.pyx error line 118 #3323
Comments
Seems like it is an error doing the cleanup? We have hit some of these before, they should not affect your computations but we should fix this too. |
It's really odd...I can't reproduce this with v7.4.0 from conda-forge. @emcastillo can you? By the way, just curious (I don't think this is relevant):
How is this combination possible? You run with CUDA 8.0, but the runtime detection is 10.0?! |
OK looks like something is odd with CUDA 10.1/10.2. I can reproduce this with the following conda environment: conda create -n CF_cupy_test python=3.7 cupy cudatoolkit=10.2 -c conda-forge
conda activate CF_cupy_test
python cupy740_crash_mre.py but not with |
Seems to be on |
Yes, I think it's a cleanup bug. Just don't know what exactly triggered it, and why no one had complained until now 😂 Perhaps we need to list a few more common infrastructural objects in |
I will look at it later and try to fix it the same way we did before. |
I take it back, sorry @emcastillo. This is the full output I got (this time with
Note that the earliest error is |
OK so CUDA 9.2 & 10.0 do not trigger this, but 10.1 & 10.2 do. Perhaps we hit another "massive array" bug in cuSOLVER (as in #3127)?! |
could be ... |
I can't reproduce wtih 10.2
|
This comment has been minimized.
This comment has been minimized.
@turbach what's the GPU model you are using? |
#3331 fix this bug, it was an error in the implementation of |
Sorry for the slow reply, I got the error on a TitanX Pascal 12GB, the specs and version particulars are at the bottom of the notebook pdf in the OP. Anything else I can do at this end let me know. Thanks for checking it out. |
We solved the bug and it will be available in the next release (next week) |
Great thanks. I'm in an EEG research lab, we just did a GPU hackathon with NVIDIA and SDSC, our first foray into GPU acceleration. Took CuPy for a test drive, got order of magnitude acceleration on our matrix math changing np to cp. Not bad. |
Hi,
I'm working in conda envs with conda installs. Hit a snag upgrading from cupy 6.0.0 to 7.4.0 with rapidsai.
The MRE runs in cupy 6.0.0 and crashes in 7.4.0 with this error:
In Jupyter Notebook the MRE errors one line later:
The complete stack trace is in the attached notebook along with additional system and device specs.
Great tool box, thanks.
Tom
cupy740_crash_mre.ipynb.pdf
[sandbox]$ conda activate cupy
(cupy) [sandbox]$ python --version; python -c "import cupy; cupy.show_config()"; python cupy740_crash_mre.py
Python 3.7.7
CuPy Version : 6.0.0
CUDA Root : /usr/local/cuda-8.0
CUDA Build Version : 10000
CUDA Driver Version : 10020
CUDA Runtime Version : 10000
cuDNN Build Version : 7301
cuDNN Version : 7605
NCCL Build Version : 1000
NCCL Runtime Version : (unknown)
MB free 12039 total 12196
MB matrices: 978
(cupy) [sandbox]$ conda deactivate
[sandbox]$ conda activate rapidsai37
(rapidsai37) [sandbox]$ python --version; python -c "import cupy; cupy.show_config()"; python cupy740_crash_mre.py
Python 3.7.6
CuPy Version : 7.4.0
CUDA Root : /home/turbach/.conda/envs/rapidsai37
CUDA Build Version : 10020
CUDA Driver Version : 10020
CUDA Runtime Version : 10020
cuBLAS Version : 10202
cuFFT Version : 10102
cuRAND Version : 10102
cuSOLVER Version : (10, 3, 0)
cuSPARSE Version : 10301
NVRTC Version : (10, 2)
cuDNN Build Version : 7605
cuDNN Version : 7605
NCCL Build Version : 2406
NCCL Runtime Version : 2507
MB free 12027 total 12196
MB matrices: 978
Traceback (most recent call last):
File "cupy/cuda/driver.pyx", line 247, in cupy.cuda.driver.moduleUnload
File "cupy/cuda/driver.pyx", line 118, in cupy.cuda.driver.check_status
TypeError: 'NoneType' object is not callable
Exception ignored in: 'cupy.cuda.function.Module.dealloc'
Traceback (most recent call last):
File "cupy/cuda/driver.pyx", line 247, in cupy.cuda.driver.moduleUnload
File "cupy/cuda/driver.pyx", line 118, in cupy.cuda.driver.check_status
TypeError: 'NoneType' object is not callable
(rapidsai37) [sandbox]$
The text was updated successfully, but these errors were encountered: