Running #100

ohcurrent · 2019-02-28T06:53:01Z

Hello I am using 'gpgpusim-dev ver.' and running 'cudnn_samples_v7/mnistCUDNN' I get this error message.

_"CUDNN failure
Error: CUDNN_STATUS_NOT_INITIALIZED"_

My environment settings are like below.
(Virtualbox)
Ubuntu 16.04.5 LTS
gcc 5.4.0 g++ 5.4.0
python 3.6.7
cuda 9.1
cudnn 7.0.5

Does anyone know how to solve that error message?

deval281shah · 2019-03-02T21:58:10Z

Hello,
CuDNN is tested for the following configuration:

Ubuntu 16.04.5 LTS
gcc 5.4.0 g++ 5.4.0
CUDA 8.0
cuDNN 7.1.4
Please try with these settings.

Also, you can use mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist (It has detailed steps as well)

deval281shah · 2019-03-10T18:26:04Z

Hi,
Which config files (gpgpusim.config) are you using?

deval281shah · 2019-03-13T20:52:45Z

Hi,
Thanks for reporting the issue. This might be an issue is with the config file.
Have you tried with SM7_TITANV. It works for that config.

shengyushen · 2019-03-20T08:03:02Z

cuda8.0 dont support tensorcore, wich need cuda9.0, so how to resolve this?

gangmul12 · 2019-03-22T05:31:09Z

I think libcudart path on pytorch installation is not that important because gpgpu-sim is based on dynamic library hijacking. Just running pytorch is not working??

gangmul12 · 2019-03-22T07:26:17Z

I got stucked at another error message, but i think i am a few step ahead of you because gpgpusim is successfully loaded when executing "python main.py" in examples/mnist.
How's the result when just launching python and execute import torch?

*my env
python 2.7.12
gcc 5.4
Ubuntu 16.04.06
CUDA 8.0
cudnn 7.1.4
gpgpu-sim commit 49e95cd

gangmul12 · 2019-03-22T15:46:11Z

@ohcurrent
i wrote issus #101

I produced this message when i test resnet, and reproduced it when runnig examples/mnist

gangmul12 · 2019-03-23T03:58:17Z

@ohcurrent
maybe you should set PYTORCH_BIN env variable as gpgpusim indicated

gangmul12 · 2019-03-24T15:57:22Z

I think we are now in the same problem! However, attribute error seems like python error, not gpgpu-sim. i suggest you to test that your program works well without using gpgpu-sim.

gangmul12 · 2019-04-01T05:30:19Z

Long time no see XD
No, currently i'm not working on this issue now.. Also i've never seen that issue, maybe your problem seems different from mine! However, again, isn't it python error? It doesn't seem like gpgpu-sim error...

cng123 · 2019-04-18T21:39:36Z

Hi. This has been a known issue, and it has to do with kernels not being found in libcudnn.so. Please try the instructions in the link below and see if it helps.
https://docs.google.com/document/d/17fSM2vrWodP8rWR7ctpgaggVXEw0uD2VCAh0Gi4Gpb4/edit?usp=sharing

ohcurrent · 2019-04-19T11:15:17Z

@cng123 Did your method succeed in any cudaLaunch for PyTorch examples?

cng123 · 2019-04-20T18:04:29Z

cudaLaunch succeeded for some of the PyTorch examples. However, the kernels might still fail (cuda_status_internal_errror.) Unfortunately, I have not resolved this issue yet.

ohcurrent · 2019-04-21T14:46:45Z

@cng123 Thank you for sharing the link.
However, how do you know 'no PTX implementation~' error is related with wrong cuDNN paths?

cng123 · 2019-04-21T23:12:05Z

@ohcurrent It was just based on personal experience with experimenting with different builds with different environmental variables. There are probably other problems that may lead to the same error, but the most common one I have seen so far is with the wrong (or unset) cuDNN paths.

It seems like if the caffe shared library is not statically linked to cudnn (so either dynamically linked, or not linked at all,) the 'no PTX implementation' will occur, which is a problem since it means that if pytorch is not built with cudnn and only with cuda, it will not work with gpgpu-sim. I am speculating that it has to do with libcublas and other cuda libraries being dynamically linked, but I cannot say for sure.

mivenHan · 2020-03-31T03:59:59Z

@ohcurrent Have you solved cudnn error with mnistCudnn? I encountered the same error with you.

ohcurrent · 2020-03-31T04:10:05Z

@mivenHan
Yes, you can just edit the Makefile of the mnistCudnn, as below.
The application should be compiled with static libraries

#LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm

LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas_static -lcudnn_static_v7 -lculibos -lfreeimage -lstdc++ -lm -ldl -lpthread

mivenHan · 2020-03-31T16:00:07Z

@ohcurrent Thank you for your reply first.

I want to confirm that you did complete the mnistCUDNN successfully under the cuda 9.1?
Since I worked with mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist

My cuda version is 9.0
and I try the cudnn 7.6.5 and cudnn 7.1.4 and cudnn 7.0.5.

All failed with a new error.

GPGPU-Sim PTX: WARNING: Asynchronous memset not supported (cudaError_t cudaMemsetAsync(void*, int, size_t, cudaStream_t))
./mnistCUDNN: relocation error: ./mnistCUDNN: symbol cudaFuncSetAttribute, version libcudart.so.9.0 not defined in file libcudart.so.9.0 with link time reference

Have you encountered that error?
Or the code with cuda 9.1 and cudnn 7.0.5 can complete running successfully?

Thank you for your time.

ohcurrent closed this as completed Nov 6, 2019

ohcurrent changed the title ~~Running cudnn_samples_v7~~ Running Nov 6, 2019

mattsinc mentioned this issue Mar 31, 2020

cudnn_samples_v7 mnistCUDNN ERROR WITH CUDNN #172

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running #100

Running #100

ohcurrent commented Feb 28, 2019 •

edited

Loading

deval281shah commented Mar 2, 2019

deval281shah commented Mar 10, 2019

deval281shah commented Mar 13, 2019

shengyushen commented Mar 20, 2019

gangmul12 commented Mar 22, 2019

gangmul12 commented Mar 22, 2019

gangmul12 commented Mar 22, 2019

gangmul12 commented Mar 23, 2019

gangmul12 commented Mar 24, 2019

gangmul12 commented Apr 1, 2019

cng123 commented Apr 18, 2019

ohcurrent commented Apr 19, 2019

cng123 commented Apr 20, 2019

ohcurrent commented Apr 21, 2019 •

edited

Loading

cng123 commented Apr 21, 2019

mivenHan commented Mar 31, 2020

ohcurrent commented Mar 31, 2020

mivenHan commented Mar 31, 2020

Running #100

Running #100

Comments

ohcurrent commented Feb 28, 2019 • edited Loading

deval281shah commented Mar 2, 2019

deval281shah commented Mar 10, 2019

deval281shah commented Mar 13, 2019

shengyushen commented Mar 20, 2019

gangmul12 commented Mar 22, 2019

gangmul12 commented Mar 22, 2019

gangmul12 commented Mar 22, 2019

gangmul12 commented Mar 23, 2019

gangmul12 commented Mar 24, 2019

gangmul12 commented Apr 1, 2019

cng123 commented Apr 18, 2019

ohcurrent commented Apr 19, 2019

cng123 commented Apr 20, 2019

ohcurrent commented Apr 21, 2019 • edited Loading

cng123 commented Apr 21, 2019

mivenHan commented Mar 31, 2020

ohcurrent commented Mar 31, 2020

mivenHan commented Mar 31, 2020

ohcurrent commented Feb 28, 2019 •

edited

Loading

ohcurrent commented Apr 21, 2019 •

edited

Loading