Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running #100

Closed
ohcurrent opened this issue Feb 28, 2019 · 18 comments
Closed

Running #100

ohcurrent opened this issue Feb 28, 2019 · 18 comments

Comments

@ohcurrent
Copy link

ohcurrent commented Feb 28, 2019

Hello I am using 'gpgpusim-dev ver.' and running 'cudnn_samples_v7/mnistCUDNN' I get this error message.

_"CUDNN failure
Error: CUDNN_STATUS_NOT_INITIALIZED"
_

My environment settings are like below.
(Virtualbox)
Ubuntu 16.04.5 LTS
gcc 5.4.0 g++ 5.4.0
python 3.6.7
cuda 9.1
cudnn 7.0.5

Does anyone know how to solve that error message?

@deval281shah
Copy link
Contributor

Hello,
CuDNN is tested for the following configuration:

Ubuntu 16.04.5 LTS
gcc 5.4.0 g++ 5.4.0
CUDA 8.0
cuDNN 7.1.4
Please try with these settings.

Also, you can use mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist (It has detailed steps as well)

@deval281shah
Copy link
Contributor

Hi,
Which config files (gpgpusim.config) are you using?

@deval281shah
Copy link
Contributor

Hi,
Thanks for reporting the issue. This might be an issue is with the config file.
Have you tried with SM7_TITANV. It works for that config.

@shengyushen
Copy link

cuda8.0 dont support tensorcore, wich need cuda9.0, so how to resolve this?

@gangmul12
Copy link
Contributor

I think libcudart path on pytorch installation is not that important because gpgpu-sim is based on dynamic library hijacking. Just running pytorch is not working??

@gangmul12
Copy link
Contributor

I got stucked at another error message, but i think i am a few step ahead of you because gpgpusim is successfully loaded when executing "python main.py" in examples/mnist.
How's the result when just launching python and execute import torch?

*my env
python 2.7.12
gcc 5.4
Ubuntu 16.04.06
CUDA 8.0
cudnn 7.1.4
gpgpu-sim commit 49e95cd

@gangmul12
Copy link
Contributor

@ohcurrent
i wrote issus #101
capture

I produced this message when i test resnet, and reproduced it when runnig examples/mnist

@gangmul12
Copy link
Contributor

@ohcurrent
maybe you should set PYTORCH_BIN env variable as gpgpusim indicated

@gangmul12
Copy link
Contributor

I think we are now in the same problem! However, attribute error seems like python error, not gpgpu-sim. i suggest you to test that your program works well without using gpgpu-sim.

@gangmul12
Copy link
Contributor

Long time no see XD
No, currently i'm not working on this issue now.. Also i've never seen that issue, maybe your problem seems different from mine! However, again, isn't it python error? It doesn't seem like gpgpu-sim error...

@cng123
Copy link
Contributor

cng123 commented Apr 18, 2019

Hi. This has been a known issue, and it has to do with kernels not being found in libcudnn.so. Please try the instructions in the link below and see if it helps.
https://docs.google.com/document/d/17fSM2vrWodP8rWR7ctpgaggVXEw0uD2VCAh0Gi4Gpb4/edit?usp=sharing

@ohcurrent
Copy link
Author

@cng123 Did your method succeed in any cudaLaunch for PyTorch examples?

@cng123
Copy link
Contributor

cng123 commented Apr 20, 2019

cudaLaunch succeeded for some of the PyTorch examples. However, the kernels might still fail (cuda_status_internal_errror.) Unfortunately, I have not resolved this issue yet.

@ohcurrent
Copy link
Author

ohcurrent commented Apr 21, 2019

@cng123 Thank you for sharing the link.
However, how do you know 'no PTX implementation~' error is related with wrong cuDNN paths?

@cng123
Copy link
Contributor

cng123 commented Apr 21, 2019

@ohcurrent It was just based on personal experience with experimenting with different builds with different environmental variables. There are probably other problems that may lead to the same error, but the most common one I have seen so far is with the wrong (or unset) cuDNN paths.

It seems like if the caffe shared library is not statically linked to cudnn (so either dynamically linked, or not linked at all,) the 'no PTX implementation' will occur, which is a problem since it means that if pytorch is not built with cudnn and only with cuda, it will not work with gpgpu-sim. I am speculating that it has to do with libcublas and other cuda libraries being dynamically linked, but I cannot say for sure.

@ohcurrent ohcurrent changed the title Running cudnn_samples_v7 Running Nov 6, 2019
@mivenHan
Copy link

@ohcurrent Have you solved cudnn error with mnistCudnn? I encountered the same error with you.

@ohcurrent
Copy link
Author

@mivenHan
Yes, you can just edit the Makefile of the mnistCudnn, as below.
The application should be compiled with static libraries

#LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm

LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas_static -lcudnn_static_v7 -lculibos -lfreeimage -lstdc++ -lm -ldl -lpthread

@mivenHan
Copy link

@ohcurrent Thank you for your reply first.

I want to confirm that you did complete the mnistCUDNN successfully under the cuda 9.1?
Since I worked with mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist

My cuda version is 9.0
and I try the cudnn 7.6.5 and cudnn 7.1.4 and cudnn 7.0.5.

All failed with a new error.

GPGPU-Sim PTX: WARNING: Asynchronous memset not supported (cudaError_t cudaMemsetAsync(void*, int, size_t, cudaStream_t))
./mnistCUDNN: relocation error: ./mnistCUDNN: symbol cudaFuncSetAttribute, version libcudart.so.9.0 not defined in file libcudart.so.9.0 with link time reference

Have you encountered that error?
Or the code with cuda 9.1 and cudnn 7.0.5 can complete running successfully?

Thank you for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants