Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements #8

Closed
AziziShekoofeh opened this issue Nov 8, 2018 · 5 comments
Closed

Requirements #8

AziziShekoofeh opened this issue Nov 8, 2018 · 5 comments

Comments

@AziziShekoofeh
Copy link

Hi,

Thanks for providing the package. I have a question about the requirements. I am trying to train the generative inpainting code [https://github.com/JiahuiYu/generative_inpainting] which is using neuralgym package. In the start of the training procedure [after assigning the GPUs and even generating the graphs], I am getting a few errors related to the version of cuDNN and CUDA library that I am using. However, previously I didn't have any issue in running the test codes of the generative inpainting model.

Ok, long story short, previously I had:
cuDNN v7.1.4
CUDA 9.0
Nvidia Driver 384.130

After the training error, it suggests me to upgrade to cuDNN v.7.2.x, now, with this new driver I am getting:
CUDNN_STATUS_NOT_INITIALIZED

and again, it suggests I need to upgrade my driver. So, I upgrade to Nvidia driver 390.87, but now, it has issues to catch all of the GPUs and saying not enough gpus, well, which I have enough gpu.

My question is that, is there any safe, and tested combination of Tensorflow, Nvidia driver version, CUDA, and cuDNN library version that works well with neuralgym. I really apperitiate even if you can share your versions.

-Many Thanks, Shek

@JiahuiYu
Copy link
Owner

JiahuiYu commented Nov 8, 2018

@AziziShekoofeh Can you obtain results with nvidia-smi?

@AziziShekoofeh
Copy link
Author

you mean the versions and so on? yes, nvidia-smi is working.

@JiahuiYu
Copy link
Owner

JiahuiYu commented Nov 8, 2018

@AziziShekoofeh Ok. I guess the problem is with TensorFlow-GPU. Please have a reference at issues like this one.

Can you successfully train other code/models with tensorflow? I wonder if the problem is on my side or tensorflow/environment side.

@AziziShekoofeh
Copy link
Author

Thanks for your prompt responses. Yes, I could.
Anyway, I find the proper combination finally, I am writing it for other people reference,

my previous setting [It was fine during the test but I had an issue with test]:
cuDNN v7.1.4
CUDA 9.0
Nvidia Driver 384.130
TF 1.4

my current setting and upgrades:
cuDNN v7.2.x
CUDA 9.0
Nvidia Driver 396.54
TF 1.4

I think the neuralgym needs cuDNN v7.2.x or higher [I don't know in which part exactly this dependency is happening], and the Nvidia driver should be 396.54 or higher.

Anyway, many thanks for the response and the useful package.

@JiahuiYu
Copy link
Owner

JiahuiYu commented Nov 8, 2018

@AziziShekoofeh Thanks and I also appreciate your contribute by sharing your experiences.

In my understand, the package neuralgym does not require any cuDNN related libraries. It implicitly requires cuDNN when import tensorflow.

@JiahuiYu JiahuiYu closed this as completed Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants