Requirements #8

AziziShekoofeh · 2018-11-08T01:19:51Z

Hi,

Thanks for providing the package. I have a question about the requirements. I am trying to train the generative inpainting code [https://github.com/JiahuiYu/generative_inpainting] which is using neuralgym package. In the start of the training procedure [after assigning the GPUs and even generating the graphs], I am getting a few errors related to the version of cuDNN and CUDA library that I am using. However, previously I didn't have any issue in running the test codes of the generative inpainting model.

Ok, long story short, previously I had:
cuDNN v7.1.4
CUDA 9.0
Nvidia Driver 384.130

After the training error, it suggests me to upgrade to cuDNN v.7.2.x, now, with this new driver I am getting:
CUDNN_STATUS_NOT_INITIALIZED

and again, it suggests I need to upgrade my driver. So, I upgrade to Nvidia driver 390.87, but now, it has issues to catch all of the GPUs and saying not enough gpus, well, which I have enough gpu.

My question is that, is there any safe, and tested combination of Tensorflow, Nvidia driver version, CUDA, and cuDNN library version that works well with neuralgym. I really apperitiate even if you can share your versions.

-Many Thanks, Shek

JiahuiYu · 2018-11-08T01:25:34Z

@AziziShekoofeh Can you obtain results with nvidia-smi?

AziziShekoofeh · 2018-11-08T01:35:52Z

you mean the versions and so on? yes, nvidia-smi is working.

JiahuiYu · 2018-11-08T01:43:10Z

@AziziShekoofeh Ok. I guess the problem is with TensorFlow-GPU. Please have a reference at issues like this one.

Can you successfully train other code/models with tensorflow? I wonder if the problem is on my side or tensorflow/environment side.

AziziShekoofeh · 2018-11-08T01:52:50Z

Thanks for your prompt responses. Yes, I could.
Anyway, I find the proper combination finally, I am writing it for other people reference,

my previous setting [It was fine during the test but I had an issue with test]:
cuDNN v7.1.4
CUDA 9.0
Nvidia Driver 384.130
TF 1.4

my current setting and upgrades:
cuDNN v7.2.x
CUDA 9.0
Nvidia Driver 396.54
TF 1.4

I think the neuralgym needs cuDNN v7.2.x or higher [I don't know in which part exactly this dependency is happening], and the Nvidia driver should be 396.54 or higher.

Anyway, many thanks for the response and the useful package.

JiahuiYu · 2018-11-08T03:09:00Z

@AziziShekoofeh Thanks and I also appreciate your contribute by sharing your experiences.

In my understand, the package neuralgym does not require any cuDNN related libraries. It implicitly requires cuDNN when import tensorflow.

JiahuiYu closed this as completed Nov 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirements #8

Requirements #8

AziziShekoofeh commented Nov 8, 2018

JiahuiYu commented Nov 8, 2018

AziziShekoofeh commented Nov 8, 2018

JiahuiYu commented Nov 8, 2018

AziziShekoofeh commented Nov 8, 2018

JiahuiYu commented Nov 8, 2018

Requirements #8

Requirements #8

Comments

AziziShekoofeh commented Nov 8, 2018

JiahuiYu commented Nov 8, 2018

AziziShekoofeh commented Nov 8, 2018

JiahuiYu commented Nov 8, 2018

AziziShekoofeh commented Nov 8, 2018

JiahuiYu commented Nov 8, 2018