Can not learn with CUDA #29

vankhoa21991 · 2019-05-29T09:05:03Z

Hello, when I do the cmake, I had to remove everything in the exec folder except the n2d2.cpp because if not it will lead to this error:
"add_executable cannot create target "n2d2" because another target with the same name already exists."

Then I did the make like usual. But when I test the model I ran on this error with CUDA, and if I set to Frame only, the model does not learn. Do you know what is the problem that I made? Thanks

sudo ./build/bin/n2d2 models/mnist24_16c4s2_24c5s2_150_10.ini -learn 40000000 -log 100000
Option -log: number of steps between logs [100000]
Option -learn: number of backprop learning steps [40000000]
Loading network configuration file models/mnist24_16c4s2_24c5s2_150_10.ini
Layer: conv1 [Conv(Frame_CUDA)]
Notice: Could not open configuration file: conv1.cfg

Shared synapses: 256

Virtual synapses: 30976

Inputs dims: 24 24 1

Outputs dims: 11 11 16

Warning: No monitor could be added to Cell: conv1
Layer: conv2 [Conv(Frame_CUDA)]
Notice: Could not open configuration file: conv2.cfg

Shared synapses: 2250

Virtual synapses: 36000

Inputs dims: 11 11 16

Outputs dims: 4 4 24

Warning: No monitor could be added to Cell: conv2
Layer: fc1 [Fc(Frame_CUDA)]
Notice: Could not open configuration file: fc1.cfg

Synapses: 57600

Inputs dims: 4 4 24

Outputs dims: 1 1 150

Warning: No monitor could be added to Cell: fc1
Layer: fc1.drop [Dropout(Frame_CUDA)]
Notice: Could not open configuration file: fc1.drop.cfg

Inputs dims: 1 1 150

Outputs dims: 1 1 150

Warning: No monitor could be added to Cell: fc1.drop
Layer: fc2 [Fc(Frame_CUDA)]
Notice: Could not open configuration file: fc2.cfg

Synapses: 1500

Inputs dims: 1 1 150

Outputs dims: 1 1 10

Warning: No monitor could be added to Cell: fc2
Layer: softmax [Softmax(Frame_CUDA)]
Notice: Could not open configuration file: softmax.cfg

Inputs dims: 1 1 10

Outputs dims: 1 1 10

Target: softmax (target value: 1 / default value: 0 / top-n value: 1)
Warning: No monitor could be added to Cell: softmax
Total number of neurons: 2640
Total number of nodes: 2640
Total number of synapses: 61606
Total number of virtual synapses: 126076
Total number of connections: 126076
Notice: Unused section softmax.Target in INI file
CUDNN failure: CUDNN_STATUS_NOT_INITIALIZED (1) in /home/kevin/IMRA_le/3_Program/SNN/N2D2/include/CudaContext.hpp:58
Time elapsed: 1.79893 s
Error: CUDNN failure: CUDNN_STATUS_NOT_INITIALIZED (1) in /home/kevin/IMRA_le/3_Program/SNN/N2D2/include/CudaContext.hpp:58

vankhoa21991 · 2019-05-29T09:12:31Z

olivierbichler-cea · 2019-05-29T10:09:17Z

Hello,
Do you have CuDNN properly installed? What is your CuDNN version?

vankhoa21991 · 2019-05-29T12:20:55Z

This is the result from cmake
sudo cmake -DCMAKE_C_COMPILER=gcc-6 -DCMAKE_CXX_COMPILER=g++-6 ..
-- cotire 1.8.0 loaded.
-- No PugiXML found
-- MongoDB not found.
-- CuDNN library status:
-- version: 7.4.1
-- include path: /usr/local/cuda/include
-- libraries: /usr/local/cuda/lib64/libcudnn.so
-- Configuring done
-- Generating done

olivierbichler-cea · 2019-05-29T16:05:45Z

It looks like your driver version is not compatible with your CuDNN version, according to the CuDNN support matrix: https://docs.nvidia.com/deeplearning/sdk/cudnn-support-matrix/index.html

vankhoa21991 · 2019-05-30T07:24:20Z

Thank you, now I'm having cudnn 7.4.1, CUDA 9.2, driver 390.116. Should I downgrade the driver to 384.11 or downgrade the CUDA to 9.0? It looks like my driver is not in this table.

olivierbichler-cea · 2019-06-07T13:46:39Z

According to the table, you should upgrade your driver to r396.26. I recommend to upgrade it if you can, instead of downgrading other things.

olivierbichler-cea · 2019-06-13T10:03:24Z

The learning in Frame only should work we the latest version of N2D2. There was a bug that has been corrected since.

olivierbichler-cea · 2019-07-11T07:45:31Z

Closing the issue, as this is a driver problem. Please feel free to re-open it if necessary.

olivierbichler-cea closed this as completed Jul 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not learn with CUDA #29

Can not learn with CUDA #29

vankhoa21991 commented May 29, 2019 •

edited

Loading

vankhoa21991 commented May 29, 2019

olivierbichler-cea commented May 29, 2019

vankhoa21991 commented May 29, 2019

olivierbichler-cea commented May 29, 2019

vankhoa21991 commented May 30, 2019

olivierbichler-cea commented Jun 7, 2019 •

edited

Loading

olivierbichler-cea commented Jun 13, 2019

olivierbichler-cea commented Jul 11, 2019

Can not learn with CUDA #29

Can not learn with CUDA #29

Comments

vankhoa21991 commented May 29, 2019 • edited Loading

Shared synapses: 256

Virtual synapses: 30976

Inputs dims: 24 24 1

Outputs dims: 11 11 16

Shared synapses: 2250

Virtual synapses: 36000

Inputs dims: 11 11 16

Outputs dims: 4 4 24

Synapses: 57600

Inputs dims: 4 4 24

Outputs dims: 1 1 150

Inputs dims: 1 1 150

Outputs dims: 1 1 150

Synapses: 1500

Inputs dims: 1 1 150

Outputs dims: 1 1 10

Inputs dims: 1 1 10

Outputs dims: 1 1 10

vankhoa21991 commented May 29, 2019

olivierbichler-cea commented May 29, 2019

vankhoa21991 commented May 29, 2019

olivierbichler-cea commented May 29, 2019

vankhoa21991 commented May 30, 2019

olivierbichler-cea commented Jun 7, 2019 • edited Loading

olivierbichler-cea commented Jun 13, 2019

olivierbichler-cea commented Jul 11, 2019

vankhoa21991 commented May 29, 2019 •

edited

Loading

olivierbichler-cea commented Jun 7, 2019 •

edited

Loading