You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 24, 2018. It is now read-only.
I've been able to run the kaggle_bowl example but the GPU runs report a training-error that remains flat. I run the same bowl.conf only with dev=cpu and the training seems to begin to train fine. (I've appended the output of both below). I haven't otherwise changed bowl.conf. I did run the GPU on the MNIST example and it worked fine.
Do you have an idea of what might be going on here? Thanks!
Saw this on the kaggle forums and it fixed it for me:
"I discovered a warning during the compilation of 'tensor_gpu-inl.cuh' which is part of mshadow and located in 'cxxnet-master/mshadow/mshadow/cuda'. The warning said that the CUDA architecture of the GPU could not be determined and will be set to 2.0 automatically. It seemed that the CUDA_ARCH macro, that is checked for the architecture, was broken. Setting this macro by hand like
#define CUDA_ARCH 300
resolved the issue for me (in my case, the 300 comes from the GTX 680 having compute capability 3.0). Maybe this can help with your problem too...
"
Except mine is a GTX 760 so I put #define CUDA_ARCH 500. Then it gave me a redefine error and everything worked again. So just make clean and then add this line in tensor_gpu-inl.cuh and then make again.
I've been able to run the kaggle_bowl example but the GPU runs report a training-error that remains flat. I run the same bowl.conf only with dev=cpu and the training seems to begin to train fine. (I've appended the output of both below). I haven't otherwise changed bowl.conf. I did run the GPU on the MNIST example and it worked fine.
Do you have an idea of what might be going on here? Thanks!
Use CUDA Device 0: Tesla C1060
CXXNetTrainer, devCPU=0
ConvolutionLayer: nstep=256
ConvolutionLayer: nstep=256
ConvolutionLayer: nstep=128
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
node[0].shape: 256,3,48,48
node[1].shape: 256,96,12,12
node[2].shape: 256,96,12,12
node[3].shape: 256,96,6,6
node[4].shape: 256,128,8,8
node[5].shape: 256,128,8,8
node[6].shape: 256,128,8,8
node[7].shape: 256,128,8,8
node[8].shape: 256,128,4,4
node[9].shape: 1,1,256,2048
node[10].shape: 1,1,256,512
node[11].shape: 1,1,256,512
node[12].shape: 1,1,256,512
node[13].shape: 1,1,256,512
node[14].shape: 1,1,256,121
ThreadImagePageIterator:image_list=./train.lst, bin=./train.bin
loading mean image from models/image_mean.bin
ThreadBufferIterator: buffer_size=2
ThreadImagePageIterator:image_list=./train.lst, bin=./train.bin
loading mean image from models/image_mean.bin
ThreadBufferIterator: buffer_size=2
initializing end, start working
round 0:[ 100] 13 sec elapsed[1] train-error:0.999570
round 1:[ 100] 33 sec elapsed[2] train-error:0.999570
round 2:[ 100] 52 sec elapsed[3] train-error:0.999570
round 3:[ 100] 72 sec elapsed[4] train-error:0.999570
CXXNetTrainer, devCPU=1
ConvolutionLayer: nstep=256
ConvolutionLayer: nstep=256
ConvolutionLayer: nstep=128
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
SGDUpdater: eta=0.010000, mom=0.900000
SGDUpdater: eta=0.020000, mom=0.900000
node[0].shape: 256,3,48,48
node[1].shape: 256,96,12,12
node[2].shape: 256,96,12,12
node[3].shape: 256,96,6,6
node[4].shape: 256,128,8,8
node[5].shape: 256,128,8,8
node[6].shape: 256,128,8,8
node[7].shape: 256,128,8,8
node[8].shape: 256,128,4,4
node[9].shape: 1,1,256,2048
node[10].shape: 1,1,256,512
node[11].shape: 1,1,256,512
node[12].shape: 1,1,256,512
node[13].shape: 1,1,256,512
node[14].shape: 1,1,256,121
ThreadImagePageIterator:image_list=./train.lst, bin=./train.bin
loading mean image from models/image_mean.bin
ThreadBufferIterator: buffer_size=2
ThreadImagePageIterator:image_list=./train.lst, bin=./train.bin
loading mean image from models/image_mean.bin
ThreadBufferIterator: buffer_size=2
initializing end, start working
round 0:[ 100] 311 sec elapsed[1] train-error:0.776947
round 1:[ 100] 815 sec elapsed[2] train-error:0.689883
update round 2
The text was updated successfully, but these errors were encountered: