occur "cudamat.cudamat.CUDAMatException: CUBLAS error." when running multimodal_dbm example #69

Demoscai · 2014-07-25T11:25:43Z

hi nitish srivastava
@nitishsrivastava
I have problems when running multimodal_dbm example, like this:
Train Step: 0Traceback (most recent call last):
File "/home/meitu299/deepnet/deepnet/trainer.py", line 60, in
main()
File "/home/meitu299/deepnet/deepnet/trainer.py", line 54, in main
model.Train()
File "/home/meitu299/deepnet/deepnet/neuralnet.py", line 631, in Train
self.GetTrainBatch()
File "/home/meitu299/deepnet/deepnet/neuralnet.py", line 524, in GetTrainBatch
self.GetBatch(self.train_data_handler)
File "/home/meitu299/deepnet/deepnet/dbm.py", line 264, in GetBatch
super(DBM, self).GetBatch(handler=handler)
File "/home/meitu299/deepnet/deepnet/neuralnet.py", line 512, in GetBatch
data_list = handler.Get()
File "/home/meitu299/deepnet/deepnet/datahandler.py", line 627, in Get
batch = self.gpu_cache.Get(self.batchsize, get_last_piece=self.get_last_piece)
File "/home/meitu299/deepnet/deepnet/datahandler.py", line 396, in Get
self.LoadData()
File "/home/meitu299/deepnet/deepnet/datahandler.py", line 327, in LoadData
self.data[i] = cm.CUDAMatrix(mat)
File "/home/meitu299/deepnet/cudamat/cudamat.py", line 195, in init
raise generate_exception(err_code)
cudamat.cudamat.CUDAMatException: CUBLAS error.

and the RAM is 8G and gpu memory is 3G in my computer ,CUDA6.0
I follow your INSTALL, but always happen this
could tell how to resolve this ? It's a bug?
thx

cbalint13 · 2014-07-25T13:31:57Z

Try reduce batch size from 128 to 100.

Demoscai · 2014-07-28T07:12:09Z

I have try to reduce batch size to 50, but it doesn't work

jormansa · 2014-07-31T10:33:04Z

Try to fix the value "gpu_memory" of your .pbtxt file to "2G" or "2.5G"

Demoscai · 2014-08-19T09:48:15Z

thanks , that's OK

tengshaofeng · 2014-10-15T05:53:17Z

thanks to you in advanved
i have the similar problem.
when i run the example of ff,i set the steps from 1000000 to 10000, the batchsize from 100 to 10,the
gpu_memory from 2G to 0.1G,the main_memmory from 4G to 0.7G.
but when i come to the setp 499, it still comes to the problem like this:

File "/home/tbq/Downloads/deepnet-master/deepnet/softmax_layer.py", line 65, in GetLoss
perf.correct_preds = temp.sum()
File "/home/tbq/Downloads/deepnet-master/cudamat/cudamat.py", line720, in sum
return vdot(self,CUDAMatrix.ones.slice(0,self.shape[0]*self.shape[1]))
File "/home/tbq/Downloads/deepnet-master/cudamat/cudamat.py", line1650 in vdot
raise generate_exception(err_code.value)
cudamat.cudamat.CUDAMatException: CUBLAS error.

and the RAM is 1G and gpu memory is 256M in my computer ,CUDA5.5

when i try the dbm and rbm ,it is also comes to the problem
i want to know whether my cpu and gpu is not satisfy the demand.
thx

tengshaofeng · 2014-10-15T05:56:33Z

sorry,english is not mother tongue. in addition,gcc:4.6.3

jnhwkim · 2015-05-11T09:50:56Z

In my case, I decreased gpu_mem as 1G in run_all_dbn.sh though my gpu memory is 4G (NVIDIA GeForce GTX 780M 4096 MB).

chaojiewang94 · 2017-01-16T02:38:05Z

thank you all , ruducing the gpu_mem really helps , and the code strat to work , but at the end of trainning the first layer , the bug happens again , is the gpu_mem still too large?
and what will happen if i reduce the gpu_mem

Thank you all a lot if anyone can help me

springzfx mentioned this issue Apr 5, 2016

cudamat.cudamat.CUDAMatException: CUDA error: no error mansimov/unsupervised-videos#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

occur "cudamat.cudamat.CUDAMatException: CUBLAS error." when running multimodal_dbm example #69

occur "cudamat.cudamat.CUDAMatException: CUBLAS error." when running multimodal_dbm example #69

Demoscai commented Jul 25, 2014

cbalint13 commented Jul 25, 2014

Demoscai commented Jul 28, 2014

jormansa commented Jul 31, 2014

Demoscai commented Aug 19, 2014

tengshaofeng commented Oct 15, 2014

tengshaofeng commented Oct 15, 2014

jnhwkim commented May 11, 2015

chaojiewang94 commented Jan 16, 2017

occur "cudamat.cudamat.CUDAMatException: CUBLAS error." when running multimodal_dbm example #69

occur "cudamat.cudamat.CUDAMatException: CUBLAS error." when running multimodal_dbm example #69

Comments

Demoscai commented Jul 25, 2014

cbalint13 commented Jul 25, 2014

Demoscai commented Jul 28, 2014

jormansa commented Jul 31, 2014

Demoscai commented Aug 19, 2014

tengshaofeng commented Oct 15, 2014

tengshaofeng commented Oct 15, 2014

jnhwkim commented May 11, 2015

chaojiewang94 commented Jan 16, 2017