Sample cod Error on Gpu: RuntimeError: cublasSgemm(dev.cublas_handle, #1582

yang182 · 2019-08-15T04:41:12Z

import numpy as np
import dynet_config
dynet_config.set_gpu()

import dynet as dy
from optparse import OptionParser
parser = OptionParser()
parser.add_option("--dynet-gpu", action="store_true")
class OurNetwork(object):
def init(self, pc):
self.pW = pc.add_parameters((10,30))
self.pB = pc.add_parameters(10)
self.lookup = pc.add_lookup_parameters((500,10))

def __call__(self, inputs):
    lookup = self.lookup
    emb_vectors = [lookup[i] for i in inputs]
    net_input = dy.concatenate(emb_vectors)
    net_output = dy.softmax((self.pW * net_input) + self.pB)
    return net_output

def create_network_return_loss(self, inputs, expected_output):
    dy.renew_cg()
    out = self(inputs)
    loss = -dy.log(dy.pick(out, expected_output))
    return loss

def create_network_return_best(self, inputs):
    dy.renew_cg()
    out = self(inputs)
    return np.argmax(out.npvalue())

dy.init()
dy.renew_cg()

m = dy.Model()

network = OurNetwork(m)

trainer = dy.SimpleSGDTrainer(m)

for epoch in range(50):
for inp,lbl in (([1,2,3],1), ([3,2,4],2)):
loss = network.create_network_return_loss(inp, lbl)
loss.value()
loss.backward()
trainer.update()
print(loss.value()) # need to run loss.value() for the forward prop

print('Predicted smallest element among {} is {}:'.format([1,2,3], network.create_network_return_best([1,2,3])))

==========================out info=================
[dynet] initializing CUDA
[dynet] CUDA driver/runtime versions are 10.0/10.0
Request for 1 GPU ...
[dynet] Device Number: 0
[dynet] Device name: GeForce RTX 2080 Ti
[dynet] Memory Clock Rate (KHz): 7000000
[dynet] Memory Bus Width (bits): 352
[dynet] Peak Memory Bandwidth (GB/s): 616
[dynet] Memory Free (GB): 1.65767/11.5229
[dynet]
[dynet] Device(s) selected: 0
[dynet] random seed: 3237171193
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
WARNING: Attempting to initialize dynet twice. Ignoring duplicate initialization.
CUBLAS failure in cublasSgemm(dev.cublas_handle, CUBLAS_OP_N, CUBLAS_OP_N, y.d.rows(), y.d.cols() * y.d.batch_elems(), l.d.cols(), dev.kSCALAR_ONE, l.v, l.d.rows(), r.v, r.d.rows(), acc_scalar, y.v, y.d.rows())
13
Traceback (most recent call last):
File "/data/jupyter/hongchao/text2structure/model2note/model/test_dy_gpu.py", line 57, in
loss.value()
File "_dynet.pyx", line 769, in _dynet.Expression.value
File "_dynet.pyx", line 783, in _dynet.Expression.value
RuntimeError: cublasSgemm(dev.cublas_handle, CUBLAS_OP_N, CUBLAS_OP_N, y.d.rows(), y.d.cols() * y.d.batch_elems(), l.d.cols(), dev.kSCALAR_ONE, l.v, l.d.rows(), r.v, r.d.rows(), acc_scalar, y.v, y.d.rows())

Process finished with exit code 1

The text was updated successfully, but these errors were encountered:

cydur · 2021-02-19T19:37:14Z

I've got a similar problem using C++ with CUDA V 10.2.
In my case the forward step works fine and I get a loss value which looks OK. But in the backward step there is an Exception telling:
** On entry to SGEMM parameter number 8 had an illegal value
CUBLAS failure in cublasSgemm(dev.cublas_handle, CUBLAS_OP_N, CUBLAS_OP_T, y.d.rows(), y.d.cols(), l.d.cols() * l.d.batch_elems(), dev.kSCALAR_ONE, l.v, l.d.rows(), r.v, r.d.rows(), dev.kSCALAR_ONE, y.v, y.d.rows())
7

This only occurs with dynet-autobatch turned on. Without autobatch it works fine.

Any ideas someone?

cydur mentioned this issue Oct 24, 2023

autobatching bug fixed. #1666

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample cod Error on Gpu: RuntimeError: cublasSgemm(dev.cublas_handle, #1582

Sample cod Error on Gpu: RuntimeError: cublasSgemm(dev.cublas_handle, #1582

yang182 commented Aug 15, 2019 •

edited

cydur commented Feb 19, 2021 •

edited

Sample cod Error on Gpu: RuntimeError: cublasSgemm(dev.cublas_handle, #1582

Sample cod Error on Gpu: RuntimeError: cublasSgemm(dev.cublas_handle, #1582

Comments

yang182 commented Aug 15, 2019 • edited

cydur commented Feb 19, 2021 • edited

yang182 commented Aug 15, 2019 •

edited

cydur commented Feb 19, 2021 •

edited