Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED /root/autodl-tmp/MGCNet #57

Open
Lijuming33 opened this issue Apr 3, 2024 · 0 comments

Comments

@Lijuming33
Copy link

My environment is as follows:
Graphics card:RTX 3080 Ti ,CUDA10.0
tensorflow-gpu==1.13.1 tensorboard==1.13.1 tensorflow-estimator==1.13.0 protobuf==3.20.3 trimesh==2.38.40 h5py==2.10.0 numpy==1.16.6 opencv-python==3.4.2.17 scikit-image==0.14.5 scikit-learn==0.20.3 scipy==1.2.2 face-alignment==1.3.3 packaging==23.1 shapely==2.0.3

The error message is as follows,【I changed the code -related code of the graphics card, but there was always an error: Failed to run Cublas Routine Cublassgemm_v2: Cublas_Status_execution_FAILED】、【Internet (See Above for Traceback): Blas XGEMM Launch Failed】:

2024-04-03 02:43:46.041406: E tensorflow/stream_executor/cuda/cuda_blas.cc:698] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
/root/autodl-tmp/MGCNet
Finish copy
[_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, 10724500243007706377), _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 8698154402609908115), _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 5407358957467732393), _DeviceAttributes(/job:localhost/replica:0/task:0/device:GPU:0, GPU, 6315147264, 4820318963678066325)]
Traceback (most recent call last):
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,35709,4], b.shape=[1,4,4], m=35709, n=4, k=4
[[{{node MatMul_37}}]]
[[{{node convert_image_4}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test_image.py", line 143, in
pred = system.inference(sess, image_rgb_b)
File "/root/autodl-tmp/MGCNet/src_tfGraph/build_graph.py", line 1103, in inference
results = sess.run(fetches, feed_dict={'pl_input:0':inputs})
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,35709,4], b.shape=[1,4,4], m=35709, n=4, k=4
[[node MatMul_37 (defined at /root/autodl-tmp/MGCNet/thirdParty/tf_mesh_renderer/mesh_renderer/rasterize_triangles.py:105) ]]
[[node convert_image_4 (defined at /root/autodl-tmp/MGCNet/src_tfGraph/build_graph.py:1006) ]]

Caused by op 'MatMul_37', defined at:
File "test_image.py", line 92, in
FLAGS, img_height=FLAGS.img_height, img_width=FLAGS.img_width, batch_size=FLAGS.batch_size
File "/root/autodl-tmp/MGCNet/src_tfGraph/build_graph.py", line 949, in build_test_graph
self.build_testVisual_graph()
File "/root/autodl-tmp/MGCNet/src_tfGraph/build_graph.py", line 1039, in build_testVisual_graph
self.gpmm_frustrum, gpmm_tar_mvMain, gpmm_tar_eyeMain, fore=opt.flag_fore, tone=False, background=-1
File "/root/autodl-tmp/MGCNet/src_tfGraph/deep_3dmm_decoder.py", line 309, in decoder_renderColorMesh_gary
mtx_perspect_frustrum, list_mtx_model_view[i], list_cam_position[i], tone, background
File "/root/autodl-tmp/MGCNet/src_tfGraph/deep_3dmm_decoder.py", line 529, in gpmm_render_image_garyLight
opt.img_width, opt.img_height, background=background)
File "/root/autodl-tmp/MGCNet/src_common/geometry/render/api_tf_mesh_render.py", line 262, in mesh_renderer_camera
image_width, image_height, [background] * vertex_attributes.shape[2].value)
File "/root/autodl-tmp/MGCNet/thirdParty/tf_mesh_renderer/mesh_renderer/rasterize_triangles.py", line 105, in rasterize_triangles
vertices_homogeneous, projection_matrices, transpose_b=True)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py", line 2417, in matmul
a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1423, in batch_mat_mul
"BatchMatMul", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/root/miniforge-pypy3/envs/mgcnet/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,35709,4], b.shape=[1,4,4], m=35709, n=4, k=4
[[node MatMul_37 (defined at /root/autodl-tmp/MGCNet/thirdParty/tf_mesh_renderer/mesh_renderer/rasterize_triangles.py:105) ]]
[[node convert_image_4 (defined at /root/autodl-tmp/MGCNet/src_tfGraph/build_graph.py:1006) ]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant