Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda-samples/Samples/cudaTensorCoreGemm #2

Closed
hummingtree opened this issue Jul 30, 2018 · 2 comments
Closed

cuda-samples/Samples/cudaTensorCoreGemm #2

hummingtree opened this issue Jul 30, 2018 · 2 comments

Comments

@hummingtree
Copy link

Get the following error for cuda-samples/Samples/cudaTensorCoreGemm

Initializing...
GPU Device 0: "Tesla V100-SXM2-16GB" with compute capability 7.0

M: 4096 (16 x 256)
N: 4096 (16 x 256)
K: 4096 (16 x 256)
Preparing data for GPU...
Required shared memory size: 68 Kb
Computing...
CUDA error at cudaTensorCoreGemm.cu:474 code=77(cudaErrorIllegalAddress) "cudaEventSynchronize(stop)"

The error goes away if I decrease M_TILES,N_TILES and K_TILES to 168(from 256).

Any ideas about this?

Thanks.

@mdoijade
Copy link
Collaborator

@hummingtree we have updated the cudaTensorCoreGemm with few fixes in the v10.0 release of these samples. Please check if your issue is resolved.

@mdoijade
Copy link
Collaborator

mdoijade commented Oct 4, 2018

Closing as no response for more than 2 weeks.

@mdoijade mdoijade closed this as completed Oct 4, 2018
Schabrackentapir added a commit to Schabrackentapir/cuda-samples that referenced this issue Feb 12, 2022
Bugfix NVIDIA#1:  added missing cudaExternalMemoryDedicated flag on cudaExternalMemoryHandleDesc
Bugfix NVIDIA#2:  IsWindows8<xxx>OrGreater queries return false on Windows 10. Always returning Windows 10 values now (might break on older Windows versions)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants