Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUBLAS abort #20

Closed
fernandoFernandeSantos opened this issue Jun 10, 2020 · 3 comments
Closed

CUBLAS abort #20

fernandoFernandeSantos opened this issue Jun 10, 2020 · 3 comments

Comments

@fernandoFernandeSantos
Copy link

I'm trying to run the opcode_hist on 0_Simple/matrixMulCUBLAS from NVIDIA samples on Tesla K40c. This sample uses sgemm from CUBLAS to perform matrix multiplication. However the I'm having trouble with it. When I try to execute the following line:

eval LD_PRELOAD=.../nvbit_release/tools/opcode_hist/opcode_hist.so ./matrixMulCUBLAS

The output is as follows:

------------- NVBit (NVidia Binary Instrumentation Tool v1.4) Loaded --------------
NVBit core environment variables (mostly for nvbit-devs):
            NVDISASM = nvdisasm - override default nvdisasm found in PATH
            NOBANNER = 0 - if set, does not print this banner
---------------------------------------------------------------------------------
         INSTR_BEGIN = 0 - Beginning of the instruction interval where to apply instrumentation
           INSTR_END = 4294967295 - End of the instruction interval where to apply instrumentation
        KERNEL_BEGIN = 0 - Beginning of the kernel launch interval where to apply instrumentation
          KERNEL_END = 4294967295 - End of the kernel launch interval where to apply instrumentation
        TOOL_VERBOSE = 0 - Enable verbosity inside the tool
    COUNT_WARP_LEVEL = 1 - Count warp level or thread level instructions
    EXCLUDE_PRED_OFF = 0 - Exclude predicated off instruction from count
----------------------------------------------------------------------------------------------------
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "Tesla K40c" with compute capability 3.5

GPU Device 0: "Tesla K40c" with compute capability 3.5

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...matrixMulCUBLAS: arch/gk11x_hal.cpp:173: void set_imm_relative_control_flow(uint64_t*, int64_t): Assertion `!((((imm)&0xFF000000) != 0) && (((imm)&0xFF000000) != 0xFF000000))' failed.
Aborted (core dumped)

I'm using nvcc 10.1. Without the instrumentation, the code runs as expected.

@ovilla
Copy link
Collaborator

ovilla commented Jun 10, 2020

It is definitely a bug, and possible a bad one to fix. Thanks for reporting.
SM3.5 is relatively old in GPU terms and a lot of code was not properly tested on it.
We will take a look for sure and try to fix, but it is possible this will not get solved and we will simply deprecate SM3.5 in NVBit.

@AjinkyaBankar
Copy link

Hi Nvbit team,
I am facing a similar problem and get the following error:

lenet: arch/gm10x_hal.cpp:181: void set_imm_relative_control_flow(uint64_t*, int64_t): Assertion `!IS_LARGER_THAN_24BIT(imm)' failed.
Aborted (core dumped)

I have NVIDIA GeForce 1080Ti on my machine. Are you able to get any solution? Thanks.

@ovilla
Copy link
Collaborator

ovilla commented Feb 3, 2022

version 1.5.5 should have fixed this issue, closing!

@ovilla ovilla closed this as completed Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants