NOP instructions in Matrix Multiplication #5

albertorodes · 2021-06-01T21:09:52Z

Hello!

I am trying to run a series of tests to compare the reliability of different versions of the Matrix Multiplication. The kernels that I am using have a parameter that allows to change the thread block size. I performed tests with this parameter set to 32x32 and had no problems or unexpected results. However, when I tried to change that parameter to 16x16 or 8x8 I started getting these types of results:

inspecting: voidmatrixMulCUDA<8>(float*,float*,float*,int,int)
num_static_instrs: 90
maxregs: 30(30)
Injection data
index: 0
kernel_name: voidmatrixMulCUDA<8>(float*,float*,float*,int,int)
ctas: 256
instrs: 10452992
grp 0: 0 grp 1: 2097152 grp 2: 3145728 grp 3: 278528 grp 4: 1671168 grp 5: 3260416 grp 6: 8781824 grp 7: 8503296
mask: 0x0
beforeVal: 0x0;afterVal: 0x0
regNo: -1
opcode: NOP
pcOffset: 0x0
tid: -1
Error not injected

I checked the injection file in the logs and found lines like this one in all the injections that failed:
1;voidmatrixMulCUDA<8>(float*,float*,float*,int,int);0;28898422;0.947758577437;0.204871567272:0x0:NOP: -1:0x0:15.610934:19::value_before0x0:value_after0x0

As I said, these injections on NOP instructions never happened with the 32x32 thread block size, but it happens almost 80% of the time with other values.

Thank you in advance!

sivahari · 2021-06-01T21:21:24Z

This output is typically printed when nvbitfi could not find the injection site. One possibility is that the profiling run thinks that there are way more instructions than the actual injection run. Did you rerun the profiler when you changed the input?

…

On Tue, Jun 1, 2021 at 2:10 PM aarg3 ***@***.***> wrote: Hello! I am trying to run a series of tests to compare the reliability of different versions of the Matrix Multiplication. The kernels that I am using have a parameter that allows to change the thread block size. I performed tests with this parameter set to 32x32 and had no problems or unexpected results. However, when I tried to change that parameter to 16x16 or 8x8 I started getting these types of results: inspecting: voidmatrixMulCUDA<8>(float*,float*,float*,int,int) num_static_instrs: 90 maxregs: 30(30) Injection data index: 0 kernel_name: voidmatrixMulCUDA<8>(float*,float*,float*,int,int) ctas: 256 instrs: 10452992 grp 0: 0 grp 1: 2097152 grp 2: 3145728 grp 3: 278528 grp 4: 1671168 grp 5: 3260416 grp 6: 8781824 grp 7: 8503296 mask: 0x0 beforeVal: 0x0;afterVal: 0x0 regNo: -1 opcode: NOP pcOffset: 0x0 tid: -1 Error not injected I checked the injection file in the logs and found lines like this one in all the injections that failed: 1;voidmatrixMulCUDA<8>(float*,float*,float*,int,int);0;28898422;0.947758577437;0.204871567272:0x0:NOP 👎0x0:15.610934:19::value_before0x0:value_after0x0 As I said, these injections on NOP instructions never happened with the 32x32 thread block size, but it happens almost 80% of the time with other values. Thank you in advance! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#5>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABRMQ5Y2JXLR2WXCBMDLOILTQVEC7ANCNFSM455M7K6Q> .

albertorodes · 2021-06-03T17:32:04Z

I checked and yes, the profiler is running and generating different results depending on the parameters. However, it is true that the block size values that generate this errors have a much higher instruction count that the ones that don't generate any.
To be specific the profiler with a 32x32 block size counts 263168 instructions (doesn't generate any problems) and with a 8x8 block size it counts 1052672 (generates 80% of "not injected errors").
It could be something about the implementation, but the instruction count difference seems too large.

sergicuen · 2021-12-03T18:22:58Z

The issue was solve using the wordaround described here: Error not injected when threads/block different to 1024 #7

sivahari closed this as completed Feb 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NOP instructions in Matrix Multiplication #5

NOP instructions in Matrix Multiplication #5

albertorodes commented Jun 1, 2021 •

edited

Loading

sivahari commented Jun 1, 2021 via email

albertorodes commented Jun 3, 2021

sergicuen commented Dec 3, 2021

NOP instructions in Matrix Multiplication #5

NOP instructions in Matrix Multiplication #5

Comments

albertorodes commented Jun 1, 2021 • edited Loading

sivahari commented Jun 1, 2021 via email

albertorodes commented Jun 3, 2021

sergicuen commented Dec 3, 2021

albertorodes commented Jun 1, 2021 •

edited

Loading