New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix various issues with dwarf emission: #7116
Conversation
* Prevents the emission of DWARF information for alloca/zero init locations. * Fixes the DWARF location of variables arriving as arguments to the function.
Demonstration of the behaviour with this patch: $ cat issue7103.py
import numba
import numpy as np
@numba.njit(debug=True)
def foo(a):
b = a + 1
c = a * 2.34
d = (a, b, c)
print(a, b, c, d)
return b, c, d
r = foo(123)
lib = foo.overloads[foo.signatures[0]].library
with open(__file__.replace('.py','.elf'), 'wb') as f:
f.write(lib._get_compiled_object())
$ gdb -q -ex 'b "issue7103.py:__main__::foo$241"' --args python issue7103.py
Reading symbols from python...
No source file named issue7103.py.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 ("issue7103.py:__main__::foo$241") pending.
(gdb) r
Starting program: <path>/python issue7103.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 23895]
[Detaching after fork from child process 23896]
[Detaching after fork from child process 23897]
[New Thread 0x7fffe9fb0700 (LWP 23898)]
[New Thread 0x7fffe97af700 (LWP 23899)]
[New Thread 0x7fffd8fae700 (LWP 23900)]
Thread 1 "python" hit Breakpoint 1, __main__::foo$241 () at issue7103.py:4
4 @numba.njit(debug=True)
(gdb) l
1 import numba
2 import numpy as np
3
4 @numba.njit(debug=True)
5 def foo(a):
6 b = a + 1
7 c = a * 2.34
8 d = (a, b, c)
9 print(a, b, c, d)
10 return b, c, d
(gdb) info locals
d = "0/lUUU\000\000\025\260iUUU\000\000\000\000\000\000\000\000\000"
a = <optimized out>
b = <optimized out>
c = <optimized out>
(gdb) n
6 b = a + 1
(gdb) info locals
d = '\000' <repeats 23 times>
a = 123
b = 0
c = 0
(gdb) n
7 c = a * 2.34
(gdb) info locals
d = '\000' <repeats 23 times>
a = 123
b = <optimized out>
c = 0
(gdb) n
8 d = (a, b, c)
(gdb) info locals
d = '\000' <repeats 23 times>
a = 123
b = 124
c = 287.81999999999999
(gdb) n
9 print(a, b, c, d)
(gdb) info locals
d = "{\000\000\000\000\000\000\000|\000\000\000\000\000\000\000\205\353Q\270\036\375q@"
a = 123
b = 124
c = 287.81999999999999
(gdb) n
123 124 287.82 (123, 124, 287.82)
10 return b, c, d
(gdb) info locals
d = "{\000\000\000\000\000\000\000|\000\000\000\000\000\000\000\205\353Q\270\036\375q@"
a = 0
b = 124
c = 287.81999999999999
(gdb) n
PyTuple_New (size=3) at <path>/Objects/tupleobject.c:81
81 <path>/Objects/tupleobject.c: No such file or directory.
(gdb) c
Continuing.
123 124 287.82 (123, 124, 287.82)
[Thread 0x7fffe97af700 (LWP 23917) exited]
[Thread 0x7fffe9fb0700 (LWP 23916) exited]
[Thread 0x7fffd8fae700 (LWP 23918) exited]
[Inferior 1 (process 23909) exited normally] looking at the line info:
compared to #7103 (comment), the "jumps" have gone as the entry block now isn't referenced in the DWARF. |
This PR also contains #7074. |
Suggestions about how to test this are welcomed. Possibilities include:
|
Test suite is currently failing (bf19801) on build
|
# If it's an arg set the location as the function definition line | ||
if name in self.func_ir.arg_names: | ||
loc = self.loc.with_lineno(self.func_ir.loc.line) | ||
else: | ||
loc = self.loc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't see anything that tests this. Do you think the following addition is a reasonable test?
diff --git a/numba/tests/test_debuginfo.py b/numba/tests/test_debuginfo.py
index 7a1aae08f..e9d2fc410 100644
--- a/numba/tests/test_debuginfo.py
+++ b/numba/tests/test_debuginfo.py
@@ -182,6 +182,16 @@ class TestDebugInfoEmission(TestCase):
dilocation_info = metadata_definition_map[k]
self.assertIn(f'line: {line_no}', dilocation_info)
+ # Check that the line number of the argument `a` is the line number
+ # that the function begins on
+ line_re = re.compile(r'.*!DILocalVariable\(name: "a",.*line: ([0-9]*),')
+ for line in metadata:
+ matched = line_re.match(line)
+ if matched:
+ g = matched.groups()
+ self.assertEqual(len(g), 1)
+ self.assertEqual(int(g[0]), pysrc_line_start)
+
if __name__ == '__main__':
unittest.main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good point, I'll add something to this effect.
I'm not seeing quite the same behaviour with the example above:
|
Line info from the ELF:
... perhaps I'm doing something wrong / stupid here. |
Did you set |
Thanks, that was it - I assumed you were at |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
This? https://numba.readthedocs.io/en/stable/user/troubleshoot.html#debugging-jit-compiled-code-with-gdb I think this patch fixes the second point under "known issues" with stepping (jumpy source location). I'll remove both that statement and the remark for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks generally good to me and behaves as you suggest it should (when NUMBA_OPT
is set to 0
! :-) )
The location and language tests that are added look sane to me.
What do you think about the additional test for the argument locations? It would be good to have some test of them - is there one that exists already that I missed?
As discussed out-of-band, an alternative might be to set the debug location of the alloca
and store
to the first source line of the function (which will coincide with the @njit
decorator line in most cases), which might help prevent them showing as <optimized out>
when the breakpoint is first hit, but I think there's no need to experiment with this or make changes for this PR - it could be a potential improvement in future.
The test did work for me locally, but now I've kicked off the test again to see if the typeguard problem was transient, I'm a little surprised by the new fails:
|
It's #6720 causing the problems. |
96065e4
to
fc5859b
Compare
Conflicts: numba/core/lowering.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good - lifetime of the variables is looking better with the recent commits:
$ NUMBA_OPT=0 gdb -q -ex 'b "repro.py:__main__::foo$241"' --args python repro.py
Reading symbols from python...done.
No source file named repro.py.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 ("repro.py:__main__::foo$241") pending.
(gdb) run
Starting program: /home/gmarkall/miniconda3/envs/numbanp120/bin/python repro.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, __main__::foo$241 () at repro.py:4
4 @numba.njit(debug=True)
(gdb) l
1 import numba
2 import numpy as np
3
4 @numba.njit(debug=True)
5 def foo(a):
6 b = a + 1
7 c = a * 2.34
8 d = (a, b, c)
9 print(a, b, c, d)
10 return b, c, d
(gdb) info locals
a = 140736910562592
b = 93824993559743
c = 6.9533272545908306e-310
d = " \225\217\335\377\177\000\000\374\016\324\335\377\177\000\000\000\021\220\335\377\177\000"
(gdb) n
6 b = a + 1
(gdb) info locals
a = 0
b = 0
c = 0
d = '\000' <repeats 23 times>
(gdb) n
7 c = a * 2.34
(gdb) info locals
a = 123
b = 124
c = 0
d = '\000' <repeats 23 times>
(gdb) n
8 d = (a, b, c)
(gdb) info locals
a = 123
b = 124
c = 287.81999999999999
d = '\000' <repeats 23 times>
(gdb) n
9 print(a, b, c, d)
(gdb) info locals
a = 123
b = 124
c = 287.81999999999999
d = "{\000\000\000\000\000\000\000|\000\000\000\000\000\000\000\205\353Q\270\036\375q@"
(gdb) n
123 124 287.82 (123, 124, 287.82)
10 return b, c, d
(gdb) info locals
a = 0
b = 124
c = 287.81999999999999
d = "{\000\000\000\000\000\000\000|\000\000\000\000\000\000\000\205\353Q\270\036\375q@"
In summary:
- The life of locals seems to last throughout the lifetime of the block
- The exception is the argument
a
which doesn't yet appear initialized when breaking on the function (it shows up as 0 when the breakpoint is hit), and also seems to get clobbered on the last line of the function. (I don't think this is an issue to try to fix in this PR, which already makes a great improvement).
There are a couple of small comments on the diff, but otherwise I think this is looking good.
numba/tests/test_debuginfo.py
Outdated
the entry block has no debug metadata associated with it. In practice, | ||
if this is not the case, it manifests as the prologue_end being | ||
associated with the end_sequence or similar (due to the way code gen | ||
works for the entry block).""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when you say "In practice, if this is not the case", the meaning is "In practice, if debug metadata is associated with it" - is that understanding correct? If so, could the comment state that explicitly? If not, what does the original phrase mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in c0d9ee8
numba/core/lowering.py
Outdated
if name not in self._singly_assigned_vars: | ||
# If the name is used in multiple blocks or lowering with debuginfo... | ||
flags = utils.ConfigStack.top_or_none() | ||
if (name not in self._singly_assigned_vars) or flags.debuginfo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's about 6 checks for debug info added in this file - whilst its clear at the time of this review that these are all added to disable the SROA-like optimization in debug mode, I think later down the line looking at these functions in isolation it might not be all that clear why these conditions include flags.debuginfo
or not flags.debuginfo
- is there a way to replace the use of this flag at each site with a variable that makes it clear what its reason is - e.g. named disable_sroa_like_opt
(or something more apt, I'm presently lacking in imagination)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed in 8329107
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates - this is looking good!
Thanks for the review @gmarkall |
locations.
the function.
Fixes #7103