-
Notifications
You must be signed in to change notification settings - Fork 240
Profile all threads, not just the one with the GIL #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to do another pass over this later to look at the ptrace/elf stuff more closely, but this is a really exciting and welcome change. What do you think of the issue with code execution I pointed out?
.deps/ | ||
*~ | ||
src/stamp-h1 | ||
src/config.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: file should end in a newline
FollowFrame(pid, frame_addr, &stack); | ||
threads.push_back(Thread(id, is_current, stack)); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra newline here (this is also detected by clang-format)
@@ -47,6 +48,33 @@ void PtraceDetach(pid_t pid) { | |||
} | |||
} | |||
|
|||
struct user_regs_struct PtraceGetRegs(pid_t pid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have strong feeling about the style here, but FYI the struct
keyword is only required in C, not C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is purely a matter of ignorance on my part - this is literally the first thing I've written in C++.
@@ -93,4 +150,30 @@ std::unique_ptr<uint8_t[]> PtracePeekBytes(pid_t pid, unsigned long addr, | |||
} | |||
return bytes; | |||
} | |||
|
|||
#ifdef __amd64__ | |||
long PtraceCallFunction(pid_t pid, unsigned long addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this code is incorrect in certain corner cases. Notably, there's a problem if %rip
is at or near addr
. For instance, let's say you want to call some function foo()
. If the process is already in foo()
when you ptrace attach, then when you insert your new code at %rip
the subsequent call to foo()
will fail: it could deadlock if foo()
is guarded by a mutex, or it could become an infinite loop.
This is a problem I had using the same technique on another project at Uber (unfortunately not open source), and why I didn't implement code execution earlier in Pyflame. In that case I was running into this issue a lot because I was trying to call common libc functions like malloc()
.
I believe the correct way to do this is:
- use PTRACE_POKETEXT to copy the bytes to an unused portion of memory
- use PTRACE_SETREGS to point the program counter to the memory area, and resume the process
One way to do this is to directly make a syscall to mmap(2)
to get a single page of memory, and then use that page. There's also usually room for a few extra bytes of code at the end of the .text
area, but that's not guaranteed.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're almost certainly right. I'm very near the limits of my understanding on this stuff - this code is crudely adapted from some code you wrote. I left out the mmap stuff mostly because this seemed to work, but I did find that if I called this on every profile cycle that sometimes the SIGTRAP would leak out, which this may explain.
@@ -36,4 +36,9 @@ std::string PtracePeekString(pid_t pid, unsigned long addr); | |||
// peek some number of bytes | |||
std::unique_ptr<uint8_t[]> PtracePeekBytes(pid_t pid, unsigned long addr, | |||
size_t nbytes); | |||
|
|||
#ifdef __amd64__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more correct to put guards into the configure.ac
script, but I can handle that later if the rest of the code looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable. I must confess, I've never really learned autotools, but it's probably the right answer here.
std::vector<Frame> stack; | ||
FollowFrame(pid, frame_addr, &stack); | ||
return stack; | ||
std::vector<Thread> GetThreads(pid_t pid, PyAddresses addrs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you can make addrs
a const reference here.
I am getting segfaults somewhat regularly with this branch (not on every run, but easy enough to reproduce). From one window I run
Here's an example backtrace (from the generated core dump):
|
I suspect this is related to the over-simple ptrace function call stuff, that you've already spotted. It looks like the SIGTRAP is escaping from the injected code. I'll take a look at this when I get home.
…________________________________
From: Evan Klitzke <notifications@github.com>
Sent: Monday, February 27, 2017 6:52:39 PM
To: uber/pyflame
Cc: James Pickering; Author
Subject: Re: [uber/pyflame] Profile all threads, not just the one with the GIL (#52)
I am getting segfaults somewhat regularly with this branch (not on every run, but easy enough to reproduce). From one window I run python tests/threaded_sleeper.py and from the other I run pyflame. The threaded sleeper crashes with a segfault, and the pyflame process displays:
Failed to PTRACE_CONT - unexpectedly got status 139
Here's an example backtrace (from the generated core dump):
evan@xenial.ssh ~/pyflame (cb90496...) $ gdb python
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/cd/e2c487269892a2815c715667ae336984b82b0c.debug...done.
done.
(gdb) core core
warning: core file may not match specified executable file.
[New LWP 31291]
Core was generated by `python tests/exit_early.py -s'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0 0x000000000041df10 in convertsimple (freelist=0x7f50143d30c8, bufsize=512, msgbuf=0xa3da80 "\t", flags=<optimized out>, p_va=<optimized out>, p_format=0x7fff9c2ad7f8,
arg=<optimized out>) at ../Python/getargs.c:896
896 ../Python/getargs.c: No such file or directory.
(gdb) bt
#0 0x000000000041df10 in convertsimple (freelist=0x7f50143d30c8, bufsize=512, msgbuf=0xa3da80 "\t", flags=<optimized out>, p_va=<optimized out>, p_format=0x7fff9c2ad7f8,
arg=<optimized out>) at ../Python/getargs.c:896
#1 convertitem.lto_priv.1589 (freelist=0x7f50143d30c8, bufsize=512, msgbuf=0xa3da80 "\t", levels=0x7fff9c2ad808, flags=<optimized out>, p_va=<optimized out>, p_format=0xa3ddc0,
arg=<optimized out>) at ../Python/getargs.c:514
#2 vgetargskeywords.lto_priv.1414 (args=<unknown at remote 0x5a04f0>, keywords=(<type at remote 0xa3dc20>,), format=0x1f <error: Cannot access memory at address 0x1f>, kwlist=0x0,
p_va=0x7f50143d2360, flags=0) at ../Python/getargs.c:1627
#3 0x00007fff9c2add85 in ?? ()
#4 0x00007fff9c2adda7 in ?? ()
#5 0x00007fff9c2addba in ?? ()
#6 0x00007fff9c2addcd in ?? ()
#7 0x00007fff9c2addd7 in ?? ()
#8 0x00007fff9c2ade05 in ?? ()
#9 0x00007fff9c2ade2d in ?? ()
#10 0x00007fff9c2adeb5 in ?? ()
#11 0x00007fff9c2adecf in ?? ()
#12 0x00007fff9c2adee6 in ?? ()
#13 0x00007fff9c2adefa in ?? ()
#14 0x00007fff9c2adf0b in ?? ()
#15 0x00007fff9c2adf1e in ?? ()
#16 0x00007fff9c2adf2e in ?? ()
#17 0x00007fff9c2adf3e in ?? ()
#18 0x00007fff9c2adf46 in ?? ()
#19 0x00007fff9c2adf53 in ?? ()
#20 0x00007fff9c2adf88 in ?? ()
#21 0x00007fff9c2adfa7 in ?? ()
#22 0x0000000000000000 in ?? ()
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#52 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAlg68jqhwS1cBz1aOSf4hjemCKP9OXQks5rgxt3gaJpZM4MMJHC>.
|
I'm wrong. It's a threading thing. If another thread reaches the overwritten code before we've cleaned it up, it runs the overwritten code. I'm going to close this PR. It needs a rethink. |
This pull request is the combination of mine and @batterseapower's work on issue #13.