Skip to content
This repository has been archived by the owner on Dec 3, 2019. It is now read-only.

Profile all threads, not just the one with the GIL #52

Closed
wants to merge 12 commits into from

Conversation

jamespic
Copy link
Contributor

This pull request is the combination of mine and @batterseapower's work on issue #13.

@CLAassistant
Copy link

CLAassistant commented Feb 25, 2017

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@eklitzke eklitzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to do another pass over this later to look at the ptrace/elf stuff more closely, but this is a really exciting and welcome change. What do you think of the issue with code execution I pointed out?

.deps/
*~
src/stamp-h1
src/config.h
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: file should end in a newline

FollowFrame(pid, frame_addr, &stack);
threads.push_back(Thread(id, is_current, stack));
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra newline here (this is also detected by clang-format)

@@ -47,6 +48,33 @@ void PtraceDetach(pid_t pid) {
}
}

struct user_regs_struct PtraceGetRegs(pid_t pid) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have strong feeling about the style here, but FYI the struct keyword is only required in C, not C++.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is purely a matter of ignorance on my part - this is literally the first thing I've written in C++.

@@ -93,4 +150,30 @@ std::unique_ptr<uint8_t[]> PtracePeekBytes(pid_t pid, unsigned long addr,
}
return bytes;
}

#ifdef __amd64__
long PtraceCallFunction(pid_t pid, unsigned long addr) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this code is incorrect in certain corner cases. Notably, there's a problem if %rip is at or near addr. For instance, let's say you want to call some function foo(). If the process is already in foo() when you ptrace attach, then when you insert your new code at %rip the subsequent call to foo() will fail: it could deadlock if foo() is guarded by a mutex, or it could become an infinite loop.

This is a problem I had using the same technique on another project at Uber (unfortunately not open source), and why I didn't implement code execution earlier in Pyflame. In that case I was running into this issue a lot because I was trying to call common libc functions like malloc().

I believe the correct way to do this is:

  • use PTRACE_POKETEXT to copy the bytes to an unused portion of memory
  • use PTRACE_SETREGS to point the program counter to the memory area, and resume the process

One way to do this is to directly make a syscall to mmap(2) to get a single page of memory, and then use that page. There's also usually room for a few extra bytes of code at the end of the .text area, but that's not guaranteed.

What do you think?

Copy link
Contributor Author

@jamespic jamespic Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're almost certainly right. I'm very near the limits of my understanding on this stuff - this code is crudely adapted from some code you wrote. I left out the mmap stuff mostly because this seemed to work, but I did find that if I called this on every profile cycle that sometimes the SIGTRAP would leak out, which this may explain.

@@ -36,4 +36,9 @@ std::string PtracePeekString(pid_t pid, unsigned long addr);
// peek some number of bytes
std::unique_ptr<uint8_t[]> PtracePeekBytes(pid_t pid, unsigned long addr,
size_t nbytes);

#ifdef __amd64__
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more correct to put guards into the configure.ac script, but I can handle that later if the rest of the code looks good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable. I must confess, I've never really learned autotools, but it's probably the right answer here.

std::vector<Frame> stack;
FollowFrame(pid, frame_addr, &stack);
return stack;
std::vector<Thread> GetThreads(pid_t pid, PyAddresses addrs) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you can make addrs a const reference here.

@eklitzke
Copy link
Collaborator

I am getting segfaults somewhat regularly with this branch (not on every run, but easy enough to reproduce). From one window I run python tests/threaded_sleeper.py and from the other I run pyflame. The threaded sleeper crashes with a segfault, and the pyflame process displays:

Failed to PTRACE_CONT - unexpectedly got status  139

Here's an example backtrace (from the generated core dump):

evan@xenial.ssh ~/pyflame (cb90496...) $ gdb python
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/cd/e2c487269892a2815c715667ae336984b82b0c.debug...done.
done.
(gdb) core core
warning: core file may not match specified executable file.
[New LWP 31291]
Core was generated by `python tests/exit_early.py -s'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  0x000000000041df10 in convertsimple (freelist=0x7f50143d30c8, bufsize=512, msgbuf=0xa3da80 "\t", flags=<optimized out>, p_va=<optimized out>, p_format=0x7fff9c2ad7f8, 
    arg=<optimized out>) at ../Python/getargs.c:896
896	../Python/getargs.c: No such file or directory.
(gdb) bt
#0  0x000000000041df10 in convertsimple (freelist=0x7f50143d30c8, bufsize=512, msgbuf=0xa3da80 "\t", flags=<optimized out>, p_va=<optimized out>, p_format=0x7fff9c2ad7f8, 
    arg=<optimized out>) at ../Python/getargs.c:896
#1  convertitem.lto_priv.1589 (freelist=0x7f50143d30c8, bufsize=512, msgbuf=0xa3da80 "\t", levels=0x7fff9c2ad808, flags=<optimized out>, p_va=<optimized out>, p_format=0xa3ddc0, 
    arg=<optimized out>) at ../Python/getargs.c:514
#2  vgetargskeywords.lto_priv.1414 (args=<unknown at remote 0x5a04f0>, keywords=(<type at remote 0xa3dc20>,), format=0x1f <error: Cannot access memory at address 0x1f>, kwlist=0x0, 
    p_va=0x7f50143d2360, flags=0) at ../Python/getargs.c:1627
#3  0x00007fff9c2add85 in ?? ()
#4  0x00007fff9c2adda7 in ?? ()
#5  0x00007fff9c2addba in ?? ()
#6  0x00007fff9c2addcd in ?? ()
#7  0x00007fff9c2addd7 in ?? ()
#8  0x00007fff9c2ade05 in ?? ()
#9  0x00007fff9c2ade2d in ?? ()
#10 0x00007fff9c2adeb5 in ?? ()
#11 0x00007fff9c2adecf in ?? ()
#12 0x00007fff9c2adee6 in ?? ()
#13 0x00007fff9c2adefa in ?? ()
#14 0x00007fff9c2adf0b in ?? ()
#15 0x00007fff9c2adf1e in ?? ()
#16 0x00007fff9c2adf2e in ?? ()
#17 0x00007fff9c2adf3e in ?? ()
#18 0x00007fff9c2adf46 in ?? ()
#19 0x00007fff9c2adf53 in ?? ()
#20 0x00007fff9c2adf88 in ?? ()
#21 0x00007fff9c2adfa7 in ?? ()
#22 0x0000000000000000 in ?? ()

@jamespic
Copy link
Contributor Author

jamespic commented Feb 27, 2017 via email

@jamespic
Copy link
Contributor Author

I'm wrong. It's a threading thing. If another thread reaches the overwritten code before we've cleaned it up, it runs the overwritten code. I'm going to close this PR. It needs a rethink.

@jamespic jamespic closed this Feb 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants