Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attaching to a running process. #360

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

ParkHanbum
Copy link
Contributor

@ParkHanbum ParkHanbum commented May 13, 2018

this PR contains implementation of TODO that attaching the already running process.

usage :

$ uftrace record ---pid=<target process pid> --force

example:

$ ./node_g simple_server.js
# another console
$ ps -ef | grep node
m        123218   1540  0 19:56 pts/1    00:00:00 ./node_g simple_server.js
m        123250   1577  0 19:57 pts/2    00:00:00 grep --color=auto node

# for use ptrace
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope

# without force option, it will be cause the error about mcount symbol.
$ uftrace record -v --pid=123218 --force
uftrace: removing uftrace.data.old directory
uftrace: /home/m/git/node/out/Debug/node
uftrace: skipping perf event due to error: Permission denied
uftrace: please check /proc/sys/kernel/perf_event_paranoid
uftrace: creating 2 thread(s) for recording
uftrace: using /usr/local/lib/libmcount.so library for tracing
uftrace: size of code to inject 209
uftrace: Path of Shared object to be injected : /usr/local/lib/libmcount.so
uftrace: Return address : 7f41f419a94e
uftrace: RIP Register : 308cf2a

@ParkHanbum ParkHanbum force-pushed the PR_FOR_DT branch 2 times, most recently from cd4ebca to 04b19cb Compare May 13, 2018 17:54
@namhyung
Copy link
Owner

namhyung commented May 14, 2018

I like the way you did the job - I'd also like to work on attaching to a process at first. But I don't want to add a new command just for this feature. I prefer having -p <pid> option in the record command similar to other tools like gdb, strace and perf.

I didn't review this PR yet but it seems there are many coding style issues. And you need to ensure to keep the license and copyright notice as is. It'd be nice if you could minimize the code copy. I'll review soon anyway. Many thanks to your great work! :)

@honggyukim
Copy link
Collaborator

Hi @ParkHanbum, I also appreciate your effort for this work.

But as @namhyung mentioned, this PR contains a lot of style issues. In Linux kernel source tree, there is a tool called checkpatch.pl to check this kind of problems.

Here are the error reports of checkpatch.pl.

WARNING: Missing a blank line after declarations
#68: FILE: utils/inject-utils.c:34:
+       char perms[5];
+       sprintf(filename, "/proc/%d/maps", pid);

ERROR: space required before the open parenthesis '('
#70: FILE: utils/inject-utils.c:36:
+       if(fp == NULL)

WARNING: unnecessary whitespace before a quoted newline
#71: FILE: utils/inject-utils.c:37:
+               pr_err("cannot open /proc/%d/maps \n", pid);

ERROR: "foo* bar" should be "foo *bar"
#133: FILE: utils/inject-utils.c:99:
+int checkloaded(pid_t pid, char* libname)

WARNING: line over 80 characters
#449: FILE: utils/ptrace.c:223:
+               pr_dbg("instead of expected SIGTRAP, target stopped with signal %d: %s\n", targetsig.si_signo, strsignal(targetsig.si_signo));

WARNING: unnecessary whitespace before a quoted newline
#88: FILE: arch/x86_64/inject.c:55:
+               "push %r9 \n"

ERROR: Bad function definition - void inject_shared_lib_end() should probably be void inject_shared_lib_end(void)
#163: FILE: arch/x86_64/inject.c:130:
+void inject_shared_lib_end()

        ... many more ...

It shows too many errors and warnings so I've attached the full output so please refer to the report and correct the style first so that @namhyung can focus on the important parts other than the style issues.
checkpatch-for-PR_FOR_DT.txt

For more about the style issues, you can read coding stye doc.

@ParkHanbum
Copy link
Contributor Author

ParkHanbum commented May 15, 2018

@honggyukim I didn't know about that. thank you for let me know. I'll check that!

  • updated
    it had been modified to keep the coding convension. thanks you for your reviewing.

@ParkHanbum ParkHanbum force-pushed the PR_FOR_DT branch 2 times, most recently from ceb8c0e to ce766d2 Compare May 15, 2018 11:14
"\tCannot access to the %s which contain environment variable.\n"


__attribute__((weak)) void build_debug_domain(char *dbg_domain_str)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this build_debug_domain() can be duplicated with build_debug_domain() in libmcount/misc.c
So I thought we can include internal.h to this file, no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Taeung if it is possible, I would have done that. "internal.h" have many dependency with source file which located at libmcount. but libtrigger does not need other codes, need only 'build_debug_domain'. so, I have no choice. lately, I'll refactoring this code by decoupling 'build_debug_domain' from 'internal.h' if it need

cmd-dynamic.c Outdated
// settings for using dynamic tracing feature.
void setup_uftrace_environ(struct opts *opts, int pfd)
{
make_uftrace_environ_file(opts, pfd);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it is enough that if there is one of the two functions setup_uftrace_environ and make_uftrace_environ_file, no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Taeung you're right. I'm considering that remove setup_uftrace_environ.

int fd_envfile;

fd_envfile = open_env_file();
set_env_from_file(fd_envfile);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After pre_startup() does setenv(), mcount_startup() does getenv() to pass various uftrace option values.
But I think we can skip setenv() and getenv().
To sum up, after reading "envfile", we can directly set uftrace option values. What do you think about this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Taeung you're right, but I'm considering that dynamic tracing feature need more time to stable. for that reason, I want to split dynamic tracing feature from uftrace. it is the reason what I have wrote code this way. if you have more idea about isolate the dynamic tracing feature, let me know please.

@honggyukim
Copy link
Collaborator

I cannot review this PR now but can you rebase it based on the current master? It has some conflicts.

@Taeung
Copy link
Collaborator

Taeung commented May 21, 2018

Hi @ParkHanbum

I think this job is really great !! Thanks for doing this.
I comment several things after briefly reviewing this PR.
I'd appreciate it, if you check my comments.

In addition, It seems that you need to add libtrigger.so to .gitignore file,
or rename it to libmcount-trigger.so.
How about that ?

Because I found the below git output after building this PR source code:

On branch PR_FOR_DT
Your branch is up to date with 'origin/PR_FOR_DT'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	libtrigger.so

nothing added to commit but untracked files present (use "git add" to track)

@namhyung namhyung changed the title implement the attaching to already running process. attaching to a running process. May 23, 2018
Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed just some patches but there are issues we need to discuss. Also I don't like the name of libtrigger.so since we use the term "trigger" differently. Anyway nice work, much appreciated!

@@ -8,6 +8,7 @@ include $(srcdir)/Makefile.include
ARCH_ENTRY_SRC = $(wildcard $(sdir)/*.S)
ARCH_MCOUNT_SRC = $(wildcard $(sdir)/mcount-*.c) $(sdir)/regs.c $(sdir)/symbol.c
ARCH_UFTRACE_SRC = $(sdir)/cpuinfo.c $(sdir)/regs.c $(sdir)/symbol.c
ARCH_UFTRACE_SRC += $(sdir)/inject.c
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. but you didn't add inject.c file here and it makes this commit unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not thoughtful enough. as you say, I'll change this commit to include commit which about inject.c .

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did too many things in this commit. I suggest you to split it to 3 parts - 1. add ptrace code, 2. add (generic) inject code, 3. add x86 inject code

@@ -0,0 +1,172 @@
#include <stdio.h>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code also needs to have the copyright and license lines since you copied it. Also this commit should be earlier than the arch code. Also the name can be simply utils/inject.c.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK~! it will be.


int lib_path_length = strlen(lib_path) + 1;
int mypid = getpid();
long mylibcaddr = getlibcaddr(mypid);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't compile due to missing functions like above. You need to move commits implement it before this commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... how about now? it was implemented at previouse commit.

sprintf(filename, "/proc/%d/maps", pid);
fp = fopen(filename, "r");
if (fp == NULL)
pr_err("cannot open /proc/%d/maps\n", pid);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr_err usually doesn't need \n at the end since it'll add error string and a newline for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH I SEE!! I'll remove them all :D

"\tmake sure that the file exists.\n\n" \
"\t[notification]\n" \
"\twhen you use dynamic tracing to the process, you must specify the\n" \
"\tabsolute path of 'libmcount-dynamic.so' not the relative path.\n\n" \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where comes this restriction from? Isn't it just a matter of calling realpath()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we can assume that uftrace has been installed, we can get path of libmcount-dynamic.so. otherwise, libtrigger.so has the path as default from attached process located at. so, we cannot get the absolute path of libmcount-dynamic.so.

* can mitigation the symptom.
*/
setvbuf(stdout, NULL, _IONBF, 1024);
setvbuf(stderr, NULL, _IONBF, 1024);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this changes the behavior of the target process which is not desired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it may. but uftrace and target program could not print any output without that. what choice do I have?

else
pr_err_ns(ERROR_LIBRARY_NOT_EXIST, preload_library_path);

handle = dlopen(preload_library_path, RTLD_LAZY);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. so do you want to load libmcount.so again? Why not just doing it in this library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because libmcount could not loaded by using __libc_dlopen_mode but it can be possible by using dlopen. I don't know what cause this. see below backtrace.

   0x000000000040062a <+19>:    callq  0x400520 <__libc_dlopen_mode@plt>
   0x000000000040062f <+24>:    mov    $0x40071c,%edi
   0x0000000000400634 <+29>:    callq  0x4004f0 <puts@plt>
   0x0000000000400639 <+34>:    mov    0x200458(%rip),%rax        # 0x600a98 <so_name>
   0x0000000000400640 <+41>:    mov    $0x2,%esi
   0x0000000000400645 <+46>:    mov    %rax,%rdi
   0x0000000000400648 <+49>:    callq  0x400510 <dlopen@plt>
   0x000000000040064d <+54>:    mov    $0x0,%eax
   0x0000000000400652 <+59>:    pop    %rbp
   0x0000000000400653 <+60>:    retq
End of assembler dump.
(gdb) b *main+49
Breakpoint 1 at 0x400648
(gdb) r
Starting program: /home/m/git/a.out

Program received signal SIGSEGV, Segmentation fault.
_dl_map_object_from_fd (name=name@entry=0x4006e8 "/home/m/git/_uftrace/libmcount/libmcount-dynamic.so", origname=origname@entry=0x0,
    fd=<optimized out>, fbp=fbp@entry=0x7fffffffdba0, realname=<optimized out>, loader=loader@entry=0x0, l_type=2, mode=268435458,
    stack_endp=0x7fffffffdb98, nsid=0) at dl-load.c:1337
1337    dl-load.c: No such file or directory.
(gdb) bt
#0  _dl_map_object_from_fd (name=name@entry=0x4006e8 "/home/m/git/_uftrace/libmcount/libmcount-dynamic.so", origname=origname@entry=0x0,
    fd=<optimized out>, fbp=fbp@entry=0x7fffffffdba0, realname=<optimized out>, loader=loader@entry=0x0, l_type=2, mode=268435458,
    stack_endp=0x7fffffffdb98, nsid=0) at dl-load.c:1337
#1  0x00007ffff7ddfc27 in _dl_map_object (loader=0x0, loader@entry=0x7ffff7ffe168,
    name=name@entry=0x4006e8 "/home/m/git/_uftrace/libmcount/libmcount-dynamic.so", type=type@entry=2, trace_mode=trace_mode@entry=0,
    mode=mode@entry=268435458, nsid=<optimized out>) at dl-load.c:2498
#2  0x00007ffff7dec577 in dl_open_worker (a=a@entry=0x7fffffffe160) at dl-open.c:237
#3  0x00007ffff7de7564 in _dl_catch_error (objname=objname@entry=0x7fffffffe150, errstring=errstring@entry=0x7fffffffe158,
    mallocedp=mallocedp@entry=0x7fffffffe14f, operate=operate@entry=0x7ffff7dec4d0 <dl_open_worker>, args=args@entry=0x7fffffffe160)
    at dl-error.c:187
#4  0x00007ffff7debda9 in _dl_open (file=0x4006e8 "/home/m/git/_uftrace/libmcount/libmcount-dynamic.so", mode=2,
    caller_dlopen=0x40062f <main+24>, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=0x7fffffffe4a8) at dl-open.c:660
#5  0x00007ffff794c5ad in do_dlopen (ptr=ptr@entry=0x7fffffffe380) at dl-libc.c:87
#6  0x00007ffff7de7564 in _dl_catch_error (objname=0x7fffffffe370, errstring=0x7fffffffe378, mallocedp=0x7fffffffe36f,
    operate=0x7ffff794c570 <do_dlopen>, args=0x7fffffffe380) at dl-error.c:187
#7  0x00007ffff794c664 in dlerror_run (args=0x7fffffffe380, operate=0x7ffff794c570 <do_dlopen>) at dl-libc.c:46
#8  __GI___libc_dlopen_mode (name=<optimized out>, mode=<optimized out>) at dl-libc.c:163
#9  0x000000000040062f in main ()

else
pr_err_ns(ERROR_DATADIR_NOT_EXIST, data_dir_path);

if (stat(preload_library_path, &file) == 0)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LD_PRELOAD can have a list of libraries separated by colons..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this logic is not enough as you say.

if (stat(data_dir_path, &file) == 0)
pr_dbg("DATA-DIR EXIST : %s\n", data_dir_path);
else
pr_err_ns(ERROR_DATADIR_NOT_EXIST, data_dir_path);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can fail with various reasons so you'd better checking the error code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I'll update this.

@honggyukim
Copy link
Collaborator

@ParkHanbum Sorry but I just made a change to move all cmd-*.c files into cmds directory so please update your PR as well.

@ParkHanbum
Copy link
Contributor Author

@namhyung thank you for your reviewing! I'll check the source code that you commented.
@honggyukim nope. it is not problem.

@ParkHanbum
Copy link
Contributor Author

how about chagne the name "libtrigger.so" to "libmcount-loader.so" .

@ParkHanbum ParkHanbum force-pushed the PR_FOR_DT branch 2 times, most recently from d502362 to a261550 Compare May 27, 2018 07:55
Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the "uftrace dynamic" command.

utils/inject.c Outdated
* specified process' address space.
*
*/
long freespaceaddr(pid_t pid)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the rules in the other function comments. It looks like:

/**
 * free_space_addr() - find executable region of memory
 * @pid: pid of process to inspect
 *
 * Search the target process' /proc/pid/maps entry and find an executable
 * region of memory that we can use to run code in.
 * This function returns the address of an executable region of memory
 * inside the specified process' address space.
 */

utils/inject.c Outdated
char str[20];
char perms[5];

sprintf(filename, "/proc/%d/maps", pid);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use snprintf() instead of sprintf().

cmds/record.c Outdated
static bool check_linux_schedule_event(char *events,
enum uftrace_pattern_type ptype)
bool check_linux_schedule_event(char *events,
enum uftrace_pattern_type ptype)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the indentation.

@ParkHanbum ParkHanbum force-pushed the PR_FOR_DT branch 2 times, most recently from 14edc94 to c8389b3 Compare May 29, 2018 05:57
@ParkHanbum
Copy link
Contributor Author

@namhyung I think dynamic tracing feature need more time to stable. until then, How about separating the dynamics from the old ones?

@namhyung
Copy link
Owner

I'm ok with separating dynamic tracing feature for now. And then we can just focus on the attaching itself. I think you don't need to carry related patches. Instead let's finish the design of libtrigger.so first.

@ParkHanbum
Copy link
Contributor Author

@namhyung "Let's finish the design of libtrigger.so.". What does it mean? I think you are not talk about naming only.

@ParkHanbum ParkHanbum force-pushed the PR_FOR_DT branch 2 times, most recently from 4ace3ee to 122bae3 Compare June 2, 2018 10:52
@ParkHanbum
Copy link
Contributor Author

I have updated suggestions what talked from last discussion.
thank you guys, for precious reviewing and teaching.
muchas gracias!

@ParkHanbum
Copy link
Contributor Author

@honggyukim for now, please refer top article of this issue.

# another console
$ ps -ef | grep node
m        123218   1540  0 19:56 pts/1    00:00:00 ./node_g simple_server.js
m        123250   1577  0 19:57 pts/2    00:00:00 grep --color=auto node

# for use ptrace
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope

# without force option, it will be cause the error about mcount symbol.
$ uftrace record -v --pid=123218 --force
uftrace: removing uftrace.data.old directory
uftrace: /home/m/git/node/out/Debug/node
uftrace: skipping perf event due to error: Permission denied
uftrace: please check /proc/sys/kernel/perf_event_paranoid
uftrace: creating 2 thread(s) for recording
uftrace: using /usr/local/lib/libmcount.so library for tracing
uftrace: size of code to inject 209
uftrace: Path of Shared object to be injected : /usr/local/lib/libmcount.so
uftrace: Return address : 7f41f419a94e
uftrace: RIP Register : 308cf2a

@honggyukim
Copy link
Collaborator

honggyukim commented Apr 13, 2019

Wow! I tested it for a simple top command and works fine. But I pressed Ctrl-C in uftrace side, but it also finishes attached top program as well. Is it possible to detach it as well?

@ParkHanbum
Copy link
Contributor Author

@honggyukim for now, only attaching was implemented. to implement detach we need channel to communicate between uftrace and libmcount.

@honggyukim
Copy link
Collaborator

Hmm.. I tested it to node, but it gets crashed right after attaching.
in tty 1

$ node
> require('process').pid
28606

> (... waiting for input ...)

in tty 2

$ uftrace record --force -v --pid 28606

If both are ready, then uftrace attach command runs in tty2. However, it gets crashed right after attaching it as follows.
in tty 2

$ uftrace record --force -v --pid 28606
uftrace: /home/honggyu/usr/bin/node
uftrace: using /home/honggyu/usr/lib/libmcount.so library for tracing
uftrace: creating 10 thread(s) for recording
child terminated by signal: 6: Aborted
uftrace: reading uftrace.data/task.txt file
uftrace: flushing /uftrace-12b15a0abdeb2069-28606-000
symbol: ELF error when loading symbols: invalid `Elf' handle
symbol: error during open symbol file: /etc/ld.so.cache: Bad file descriptor
symbol: ELF error when loading symbols: invalid `Elf' handle
symbol: error during open symbol file: /etc/ld.so.cache: Bad file descriptor
symbol: ELF error when loading symbols: invalid `Elf' handle

in tty 1

$ node
> require('process').pid
28606

> mcount: initializing mcount library
demangle: demangle failed: _ZTCN2v88internal8compiler12_GLOBAL__N_113TurboJsonFileE0_St14basic_ofstreamIcSt11char_traitsIcEE
demangle: demangle failed: _ZTCN2v88internal8compiler12_GLOBAL__N_112TurboCfgFileE0_St14basic_ofstreamIcSt11char_traitsIcEE
demangle: demangle failed: _ZTVN10__cxxabiv121__vmi_class_type_infoE@@CXXABI_1.3
demangle: demangle failed: _ZTVNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21
demangle: demangle failed: _ZTVN10__cxxabiv117__class_type_infoE@@CXXABI_1.3
demangle: demangle failed: _ZTVNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21
demangle: demangle failed: _ZTVNSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21
demangle: demangle failed: _ZTTNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21
demangle: demangle failed: _ZTVN10__cxxabiv120__si_class_type_infoE@@CXXABI_1.3
demangle: demangle failed: _ZTTNSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21
symbol: ELF error when loading symbols: invalid `Elf' handle
symbol: error during open symbol file: /etc/ld.so.cache: Success
symbol: ELF error when loading symbols: invalid `Elf' handle
symbol: error during open symbol file: /etc/ld.so.cache: Success
symbol: ELF error when loading symbols: invalid `Elf' handle
dwarf: prepare debug info
symbol: ELF error when loading symbols: invalid `Elf' handle
symbol: error during open symbol file: /etc/ld.so.cache: Success
plthook: setup PLT hooking "/home/honggyu/usr/bin/node"
mcount: mcount setup done
mcount: new session started: 12b15a0abdeb2069: node
process crashed by signal 6: Aborted (si_code: -6)
Backtrace from uftrace:
=====================================
[0] (abort[862d10] <= uv__io_poll[1602238])
Aborted (core dumped)

@namhyung
Copy link
Owner

My personal opinion on this is that it's a kind of future work since most use-cases will require (full) dynamic tracing anyway. Could we focus on finishing it first and then comeback to this?

@ParkHanbum
Copy link
Contributor Author

ParkHanbum commented Apr 13, 2019

@honggyukim you got a right way. it need #704 to attaching node. but it work fine with python without #704. or you can trace node with single thread option. try it with '--single-threaded'

@ParkHanbum
Copy link
Contributor Author

this PR has been refreshed.

Add 'inject-content.S' to insert the shared process into the target
process and load the shared object.

inject-content.S consists of the following:
- inject_contents_start
: Declaring the beginning of the whole
- inject_so_loader
: How to Load Shared Objects
- inject_so_loader_ret
: Returns the address of the target process.
- inject_dlopen_addr
: Shared object load function address of target process
- inject_so_path
: Absolute path of shared object (up to 127 characters)
- inject_contents_end
: Declare the end of the whole.

Each part of inject-content.S consists of information read from the
target process and information calculated by uftrace.

When the inject_so_loader is executed in the target process, it will
load the shared object from inject_so_path.

Signed-off-by: Hanbum Park <kese111@gmail.com>
by default, 'uftrace' use environment variable to share options with
target program. because target program be forked from 'uftrace',
target program can be share the environment variable. but this
approach is no longer possible when use dynamic trace because there
is no way to share environment variable between separated process.
therefore, extracted command-line options into a file to share with
the target program. 'env-file' contain some useful function to handle
that file.

Signed-off-by: Hanbum Park <kese111@gmail.com>
@ParkHanbum ParkHanbum force-pushed the PR_FOR_DT branch 2 times, most recently from 868ed62 to 4e8b9aa Compare December 6, 2019 04:56
'ptrace' utility contain some useful functions to use 'ptrace'.
injecting shared object is needs 'ptrace' to write and to read
the target process memory and need it to execute the target process,
also. because there is many duplicate code for using ptrace, so have
been extracted those to the function.

'inject' utility contain some function for injecting shared object.
to inject shared object must implement codes for each different
architecture. but there is common code exist for each, therefore
extract those common code to 'inject' to reduce duplicate.

Signed-off-by: Hanbum Park <kese111@gmail.com>
To use the attach, need to do the following:
-. it removed the validation of the executable file path it received as
an argument previously.
-. Writes the arguments passed to environment variables to a file.
-. Invoke the inject function.

this commit contains the code for the above work.

Signed-off-by: Hanbum Park <kese111@gmail.com>
to use attach, the full path to the directory where you want to save
must be passed. therefore some codes was added.

Signed-off-by: Hanbum Park <kese111@gmail.com>
attaching to already running process require permission for use ptrace
functionality that supports by linux kernel. recently kernel disable
ptrace at default. so, user who want use attaching must enable ptrace.
added message guide user how to do this.

Signed-off-by: Hanbum Park <kese111@gmail.com>
1. brief of currently situlation
-. Currently uftrace use thread local storage variable mtd as type of
thread_data.
-. Each entry function calls the mcount_prepare function, which prepares
mcount_thread_data.

mcount_prepare function has the following code:
```
struct mcount_thread_data * mtdp = & mtd;
```

The compiler translates this code as follows:
```
data16 lea 0x2d6d7 (% rip),% rdi # 35f98 <.got + 0x8>
data16 data16 callq 6400 <__ tls_get_addr @ plt>
```
<__tls_get_addr@plt> function is called to reference the TLS variable
mtd.

2. Problem
Problem will be occurred here, 'malloc@plt' or 'free@plt' function could
be called inside of <__tls_get_addr@plt>. when these are called then
infinity looping start.

because it will lead call to plthook_entry recursive as follows.
[mcount_prepare => __tls_get_addr@plt => malloc@plt => plthook_entry =>
mcount_prepare]

3. Solution
For this reason, changed the model of the TLS variable mtd to
initial-exec. The initial-exec model solves the problem because it is
not make any calls to refer the TLS. the way to refer to variables in
the intial-exec is adding or subtracting offsets from static TLS blocks
as follows.

```
mov 0x2d6a6 (% rip),% rax # 35fc8 <.got + 0x38>
mov% fs: (% rax),% rbx
```

4. Limitation
the initial-exec TLS model have limitation. traditionally, TLS will
allocated by loader and the initial-exec TLS block allocate statically.
it will not effect to uftrace features except the attaching. but after
process already running and the initial-exec allocated, it can be that
there is not enough space in initial-exec static TLS block to allocate
mtd.

4-1. changed things that according to limitation.
therefore, there is need to reduce TLS variable mtd to minimized. so,
changed TLS variable mtd to normal variable. and added new two TLS
variable pointer _mtdp and bool mcount_recursion_marker. it is not
allocate variable mtd to the TLS, instead allocate to heap. and manage
it by *_mtdp which declared as TLS variable. and mcount_recursion_marker
has same role with mcount_recursion_marker that located at inside of TLS
variable mtd.

5. reference
For more information on this, please see the following link:
https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt

If it change to the initial-exec model, it can expect some improvement
in performance compared to the existing one. Please refer to the
following link for more details.
https://die4taoam.tistory.com/37

Signed-off-by: Hanbum Park <kese111@gmail.com>
PAGE_SIZE has been duplicate declared. it is declared in user.h. so,
remove it.

caution: PAGE_SIZE declared in user.h for x86 only, don't try to that at
another architecture.

Signed-off-by: Hanbum Park <kese111@gmail.com>
… change

pos type to unsigned

PAGE_SIZE can be defined under 'arch', PAGE_SIZE be declared only if it is
not declared before. and pos the member of code_page structure mean
offset of current CODECHUNK. therefore, it never can be negative. so,
change pos type to unsigned from singed.

Signed-off-by: Hanbum Park <kese111@gmail.com>
@ParkHanbum
Copy link
Contributor Author

ParkHanbum commented Dec 6, 2019

statistic:
test with simple http server.
nodejs : success
pytohn : success
rust : success
go : particle success
firefox : success
chrome : particle failed (failed by chrome internal)

daemon
apache : failed
[[Thu Dec 05 22:08:04.646853 2019] [core:notice] [pid 28921] AH00060: seg fault or similar nasty error detected in the parent process
-> maybe this is caused by heathy check logic...

etc
tomcat : success
(( notice : some problem with jvm interpreting))

@bernhardkaindl
Copy link
Contributor

@ParkHanbum Is it possible to re-start this feature on attaching to running process, but now using full dynamic tracing?

Thanks, Bernhard

@ParkHanbum
Copy link
Contributor Author

@bernhardkaindl actually, this feature almost done In my opinion. uftrace do many job on module loaded time and it lead crash with some runtime program likely chrome/firefox. for now, some progress working to make uftrace active after running with cli interface as I know. if this has done, then I will update this feature and try to merge ASAP.

@namhyung
Copy link
Owner

@ParkHanbum are you talking about #1643? I'm not sure but I don't think it would help avoiding any existing bugs. Doing the work at runtime would make thing more complicated rather than making life easier. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants