Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the same usdt probe appearing in multiple binary files not working #1515

Closed
yonghong-song opened this issue Jan 5, 2018 · 4 comments
Closed

Comments

@yonghong-song
Copy link
Collaborator

Here, the same usdt probe refers to probes with the same <provider>:<probe_name>.

Current bcc implementation assumes probes with a particular <probe_name> only appears in one binary. If a probe is defined in a header files, it is possible that eventually, the probe may end up in different binaries. For example,

  define.h: contains FOLLY_SDT(foo, bar, ...);

a.c includes define.h and eventually lands in liba.so and liba.so have foo:bar probe.
b.c include define.h and eventually links with liba.so to produce the binary a.out and
a.out also have foo:bar probe.

In this case, foo:bar appears in two different binary files. I hit this issue when I debug a FB internal issue.

I hacked bcc and the following code change seems working:

diff --git a/src/cc/usdt/usdt.cc b/src/cc/usdt/usdt.cc
index dec0646..98d5935 100644
--- a/src/cc/usdt/usdt.cc
+++ b/src/cc/usdt/usdt.cc
@@ -238,7 +238,8 @@ int Context::_each_module(const char *modpath, uint64_t, uint64_t, uint64_t,
 
 void Context::add_probe(const char *binpath, const struct bcc_elf_usdt *probe) {
   for (auto &p : probes_) {
-    if (p->provider_ == probe->provider && p->name_ == probe->name) {
+    if (p->bin_path_ == binpath && p->provider_ == probe->provider &&
+        p->name_ == probe->name) {
       p->add_location(probe->pc, probe->arg_fmt);
       return;
     }
@@ -277,8 +278,15 @@ bool Context::enable_probe(const std::string &probe_name,
   if (pid_stat_ && pid_stat_->is_stale())
     return false;
 
-  auto p = get(probe_name);
-  return p && p->enable(fn_name);
+  bool found = false;
+  for (auto &p : probes_) {
+    if (p->name_ == probe_name) {
+      p->enable(fn_name);
+      found = true;
+    }
+  }
+
+  return found;
 }
 
 void Context::each(each_cb callback) {
diff --git a/tools/tplist.py b/tools/tplist.py
index db4b68b..04ce05b 100755
--- a/tools/tplist.py
+++ b/tools/tplist.py
@@ -88,10 +88,11 @@ def print_usdt(pid, lib):
         probes_seen = []
         for probe in reader.enumerate_probes():
                 probe_name = probe.short_name()
+                binary_probe_name = probe.bin_path + "_" + probe_name
                 if not args.filter or fnmatch.fnmatch(probe_name, args.filter):
-                        if probe_name in probes_seen:
+                        if binary_probe_name in probes_seen:
                                 continue
-                        probes_seen.append(probe_name)
+                        probes_seen.append(binary_probe_name)
                         print_usdt_details(probe)
 
 if __name__ == "__main__":

Before I go ahead for a formal patch with test cases, any concern for this that I may not foresee?

@goldshtn
Copy link
Collaborator

goldshtn commented Jan 5, 2018

Yes, that certainly makes sense. When testing your patch, can you check if the trace/argdist/funccount etc. support still works without specifying the binary path? It is very convenient to do something like trace u::gc__start -p $(pidof node) without having to specify the full path. Perhaps if the user did not specify a library, we could just search for an appropriate probe without a library name?

@yonghong-song
Copy link
Collaborator Author

Right. If user did not specify a library, we could just search for the probe across all file-backed modules given a pid.

@palmtenor
Copy link
Member

Looking at the patch, those two same probes from different binary will still only generate one readarg_X helper, right? What if this same probe has different format in those two different binaries (and have conflicting virtual address in the switch-case in the readarg_X helper)?

@yonghong-song
Copy link
Collaborator Author

Looking at the patch, those two same probes from different binary will still only generate one readarg_X helper, right?
Right.

What if this same probe has different format in those two different binaries (and have conflicting virtual address in the switch-case in the readarg_X helper)?
Then we even have issues when the same probe appears more than once in the same binary. I guess we could add logic in the code to check format as well and return error, right? it is really not a good practice to have different formats for the same probe name.

The virtual address should be okay as they are all under the same pid address space. However, the virtual address in the .so note section cannot be directly used for code generation ...

yonghong-song added a commit that referenced this issue Jan 10, 2018
Fixing issue #1515.

Currently, each usdt probe (provider, probe_name) can only
have locations from the single binary. It is possible that
USDT probes are defined in a header file which eventually causes
the same usdt probe having locations in several different
binary/shared_objects. In such cases, we are not able to attach
the same bpf program to all these locations.

This patch addresses this issue by defining each location to
be `bin_path + addr_offset` vs. previous `addr_offset` only.
This way, each internal Probe class can represent all locations
for the same (provider, probe_name) pair.

The `tplist.py` output is re-organized with the (provider, probe_name)
in the top level like below:
```
...
rtld:lll_futex_wake [sema 0x0]
  location #1 /usr/lib64/ld-2.17.so 0xaac8
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #2 /usr/lib64/ld-2.17.so 0xe9b9
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #3 /usr/lib64/ld-2.17.so 0xef3b
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
...
```

Tested with the following commands
```
  tplist.py
  trace.py -p <pid> 'u::probe "arg1 = %d", arg1'
  trace.py u:<binary_path>:probe "arg1 = %d", arg1'
  argdist.py -p <pid> 'u::probe():s64:arg1'
  argdist.py -C 'u:<binary_path>:probe():s64:arg1'
  funccount.py -p <pid> 'u:<binary_path>:probe'
  funccount.py 'u:<binary_path>:probe'
```

Signed-off-by: Yonghong Song <yhs@fb.com>
yonghong-song added a commit that referenced this issue Jan 11, 2018
Fixing issue #1515.

Currently, each usdt probe (provider, probe_name) can only
have locations from the single binary. It is possible that
USDT probes are defined in a header file which eventually causes
the same usdt probe having locations in several different
binary/shared_objects. In such cases, we are not able to attach
the same bpf program to all these locations.

This patch addresses this issue by defining each location to
be `bin_path + addr_offset` vs. previous `addr_offset` only.
This way, each internal Probe class can represent all locations
for the same (provider, probe_name) pair.

The `tplist.py` output is re-organized with the (provider, probe_name)
in the top level like below:
```
...
rtld:lll_futex_wake [sema 0x0]
  location #1 /usr/lib64/ld-2.17.so 0xaac8
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #2 /usr/lib64/ld-2.17.so 0xe9b9
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #3 /usr/lib64/ld-2.17.so 0xef3b
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
...
```

Tested with the following commands
```
  tplist.py
  trace.py -p <pid> 'u::probe "arg1 = %d", arg1'
  trace.py u:<binary_path>:probe "arg1 = %d", arg1'
  argdist.py -p <pid> 'u::probe():s64:arg1'
  argdist.py -C 'u:<binary_path>:probe():s64:arg1'
  funccount.py -p <pid> 'u:<binary_path>:probe'
  funccount.py 'u:<binary_path>:probe'
```

Signed-off-by: Yonghong Song <yhs@fb.com>
yonghong-song added a commit that referenced this issue Jan 11, 2018
Fixing issue #1515.

Currently, each usdt probe (provider, probe_name) can only
have locations from the single binary. It is possible that
USDT probes are defined in a header file which eventually causes
the same usdt probe having locations in several different
binary/shared_objects. In such cases, we are not able to attach
the same bpf program to all these locations.

This patch addresses this issue by defining each location to
be `bin_path + addr_offset` vs. previous `addr_offset` only.
This way, each internal Probe class can represent all locations
for the same (provider, probe_name) pair.

The `tplist.py` output is re-organized with the (provider, probe_name)
in the top level like below:
```
...
rtld:lll_futex_wake [sema 0x0]
  location #1 /usr/lib64/ld-2.17.so 0xaac8
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #2 /usr/lib64/ld-2.17.so 0xe9b9
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #3 /usr/lib64/ld-2.17.so 0xef3b
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
...
```

Tested with the following commands
```
  tplist.py
  trace.py -p <pid> 'u::probe "arg1 = %d", arg1'
  trace.py u:<binary_path>:probe "arg1 = %d", arg1'
  argdist.py -p <pid> 'u::probe():s64:arg1'
  argdist.py -C 'u:<binary_path>:probe():s64:arg1'
  funccount.py -p <pid> 'u:<binary_path>:probe'
  funccount.py 'u:<binary_path>:probe'
```

Signed-off-by: Yonghong Song <yhs@fb.com>
yonghong-song added a commit that referenced this issue Jan 12, 2018
Fixing issue #1515.

Currently, each usdt probe (provider, probe_name) can only
have locations from the single binary. It is possible that
USDT probes are defined in a header file which eventually causes
the same usdt probe having locations in several different
binary/shared_objects. In such cases, we are not able to attach
the same bpf program to all these locations.

This patch addresses this issue by defining each location to
be `bin_path + addr_offset` vs. previous `addr_offset` only.
This way, each internal Probe class can represent all locations
for the same (provider, probe_name) pair.

The `tplist.py` output is re-organized with the (provider, probe_name)
in the top level like below:
```
...
rtld:lll_futex_wake [sema 0x0]
  location #1 /usr/lib64/ld-2.17.so 0xaac8
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #2 /usr/lib64/ld-2.17.so 0xe9b9
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
  location #3 /usr/lib64/ld-2.17.so 0xef3b
    argument #1 8 unsigned bytes @ di
    argument #2 4 signed   bytes @ 1
    argument #3 4 signed   bytes @ 0
...
```

Tested with the following commands
```
  tplist.py
  trace.py -p <pid> 'u::probe "arg1 = %d", arg1'
  trace.py u:<binary_path>:probe "arg1 = %d", arg1'
  argdist.py -p <pid> 'u::probe():s64:arg1'
  argdist.py -C 'u:<binary_path>:probe():s64:arg1'
  funccount.py -p <pid> 'u:<binary_path>:probe'
  funccount.py 'u:<binary_path>:probe'
```

Signed-off-by: Yonghong Song <yhs@fb.com>
banh-gao pushed a commit to banh-gao/bcc that referenced this issue Jun 21, 2018
Fixing issue iovisor#1515.

Currently, each usdt probe (provider, probe_name) can only
have locations from the single binary. It is possible that
USDT probes are defined in a header file which eventually causes
the same usdt probe having locations in several different
binary/shared_objects. In such cases, we are not able to attach
the same bpf program to all these locations.

This patch addresses this issue by defining each location to
be `bin_path + addr_offset` vs. previous `addr_offset` only.
This way, each internal Probe class can represent all locations
for the same (provider, probe_name) pair.

The `tplist.py` output is re-organized with the (provider, probe_name)
in the top level like below:
```
...
rtld:lll_futex_wake [sema 0x0]
  location iovisor#1 /usr/lib64/ld-2.17.so 0xaac8
    argument iovisor#1 8 unsigned bytes @ di
    argument iovisor#2 4 signed   bytes @ 1
    argument iovisor#3 4 signed   bytes @ 0
  location iovisor#2 /usr/lib64/ld-2.17.so 0xe9b9
    argument iovisor#1 8 unsigned bytes @ di
    argument iovisor#2 4 signed   bytes @ 1
    argument iovisor#3 4 signed   bytes @ 0
  location iovisor#3 /usr/lib64/ld-2.17.so 0xef3b
    argument iovisor#1 8 unsigned bytes @ di
    argument iovisor#2 4 signed   bytes @ 1
    argument iovisor#3 4 signed   bytes @ 0
...
```

Tested with the following commands
```
  tplist.py
  trace.py -p <pid> 'u::probe "arg1 = %d", arg1'
  trace.py u:<binary_path>:probe "arg1 = %d", arg1'
  argdist.py -p <pid> 'u::probe():s64:arg1'
  argdist.py -C 'u:<binary_path>:probe():s64:arg1'
  funccount.py -p <pid> 'u:<binary_path>:probe'
  funccount.py 'u:<binary_path>:probe'
```

Signed-off-by: Yonghong Song <yhs@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants