Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcc does not pick the right libraries #525

Open
rnav opened this issue May 4, 2016 · 16 comments
Open

bcc does not pick the right libraries #525

rnav opened this issue May 4, 2016 · 16 comments

Comments

@rnav
Copy link
Contributor

rnav commented May 4, 2016

bcc is not considering the encoded hwcap when choosing libraries. As such, uprobes on a shared library does not work on powerpc, as seen with the uprobes test:

======================================================================
FAIL: test_simple_library (__main__.TestUprobes)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcc/tests/python/test_uprobes.py", line 34, in test_simple_library
    self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
AssertionError: 0L != 2

----------------------------------------------------------------------

libc libraries in cache:

# ldconfig -p | grep libc.so
    libc.so.6 (libc6,64bit, hwcap: 0x0000200000000000, OS ABI: Linux 2.6.32) => /lib64/power8/libc.so.6
    libc.so.6 (libc6,64bit, OS ABI: Linux 2.6.32) => /lib64/libc.so.6

bcc always picks the first library here, which won't work on non-power8 machines.

We need to either implement stricter checks (look at hwcap and perhaps the platform) or consider probing on all libraries with the same name.

@drzaeus77
Copy link
Collaborator

Not sure I understand the full issue or suggestion here. As it is now, we're just using standard gcc logic when compiling.

If you run make VERBOSE=1, can you capture the line where libbcc.so is linked, and make some modifications that cause it to work for you? If you could for instance identify a problematic gcc line, we could help in working that into the build definitions.

@rnav
Copy link
Contributor Author

rnav commented May 4, 2016

Oh, I probably should have explained better. The problem actually shows up at runtime and not while building bcc itself. In this case, it is with test_uprobes.py:

# /root/bcc/build/tests/wrapper.sh "py_uprobes" "sudo" "/root/bcc/tests/python/test_uprobes.py"
Python 2.7.5
.Arena 0:
system bytes     =   12648448
in use bytes     =    3013088
Total (incl. mmap):
system bytes     =   16449536
in use bytes     =    6814176
max mmap regions =         12
max mmap bytes   =    4456448
F
======================================================================
FAIL: test_simple_library (__main__.TestUprobes)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcc/tests/python/test_uprobes.py", line 34, in test_simple_library
    self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
AssertionError: 0L != 2

----------------------------------------------------------------------
Ran 2 tests in 0.510s

FAILED (failures=1)
Failed

In test_uprobes.py, we use the below line to place a probe at malloc_stats() in libc:

b.attach_uprobe(name="c", sym="malloc_stats", fn_name="count")

This triggers a search for libc, which on powerpc ends up picking the wrong library to place the probe (/lib64/power8/libc.so.6 rather than /lib64/libc.so.6). As such, the probe never fires.

The reason we pick the wrong library is because we are not considering the hwcap associated with the library.

@drzaeus77
Copy link
Collaborator

Yes, thanks for explaining, that certainly makes more sense! Actually the problem seems obvious in retrospect but its been a busy morning so far :)

@rnav
Copy link
Contributor Author

rnav commented May 4, 2016

Sure. It looks like @vmg wrote much of this code. @vmg do you have ideas on how best to address this?

@mprzybylski
Copy link

mprzybylski commented Jan 11, 2017

Just ran into this issue while building / testing on Debian 8 amd64.

mikep@mv-tricolor:~/bcc/obj-x86_64-linux-gnu$ sudo /usr/bin/ctest --force-new-ctest-process -j1 -V

...

20: Test command: /home/mikep/bcc/obj-x86_64-linux-gnu/tests/wrapper.sh "py_uprobes" "sudo" "/home/mikep/bcc/tests/python/test_uprobes.py"
20: Test timeout computed to be: 9.99988e+06
20: Python 2.7.9
20: .Arena 0:
20: system bytes     =   13799424
20: in use bytes     =    2969696
20: Total (incl. mmap):
20: system bytes     =   14589952
20: in use bytes     =    3760224
20: max mmap regions =          4
20: max mmap bytes   =    1589248
20: F
20: ======================================================================
20: FAIL: test_simple_library (__main__.TestUprobes)
20: ----------------------------------------------------------------------
20: Traceback (most recent call last):
20:   File "/home/mikep/bcc/tests/python/test_uprobes.py", line 34, in test_simple_library
20:     self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
20: AssertionError: 0L != 2
20: 
20: ----------------------------------------------------------------------
20: Ran 2 tests in 0.217s
20: 
20: FAILED (failures=1)
20: Failed
20/28 Test #20: py_uprobes .......................***Failed    0.29 sec

Looks like malloc_stats() is missing from this distro and version.

mikep@mv-tricolor:~$ python
Python 2.7.9 (default, Jun 29 2016, 13:08:31) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes.util import find_library
>>> find_library('malloc_stats')
>>> find_library('c')
'libc.so.6'

@rnav, would you mind doing the same check with python and find_library() on the host where you discovered this issue?

@mprzybylski
Copy link

Ugh. That's not it either...

mikep@mv-tricolor:~$ nm -D --defined-only /lib/x86_64-linux-gnu/libc.so.6
...
000000000007ddb0 W malloc_stats
...

It's definitely defined, and after reading the man page for malloc_stats(), its pretty clear that it did run and this was its output:

20: Python 2.7.9
20: .Arena 0:
20: system bytes     =   13799424
20: in use bytes     =    2969696
20: Total (incl. mmap):
20: system bytes     =   14589952
20: in use bytes     =    3760224
20: max mmap regions =          4
20: max mmap bytes   =    1589248

So something about that bpf probe didn't fire correctly...

@rnav
Copy link
Contributor Author

rnav commented Jan 12, 2017

@mprzybylski you can put in a sleep() in tests/python/test_uprobes.py and check /sys/kernel/debug/tracing/uprobe_events to see which library is being picked by bcc.

@rnav
Copy link
Contributor Author

rnav commented Jan 12, 2017

I thought of using the aux vector to figure out the hardware capabilities before picking up the right library, but with 32-bit and 64-bit libraries, that may not be enough. Perhaps we should just put a probe on all matching libraries?

@pchaigno
Copy link
Contributor

We discussed the same bug in #853. Sorry I didn't drop a note here before.

#875 is a first pull request to improve this situation (I plan to update it this week-end). It tries to attach to the library that's currently loaded into the target process, if one is given.

I also started working on the second strategy discussed in #853 (using the running architecture of the bcc process to help select the appropriate library), but I don't have much time, so it might take longer. If anyone else wants to take care of that one, please go ahead 😃

@mprzybylski
Copy link

@rnav, sure enough, this is general problem for multiarch platforms. Thanks for pointing me in the right direction.

root@mv-tricolor:/sys/kernel/debug/tracing# cat uprobe_events 
p:uprobes/p__libx32_libc_so_6_0x76f30 /libx32/libc.so.6:0x0000000000076f30
r:uprobes/r__libx32_libc_so_6_0x76f30 /libx32/libc.so.6:0x0000000000076f30

@mprzybylski
Copy link

@pchaigno, Nice work on #875.

I just patched tests/python/test_uprobes.py to take advantage of it and got those tests to pass. I'll submit a pull request with that, and a few other things soon...

@finelli
Copy link

finelli commented May 1, 2017

Hi @pchaigno which patches should be applied to make the test pass ? Right now on a Debian Jessie the build hangs on the py_uprobes test.

@pchaigno
Copy link
Contributor

pchaigno commented May 2, 2017

@finelli You shouldn't need any patch for the tests to pass. What makes you think it's related to this issue?

@Karm77ii
Copy link

Products

@MonarchiaLIMMAVMFEAR
Copy link

MonarchiaLIMMAVMFEAR commented Jul 17, 2022

`MiniO

  • #@`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants
@finelli @pchaigno @rnav @drzaeus77 @mprzybylski @MonarchiaLIMMAVMFEAR @Karm77ii and others