New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix listing / tracing usdt probes in shared libraries #1600
Fix listing / tracing usdt probes in shared libraries #1600
Conversation
eb3b8ca
to
738ebcf
Compare
7d04e86
to
89799f9
Compare
9ef3cf2
to
b8f197a
Compare
@dalehamel can you take a look at this one? :) |
@xdrop can you explain fully the situations that previously failed but which now pass? Eg, "without this patch, the following script fails". Going off of your tests, it looks like https://github.com/iovisor/bpftrace/pull/1600/files#diff-cc0e770d906adc6e162b517dcb638e6a50c2af8f1b5689421719743502eaeae2R137 is the case you are trying to ensure doesn't regress? I have seen similar issues in the past but thought they had been fixed, but perhaps we are no longer using the bcc APIs, which would enter the mount namespace correctly. I see that you are introducing a map for caching the path lookups, in the past we have had issues with cache coherency when we've done similar things. Does this only fail if you try to specify a path that explicitly does have "/proc/PID/..." in it? Ie, if you specify "-p", and a path relative to the mount namespace, it still works correctly, right? Also, if you have two separate fixes (one for the /proc relative path, path, and one for the shared libraries), they should probably be in two separate PRs, as it will make it easier to review. The library part of this looks great, I'm still struggling to grasp why we need a path cache though for the /proc relative problem though. Thanks for adding tests for both cases either way! |
Sure :) Without 24c1267, this fails: (Under the condition that pid
Without 8151e3a, this fails (or tracing any usdt tracepoint in a shared lib):
I think this fails regardless of whether we are in a different mount namespace or not because without my changes the library tests fail (which of course don't run in a different namespace). But can confirm tomorrow for sure.
Yeah I couldn't figure out any other way to get the tracepoint paths detected by bcc other than to store them in a map during it's collection of tracepoints in
Nope it didn't work for me, maybe it's a regression since I might give a go re-enabling it since it has passed at least on my system with my changes (and fails without).
We don't need the cache for the Basically what happens is bcc finds the tracepoint for my pid, and correctly realizes it's in a shared lib (which is say |
Lol there it is. I was thinking about https://github.com/iovisor/bcc/pull/2817/files while reading this. There have definitely been a number of changes in this logic in BCC, so it's not surprising that this behavior has broken, but I was surprised because I was quite sure it worked in the past. The bcc that has the fix has long since been released (it is up to 0.17.0 now) so we should separately bump bcc and re-enable the test.
awesome! If we split this up, we can also make an issue for re-enabling that regression test. Thanks for the great work here again, the extra regression test will keep this 💪 |
b8f197a
to
5808b76
Compare
382e193
to
88200da
Compare
Great! I've created an issue for re-enabling the test #1638
PR is now split, this PR contains the fixes + tests for shared libraries and #1637 contains the |
usdt_probe_list probes; | ||
for (auto const &usdt_probes : usdt_provider_cache[path]) | ||
for (auto const &path : usdt_pid_to_paths_cache[pid]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we cache pid → path here, and then the existing logic of the path caching is used right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly
Thanks for splitting this up, it is easier to review now. Looks like you need to rebase for the changelog.md anyways. @danobi since you have been into the USDT code more recently than me (and reviewed the other PR he split out of this) can you take a look as well? |
88200da
to
b8564b2
Compare
@danobi Have you had a chance to take a look? Right now I have to maintain my own fork to use bpftrace with USDT probes in shared libs, and it's not the most convenient option 😄 Let me know if I can be of any help |
I hope to have this reviewed before the end of the week. Sorry for the delay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, really sorry about long delay.
Side note: all the global state in usdt.cpp really sucks. We should fix it at some point.
No problem, thank you for reviewing. I will rebase to fix the changelog conflicts.
Agreed. |
b8564b2
to
6c2b368
Compare
ping @fbs I'll merge in a few days if fbs is busy. |
I'm not that familiar with the whole usdt part but it looks ok :) |
The usdt bin paths may not necessarily correspond to the pids exe path (they may be in a shared lib). This results in cache misses when the tracepoint is in a shared library object and not in the main binary. We instead chooses to keep track of the bin paths for each pid (provided by bcc) and do a cache lookup based on those instead.
Adding runtime tests that ensure we can list and probe tracepoints not present in the main binary, but rather in linked shared libraries.
6c2b368
to
f04cece
Compare
Issue
Listing or tracing a usdt probes contained in a linked shared libs doesn't work.
Cause
During
read_probes_from_pid
inustd.cpp
we collect usdt probes from bcc and we maintain a cache of paths --> providers --> tracepointshttps://github.com/iovisor/bpftrace/blob/80d2d3c3144b8c74b84e12a72b6b34004229c2cb/src/usdt.cpp#L23-L29
In the scenario above the cache looks something like:
Later in
USDTHelper::probes_for_pid
we get given a pid and try to perform a lookup based on the pid exe path.https://github.com/iovisor/bpftrace/blob/80d2d3c3144b8c74b84e12a72b6b34004229c2cb/src/usdt.cpp#L78-L85
In this case it tries to look for
/proc/43133/root/bcc-build/bpftrace/build-mine/tests/testprogs/usdt_lib
in the cache, however only/proc/43133/root/bcc-build/bpftrace/build/tests/testlibs/libusdt_tp.so
exists, and so it fails to find the probe.Fix
The fix is when we iterate the probes given by bcc, we populate another separate map, that maps the paths for each pid found by bcc. So rather than trying to compute the path to look for in the cache using the pid's exe (https://github.com/iovisor/bpftrace/blob/80d2d3c3144b8c74b84e12a72b6b34004229c2cb/src/usdt.cpp#L78-L80) we instead can get the paths for the pid by indexing
usdt_pid_to_paths_cache[pid]
. This way we don't risk a cache miss due to different paths found by bcc / computed by bpftrace.Checklist
CHANGELOG.md