Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing: Only prepare tracepoint arguments when actually tracing #26593

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

0xB10C
Copy link
Contributor

@0xB10C 0xB10C commented Nov 28, 2022

Currently, if the tracepoints are compiled (e.g. in depends and release builds), we always prepare the tracepoint arguments regardless of the tracepoints being used or not. We made sure that the argument preparation is as cheap as possible, but we can almost completely eliminate any overhead for users not interested in the tracepoints (the vast majority), by gating the tracepoint argument preparation with an if(something is attached to this tracepoint). To achieve this, we use the optional semaphore feature provided by SystemTap.

The first commit simplifies and deduplicates our tracepoint macros from 13 TRACEx macros to a single TRACEPOINT macro. This makes them easier to use and also avoids more duplicate macro definitions in the second commit.

The Linux tracing tools I'm aware of (bcc, bpftrace, libbpf, and systemtap) all support the semaphore gating feature. Thus, all existing tracepoints and their argument preparation is gated in the second commit. For details, please refer to the commit messages and the updated documentation in doc/tracing.md.

Also adding unit tests that include all tracepoint macros to make sure there are no compiler problems with them (e.g. some varadiac extension not supported).

Reviewers might want to check:

  • Do the tracepoints still work for you? Do the examples in contrib/tracing/ run on your system (as bpftrace frequently breaks on every new version, please test master too if it should't work for you)? Do the CI interface tests still pass?
  • Is the new documentation clear?
  • The TRACEPOINT_SEMAPHORE(event, context) macros places global variables in our global namespace. Is this something we strictly want to avoid or maybe move to all TRACEPOINT_SEMAPHOREs to a separate .cpp file or even namespace? I like having the TRACEPOINT_SEMAPHORE() in same file as the TRACEPOINT(), but open for suggestion on alternative approaches.
  • Are newly added tracepoints in the unit tests visible when using readelf -n /src/test/test_bitcoin? You can run the new unit tests with ./src/test/test_bitcoin --run_test=util_trace_tests* --log_level=all.
Two of the added unit tests demonstrate that we are only processing the tracepoint arguments when attached by having a test-failure condition in the tracepoint argument preparation. The following bpftrace script can be used to demonstrate that the tests do indeed fail when attached to the tracepoints.

fail_tests.bt:

#!/usr/bin/env bpftrace

usdt:./src/test/test_bitcoin:test:check_if_attached {
  printf("the 'check_if_attached' test should have failed\n");
}

usdt:./src/test/test_bitcoin:test:expensive_section {
  printf("the 'expensive_section' test should have failed\n");
}

Run the unit tests with ./src/test/test_bitcoin and start bpftrace fail_tests.bt -p $(pidof test_bitcoin) in a separate terminal. The unit tests should fail with:

Running 594 test cases...
test/util_trace_tests.cpp(31): error: in "util_trace_tests/test_tracepoint_check_if_attached": check false has failed
test/util_trace_tests.cpp(51): error: in "util_trace_tests/test_tracepoint_manual_tracepoint_active_check": check false has failed

*** 2 failures are detected in the test module "Bitcoin Core Test Suite"

These links might provide more contextual information for reviewers:

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 28, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK willcl-ark
Concept ACK jb55, vasild, kouloumos
Approach ACK virtu
Stale ACK laanwj

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)
  • #29432 (Stratum v2 Template Provider (take 3) by Sjors)
  • #29071 (refactor: Remove Span operator==, Use std::ranges::equal by maflcko)
  • #28984 (Cluster size 2 package rbf by instagibbs)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from 27bd196 to 94331ba Compare November 28, 2022 20:54
src/util/trace.h Outdated Show resolved Hide resolved
src/util/trace.h Outdated Show resolved Hide resolved
@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from 94331ba to be0a187 Compare November 29, 2022 15:44
@0xB10C
Copy link
Contributor Author

0xB10C commented Nov 29, 2022

  • Addressed tracing: Only prepare tracepoint arguments when actually tracing #26593 (comment) by using __VA_ARGS__. Also using a separate TRACEPOINT0 macro for tracepoints without arguments, as using the TRACEPOINT macro without arguments is only be possible with GCC (clang warns that we are using gcc extensions). Tested with both clang and gcc.
  • Adressed tracing: Only prepare tracepoint arguments when actually tracing #26593 (comment) by dropping the TRACEPOINT_UNCHECKED macro.
  • Added unit tests for the TRACEPOINT macros. The tracepoints in the unit tests don't do anything in particular, but the tests show that the macros work. Through the CI we cover ENABLE_TRACING being defined and undefined. The test case test_tracepoint_check_if_attached also shows that the TRACEPOINT_ACTIVE macro is not broken (it's tested that it's working in the functional tests).

@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from be0a187 to 61ca0f9 Compare November 29, 2022 16:29
@jb55
Copy link
Contributor

jb55 commented Nov 29, 2022

genius. Concept ACK

@vasild
Copy link
Contributor

vasild commented Dec 2, 2022

Concept ACK

@virtu
Copy link
Contributor

virtu commented Dec 12, 2022

Approach ACK

Super cool stuff. This effectively removes any runtime overhead of tracepoints on modern processors with branch prediction and speculative execution when no probes are attached .

Some feedback on testing with 61ca0f9:

  • functional tests executed successfully on x86_64 and arm64

  • demo scripts in contrib/tracing: right now, bpftrace scripts work, bcc scripts don't

  • generated assembly looks as expected:

    • tracing code is semaphore-gated using compare and branch instructions:

      # x86_64
      cmp    WORD PTR [rip+0x8f816b],0x0        # 0x55d32c25d6c0 <net_outbound_message_semaphore>
      jne    0x55d32b965800 <_ZN8CConnman11PushMessageEP5CNodeO17CSerializedNetMsg+1488>
      
      # arm64
      ldrh    w1, [x0, #1776]
      cbnz    w1, 0xaaaab2413fa4 <_ZN8CConnman11PushMessageEP5CNodeO17CSerializedNetMsg+1228>
      
    • tracing code uses nop when no probe is attached vs. int3 (x86_64) / brk (arm64) to trap when probe is attached

  • used gdb to inspect semaphores: they are correctly initialized to zero and set to one if one or more(!) probes are attached. The documentation I read stated semaphores act as probe reference counters, so I was expecting semaphore values to reflect the number of probes. However, it looks like semaphores and counters were separated. This does not impact correct behavior though:

    (gdb) # no probe attached
    (gdb) x/hx &net_outbound_message_semaphore
    0xaaaacc8506f0 <net_outbound_message_semaphore>:        0x0000
    (gdb) # started one instance of contrib/tracing/log_p2p_traffic.bt
    (gdb) x/hx &net_outbound_message_semaphore
    0xaaaacc8506f0 <net_outbound_message_semaphore>:        0x0001
    (gdb) # started another instance of contrib/tracing/log_p2p_traffic.bt
    (gdb) x/hx &net_outbound_message_semaphore
    0xaaaacc8506f0 <net_outbound_message_semaphore>:        0x0001
    (gdb) # stopped first instance
    (gdb) x/hx &net_outbound_message_semaphore
    0xaaaacc8506f0 <net_outbound_message_semaphore>:        0x0001
    (gdb) # stopped second instance
    (gdb) x/hx &net_outbound_message_semaphore
    0xaaaacc8506f0 <net_outbound_message_semaphore>:        0x0000
    

@0xB10C
Copy link
Contributor Author

0xB10C commented Dec 13, 2022

Thanks for the detailed review and testing. I've too noticed that the current example bcc scripts have a problem with the semaphores. The functional tests, also using bcc, work as intended as they use the PID to attach to the bitcoind process. Will add a commit to allow running the example scripts by specifying a PID.

@kouloumos
Copy link
Contributor

Concept ACK!
I've though about something similar while working on #25541 and even more when you mentioned performance impact on #25832.

@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch 2 times, most recently from 636d0b8 to 6ac946d Compare December 20, 2022 12:39
@0xB10C
Copy link
Contributor Author

0xB10C commented Dec 20, 2022

demo scripts in contrib/tracing: right now, bpftrace scripts work, bcc scripts don't -- (#26593 (comment))

Rebased and added a commit that fixes the bcc example scripts by using the PID of bitcoind instead of the file path. We already do this in the tests (220a5a2).

@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from 6ac946d to c4db6fb Compare December 20, 2022 12:44
@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from c4db6fb to e35f0b5 Compare January 4, 2023 10:32
@fanquake
Copy link
Member

fanquake commented Mar 6, 2024

@0xB10C can you give a summary of the current state of these changes? It's draft, and from the looks of the most recent comment, that's because it's waiting on support for macOS and FreeBSD based on this branch.. Is that still correct/being worked on?

@0xB10C
Copy link
Contributor Author

0xB10C commented Mar 7, 2024

Thanks for the poke. Undrafting.

I have a branch building on top of this PR adding support for no-dependency no-overhead-if-not-used macOS tracing in a working state and a concept for FreeBSD to do the same. Still some work to do before I can open a PR. This PR was a draft as I wanted to make sure it's compatible with both the macOS and FreeBSD changes. I think it's mostly compatible with the changes here.

Nonetheless, I would prefer review on #25832 first from reviewers interested in tracepoints.

@@ -132,8 +132,9 @@ def print_message(event, inbound):
)


def main(bitcoind_path):
bitcoind_with_usdts = USDT(path=str(bitcoind_path))
def main(pid):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attaching by pid is a good idea, even aside from semaphores, as it attaches a specific instance instead of all processes sharing the binary, which i think it does if you provide the path.

@laanwj
Copy link
Member

laanwj commented Apr 8, 2024

For some reason, after building with this, I dont get any output from:

$ gdb src/bitcoind
...
(gdb) info tracepoints
No tracepoints.

Tested with gdb 15.0.50.20240219-git and 13.1.

I checked that the tracepoints are included:

$ readelf -n ./src/bitcoind | grep NT_STAPSDT -A 4 -B 2
  stapsdt              0x00000062       NT_STAPSDT (SystemTap probe descriptors)
    Provider: net
    Name: outbound_message
    Location: 0x00000000001809bc, Base: 0x0000000000a2cb60, Semaphore: 0x0000000000b73248
    Arguments: -8@%r14 8@%r15 8@-560(%rbp) 8@24(%rcx) 8@%rax 8@%rdx
...

I don't know if this is an actual problem as we don't rely on gdb for tracing, but if so it's a documentation issue.

Edit: something is broken here, it also doesn't show them for the master branch.

@laanwj
Copy link
Member

laanwj commented Apr 8, 2024

Oh, it's info probes not info tracepoints. Whoops. It checks out now.

Code review and lightly tested ACK 3933bdd

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a nit/question

src/util/trace.h Outdated Show resolved Hide resolved
This deduplicates the TRACEx macros by using systemtaps STAP_PROBEV[0]
variadic macro instead of the DTrace compability DTRACE_PROBE[1] macros.
Bitcoin Core never had DTrace tracepoints, so we don't need to use the
drop-in replacement for it. As noted in pr25541[2], these macros aren't
compatibile with DTrace on macOS anyway.

This also renames the TRACEx macro to TRACEPOINT to clarify what the
macro does: inserting a tracepoint vs tracing (logging) something.

[0]: https://sourceware.org/git/?p=systemtap.git;a=blob;f=includes/sys/sdt.h;h=24d5e01c37805e55c36f7202e5d4e821b85167a1;hb=ecab2afea46099b4e7dfd551462689224afdbe3a#l407
[1]: https://sourceware.org/git/?p=systemtap.git;a=blob;f=includes/sys/sdt.h;h=24d5e01c37805e55c36f7202e5d4e821b85167a1;hb=ecab2afea46099b4e7dfd551462689224afdbe3a#l490
[2]: https://github.com/bitcoin/bitcoin/pull/25541/files#diff-553886c5f808e01e3452c7b21e879cc355da388ef7680bf310f6acb926d43266R30-R31

Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from 3933bdd to 2a2c04e Compare April 9, 2024 15:26
@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 9, 2024

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/23620364225

Before this commit, we would always prepare tracepoint arguments
regardless of the tracepoint being used or not. While we already made
sure not to include expensive arguments in our tracepoints, this
commit introduces gating to make sure the arguments are only prepared
if the tracepoints are actually used. This is a win-win improvement
to our tracing framework. For users not interested in tracing, the
overhead is reduced to a cheap 'greater than 0' compare. As the
semaphore-gating technique used here is available in bpftrace, bcc,
and libbpf, users interested in tracing don't have to change their
tracing scripts while profiting from potential future tracepoints
passing slightly more expensive arguments. An example are mempool
tracepoints that pass serialized transactions. We've avoided the
serialization in the past as it was too expensive.

Under the hood, the semaphore-gating works by placing a 2-byte
semaphore in the '.probes' ELF section. The address of the semaphore
is contained in the ELF note providing the tracepoint information
(`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits
like bpftrace, bcc, and libbpf increase the semaphore at the address
upon attaching to the tracepoint. We only prepare the arguments and
reach the tracepoint if the semaphore is greater than zero. The
semaphore is decreased when detaching from the tracepoint.

This also extends the "Adding a new tracepoint" documentation to
include information about the semaphores and updated step-by-step
instructions on how to add a new tracepoint.
BCC needs the PID of a bitcoind process to attach to the tracepoints
(instead of the binary path used before) when the tracepoints have a
semaphore.

For reference, we already use the PID in our tracepoint interface
tests. See 220a5a2.
@0xB10C 0xB10C force-pushed the 2022-11-only-prepare-tracepoint-arguments-when-tracing branch from 2a2c04e to 38e24df Compare April 9, 2024 21:32
@DrahtBot DrahtBot removed the CI failed label Apr 9, 2024
@0xB10C
Copy link
Contributor Author

0xB10C commented Apr 10, 2024

As mentioned in #26593 (comment), with C++20, we can use __VA_OPT__ and drop the changes from #26593 (comment). Done that and rebased.

Copy link
Member

@willcl-ark willcl-ark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 38e24df

I think the macro simplification included in this change is a nice win for simplicity.
Ran all the updated examples in contrib/tracing, and played around with adding/removing various tracepoint arguments to check everything works as expected.

Left one q and one suggestion inline, but neither a blocker for this.

@@ -176,8 +177,8 @@ def handle_outbound(_, data, size):


if __name__ == "__main__":
if len(sys.argv) < 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In: 38e24df
If you re-touch these, it might be nice (user-friendly) to add a simple check that we are running as root. Currently without root you see:

Hooking into bitcoind with pid 4039799
Traceback (most recent call last):
  File "/home/will/src/bitcoin/contrib/tracing/p2p_monitor.py", line 258, in <module>
    main(pid)
  File "/home/will/src/bitcoin/contrib/tracing/p2p_monitor.py", line 124, in main
    bitcoind_with_usdts.enable_probe(
  File "/usr/lib/python3/dist-packages/bcc-0.30.0+679166bd-py3.11.egg/bcc/usdt.py", line 161, in enable_probe
bcc.usdt.USDTException: Failed to enable USDT probe 'inbound_message':
the specified pid might not contain the given language's runtime,
or the runtime was not built with the required USDT probes. Look
for a configure flag similar to --with-dtrace or --enable-dtrace.
To check which probes are present in the process, use the tplist tool.

Could improve easily with a root check, e.g.:

diff --git a/contrib/tracing/p2p_monitor.py b/contrib/tracing/p2p_monitor.py
index 6d8ae43c71..d107c5f994 100755
--- a/contrib/tracing/p2p_monitor.py
+++ b/contrib/tracing/p2p_monitor.py
@@ -244,8 +244,13 @@ def render(screen, peers, cur_list_pos, scroll, ROWS_AVALIABLE_FOR_LIST, info_pa
                         i + 3, 1, " %s (%d byte) --->" %
                         (msg.msg_type, msg.size), curses.A_NORMAL)
 
+import os
+def running_as_root() -> bool:
+    return os.getuid() == 0
 
 if __name__ == "__main__":
+    if not running_as_root():
+        print("Detected that we are not running as root, this is likely not what you want")
     if len(sys.argv) != 2:
         print("USAGE:", sys.argv[0], "<pid of bitcoind>")
         exit()

There's of course a counter-argument that anyone hooking into tracepoints will likely understand they need root access to the kernel 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that adding this makes sense as the bcc error message isn't clear. If I retouch, I'll add it.

doc/tracing.md Show resolved Hide resolved
@DrahtBot DrahtBot requested a review from laanwj April 29, 2024 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Seeking Concept ACK
Development

Successfully merging this pull request may close these issues.

None yet