tracing: only prepare tracepoint args if attached

Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
kouloumos · Apr 5, 2023 · d42aa93 · d42aa93
1 parent 383eb5d
commit d42aa93
Show file tree

Hide file tree

Showing 8 changed files with 147 additions and 29 deletions.
diff --git a/doc/tracing.md b/doc/tracing.md
@@ -265,33 +265,103 @@ Arguments passed:
 
 ## Adding tracepoints to Bitcoin Core
 
-To add a new tracepoint, `#include <util/trace.h>` in the compilation unit where
-the tracepoint is inserted and use either the `TRACEPOINT(context, event)` macro
-without arguments or the `TRACEPOINT(context, event, args...)` macro with up to
-12 arguments. The `context` and `event` specify the names by which the
-tracepoint is referred to. Please use `snake_case` and try to make sure that the
-tracepoint names make sense even without detailed knowledge of implementation
-details. Do not forget to update the tracepoint list in this document.
+All tracepoint-related macros are defined in `src/util/trace.h`. To use these
+macros, `#include <util/trace.h>` in the compilation unit where the tracepoint
+is inserted. A tracepoint has a `context` and `event`, which specify the name by
+which a tracepoint is referred to. Make sure the tracepoint names make sense
+even without detailed knowledge of implementation details. Please use
+`snake_case` for the `context` and `event`.
+
+The `trace.h` header file contains the macro definitions
+`TRACEPOINT0(context, event)` for a tracepoint without arguments and
+`TRACEPOINT(context, event, ...)` for a tracepoint with up to twelve arguments
+(`...` is a variadic). These arguments can pass data to tracing scripts and are
+typically boolean, integer, or pointer to bytes or C-style strings. Often, the
+arguments need to be prepared by, for example, converting a C++ `std::string`
+to a C-style string with `c_str()`, calling getter functions, or doing some
+other light calculations. To avoid computational overhead by perparing
+tracepoint arguments for users not using the tracepoints, the arguments should
+only be prepared if there is something attached to the tracepoint.
+
+On Linux, this is solved with a counting semaphore for each tracepoint. When a
+tracing toolkit like bpftrace, bcc, or libbpf attaches to a tracepoint, the
+respective semaphore is increased. It's decreased when the tracing toolkit
+detaches from the tracepoint. By checking if the semaphore is greater than
+zero, we can cheaply gate the preparation of the tracepoint. The macro
+`TRACEPOINT_SEMAPHORE(context, event)` produces such a semaphore as a global
+variable. It must be placed in the same file as the tracepoint macro. The
+`TRACEPOINT(context, event, ...)` macro already includes the check if the
+tracepoint is being used.
 
-For example, the `net:inbound_messsage` tracepoint with 6 arguments:
+```C++
+// The `net:outbound_message` tracepoint with 6 arguments in `src/net.cpp`.
+
+TRACEPOINT_SEMAPHORE(net, outbound_message);
+…
+void CConnman::PushMessage(…) {
+  …
+  TRACEPOINT(net, outbound_message,
+      pnode->GetId(),
+      pnode->m_addr_name.c_str(),
+      pnode->ConnectionTypeAsString().c_str(),
+      sanitizedType.c_str(),
+      msg.data.size(),
+      msg.data.data()
+  );
+  …
+}
+```
+
+If needed, an extra `if(TRACEPOINT_ACTIVE(context, event) {..}` check can be
+used to prepare arguments right before the tracepoint.
 
 ```C++
-TRACEPOINT(net, inbound_message,
-    pnode->GetId(),
-    pnode->m_addr_name.c_str(),
-    pnode->ConnectionTypeAsString().c_str(),
-    sanitizedType.c_str(),
-    msg.data.size(),
-    msg.data.data()
-);
+// A fictitious `multiline:argument_preparation` tracepoint with a single argument.
+
+TRACEPOINT_SEMAPHORE(multiline, argument_preparation);
+…
+if(TRACEPOINT_ACTIVE(multiline, argument_preparation) {
+  argument = slightly_expensive_calulation();
+  TRACEPOINT(multiline, argument_preparation, argument);
+}
 ```
 
+```C++
+// The `test:zero_args` tracepoint without arguments in `src/test/util_trace_tests.cpp`.
+TRACEPOINT_SEMAPHORE(test, zero_args);
+…
+TRACEPOINT0(test, zero_args);
+```
+
+
+Step-by-step orientation for adding a new tracepoint:
+
+1. Familiarize yourself with the tracepoint macros mentioned above and the
+   tracepoint guidelines and best practices below.
+2. Do you need a tracepoint? Tracepoints are a machine-to-machine
+   interface. Would a machine-to-human interface like logging work too? Is this
+   something others could need too?
+3. Think about where to place the new tracepoint and which arguments you want to
+   pass. Where are the arguments already available? How expensive are the
+   arguments to compute?
+4. Pick descriptive names for the `context` and `event`. Check the list of
+   existing tracepoints too. Your tracepoint might fit into an existing context.
+5. Place the tracepoint, `#include <util/trace.h>`, and place a
+   `TRACEPOINT_SEMAPHORE(context, event)`.
+6. Document the tracepoint with its arguments and what event it traces in the
+   list in this document.
+7. Add a functional test for the tracepoint in
+   `test/functional/interface_usdt_{context}.py`.
+8. Add a tracepoint usage example in `contrib/tracing/`.
+
 ### Guidelines and best practices
 
 #### Clear motivation and use case
 Tracepoints need a clear motivation and use case. The motivation should
 outweigh the impact on, for example, code readability. There is no point in
-adding tracepoints that don't end up being used.
+adding tracepoints that don't end up being used. Additionally, adding many of
+lines of code just for tracepoint argument preparation probably comes with a high
+cost on code readability. During review, such a change might be rejected.
 
 #### Provide an example
 When adding a new tracepoint, provide an example. Examples can show the use case
@@ -301,11 +371,10 @@ the tracepoint. See existing examples in [contrib/tracing/].
 
 [contrib/tracing/]: ../contrib/tracing/
 
-#### No expensive computations for tracepoints
-Data passed to the tracepoint should be inexpensive to compute. Although the
-tracepoint itself only has overhead when enabled, the code to compute arguments
-is always run - even if the tracepoint is not used. For example, avoid
-serialization and parsing.
+#### Limit expensive computations for tracepoints
+While the tracepoint arguments are only prepared when we attach something to the
+tracepoint, an argument preparation should never hang the process. Hashing and
+serialization of data structures is probably fine, a `sleep(10s)` not.
 
 #### Semi-stable API
 Tracepoints should have a semi-stable API. Users should be able to rely on the
@@ -370,7 +439,7 @@ Displaying notes found in: .note.stapsdt
   stapsdt              0x0000005d	NT_STAPSDT (SystemTap probe descriptors)
     Provider: net
     Name: outbound_message
-    Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000000000
+    Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000a69780
     Arguments: -8@%r12 8@%rbx 8@%rdi 8@192(%rsp) 8@%rax 8@%rdx
 …
 ```
@@ -389,7 +458,7 @@ between distributions. For example, on
 
 ```
 $ tplist -l ./src/bitcoind -v
-b'net':b'outbound_message' [sema 0x0]
+b'net':b'outbound_message' [sema 0xa69780]
   1 location(s)
   6 argument(s)
 …

diff --git a/src/coins.cpp b/src/coins.cpp
@@ -10,6 +10,10 @@
 #include <util/trace.h>
 #include <version.h>
 
+TRACEPOINT_SEMAPHORE(utxocache, add);
+TRACEPOINT_SEMAPHORE(utxocache, spent);
+TRACEPOINT_SEMAPHORE(utxocache, uncache);
+
 bool CCoinsView::GetCoin(const COutPoint &outpoint, Coin &coin) const { return false; }
 uint256 CCoinsView::GetBestBlock() const { return uint256(); }
 std::vector<uint256> CCoinsView::GetHeadBlocks() const { return std::vector<uint256>(); }

diff --git a/src/net.cpp b/src/net.cpp
@@ -110,6 +110,9 @@ const std::string NET_MESSAGE_TYPE_OTHER = "*other*";
 static const uint64_t RANDOMIZER_ID_NETGROUP = 0x6c0edd8036ef4036ULL; // SHA256("netgroup")[0:8]
 static const uint64_t RANDOMIZER_ID_LOCALHOSTNONCE = 0xd93e69e2bbfa5735ULL; // SHA256("localhostnonce")[0:8]
 static const uint64_t RANDOMIZER_ID_ADDRCACHE = 0x1cf2e4ddd306dda9ULL; // SHA256("addrcache")[0:8]
+
+TRACEPOINT_SEMAPHORE(net, outbound_message);
+
 //
 // Global state variables
 //

diff --git a/src/net_processing.cpp b/src/net_processing.cpp
@@ -185,6 +185,8 @@ static constexpr size_t MAX_ADDR_PROCESSING_TOKEN_BUCKET{MAX_ADDR_TO_SEND};
 /** The compactblocks version we support. See BIP 152. */
 static constexpr uint64_t CMPCTBLOCKS_VERSION{2};
 
+TRACEPOINT_SEMAPHORE(net, inbound_message);
+
 // Internal stuff
 namespace {
 /** Blocks that are in flight, and that are in the queue to be downloaded. */

diff --git a/src/test/util_trace_tests.cpp b/src/test/util_trace_tests.cpp
@@ -12,6 +12,7 @@ TRACEPOINT_SEMAPHORE(test, zero_args);
 TRACEPOINT_SEMAPHORE(test, one_arg);
 TRACEPOINT_SEMAPHORE(test, six_args);
 TRACEPOINT_SEMAPHORE(test, twelve_args);
+TRACEPOINT_SEMAPHORE(test, check_if_attached);
 
 BOOST_FIXTURE_TEST_SUITE(util_trace_tests, BasicTestingSetup)
 
@@ -29,4 +30,15 @@ BOOST_AUTO_TEST_CASE(test_tracepoint_n_args)
   BOOST_CHECK(true);
 }
 
+BOOST_AUTO_TEST_CASE(test_tracepoint_check_if_attached)
+{
+  // TRACEPOINT_ACTIVE should only return true when were attaching to this
+  // tracepoint, which we don't do here.
+  if (TRACEPOINT_ACTIVE(test, check_if_attached)) {
+    TRACEPOINT(test, check_if_attached, 0);
+    BOOST_CHECK(false);
+  }
+  BOOST_CHECK(true);
+}
+
 BOOST_AUTO_TEST_SUITE_END()
diff --git a/src/util/trace.h b/src/util/trace.h
@@ -11,15 +11,36 @@
 // the optional variadic macros to define tracepoints.
 #define SDT_USE_VARIADIC 1
 
+// Setting _SDT_HAS_SEMAPHORES let's systemtap (sys/sdt.h) know that we want to
+// use the optional semaphore feature for our tracepoints. This feature allows
+// us to check if something is attached to a tracepoint. We only want to prepare
+// some potentially expensive tracepoint arguments, if the tracepoint is being
+// used. Here, an expensive argument preparation could, for example, be
+// calculating a hash or serialization of a data structure.
+#define _SDT_HAS_SEMAPHORES 1
+
 #include <sys/sdt.h>
 
+// Used to define a counting semaphore for a tracepoint. This semaphore is
+// automatically incremented by tracing frameworks (bpftrace, bcc, libbpf, ...)
+// upon attaching to the tracepoint and decremented when detaching. This needs
+// to be a global variable. It's placed in the '.probes' ELF section.
+#define TRACEPOINT_SEMAPHORE(context, event) \
+    unsigned short context##_##event##_semaphore __attribute__((section (".probes")))
+
+// Returns true if something is attached to the tracepoint.
+#define TRACEPOINT_ACTIVE(context, event) context##_##event##_semaphore > 0
+
 // A USDT tracepoint with no arguments.
 #define TRACEPOINT0(context, event) \
-    STAP_PROBE(context, event)
+    if (TRACEPOINT_ACTIVE(context, event)) \
+        STAP_PROBE(context, event)
 
-// A USDT tracepoint with one to twelve arguments.
+// A USDT tracepoint with one to twelve arguments. It's checked that the
+// tracepoint is active before preparing its arguments.
 #define TRACEPOINT(context, event, ...) \
-    STAP_PROBEV(context, event, __VA_ARGS__)
+    if (TRACEPOINT_ACTIVE(context, event)) \
+        STAP_PROBEV(context, event, __VA_ARGS__)
 
 #else
 
@@ -28,7 +49,6 @@
 #define TRACEPOINT0(context, event)
 #define TRACEPOINT(context, event, ...)
 
-#endif
-
+#endif // ENABLE_TRACING
 
 #endif // BITCOIN_UTIL_TRACE_H
diff --git a/src/validation.cpp b/src/validation.cpp
@@ -110,6 +110,9 @@ GlobalMutex g_best_block_mutex;
 std::condition_variable g_best_block_cv;
 uint256 g_best_block;
 
+TRACEPOINT_SEMAPHORE(validation, block_connected);
+TRACEPOINT_SEMAPHORE(utxocache, flush);
+
 const CBlockIndex* Chainstate::FindForkInGlobalIndex(const CBlockLocator& locator) const
 {
     AssertLockHeld(cs_main);

diff --git a/src/wallet/spend.cpp b/src/wallet/spend.cpp
@@ -27,6 +27,11 @@
 
 using interfaces::FoundBlock;
 
+TRACEPOINT_SEMAPHORE(coin_selection, selected_coins);
+TRACEPOINT_SEMAPHORE(coin_selection, normal_create_tx_internal);
+TRACEPOINT_SEMAPHORE(coin_selection, attempting_aps_create_tx);
+TRACEPOINT_SEMAPHORE(coin_selection, aps_create_tx_internal);
+
 namespace wallet {
 static constexpr size_t OUTPUT_GROUP_MAX_ENTRIES{100};