New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mempool: Add mempool tracepoints #26531
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! concept ACK
Arguments passed: | ||
1. Transaction ID (hash) as `pointer to unsigned chars` (i.e. 32 bytes in little-endian) | ||
2. Transaction virtual size as `uint64` | ||
3. Transaction fee as `int64` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to convey information about the peer (even if just ID) as well? One of the things that I think is potentially valuable e.g. in bmon is tracking which peers fed you which transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense.
However, peer information is no longer available in functions that handle low-level mempool operations, so I propose adding a separate mempool:accepted
tracepoint for this in PeerManagerImpl::ProcessMessage
here (which is right where the debug log entries for transactions accepted into the memory pool are created).
The rationale for adding a separate mempool:accepted
tracepoint instead of moving the existing mempool:added
one from CTxMemPool::addUnchecked
is to ensure consistency of added and removed transactions.
Details: The tracepoint call in ProcessMessage
would depend on the success of ProcessTransaction
here. There's an edge case where a transaction added to the mempool in MemPoolAccept::Finalize
here is immediately removed again seven lines below during the call to LimitMempoolSize
, in which case MemPoolAccept::Finalize
will return state.Invalid(TxValidationResult::TX_MEMPOOL_POLICY, "mempool full")
. As expected, in this case ProcessMessage
will not call the tracepoint (or create a debug log entry). LimitMempoolSize
, however, will at some point call CTxMemPool::removeUnchecked
, which calls the mempool:removed
tracepoint, resulting in tracepoint data containing a "transaction removed" event without a preceding "transaction added" event.
Please note that this problem does not arise due to poor placement of the mempool:removed
tracepoint here. The tracepoint is positioned right after GetMainSignals().TransactionRemovedFromMempool
, which leads me to believe that the aforementioned consistency issue currently affects zeromq (cf. here) and the debug log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that adding transactions we receive over the network is only one option out of several to add something to our mempool. Can also add transactions via RPC, loadMempool()
, wallet, etc..
I agree that it could be useful to have information about peers that send us transactions we end up accepting. Could call a tracepoint something like net:tx_accepted
. Not sure if that's in-scope for this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't this be done by using both net:inbound_message
and mempool:added
at the same time? That is, correlate the txids from both tracepoints to deduce which peer sent the tx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that works too. Needs transaction decoding in the tracing script though, but possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, thanks for working on this!
80a9dc3
to
8b6ac48
Compare
Concept ACK |
Before this commit, we would prepare/calculate tracepoint arguments regadless if the tracepoint was being used or not. While we made sure to not include too expensive-to-calculate arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvment to our tracing framework. Users not interested in tracing now have their overhead removed to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is availabe in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing sligly more expenive-to-calculate data. One example, in the context of [bitcoin#26531]'s mempool tracepoints could be passing serialized transactions for RBF replacements. TODO: This works by.. [bitcoin#26531]: bitcoin#26531
Before this commit, we would prepare/calculate tracepoint arguments regadless if the tracepoint was being used or not. While we made sure to not include too expensive-to-calculate arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvment to our tracing framework. Users not interested in tracing now have their overhead removed to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is availabe in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing sligly more expenive-to-calculate data. One example, in the context of [bitcoin#26531]'s mempool tracepoints could be passing serialized transactions for RBF replacements. TODO: This works by.. [bitcoin#26531]: bitcoin#26531
Before this commit, we would prepare/calculate tracepoint arguments regadless if the tracepoint was being used or not. While we made sure to not include too expensive-to-calculate arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvment to our tracing framework. Users not interested in tracing now have their overhead removed to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is availabe in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing sligly more expenive-to-calculate data. One example, in the context of [bitcoin#26531]'s mempool tracepoints could be passing serialized transactions for RBF replacements. TODO: This works by.. [bitcoin#26531]: bitcoin#26531
Before this commit, we would prepare/calculate tracepoint arguments regadless if the tracepoint was being used or not. While we made sure to not include too expensive-to-calculate arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvment to our tracing framework. Users not interested in tracing now have their overhead removed to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is availabe in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing sligly more expenive-to-calculate data. One example, in the context of [bitcoin#26531]'s mempool tracepoints could be passing serialized transactions for RBF replacements. TODO: This works by.. [bitcoin#26531]: bitcoin#26531
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
8b6ac48
to
3b6d3f0
Compare
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
doc/tracing.md
Outdated
3. Replacement transaction fee as `int64` | ||
4. Replaced transaction ID (hash) as `pointer to unsigned chars` (i.e. 32 bytes in little-endian) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd find it more intuitive to first pass the replaced transaction and then the replacement.
With #26593: tracing: Only prepare tracepoint arguments when actually tracing, we could also pass the serialized transactions as arguments for the mempool tracepoints. I think this makes these mempool tracepoints even more interesting as we otherwise only have the txids of removed/rejected/replaced transactions and wouldn't need to pair them with e.g. ZMQ I'm currently working on monitoring for fullrbf replacements as seen by my node with the above branch (if the replaced transaction didn't signal optin-RBF, it should be a fullrbf replacement). Edit: It's live on https://fullrbf.mempool.observer. (also sorry for the "0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request" spam. Forgot that this would appear here for every force push...) |
Super cool stuff! One note should removal reason 'BLOCK' be logged? It's the one exclusion from even ZMQ because of the resulting data dump. With eBPF it's the most difficult to log because it will overflow the eBPF ring buffer every time. There is a clever solution to use the eBPF filesystem to store data that would otherwise overflow the ring buffer. ref. #24935 |
Block removal was deliberately included because it makes data collection much easier (inferring removal times from block arrival time involves a certain degree of processing; also, timestamps could be wrong if there was a block reorg and the original block is lost) and the overhead of calling the tracepoint pales in comparison to the rest of the code. But that's not set in stone. Check the |
contrib/tracing/README.md
Outdated
│ 17:09:56Z event=added, txid=0a2f2ef84d77a13162f26bb7a544efbab7f09d0b8a965b6f00e85bbd64cd8152, vsize=183, fee=4968 │ | ||
│ 17:09:56Z event=removed, txid=0356df4450c76d2474eab13fab15bfab843e31617edf9ce88d5d5bea5b19bf3e, reason=replaced, vsi│ | ||
│ ze=110, fee=16655 │ | ||
│ 17:09:56Z event=removed, txid=653db4f9af156626a7359c18f9cb095009583497ef606d7a9d2f093c63d392da, reason=replaced, vsi│ | ||
│ ze=110, fee=4037 │ | ||
│ 17:09:56Z event=rejected, txid=1cd12698577446904e8654372421e1ab5b82bdb8faee3f11f34ed758c17ae85a, reason=bad-txns-inp│ | ||
│ uts-missingorspent, peer_id=2568, peer_addr=158.69.116.169:8333 │ | ||
│ 17:09:56Z event=replaced, replacement_txid=db3c91d7dfff65ea5b85ca58bc3e9a26556eb6506ccb04a573f920d297060218, replace│ | ||
│ ment_vsize=110, replacement_fee=23343, replaced_txid=0356df4450c76d2474eab13fab15bfab843e31617edf9ce88d5d5bea5b19bf3│ | ||
│ e, replaced_vsize=110, replaced_fee=16655 │ | ||
│ 17:09:56Z event=replaced, replacement_txid=9b304b9357df5e2cfcbbbead7ffa67c0efc3c599aac9093362b398478d46a587, replace│ | ||
│ ment_vsize=110, replacement_fee=4268, replaced_txid=653db4f9af156626a7359c18f9cb095009583497ef606d7a9d2f093c63d392da│ | ||
│ , replaced_vsize=110, replaced_fee=4037 │ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Played around with this a bit. Super cool!
nit: I think these could be a bit more human readable instead of just printing the raw values. Examples:
- added 0356df4450c76d2474eab13fab15bfab843e31617edf9ce88d5d5bea5b19bf3e with 27.14 sat/vbyte (4968 sat in fees at 183 vbyte)
- replaced 0356df4450c76d2474eab13fab15bfab843e31617edf9ce88d5d5bea5b19bf3 (110 vbyte, 16655 sat, 151.40 sat/vbyte) with db3c91d7dfff65ea5b85ca58bc3e9a26556eb6506ccb04a573f920d297060218 (110 vbyte, 23343 sat, 212.20 sat/vbyte)
- rejected 1cd12698577446904e8654372421e1ab5b82bdb8faee3f11f34ed758c17ae85a from peer 2568 (158.69.116.169:8333): bad-txns-inputs-missingorspent
doc/tracing.md
Outdated
Arguments passed: | ||
1. Transaction ID (hash) as `pointer to unsigned chars` (i.e. 32 bytes in little-endian) | ||
2. Removal reason as `pointer to C-style String` (max. length 9 characters) | ||
3. Transaction virtual size as `uint64` | ||
4. Transaction fee as `int64` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably makes sense to also pass the time the transaction was initially added to the mempool here. CTxMempoolEntry
has a GetTime()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably make sense to include the time for all of the tracepoints added in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. Might be redundant. When a transaction is added and you pass the tracepoint arguments directly to a tracing script (can't think of a reason why you would want to store it in a BPF map first for a while), then the time is very close to now()
. Similar for rejected transactions and for the replacement transaction during RBF (the time when the replaced transaction entered is passed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reply to the original comment: oversight on my part; will push a fix.
In reply to adding event timestamps for all tracepoints: in omitting the actual event timestamps as tracepoint parameters, I followed the precedent set by other tracepoints, where, as laid out by @0xB10C, event timestamps are generated inside tracing scripts.
doc/tracing.md
Outdated
3. Replacement transaction fee as `int64` | ||
4. Replaced transaction ID (hash) as `pointer to unsigned chars` (i.e. 32 bytes in little-endian) | ||
5. Replaced transaction virtual size as `uint64` | ||
6. Replaced transaction fee as `int64` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably makes sense to also pass the time the replaced
transaction was initially added to the mempool here. CTxMempoolEntry
has a GetTime()
method. See also 0xB10C/mempool-observer-fullrbf-ui#1 (comment)
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
4f46013
to
e9ad8f1
Compare
4f440df
to
d0c6da9
Compare
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK d0c6da9
don't have much experience with bcc, so hoping to get another review from @0xB10C. Also pinging @jamesob @dergoegge for re-review?
TRACE2(mempool, rejected, | ||
tx->GetHash().data(), | ||
result.m_state.GetRejectReason().c_str() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about reusing an event name, i.e. for rejections in packages (in a future PR) could/should we add a TRACE2(mempool, rejected, txhash, reason)
or TRACE3(mempool, rejected, txhash, pkghash, reason)
in the package rejection code? Or should we use a different event
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last time I checked event names could not be reused, so the package rejection code would have to use a new event.
On the bright side, having separate rejection events might help differentiating between them and avoid conflating regular vs. package-based rejections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mm good to know, thanks!
ACK d0c6da9 Code reviewed the mempool tracepoint interface tests and the Two things to note:
I've also setup a quick-and-dirty dashboard using these tracepoints on https://bitcoind.observer/d/jRIa_aaVk/mempool (might turn this off at somepoint again). |
Tracepoints for added, removed, replaced, and rejected transactions. The removal reason is passed as string instead of a numeric value, since the benefits of not having to maintain a redundant enum-string mapping seem to outweigh the small cost of string generation. The reject reason is passed as string as well, although here the string does not have to be generated but is readily available. So far, tracepoint PRs typically included two demo scripts: a naive bpftrace script to show raw tracepoint data and a bcc script for a more refined view. However, as some of the ongoing changes to bpftrace introduce a certain degree of unreliability (running some of the existing bpftrace scripts was not possible with standard kernels and bpftrace packages on latest stable Ubuntu, Debian, and NixOS), this PR includes only a single bcc script that fuses the functionality of former bpftrace and bcc scripts.
129d70e
to
4b7aec2
Compare
|
ACK 4b7aec2 Reviewed changes since my last ACK with
The failing CI job looks unrelated. |
ACK 4b7aec2 |
https://nydig.com/bounties/add-bpf-tracepoints-for-the-mempool This might be of interest. For reference I am no longer and have not been working on my mempool tracepoints PR and I closed it months ago. |
Before this commit, we would always prepare tracepoint arguments regardless of the tracepoint being used or not. While we already made sure not to include expensive arguments in our tracepoints, this commit introduces gating to make sure the arguments are only prepared if the tracepoints are actually used. This is a win-win improvement to our tracing framework. For users not interested in tracing, the overhead is reduced to a cheap 'greater than 0' compare. As the semaphore-gating technique used here is available in bpftrace, bcc, and libbpf, users interested in tracing don't have to change their tracing scripts while profiting from potential future tracepoints passing slightly more expensive arguments. In the context of bitcoin#26531's mempool tracepoints, one example could be passing serialized transactions for RBF replacements. Under the hood, the semaphore-gating works by placing a 2-byte semaphore in the '.probes' ELF section. The address of the semaphore is contained in the ELF note providing the tracepoint information (`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits like bpftrace, bcc, and libbpf increase the semaphore at the address upon attaching to the tracepoint. We only prepare the arguments and reach the tracepoint if the semaphore is greater than zero. The semaphore is decreased when detaching from the tracepoint. The implementation of the TRACEPOINT_ACTIVE macro is flexible enough that it can be adapted for macOS dtrace tracing (bitcoin#25541). With dtrace, [CONTEXT]_[EVENT]_ENABLED() can be used to gate the tracepoints. While the TRACEPOINT0(context, event) macro without arguments doesn't need a check if it's active or not (no arguments can be prepared and passed), it does require a TRACEPOINT_SEMAPHORE() due to _SDT_HAS_SEMAPHORES requiring all tracepoints to have a semaphore. As the check is cheap, a check is added to not cause problems with the semaphore being unused. This also extends the "Adding a new tracepoint" documentation to include information about the semaphores and updated step-by-step instructions on how to add a new tracepoint.
This PR adds multiple mempool tracepoints.
mempool:added
mempool:removed
mempool:replaced
mempool:rejected
The tracepoints are further documented in
docs/tracing.md
. Usage is demonstrated in the example scriptcontrib/tracing/mempool_monitor.py
. Interface tests are provided intest/functional/interface_usdt_mempool.py
.The rationale for passing the removal reason as a string instead of numerically is that the benefits of not having to maintain a redundant enum-string mapping seem to outweigh the small cost of string generation. The reject reason is passed as string as well, although in this instance the string does not have to be generated but is readily available.