New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per-Peer Message Capture #19509
Per-Peer Message Capture #19509
Conversation
Concept ACK the high-level idea. I think this feature may be quite useful while debugging, and it requires very few lines of code. I have two questions though:
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. I'm looking forward to playing around with this.
My first thought is that this would be much more useful if the user was able to specify an individual peer to log messages for, either through a command line argument -logmessages=<ip_addr>
or through an RPC to enable and disable logging. Perhaps that can be done in a follow-up PR?
dc393b3
to
cf5966a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK: Very nice feature! Looking forward to using it. Thanks for adding it.
Feature request: Would it be possible to add a flag to the Python script that writes the recorded messages to one file each in the binary format the ProcessMessage(…)
fuzzers expect (src/test/fuzz/process_message.cpp
and src/test/fuzz/process_messages.cpp
)?
Ideally with a file naming convention matching that of what libFuzzer
and many other fuzzers generate: the file name being the SHA-1 of the file content.
That way this feature could be used to record live network messages that could be used directly as fuzzing inputs. A great way to generate a fuzzing seed corpus. That would be super neat! :)
cf5966a
to
9898fa1
Compare
Made suggested changes and fixed the Travis problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks surprisingly simple, so Concept ACK
@naumenkogs thanks for the concept ack!
IMO that format is good for a debug log, where things need to be compact on a single line. For this I much prefer JSON.
At the moment, with this locked behind a debug flag, I'm not worried about DoS vectors. For both debugging and learning someone should be at the keyboard and the node shouldn't be working with anything real. Additionally, I imagine that if writing a very small amount to disk in this way (every time a message is sent or received) is a DoS vector then there are many more serious ones available. That said, this is worth consideration, especially if we think this will one day be used for wider purposes. |
45db287
to
cbb154e
Compare
|
cbb154e
to
863c6b5
Compare
|
Huge Concept ACK! Very nice idea with simple implementation, I'm also pretty sure this will be very helpful for debugging purposes. Will definitely code-review that within the next days. Just one detail I find quite odd: why create two files instead of just one and not rather include the direction as part of the serialization for each message? Is it only about saving disk space (1 byte per message) or is there another reason I'm not aware of? I've never seen a logging/sniffing tool that divides up into multiple files before and tend to think it would only complicate re-assembling the whole send/receive order later. |
@theStack Thanks for the concept ACK!
Good question; I considered a few ways of doing this. Just to be clear, I create two files per peer. Partially, yes, this saves 1 byte per message. I could have also just logged everything into one file, which meant I would have had to save the peer address alongside each message as well. Ultimately, I chose the current way with the intention of simplifying the usage of the parser. As it stands, the parser is a simple "just give me what you want me to parse" sort of program. This allows the user to take advantage of globbing when selecting which files to parse (e.g. |
Generally I'd say it makes sense to divide up into files/folders by criteria that I would likely use as frequent search key for accessing the logging data. Would I often want to access logs for a certain peer? Absolutely. Would I want to see only incoming or outcoming messages for a certain peer? Very unlikely, as the resulting data is probably not very valueable without the whole request/response stream in both directions. On the other hand, your arguments for simplification, efficiency (if only looking for incoming or outcoming messages is really such a frequent operation) and flexibility are also very strong, hence this solution is also okay with me. I reviewed the code in more detail and noticed that there may be a serialization endianness problem, see the comments soon following below. |
I agree that this seems very nice. I don't have any objection to changing the output to be this way.
I'm less convinced of this. For one, those details can easily be fished out of the received version message in the file. I also like the directory structure of one directory per peer in |
Yes, just a suggestion to hear what people think.
Note that I'm talking about the version of the dumping node, not of the peer we're connecting to (that may also be interesting of course, but is a bit harder as it means you only know the filename after receiving VERSION).
If the intent is really "one directory per peer", the port perhaps shouldn't be included (or at least not for incoming connections), as the outgoing TCP port is newly chosen for every connection. |
This has been brought up a couple times now, and it's a great question, so it's worth a full response (that won't be buried in resolved conversations). My priority in this PR is that the capture of messages always occurs in order from an application layer perspective. Because of this, I have to capture messages in the MessageHandler thread. If I were to instead capture incoming messages in the SocketHandler thread, say, right after ReceiveMsgBytes, the following could possibly occur:
whereas on the processing side (since X and Y are in in the
So the order would be broken. Remember, this PR isn't just a fun replacement for wireshark. The point is that someone improving/debugging/learning bitcoin's p2p protocol can see what's actually going on from the node's perspective, without having to worry about how the timing of the queues is complicating things. |
@troygiorshev That's a great point. I was thinking that the difference between net-level and net_processing-level ordering would only be different interleaving across peers, but you're right to point out it even obscures the processing within one peer. I agree it's valuable to dump in processing order. |
This is already done for you by the parser, just give it both files and the messages will be interleaved automatically in the output file. If I'm understanding you correctly, the following should do what you're looking for:
That said...
I agree completely, I'll figure this out! |
This commit adds the CaptureMessage function. This will later be called when any message is sent or received. The capture directory is fixed, in a new folder "message_capture" in the datadir. Peers will then have their own subfolders, named with their IP address and port, replacing colons with underscores to keep compatibility with Windows. Inside, received and sent messages will be captured into two binary files, msgs_recv.dat and msgs_sent.dat. e.g. message_capture/203.0.113.7_56072/msgs_recv.dat message_capture/203.0.113.7_56072/msgs_sent.dat The format has been designed as to result in a minimal performance impact. A parsing script is added in a later commit.
These calls are toggled by a debug-only "capturemessages" flag. Default disabled.
This commit adds contrib/message-capture/message-capture-parser.py, a python script to be used alongside -capturemessages to parse the captured messages. It is complete with arguments and will parse any file given, sorting the messages in the files when creating the output. If an output file is specified with -o or --output, it will dump the messages in json format to that file, otherwise it will print to stdout. The small change to the unused msg_generic is to bring it in line with the other message classes, purely to avoid a bug in the future.
Add a functional test for CaptureMessage. This connects and then disconnects a peer so that the handshake can be used to check if capture is being done correctly. Included in a docstring in the test is the following: From the data file we'll only check the structure. We won't care about things like: - Deserializing the payload of the message - This is managed by the deserialize methods in test_framework.messages - The order of the messages - There's no reason why we can't, say, change the order of the messages in the handshake - Message Type - We can add new message types We're ignoring these because they're simply too brittle to test here.
This commit adds brief documentation for this feature. Included in the justification is the purpose of this feature as well as usage and functionality tips.
a17b250
to
bff7c66
Compare
Thanks so much for the review everyone! Up soon: 1 file = 1 session |
re-ACK bff7c66 only some minor changes: 👚
Show signature and timestampSignature:
Timestamp of file with hash |
Still need to do a lot of playing around with this. Interested if you have given any thought @troygiorshev on what questions this tool can easily answer (without additional digging) and what is possible for future improvements/future PRs. Don't want to derail review of this PR though so set up a StackExchange question. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK bff7c66
Verified that the error handling code that was introduced since my last ACK works, by corrupting a message-type in the middle of a random capture file (with a hex editor) and diffing the json outputs of the original and corrupted captures:
s/sendcmpct/sendcmpcx/
53,57c53,55
< "msgtype": "sendcmpct",
< "body": {
< "announce": false,
< "version": 2
< }
---
> "msgtype": "sendcmpcx",
> "body": "000200000000000000",
> "error": "Unrecognized message type."
s/sendcmpct/<random_unprintable_gibberish>
< "msgtype": "sendcmpct",
< "body": {
< "announce": false,
< "version": 2
< }
---
> "msgtype": "UNREADABLE",
> "body": "000200000000000000",
> "error": "Unrecognized message type."
The nit here can be addressed, once you have to rebase: #19509 (comment) |
@troygiorshev Are you still planning to do this? |
utACK bff7c66 This isn't part of the public interface, so the file layout can be changed in a follow-up. |
if not tmp_header_raw: | ||
break | ||
tmp_header = BytesIO(tmp_header_raw) | ||
int.from_bytes(tmp_header.read(TIME_SIZE), "little") # type: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the right place to ask, but when I'm running vulture
against this file, it complains:
bitcoin/test/functional/p2p_message_capture.py:45: invalid syntax at "int.from_bytes(tmp_header.read(TIME_SIZE), "little") # type: int"
Is that a bug in vulture
or a problem in the code? If I assign the return value of from_bytes
to a variable, it passes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a problem in the code. The return value of from_bytes
is not being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the fix is:
- int.from_bytes(tmp_header.read(TIME_SIZE), "little") # type: int
+ tmp_header.read(TIME_SIZE) # skip over the timestamp field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the message-capture-parser.py and received the following error:
So I changed the line 82 The script worked after this change. The reason of the error is the val variable had the value [] and when accessing val[0] the script crashed.
I tested it on Ubuntu 20.04 on VMWare. |
…rsion de85af5 test: store subversion (user agent) as string in msg_version (Sebastian Falbesoner) Pull request description: It seems more natural to treat the "subversion" field (=user agent string, see [BIP 14](https://github.com/bitcoin/bips/blob/master/bip-0014.mediawiki#Proposal)) of a node as pure string rather than a bytestring within the test framework. This is also suggested with the naming prefix in `msg_version.strSubVer`: one probably wouldn't expect a field starting with "str" to be a bytestring that needs further decoding to be useful. This PR moves the encoding/decoding parts to the serialization/deserialization routines so that the user doesn't have to bother with that anymore. Note that currently, in the master branch the `msg_version.strSubVer` is never read (only in `msg_version.__repr__`); However, one issue that is solved by this PR came up while testing #19509 (not merged yet): A decoding script for binary message capture files takes use of the functional test framework convert it into JSON format. Bytestrings will be convered to hexstrings, while pure strings will (surprise surprise) end up without modification in the file. So without this patch, we get: ``` $ jq . out.json | grep -m5 strSubVer "strSubVer": "2f5361746f7368693a32312e39392e302f" "strSubVer": "2f5361746f7368693a302e32302e312f" "strSubVer": "2f5361746f7368693a32312e39392e302f" "strSubVer": "2f5361746f7368693a302e32302e312f" "strSubVer": "2f5361746f7368693a32312e39392e302f" ``` After this patch: ``` $ jq . out2.json | grep -m5 strSubVer "strSubVer": "/Satoshi:21.99.0/" "strSubVer": "/Satoshi:0.20.1/" "strSubVer": "/Satoshi:21.99.0/" "strSubVer": "/Satoshi:0.20.1/" "strSubVer": "/Satoshi:21.99.0/" ``` ACKs for top commit: jnewbery: utACK de85af5 Tree-SHA512: ff23642705c858e8387a625537dfec82e6b8a15da6d99b8d12152560e52d243ba17431b602b26f60996d897e00e3f37dcf8dc8a303ffb1d544df29a5937080f9
Nice find @leonardojobim! You could also test for the list's non-emptiness by checking its booleanness, which is maybe more pythonic: elif isinstance(val, list) and val and slot in HASH_INT_VECTORS and isinstance(val[0], int): Do you mind opening a PR for this? |
Summary: The purpose and scope of this feature is intentionally limited. It answers a question anyone new to Bitcoin's P2P protocol has had: "Can I see what messages my node is sending and receiving?". When a new debug-only command line argument `capturemessages` is set, any message that the node receives or sends is captured. The capture occurs in the `MessageHandler` thread. When receiving a message, it is captured as soon as the `MessageHandler` thread takes the message off of the `vProcessMsg` queue. When sending, the message is captured just before the message is pushed onto the `vSendMsg` queue. The message capture is as minimal as possible to reduce the performance impact on the node. Messages are captured to a new message_capture folder in the datadir. Each node has their own subfolder named with their IP address and port. Inside, received and sent messages are captured into two binary files, msgs_recv.dat and msgs_sent.dat, like so: ``` message_capture/203.0.113.7:56072/msgs_recv.dat message_capture/203.0.113.7:56072/msgs_sent.dat ``` Because the messages are raw binary dumps, included in this PR is a Python parsing tool to convert the binary files into human-readable JSON. This script has been placed on its own and out of the way in the new contrib/message-capture folder. Its usage is simple and easily discovered by the autogenerated -h option. This is a backport of [[bitcoin/bitcoin#19509 | core#19509]] Test Plan: `ninja all check-all` `src/bitcoind -capturemessages` `./contrib/message-capture/message-capture-parser.py -o out.json /bitcoinddata/message_capture/**/*.dat` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D12531
This PR introduces per-peer message capture into Bitcoin Core. 📓
Purpose
The purpose and scope of this feature is intentionally limited. It answers a question anyone new to Bitcoin's P2P protocol has had: "Can I see what messages my node is sending and receiving?".
Functionality
When a new debug-only command line argument
capturemessages
is set, any message that the node receives or sends is captured. The capture occurs in the MessageHandler thread. When receiving a message, it is captured as soon as the MessageHandler thread takes the message off of the vProcessMsg queue. When sending, the message is captured just before the message is pushed onto the vSendMsg queue.The message capture is as minimal as possible to reduce the performance impact on the node. Messages are captured to a new
message_capture
folder in the datadir. Each node has their own subfolder named with their IP address and port. Inside, received and sent messages are captured into two binary files, msgs_recv.dat and msgs_sent.dat, like so:Because the messages are raw binary dumps, included in this PR is a Python parsing tool to convert the binary files into human-readable JSON. This script has been placed on its own and out of the way in the new
contrib/message-capture
folder. Its usage is simple and easily discovered by the autogenerated-h
option.Future Maintenance
I sympathize greatly with anyone who says "the best code is no code".
The future maintenance of this feature will be minimal. The logic to deserialize the payload of the p2p messages exists in our testing framework. As long as our testing framework works, so will this tool.
Additionally, I hope that the simplicity of this tool will mean that it gets used frequently, so that problems will be discovered and solved when they are small.
FAQ
"Why not just use Wireshark"
Yes, Wireshark has the ability to filter and decode Bitcoin messages. However, the purpose of the message capture added in this PR is to assist with debugging, primarily for new developers looking to improve their knowledge of the Bitcoin Protocol. This drives the design in a different direction than Wireshark, in two different ways. First, this tool must be convenient and simple to use. Using an external tool, like Wireshark, requires setup and interpretation of the results. To a new user who doesn't necessarily know what to expect, this is unnecessary difficulty. This tool, on the other hand, "just works". Turn on the command line flag, run your node, run the script, read the JSON. Second, because this tool is being used for debugging, we want it to be as close to the true behavior of the node as possible. A lot can happen in the SocketHandler thread that would be missed by Wireshark.
Additionally, if we are to use Wireshark, we are at the mercy of whoever it maintaining the protocol in Wireshark, both as to it being accurate and recent. As can be seen by the many previous attempts to include Bitcoin in Wireshark (google "bitcoin dissector") this is easier said than done.
Lastly, I truly believe that this tool will be used significantly more by being included in the codebase. It's just that much more discoverable.