forked from bitcoin/bitcoin
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport Bitcoin PR#9441: Net: Massive speedup. Net locks overhaul #1586
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Surprisingly this hasn't been causing me any issues while testing, probably because it requires lots of large blocks to be flying around. Send/Recv corks need tests!
This will be needed so that the message processor can cork incoming messages
These conditions are problematic to check without locking, and we shouldn't be relying on the refcount to disconnect.
when vRecvMsg becomes a private buffer, it won't make sense to allow other threads to mess with it anymore.
This is left-over from before there was proper accounting. Hitting 2x the sendbuffer size should not be possible.
…eserialize We'll soon no longer have access to vRecvMsg, and this is more intuitive anyway.
This allows locking to be pushed down to only where it's needed Also reuse the current time rather than checking multiple times.
This may be used publicly in the future
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR dashpay#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
This separates the storage of messages from the net and queued messages for processing, allowing the locks to be split.
Messages are dumped very quickly from the socket handler to the processor, so it's the depth of the processing queue that's interesting. The socket handler checks the process queue's size during the brief message hand-off and pauses if necessary, and the processor possibly unpauses each time a message is popped off of its queue.
Similar to the recv flag, but this one indicates whether or not the net's send buffer is full. The socket handler checks the send queue when a new message is added and pauses if necessary, and possibly unpauses after each message is drained from its buffer.
vRecvMsg is now only touched by the socket handler thread. The accounting vars (nRecvBytes/nLastRecv/mapRecvBytesPerMsgCmd) are also only used by the socket handler thread, with the exception of queries from rpc/gui. These accesses are not threadsafe, but they never were. This needs to be addressed separately. Also, update comment describing data flow
UdjinM6
approved these changes
Aug 22, 2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
EDIT: slightly tested, seems to be working ok
This was referenced Jul 16, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is backport of Bitcoin PR bitcoin#9441.
This is the PR that not only significantly improves networking speed, but also it was the original cause that triggered massive code refactoring backport from Bitcoin.
Now we are at the end of this refactoring.
There are still couple of loose ends that will be addressed in subsequent PRs:
bip68-sequence
functional test to hang. Fixed in net: Consistently use GetTimeMicros() for inactivity checks bitcoin/bitcoin#9606.The original PR description follows.
Depends on (includes) bitcoin#9289. This is what I ultimately set out to fix with the net refactor.
In my (short) tests, it cuts network latency to ~50%. In some cases, more like 30%.
Test method: 1 fresh node (macbook) connected to a single other node (desktop). Testnet. Running until synced to block 10000. Results:
The results are very reproducible, always within a few seconds. Not only is it a nice improvement for the fresh node, but it compounds when its peer is running the updated code as well.
I had hoped to have the abstraction complete in time for 0.14, but it looks like that's unlikely at this point. For reference, it would look something more like theuni/bitcoin@1a6b10a.
Anyway, this is an attempt to fix the actual issue in time for 0.14, and putting off the last bit of refactor until after that. This addresses the issue observed in bitcoin#9415, but also cleans up the nasty locking issues.
I labeled this WIP because there are probably still several racy bits. Quite a bit of test-writing remains.
See the individual commits for the details. tl;dr: We currently either process a peer's message or read from their socket, but never both simultaneously. The changes here remove that restriction.