fix(connlib): split TUN send & recv into separate threads#8117
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
jamilbk
approved these changes
Feb 13, 2025
jamilbk
left a comment
Member
There was a problem hiding this comment.
Tested on Apple. Replicated the 40% uplift, primarily on the upload throughput.
Member
|
May want to add a changelog note (I noticed customers do read these) |
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 17, 2025
Same as done for unix-based operation systems in #8117, we introduce a dedicated "TUN send" thread for Windows in this PR. Not only does this move the syscalls and copying of sending packets away from `connlib`'s main thread but it also establishes backpressure between those threads properly. WinTUN does not have any ability to signal that it has space in its send buffer. If it fails to allocate a packet for sending, it will return `ERROR_BUFFER_OVERFLOW` [0]. We now handle this case gracefully by suspending the send thread for 10ms and then try again. This isn't a great way of establishing back-pressure but at least we don't have any packet loss. To test this, I temporarily lowered the ring buffer size and ran a speed test. In that, I could confirm that `ERROR_BUFFER_OVERFLOW` is indeed emitted and handled as intended. [0]: https://git.zx2c4.com/wintun/tree/api/session.c#n267
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We appear to have caused a pretty big performance regression (~40%) in 037a2e6 (identified through
git-bisect). Specifically, the regression appears to have been caused byaef411a(#7605). Weirdly enough, undoing just that on top ofmaindoesn't fix the regression.My hypothesis is that using the same file descriptor for read AND write interests on the same runtime causes issues because those interests are occasionally cleared (i.e. on false-positive wake-ups).
In this PR, we spawn a dedicated thread each for the sending and receiving operations of the TUN device. On unix-based systems, a TUN device is just a file descriptor and can therefore simply be copied and read & written to from different threads. Most importantly, we only construct the
AsyncFdwithin the newly spawned thread and runtime because constructing anAsyncFdimplicitly registers with the runtime active on the current thread.As a nice benefit, this allows us to get rid of a
future::select. Those are always kind of nasty because they cancel the future that wasn't ready. My original intuition was that we drop packets due to cancelled futures there but that could not be confirmed in experiments.