-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(relay): reduce allocations during relaying #4453
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
Terraform Cloud Plan Output
|
We need to pass a `BufMut` to `encode_to_slice` to retain the semantics of `BytesMut` of appending to the buffer. If we coerce to a slice, `BufMut` expects the length to be sufficient to write the data. This is a bit messy but good enough for now.
Performance Test ResultsTCP
UDP
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed through it and it looks simple, but I don't actually see where the allocation is removed because I am not familiar with the code yet.
Nice to see some <'a>
s disappearing.
receiver: channel.peer_address, | ||
}); | ||
self.pending_commands | ||
.push_back(Command::ForwardDataClientToPeer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a Deque
? What was the reason not to limit the capacity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is emptied on every iteration of Eventloop::poll
and only a small number of Command
s can build up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How small? 100?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More like 2-3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we panic if it hits 100?
channel, | ||
msg, | ||
length, | ||
} | ||
} | ||
|
||
// Panics if self.data.len() > u16::MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is outdated, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
@@ -11,6 +11,7 @@ use crate::net_ext::IpAddrExt; | |||
use crate::{ClientSocket, IpStack, PeerSocket, TimeEvents}; | |||
use anyhow::Result; | |||
use bytecodec::EncodeExt; | |||
use bytes::BytesMut; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the use of bytes
that makes it easier to reduce allocations?
@@ -776,7 +788,7 @@ where | |||
|
|||
self.pending_commands.push_back(Command::ForwardData { | |||
id: channel.allocation, | |||
data: data.to_vec(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the allocations we are avoiding.
|
||
impl PeerToClient { | ||
pub fn new(msg: &[u8]) -> Self { | ||
let mut buf = BytesMut::zeroed(msg.len() + 4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still allocate here but we leave enough space to put the channel data header in front of it.
@@ -100,7 +100,7 @@ async fn forward_incoming_relay_data( | |||
tokio::select! { | |||
result = socket.recv() => { | |||
let (data, sender) = result?; | |||
relayed_data_sender.send((data.to_vec(), PeerSocket::new(sender), id)).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here used to be the 1. allocation. Note that PeerToClient::new
still allocates.
@ReactorScram I'd recommend patch-by-patch review. I added some comments on how we are avoiding the allocations :) |
Previously, we would allocate each message twice:
We can optimise this to only one allocation each by:
ChannelData
message for traffic from clients to peers.Local flamegraphing still shows most of user-space activity as allocations. I did occasionally see a throughput of ~10GBps with these patches. I'd like to still work towards #4095 to ensure we handle anything time-sensitive better.