New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple benchmarks of a few topologies #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick look mostly cuz I was curious but this is looking fantastic!
.and_then(|sink| { | ||
// This waits for FIN from the server so we don't start shutting it down before it's fully received the test data | ||
let socket = sink.into_inner().into_inner(); | ||
socket.shutdown(std::net::Shutdown::Write).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why you decided not to use sink.close()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an annoying and complicated race condition here:
If we use sink.close()
, it returns immediately (without waiting for the server to acknowledge that the connection was closed), and the only guarantee we have is that the packets are at least queued up in the kernel's networking buffers. Since the very thing that happens after this is shutting the server down, there a chance that we could actually shut the server down before it's started processing its input.
Shutting down the write half of the socket sends FIN
to the server letting it know the client is done sending data, and once it's read everything the client has sent, it'll respond with its own FIN
(which the reads on the lines after this wait for), letting us be sure that the server has at least picked up all of the input before we start shutting it down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that makes sense, what about this https://docs.rs/tokio/0.1.13/tokio/io/trait.AsyncWrite.html#tymethod.shutdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sweet! I've never noticed that. Cleaned up a bit with that in 39bd742
🎉 This looks awesome! One thing I'd be curious to see is a comparison against something that's built statically with the stream and sink combinators. That way we can get a rough idea of what (if any) price we're paying for the dynamic topology stuff. |
Closes #39.
These take about a minute to run on my main development machine, which seems like a decent compromise between thoroughness and iteration speed. Their speed can be tweaked with
num_lines
orsample_size
(to change either how much data a single benchmark handles or how many times the benchmark is run).I focused on a few topologies that I expect to have different bottlenecks or be impacted by different types code changes. Let me know if there are any other setups that you think are worth including in here.
One thing that's not in here that I think is probably worth trying at some point: Having some benchmarks with artificial sinks/sources that don't use the network (e.g. a random line generator sink) and use those to detect changes that would otherwise be hidden behind network bottlenecks (or even just the noise of using the network).
The code in these is not super pretty. If you see any easy wins on how to clean them up at all, I would love that feedback.
These are not the purest benchmarks in the world (e.g. the harness itself takes up resources, and the network traffic between the harness and the system-under-test might be a bottleneck at times), but it seems decent enough to identify significant changes in either direction.
Instructions for using this thing:
cargo bench
will run it (and report the time difference between the previous run ofcargo bench
).After
cargo install
ingcritcmp
, it can be used to compare non-consecutive run:(An example of this in action in #53.)
It also seems like we're off to a very good start with our performance.
pipe
(100 byte lines) is doing16.1 MB/sec
, andpipe_with_huge_lines
(100kb lines) is getting288.4 MB/sec
.