Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple benchmarks of a few topologies #52

Merged
merged 1 commit into from Dec 11, 2018
Merged

Conversation

michaelfairley
Copy link
Contributor

@michaelfairley michaelfairley commented Dec 11, 2018

Closes #39.

These take about a minute to run on my main development machine, which seems like a decent compromise between thoroughness and iteration speed. Their speed can be tweaked with num_lines or sample_size (to change either how much data a single benchmark handles or how many times the benchmark is run).

I focused on a few topologies that I expect to have different bottlenecks or be impacted by different types code changes. Let me know if there are any other setups that you think are worth including in here.

One thing that's not in here that I think is probably worth trying at some point: Having some benchmarks with artificial sinks/sources that don't use the network (e.g. a random line generator sink) and use those to detect changes that would otherwise be hidden behind network bottlenecks (or even just the noise of using the network).

The code in these is not super pretty. If you see any easy wins on how to clean them up at all, I would love that feedback.

These are not the purest benchmarks in the world (e.g. the harness itself takes up resources, and the network traffic between the harness and the system-under-test might be a bottleneck at times), but it seems decent enough to identify significant changes in either direction.

Instructions for using this thing:
cargo bench will run it (and report the time difference between the previous run of cargo bench).

After cargo installing critcmp, it can be used to compare non-consecutive run:

cargo bench --bench bench -- --save-baseline before
# Make changes that impact performance
cargo bench --bench bench -- --save-baseline after
critcmp before after --list

(An example of this in action in #53.)

It also seems like we're off to a very good start with our performance. pipe (100 byte lines) is doing 16.1 MB/sec, and pipe_with_huge_lines (100kb lines) is getting 288.4 MB/sec.

Copy link
Contributor

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look mostly cuz I was curious but this is looking fantastic!

.and_then(|sink| {
// This waits for FIN from the server so we don't start shutting it down before it's fully received the test data
let socket = sink.into_inner().into_inner();
socket.shutdown(std::net::Shutdown::Write).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you decided not to use sink.close()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an annoying and complicated race condition here:
If we use sink.close(), it returns immediately (without waiting for the server to acknowledge that the connection was closed), and the only guarantee we have is that the packets are at least queued up in the kernel's networking buffers. Since the very thing that happens after this is shutting the server down, there a chance that we could actually shut the server down before it's started processing its input.
Shutting down the write half of the socket sends FIN to the server letting it know the client is done sending data, and once it's read everything the client has sent, it'll respond with its own FIN (which the reads on the lines after this wait for), letting us be sure that the server has at least picked up all of the input before we start shutting it down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sweet! I've never noticed that. Cleaned up a bit with that in 39bd742

@lukesteensen
Copy link
Member

🎉 This looks awesome! One thing I'd be curious to see is a comparison against something that's built statically with the stream and sink combinators. That way we can get a rough idea of what (if any) price we're paying for the dynamic topology stuff.

@michaelfairley michaelfairley merged commit 112ec06 into master Dec 11, 2018
@michaelfairley michaelfairley deleted the benchmarking branch December 11, 2018 19:25
syedriko referenced this pull request in syedriko/vector Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants