Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add historical performance benchmark #4083

Merged
merged 25 commits into from
Jul 24, 2023
Merged

Add historical performance benchmark #4083

merged 25 commits into from
Jul 24, 2023

Conversation

tinzh
Copy link
Contributor

@tinzh tinzh commented Jun 29, 2023

Description of changes:

Add historical performance benchmark up to v1.3.16 (Jun 2022), before which there are breaking API changes that require refactor of bench harness. Script checks out old version of s2n-tls, runs benches, and stores results in a csv. The graph_perf binary parses the results csv and generates a graph.

Testing:

This adds no library code, so no new tests are needed. To test, just run benchmarks.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tinzh tinzh marked this pull request as ready for review June 29, 2023 23:46
@tinzh tinzh requested review from goatgoose and toidiu and removed request for maddeleine and jmayclin July 6, 2023 23:03
bindings/rust/bench/historical-perf/historical-perf.svg Outdated Show resolved Hide resolved
bindings/rust/bench/historical-perf/bench-past.sh Outdated Show resolved Hide resolved
bindings/rust/bench/src/bin/graph_perf.rs Outdated Show resolved Hide resolved
bindings/rust/bench/src/bin/graph_perf.rs Outdated Show resolved Hide resolved
bindings/rust/bench/src/bin/graph_perf.rs Outdated Show resolved Hide resolved
bindings/rust/bench/src/bin/graph_perf.rs Outdated Show resolved Hide resolved
bindings/rust/bench/historical-perf/bench-past.sh Outdated Show resolved Hide resolved
bindings/rust/bench/benches/handshake.rs Outdated Show resolved Hide resolved
@tinzh tinzh requested a review from goatgoose July 7, 2023 21:59
@tinzh tinzh requested a review from goatgoose July 10, 2023 23:55
bindings/rust/bench/Cargo.toml Outdated Show resolved Hide resolved
Comment on lines 57 to 71
#[cfg(not(feature = "s2n-only"))]
{
bench_handshake_for_library::<RustlsHarness>(
&mut bench_group,
"rustls",
handshake_type,
ec_group,
);
bench_handshake_for_library::<OpenSslHarness>(
&mut bench_group,
"openssl",
handshake_type,
ec_group,
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing this as a feature, is it possible to have 3 different bench harness (one for each TLS provider)? Ideally you can only run Openssl or Rustls also.

cargo bench -- s2n --exact --no-fail-fast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline, decided that keeping the feature was ultimately worth it because historical performance builds/runs faster without compiling the rustls/openssl benches. However, we'll rename the feature to historical-perf to more accurately reflect the purpose of the feature and call out what it does in the readme.

Comment on lines 21 to 22
// generate all inputs (TlsBenchHarness structs) before benchmarking handshakes
// timing only includes negotiation, not config/connection initialization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// generate all inputs (TlsBenchHarness structs) before benchmarking handshakes
// timing only includes negotiation, not config/connection initialization
// Generate all harnesses (TlsBenchHarness structs) so that benchmark only include
// negotiation and not config/connection initialization

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much time does conn/config initialization take? It should be pretty small since should be using s2n_config_new_minimal by default now. We had a regression with that so it might actually be good to have a benchmark for that separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but that would require a little bit of a refactor to the current benching harness, especially to separate out the config initialization and the conn initialization. I have a refactor planned that includes separating those two out, and I could possibly add a bench for that then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that the initialization took quite a bit of time? Taking out the initialization part was one of the first things I did, so it's been a while since I benched with it.

Comment on lines 35 to 37
// if harness invalid, do nothing but don't panic
// useful for historical performance bench to ignore configs
// invalid only for past versions of s2n-tls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// if harness invalid, do nothing but don't panic
// useful for historical performance bench to ignore configs
// invalid only for past versions of s2n-tls
// Ignore failure since some past versions of s2n-tls have a different API and therefore
// fail to build. Since the results are graphed as part of historical perf, its possible to
// extrapolate any missing data.

Copy link
Contributor Author

@tinzh tinzh Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fail-safe part of the code isn't because of API changes causing build failure, but rather API changes that cause a runtime error on initialization (namely trying to use security policies that were only added recently). I'm going to change the comment to harness with certain parameters fail to initialize for some past versions of s2n-tls, but missing data can be visually interpolated in the historical performance graph.

nit: I've been generally using lowercase for inline comments like // ... but capitalize sentences for doc comments with /// .... Is capitalizing multiline inline comments a specific style I should be following, personal preference, a mix of those, or etc.?

Comment on lines 11 to 12
# immediately bail if any command fails
set -e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to right above bin/bash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the shebang needs to be the first thing, but I can move the set -e above the suppressing stdout.

bindings/rust/bench/historical-perf/bench-past.sh Outdated Show resolved Hide resolved
@tinzh tinzh requested a review from toidiu July 14, 2023 22:03
bindings/rust/bench/historical-perf/bench-past.sh Outdated Show resolved Hide resolved
bindings/rust/bench/src/bin/graph_perf.rs Show resolved Hide resolved
};

/// Return (f64, f64) of (mean, standard error) from Criterion result JSON
fn process_single_json(path: &Path) -> (f64, f64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this more "rusty", you can define a custom data structure:

struct BenchOutput {
   mean: f64,
   std_err: f64
}

You can also define a custom error and then return a Result<BenchOutput, CustomErr>.

For custom errors checkout https://github.com/dtolnay/thiserror

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also replace your BenchGroupData

path::Path,
};

/// Return (f64, f64) of (mean, standard error) from Criterion result JSON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments should just be 2 //

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using the /// for doc comments, including on private methods, for readability. Even if it should be //, the rest of the bench codebase already uses /// for all function descriptions (but does use // everywhere else).

}

/// Plots given data with given chart parameters
fn plot_bench_groups<F: Fn(&f64) -> String>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you are parsing and plotting the data here. you could separate this to make it easier to understand and reason about

@tinzh tinzh requested a review from toidiu July 18, 2023 16:03
bindings/rust/bench/historical-perf/bench-past.sh Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to a folder called bench/scripts/bench_historical_perf.sh

Copy link
Contributor Author

@tinzh tinzh Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the script in the separate historical-perf folder mainly as a place to hold the .svg artifacts that the script outputs. If I move the script into a scripts folder where would the .svg outputs live?

Also, if there was a scripts folder, the use-awslc-*.sh scripts and memory/bench-memory.sh from the memory bench PR, and certs/generate_certs.sh should all maybe also be moved there too then (from certs/)? Overall, I feel like these scripts have not a lot in common with each other, and it'd make sense to separate them out.

It probably isn't ideal to have so many scripts strewn everywhere, but a lot of different tools and build configs need to be clobbered together and just using cargo hasn't let me do what we need to bench everything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .svg can be output to the target folder since thats where artifacts go. Like you said there are alot of scripts and its good to have them in one place. I would say yea all the scripts should move to the same folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait for a refactor pr (which should be right after all the current bench prs get merged) then to change it, since this would span multiple pr's and current work.

@tinzh tinzh requested a review from toidiu July 18, 2023 19:26
Comment on lines 245 to 258
let versions = get_unique_versions(&handshake_data)
.into_iter()
.chain(get_unique_versions(&throughput_data).into_iter())
.chain((15..16).chain(30..38).map(|p| Version::new(1, 3, p)))
.collect::<BTreeSet<Version>>()
.into_iter()
.collect::<Vec<Version>>();

// map versions to x coordinates
let version_to_x = versions
.iter()
.enumerate()
.map(|(i, version)| (version, i as i32))
.collect::<HashMap<&Version, i32>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wonder if its possible to simplify this. Try to avoid doing .collect more than once, since that allocates. The cool thing about operations on iter (map, chain, filter) is that they are lazy.. meaning that it doest create a collection at each step. A collection is only created when you do collect so its ideal to have only one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chained the collect::<BTreeSet<Version>>() and collect::Vec<Version>() calls because I need to sort and remove duplicates from the iterator, but I think extending a set from one get_unique_versions() call would be better; just changed it.

- Since the benches are run over a long time, noise on the machine can cause variability, as seen in the throughput graph.
- The variability can be seen with throughput especially because it is calculated as the inverse of time taken.

![historical-perf-handshake](https://github.com/tinzh/s2n-tls/assets/76919968/b6448634-e6d1-4724-ab91-7efc26485274)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the charts in the markdown makes sense I think, since you're able to add some context about the results here and it isn't just an image checked in by itself. However, in this case, I think I'd rather the images be in the s2n-tls repo than linked to in your fork. Maybes in bench/images or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it. Do you see all of the images, including the ones from memory benching, living here? Or just these particular images because they're in the readme (because they take a while to generate?)

Copy link
Contributor

@goatgoose goatgoose Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it necessarily makes sense to have standalone images committed into the repo. It seems better if they're included as part of the documentation, so you can give some context for them like you did in this readme. So I think images should just be included that are used in the readme.

I think it also makes sense particularly for the historical benchmarking because the single snapshot is useful. For other benchmarking, showing results from old versions (after we create releases and don't update the charts) doesn't seem as useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it. I'll add *.svg back into the gitignore but just specifically allowlist the historical-perf svgs. Maybe other generated images can still be put into images/, but they just won't be checked in.

bindings/rust/bench/README.md Show resolved Hide resolved
bindings/rust/bench/src/bin/graph_perf.rs Outdated Show resolved Hide resolved
@tinzh tinzh requested a review from goatgoose July 19, 2023 18:10
Comment on lines +21 to +25
# make Cargo.toml point s2n-tls to the cloned old version
sed -i "s|s2n-tls = .*|s2n-tls = { path = \"target/s2n-tls/bindings/rust/s2n-tls\" }|" Cargo.toml

# ensure Cargo.toml gets changed back on exit; retains original exit status
trap "{ status=$?; sed -i 's|s2n-tls = .*|s2n-tls = { path = \"../s2n-tls\" }|' $bench_path/Cargo.toml; exit $status; }" EXIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I missed this. Modifying the source is pretty hacky. If you really want to do this then I would suggest a Cargo.template which you modify and we can only run the project from the shell scripts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the Cargo.template work? Is it just a copy of the Cargo.toml that's modified with the new path? Or is it a special cargo feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can specify a different Cargo.toml, but --manifest-path requires the different toml file to be called exactly Cargo.toml, so it'd need to be stored in a different directory. However, storing the Cargo.toml in a different directory changes all of the path in the Cargo.toml since it's all relative to where the Cargo.toml lives. I really couldn't find a better option. Comment above talking about this: #4083 (comment)

Also, I wanted everything to run with cargo bench without a separate run script, since it makes the most sense for the user experience. Yes this is a little hacky, but I think it's well worth the ease of use, and it seems to be the only way to do it.

@tinzh tinzh enabled auto-merge (squash) July 21, 2023 20:55
@dougch dougch added the s2n-core team label Jul 24, 2023
@tinzh tinzh merged commit 20b0174 into aws:main Jul 24, 2023
27 of 28 checks passed
@tinzh tinzh deleted the historical-perf branch July 25, 2023 17:20
goatgoose added a commit to goatgoose/s2n-tls that referenced this pull request Jul 31, 2023
commit eafb8a2
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 18:05:50 2023 -0400

    shell -> bash

commit 12071b8
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 17:59:09 2023 -0400

    add ubuntu quickstart back to readme

commit 10bf557
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 17:52:06 2023 -0400

    fixes

commit 74adf8d
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 16:46:19 2023 -0400

    fixes

commit 0548d07
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 16:43:08 2023 -0400

    consolidate

commit cbe8f2d
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 14:55:30 2023 -0400

    remove old doc sections

commit f194321
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 12:25:28 2023 -0400

    more content

commit 882eb1d
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 09:08:24 2023 -0400

    fixes

commit ce37d0e
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Mon Jul 31 09:03:45 2023 -0400

    fixes

commit 011d15f
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Sat Jul 29 22:59:51 2023 -0400

    cmake consuming

commit 7feadc1
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Sat Jul 29 22:27:02 2023 -0400

    fixes

commit 2914950
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Sat Jul 29 21:34:24 2023 -0400

    traditional make

commit 02f9841
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Sat Jul 29 19:45:43 2023 -0400

    s2n-tls build section

commit 86c4983
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Sat Jul 29 11:56:32 2023 -0400

    Update build documentation

commit ea6d02a
Author: Sam Clark <3758302+goatgoose@users.noreply.github.com>
Date:   Fri Jul 28 16:49:21 2023 -0400

    bindings: release 0.0.35 (aws#4122)

commit 35d08ba
Author: Justin Zhang <76919968+tinzh@users.noreply.github.com>
Date:   Fri Jul 28 12:31:21 2023 -0700

    refactor(bench): separate out client and server connections in benching harness (aws#4113)

    Enables more better control of connections for benching experiments

commit 65e74ca
Author: Lindsay Stewart <slindsay@amazon.com>
Date:   Wed Jul 26 02:26:40 2023 -0700

    Print error for 32bit test (aws#4107)

commit b0b253e
Author: toidiu <apoorv@toidiu.com>
Date:   Wed Jul 26 00:30:44 2023 -0700

    ktls: set keys on socket and enable ktls (aws#4071)

commit 403d5e6
Author: Lindsay Stewart <slindsay@amazon.com>
Date:   Tue Jul 25 16:03:09 2023 -0700

    Trying to use an invalid ticket should not mutate state (aws#4110)

commit bce2b1a
Author: James Mayclin <maycj@amazon.com>
Date:   Tue Jul 25 14:44:33 2023 -0700

    fix: get_session behavior for TLS 1.3 (aws#4104)

commit 6881358
Author: Justin Zhang <76919968+tinzh@users.noreply.github.com>
Date:   Tue Jul 25 10:10:21 2023 -0700

    feat(bench): add different certificate signature algorithms to benchmarks (aws#4080)

commit aab13d5
Author: Justin Zhang <76919968+tinzh@users.noreply.github.com>
Date:   Mon Jul 24 18:17:30 2023 -0700

    feat(bench): add memory bench with valgrind/massif (aws#4081)

commit 20b0174
Author: Justin Zhang <76919968+tinzh@users.noreply.github.com>
Date:   Mon Jul 24 13:26:32 2023 -0700

    feat(bench): add historical performance benchmark (aws#4083)

commit 5cc827d
Author: Doug Chapman <54039637+dougch@users.noreply.github.com>
Date:   Thu Jul 20 11:50:50 2023 -0700

    nix: pin corretto version (aws#4103)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants