Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run load testing benchmarks comparison for each commit via Github Actions #355

Merged
merged 21 commits into from
Apr 24, 2024

Conversation

palant
Copy link
Contributor

@palant palant commented Apr 21, 2024

Description

This is rather crude but it works. The benchmarking results are remarkably consistent, I didn’t expect that. The downside: someone has to go and check/compare them manually. There are textual results in the action output, and there is a downloadable plot and binary data as the action artifact, but there is no overview of how it develops over time. This action also takes a while despite the caching, somewhere between 3 and 4 minutes. That’s despite only testing two things: producing a static file and listing a directory.

Vegeta’s choice of colors could be better. In the image below, the lower yellow dots belong to SWS (static file), the upper ones to Apache (directory listing).

The benchmarks clearly show SWS as being the slowest web server tested. In part this may be due to lack of sendfile and memory caching capabilities. I doubt that Apache has the latter by default however, and their directory listings are still almost twice as fast. So there is clearly work ahead.

Related Issue

#352

How Has This Been Tested?

I tried it out in my repository fork.

Screenshots (if appropriate):

vegeta-plot

Copy link

Review changes with SemanticDiff.

@palant
Copy link
Contributor Author

palant commented Apr 21, 2024

I’ve produced another benchmark for the v2.28.0 tag, just to see whether there was a recent performance regression. No, it doesn’t look like it.
vegeta-plot
I’ve also tried benchmarking v2.0.3 tag (here without --features=all, feature selection isn’t supported) which turned out considerably slower.
vegeta-plot

@palant
Copy link
Contributor Author

palant commented Apr 21, 2024

Enabling compression in other web servers moved their results closer to SWS:

vegeta-plot

Software Static file
[min, mean, 50, 90, 95, 99, max]
Directory listing
[min, mean, 50, 90, 95, 99, max]
SWS 286.604µs, 352.64µs, 344.91µs, 388.785µs, 413.808µs, 522.805µs, 1.442ms 565.272µs, 887.106µs, 882.876µs, 951.305µs, 974.732µs, 1.064ms, 1.753ms
Apache 399.944µs, 477.982µs, 467.852µs, 519.752µs, 559.698µs, 666.654µs, 1.404ms 515.379µs, 576.061µs, 561.651µs, 621.962µs, 694.457µs, 767.767µs, 1.537ms
lighttpd 267.879µs, 342.65µs, 330.91µs, 381.633µs, 421.928µs, 565.361µs, 1.491ms 389.277µs, 476.196µs, 455.193µs, 524.572µs, 636.127µs, 805.791µs, 2.894ms
nginx 238.107µs, 310.438µs, 301.219µs, 347.77µs, 390.524µs, 486.028µs, 1.21ms 194.036µs, 240.688µs, 227.065µs, 293.748µs, 347.816µs, 399.542µs, 971.213µs

The nginx directory listing is remarkably fast, I might check later to see whether compression really applies there.

@joseluisq joseluisq added v2 v2 release performance Related to server performance ci Related to CI/CD benchmarks Benchmarks related topic labels Apr 21, 2024
@palant
Copy link
Contributor Author

palant commented Apr 21, 2024

With the plot produced by vegeta being less than optimal, I used gnuplot to produce overview graphs like the following. It’s not that I am very happy with gnuplot (e.g. label positions above the bar charts are font-dependent) but I couldn’t find a better tool to produce this kind of graph.
overview

@joseluisq joseluisq changed the title Run benchmarks for each commit via Github Actions Run load testing benchmarks comparison for each commit via Github Actions Apr 21, 2024
@palant
Copy link
Contributor Author

palant commented Apr 21, 2024

I actually found something better than gnuplot: graph-cli. Made things way simpler, the overview graphs now look like this:
overview

@joseluisq
Copy link
Collaborator

I like the idea. I assume that you will continue improving it.
Here are a few general thoughts.

The benchmarks clearly show SWS as being the slowest web server tested. In part this may be due to lack of sendfile and memory caching capabilities.

I wonder if it is worth it to enable sendfile and memory caching for the other web servers in the comparison because SWS does not support them yet. Maybe we could add that once SWS gets them.

Also, it would be great to consider a benchmark graph that shows the resources used by each server.

And, this other server could be added to the list static-web-server/benchmarks#2

@palant
Copy link
Contributor Author

palant commented Apr 21, 2024

I wonder if it is worth it to enable sendfile and memory caching for the other web servers in the comparison because SWS does not support them yet. Maybe we could add that once SWS gets them.

Sendfile is no use with dynamic compression, so it’s effectively disabled already. As to memory caching, I didn’t enable it explicitly but maybe it’s enabled by default somewhere.

Frankly, I’m not sure that this is any good for comparing performance of web servers. It’s very hard to produce configurations that would be in any way comparable while also being realistic. For example, how do you compare SWS with --features=all to Apache or lighttpd that have almost no modules loaded? And if you load more modules, is it still a realistic scenario? But it is useful to show the baseline for SWS, e.g. that directory listings can be a lot faster. Primarily it should still indicate changes in SWS performance however.

Also, it would be great to consider a benchmark graph that shows the resources used by each server.

Sure, as soon as I have an idea how one would measure that. 😅

And, this other server could be added to the list static-web-server/benchmarks#2

So far I focused on software that can be installed via the Ubuntu package manager. I’m definitely not going to compile lwan in this Github Action, yet I’m not even sure there is a third-party Ubuntu repositories for the pre-compiled binary (not that I am very keen on using it if it exists). Also keep in mind: benchmarking another piece of software adds at least half of minute runtime to this action, so it will delay the checks for pull requests even further.

@joseluisq
Copy link
Collaborator

Sendfile is no use with dynamic compression, so it’s effectively disabled already. As to memory caching, I didn’t enable it explicitly but maybe it’s enabled by default somewhere.

Ok. About memory cache. As far as I did learn, some servers like Nginx use a hybrid approach. See Nignx's hybrid disk‑and‑memory cache strategy. I do not know about the others. But in any case, we can keep the feature if be found as the default for those servers.

Frankly, I’m not sure that this is any good for comparing performance of web servers. It’s very hard to produce configurations that would be in any way comparable while also being realistic. For example, how do you compare SWS with --features=all to Apache or lighttpd that have almost no modules loaded? And if you load more modules, is it still a realistic scenario? But it is useful to show the baseline for SWS, e.g. that directory listings can be a lot faster. Primarily it should still indicate changes in SWS performance however.

Ok, my point was more in the direction of if SWS does not support features like let's say memory cache then at least keeping the benchmark closer to what SWS offers so the benchmark may approximate itself to something more fair in the overall comparison.
But regardless, the whole idea is to improve SWS performance rather than try to compete here.

So far I focused on software that can be installed via the Ubuntu package manager. I’m definitely not going to compile lwan in this Github Action, yet I’m not even sure there is a third-party Ubuntu repositories for the pre-compiled binary (not that I am very keen on using it if it exists). Also keep in mind: benchmarking another piece of software adds at least half of minute runtime to this action, so it will delay the checks for pull requests even further.

I just mentioned it due to a request. But I get it. If there is no official binary available and only source then we could skip it for now.

@joseluisq
Copy link
Collaborator

Looking at the graph #355 (comment), SWS has a decent performance (file) compared to for example nginx.

Anyway, it would be great to have this new graph also generated for the current v2.28.0 to compare the differences.
I guess the point will be to be able to track the performance over time at some point.

And about the directory listing. As you said, there is definitely room for improvement.

@joseluisq
Copy link
Collaborator

BTW, In case you wonder. The CI FreeBSD tests run every time there is a change which is not nice. So I will take care of it.

@palant
Copy link
Contributor Author

palant commented Apr 22, 2024

Anyway, it would be great to have this new graph also generated for the current v2.28.0 to compare the differences.

Sure, this is v2.28.0 release:

overview

Also for comparison v2.27.0 release, the times are remarkably close:

overview

I think I will now make the test run on an actual test directory rather than SWS source directory/README. Then I can use matrix to split it up into five separate tests:

  1. Directory listing
  2. Small static file (dynamically compressed)
  3. Small static file (precompressed)
  4. Large static file (dynamically compressed)
  5. Large static file (precompressed)

Each of these tests will produce their own overview. This should make it easier to read given how much directory listing and file times diverge. Tests 3 and 5 are where SWS should be at a disadvantage due to sendfile.

@palant
Copy link
Contributor Author

palant commented Apr 22, 2024

It’s now nine tests, meaning nine graphs. But the run times actually went down a bit because these tests are run in parallel.

The results largely show SWS performing better than Apache and roughly on par with lighttpd. nginx performs best in every test except for producing a 404 error.

lighttpd has an unusual compression implementation which complicates matters. It isn’t capable of redirecting to pre-compressed files, which is why the relevant tests have to request these files directly. Yet its dynamic compression will cache compressed files, something that cannot be disabled – so it performs great whenever large files have to be dynamically compressed because it only compresses them once. lighttpd is also the software that required me to reduce the request rate here – by the time it manages to cache the compressed file, requests tend to pile up already and to overload the server.

overview1 overview2 overview3
overview4 overview5 overview6
overview7 overview8 overview9

@palant
Copy link
Contributor Author

palant commented Apr 22, 2024

Job summary will now display tables with the benchmarking results, so one no longer needs to dig into logs or download artifacts to get an idea. Unfortunately, this only allows Markdown. I cannot see any way to upload images to embed them here.

@palant
Copy link
Contributor Author

palant commented Apr 22, 2024

I think this is good enough now. For reference, here is a run based on v2.28.0: https://github.com/palant/static-web-server/actions/runs/8790531711. Comparing it to a run on current development branch, significant differences always affect all web servers in pretty much the same way. So if there were performance regressions since the release, these aren’t significant enough to be measurable.

Copy link
Collaborator

@joseluisq joseluisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big step overall! we will get useful insights on the way.

I just left a few minor comments to address. Let me know if something is not clear.

.github/workflows/perfcheck.yml Outdated Show resolved Hide resolved
.github/workflows/perfcheck.yml Show resolved Hide resolved
.github/workflows/testroot/filler01.txt Outdated Show resolved Hide resolved
.github/workflows/perfcheck.yml Outdated Show resolved Hide resolved
.github/workflows/perfcheck.yml Outdated Show resolved Hide resolved
@palant palant requested a review from joseluisq April 23, 2024 15:03
Copy link
Collaborator

@joseluisq joseluisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good enough for now.
Thanks!

@joseluisq joseluisq added the enhancement New feature or request label Apr 24, 2024
@joseluisq joseluisq merged commit a197f20 into static-web-server:master Apr 24, 2024
13 checks passed
@palant palant deleted the perfcheck branch April 24, 2024 04:27
@joseluisq joseluisq added this to the v2.30.0 milestone Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks Benchmarks related topic ci Related to CI/CD enhancement New feature or request performance Related to server performance v2 v2 release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants