benchmarks - wrk overload testing files [ci skip] #2695

MSP-Greg · 2021-09-11T03:06:46Z

Description

After working with wrk for a while, I thought having a script that ran several sets would be helpful. Also, after each run, collecting Puma stats and output from smem would show additional info. The summary output is as below:

────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50    10504    1.3   24.6   38.8   48.2   58.9   0.29   158708    158708     0
 13    65    10536    1.4   34.1   52.8   64.7   74.4   0.28   159153    159153     0
 17    85    10580    1.5   46.3   71.2   86.4   96.5   0.19   159859    159859     0
 23   115    10627    1.7   65.5  100.1  121.0  131.5   0.27   160631    160631     0
 30   150    10604    2.4   86.9  132.4  159.6  170.9   0.22   160249    160249     0
             10570                                    Totals   798600    798600
══════════════════════════════════════════════════════════════════════════════════════

══════════════════════════════════ Memory Change per Request
WRK Threads            10     13     17     23     30
    Connections        50     65     85    115    150    All
    Av Resp Size    52974  52960  52995  53008  53000  52987
════════════════════════════════════════════════════════════
worker-0-00    USS     42     25     17    -17      9     15
worker-0-01    USS     23     21     19    -12      6     11
All            USS     32     23     18    -15      7     13
════════════════════════════════════════════════════════════

The top section shows data from the five wrk runs, which start at the total number of threads the Puma server has, and stop at three times that number. The 'spread' column shows how consistent the requests are spread between workers. It's interesting that as the load increases, throughput (req/sec) stays about the same, but the response time increases, especially in the slowest responses.

The bottom section shows memory used after each wrk run, and should remain low. Oddly, using this with Ruby head lead to the discover of a memory leak in Ruby head, as the 'All' column was over 500 bytes per request.

Some notes:

Since CI is not consistent enough to check performance, a good wrk script is helpful for determining whether changes improve performance.
The code is commented in overload/benchmarks/local/bench_overload_wrk.rb. It uses smem for memory info, and also uses @ioquatix's fork of wrk. It only works with Puma in cluster mode (with workers).
wrk doesn't support UNIXSockets, so I may try to get code running using some of the files in Add new test framework files, test for PR #2675 #2694, which shares rackup files with this PR.

Your checklist for this pull request

I have reviewed the guidelines for contributing to this repository.
I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
My pull request is 100 lines added/removed or less so that it can be easily reviewed.
If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
I have updated the documentation accordingly.
All new and existing tests passed, including Rubocop.

jacobherrington

This is a cool idea!

I noticed some nit-picky things (that might have been left in on purpose) and left notes on them, but you're welcome to disregard them. 😄

jacobherrington · 2021-09-24T01:13:40Z

benchmarks/local/bench_base.rb

+      setup_options
+
+      unless File.exist? @state_file
+        puts "can't fined state file '#{@state_file}'"


I noticed a small typo here! 🙈

Suggested change

puts "can't fined state file '#{@state_file}'"

puts "can't find state file '#{@state_file}'"

jacobherrington · 2021-09-24T01:15:36Z

benchmarks/local/bench_base.rb

+
+      print " |#{wrk_data[/^ +\d+ +requests.+/].rstrip}\n"
+
+      # puts '', wrk_data, ''     # for debugging or output format changes


Should this be removed or uncommented?

Suggested change

# puts '', wrk_data, '' # for debugging or output format changes

jacobherrington · 2021-09-24T01:16:29Z

benchmarks/local/bench_base.rb

+
+    def parse_stats
+      obj = @puma_info.run 'stats'
+      # puts ''; pp obj; puts ''


Same as above!

Suggested change

# puts ''; pp obj; puts ''

jacobherrington · 2021-09-24T01:17:56Z

benchmarks/local/bench_base.rb

+      pids = @pids.keys
+
+      smem_info = %x[smem -c 'pid rss pss uss command']
+      # puts '', smem_info, ''


Same as above!

Suggested change

# puts '', smem_info, ''

jacobherrington · 2021-09-24T01:18:17Z

benchmarks/local/bench_overload_wrk.rb

+      overall_summary
+      leak_calc
+      puts RUBY_DESCRIPTION, ''
+      # puts ''; pp @info; puts ''


Same as above!

Suggested change

# puts ''; pp @info; puts ''

ioquatix · 2021-09-24T01:26:09Z

To confirm, by USS you mean unique set size right? i.e. private non-sharable data.

MSP-Greg · 2021-09-25T21:04:01Z

@jacobherrington

Thanks for reviewing and trying it.

I've updated things quite a bit, adding another generic rackup file, which, based on a request header, switches the body type between array, chunked, string, or io, and also the body size. I'll be adding some more benchmark scripts. I'll remove all the debug code as your comments above.

The code currently outputs to the console, but I've set it up so all the measurement data is a Ruby object, so it be can used for 'asserts' or pass/fail code. CI machines vary quite a bit, and I haven't (yet) figured out a way to determine how 'fast' they're running. But, if that can be done, we can add some performance testing to CI...

nateberkopec · 2021-09-26T15:58:04Z

test/rackup/hello.ru

@@ -1 +1,3 @@
-run lambda { |env| [200, {"Content-Type" => "text/plain"}, ["Hello World"]] }
+hdrs = {'Content-Type'.freeze => 'text/plain'.freeze}.freeze


Freezing these is a good call - we're here to benchmark Puma's overhead, not the Rack app 👍

nateberkopec

I'm good with this merging whenever you're ready Greg

MSP-Greg · 2021-09-29T14:50:33Z

@nateberkopec

Thanks. I'm still working on it, what I've expanded it to is a wrk script and a 'ruby connections' script that both are able to test the four types of request bodies (array, chunked, string, and IO), and also allow specifying the size of the body. Without a parameter for the body type/size, they run a matrix of all the combinations, but you can select a vector of the matrix, or a single 'cell/element'. The output shows requests per second, response time, response size, etc.

The 'ruby connections' script was using code that I originally wrote to send a stream of connections for use in CI tests, for things like restarts, shutdown, etc. Without a reasonably fast connection stream, testing for those state changes in Puma might not show real metrics about request handling, etc.

A couple of things:

One might ask 'why the Ruby script?' First, wrk doesn't support UNIXSockets. Secondly, 'requests per second' (RPS) is somewhat separate from response time (RT). With 'wrk', one can fill up the listener backlog. When that happens, RPS stays almost the same, but RT increases, first in the longest 20% or so, then starts to affect all the requests. One can query for the backlog size with a TCP listener, and I added it to stats so I could see it's value. I haven't looked at whether it can be added to stats for production.

Anyway, I'm still working on it, and also working on comments/documentation. Even, today, I'm still renaming methods that aren't clear...

MSP-Greg · 2022-01-01T18:09:46Z

I'll be posting a large update to this, please do not merge! Sorry for the delay...

MSP-Greg · 2022-09-28T03:04:31Z

This code is similar to the code already committed in PR #2895. That code used a constant wrk setup, and allowed a matrix of body types and sizes. The code here increases the load on Puma, and shows behavior in terms of RPS (requests per sec) and metrics on response time. Or, how Puma behaves as the incoming requests increase to the point of overloading it.

Given that there are several other issues that I'd like to work on, and also the merge conflicts shown above, closing with the expectation that I'll hoepfully reopen with updated code.

MSP-Greg added 2 commits September 10, 2021 19:58

benchmarks/local - overload files

af31f32

test/rackup files

dacf8fc

MSP-Greg mentioned this pull request Sep 11, 2021

io_buffer.rb, request.rb - improve handling with a body array/enumeration #2696

Closed

7 tasks

nateberkopec added perf waiting-for-review Waiting on review from anyone labels Sep 16, 2021

jacobherrington reviewed Sep 24, 2021

View reviewed changes

nateberkopec reviewed Sep 26, 2021

View reviewed changes

nateberkopec approved these changes Sep 29, 2021

View reviewed changes

nateberkopec added waiting-for-changes Waiting on changes from the requestor and removed waiting-for-review Waiting on review from anyone labels Jan 18, 2022

MSP-Greg marked this pull request as draft September 27, 2022 18:25

MSP-Greg closed this Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks - wrk overload testing files [ci skip] #2695

benchmarks - wrk overload testing files [ci skip] #2695

MSP-Greg commented Sep 11, 2021

jacobherrington left a comment

jacobherrington Sep 24, 2021

jacobherrington Sep 24, 2021

jacobherrington Sep 24, 2021

jacobherrington Sep 24, 2021

jacobherrington Sep 24, 2021

ioquatix commented Sep 24, 2021

MSP-Greg commented Sep 25, 2021

nateberkopec Sep 26, 2021

nateberkopec left a comment

MSP-Greg commented Sep 29, 2021

MSP-Greg commented Jan 1, 2022

MSP-Greg commented Sep 28, 2022

	puts "can't fined state file '#{@state_file}'"
	puts "can't find state file '#{@state_file}'"


		print " \|#{wrk_data[/^ +\d+ +requests.+/].rstrip}\n"

		# puts '', wrk_data, '' # for debugging or output format changes

		@@ -1 +1,3 @@
		run lambda { \|env\| [200, {"Content-Type" => "text/plain"}, ["Hello World"]] }
		hdrs = {'Content-Type'.freeze => 'text/plain'.freeze}.freeze

benchmarks - wrk overload testing files [ci skip] #2695

benchmarks - wrk overload testing files [ci skip] #2695

Conversation

MSP-Greg commented Sep 11, 2021

Description

Your checklist for this pull request

jacobherrington left a comment

Choose a reason for hiding this comment

jacobherrington Sep 24, 2021

Choose a reason for hiding this comment

jacobherrington Sep 24, 2021

Choose a reason for hiding this comment

jacobherrington Sep 24, 2021

Choose a reason for hiding this comment

jacobherrington Sep 24, 2021

Choose a reason for hiding this comment

jacobherrington Sep 24, 2021

Choose a reason for hiding this comment

ioquatix commented Sep 24, 2021

MSP-Greg commented Sep 25, 2021

nateberkopec Sep 26, 2021

Choose a reason for hiding this comment

nateberkopec left a comment

Choose a reason for hiding this comment

MSP-Greg commented Sep 29, 2021

MSP-Greg commented Jan 1, 2022

MSP-Greg commented Sep 28, 2022