Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase runtime of performance tests to O(10 ms) to increase SNR #2063

Merged

Conversation

simonjbeaumont
Copy link
Contributor

@simonjbeaumont simonjbeaumont commented Mar 9, 2022

Motivation:

The current running time of some of the performance benchmarks is O(1 ms) which can lead to low signal in the results. It would be better if these tests ran for O(10 ms).

Modifications:

  • Increase the iteration counts in all the tests within NIOPerformanceTester to aim for O(10 ms) measurements with the current code.
  • Exception: execute_100k_tasks which is still too quick but the CI cannot handle 1M tasks.

Result:

  • All benchmarks bodies should now take O(100 ms).

Notes:

The scaling factors were determined by looking at the current column from a recent report on #2059:

$ python3 scaling_factors.py perf-output
Loading file: perf-output...
Loaded 62 results.
Scaling tests so that runtime is O(10 ms)

name                                                               current_s    O(current_s)    O(desired_s)  scale_by
write_http_headers                                                 0.004221364  1E-3            1E-2          1E1
http_headers_canonical_form                                        0.085858433  1E-2            1E-2          1E0
http_headers_canonical_form_trimming_whitespace                    0.162971487  1E-1            1E-2          1E-1
http_headers_canonical_form_trimming_whitespace_from_short_string  0.148830606  1E-1            1E-2          1E-1
http_headers_canonical_form_trimming_whitespace_from_long_string   0.229662477  1E-1            1E-2          1E-1
bytebuffer_write_12MB_short_string_literals                        0.151619987  1E-1            1E-2          1E-1
bytebuffer_write_12MB_short_calculated_strings                     0.297283243  1E-1            1E-2          1E-1
bytebuffer_write_12MB_medium_string_literals                       0.058404425  1E-2            1E-2          1E0
bytebuffer_write_12MB_medium_calculated_strings                    0.15610171   1E-1            1E-2          1E-1
bytebuffer_write_12MB_large_calculated_strings                     0.142773805  1E-1            1E-2          1E-1
bytebuffer_lots_of_rw                                              0.433748776  1E-1            1E-2          1E-1
bytebuffer_write_http_response_ascii_only_as_string                0.039572042  1E-2            1E-2          1E0
bytebuffer_write_http_response_ascii_only_as_staticstring          0.030672688  1E-2            1E-2          1E0
bytebuffer_write_http_response_some_nonascii_as_string             0.039276667  1E-2            1E-2          1E0
bytebuffer_write_http_response_some_nonascii_as_staticstring       0.030416912  1E-2            1E-2          1E0
no-net_http1_10k_reqs_1_conn                                       0.105070526  1E-1            1E-2          1E-1
http1_10k_reqs_1_conn                                              0.589343003  1E-1            1E-2          1E-1
http1_10k_reqs_100_conns                                           0.581899912  1E-1            1E-2          1E-1
future_whenallsucceed_100k_immediately_succeeded_off_loop          0.070154794  1E-2            1E-2          1E0
future_whenallsucceed_100k_immediately_succeeded_on_loop           0.071029944  1E-2            1E-2          1E0
future_whenallsucceed_100k_deferred_off_loop                       0.331273236  1E-1            1E-2          1E-1
future_whenallsucceed_100k_deferred_on_loop                        0.121564379  1E-1            1E-2          1E-1
future_whenallcomplete_100k_immediately_succeeded_off_loop         0.030060621  1E-2            1E-2          1E0
future_whenallcomplete_100k_immediately_succeeded_on_loop          0.030060624  1E-2            1E-2          1E0
future_whenallcomplete_100k_deferred_off_loop                      0.258080927  1E-1            1E-2          1E-1
future_whenallcomplete_100k_deferred_on_loop                       0.061702521  1E-2            1E-2          1E0
future_reduce_10k_futures                                          0.03736344   1E-2            1E-2          1E0
future_reduce_into_10k_futures                                     0.035719515  1E-2            1E-2          1E0
channel_pipeline_1m_events                                         0.095089666  1E-2            1E-2          1E0
websocket_encode_50b_space_at_front_1m_frames_cow                  0.491598504  1E-1            1E-2          1E-1
websocket_encode_50b_space_at_front_1m_frames_cow_masking          0.065660544  1E-2            1E-2          1E0
websocket_encode_1kb_space_at_front_100k_frames_cow                0.05179348   1E-2            1E-2          1E0
websocket_encode_50b_no_space_at_front_1m_frames_cow               0.492513012  1E-1            1E-2          1E-1
websocket_encode_1kb_no_space_at_front_100k_frames_cow             0.051848542  1E-2            1E-2          1E0
websocket_encode_50b_space_at_front_10k_frames                     0.006663117  1E-3            1E-2          1E1
websocket_encode_50b_space_at_front_10k_frames_masking             0.083932953  1E-2            1E-2          1E0
websocket_encode_1kb_space_at_front_1k_frames                      0.001293064  1E-3            1E-2          1E1
websocket_encode_50b_no_space_at_front_10k_frames                  0.006696183  1E-3            1E-2          1E1
websocket_encode_1kb_no_space_at_front_1k_frames                   0.001222385  1E-3            1E-2          1E1
websocket_decode_125b_100k_frames                                  0.115220387  1E-1            1E-2          1E-1
websocket_decode_125b_with_a_masking_key_100k_frames               0.118480726  1E-1            1E-2          1E-1
websocket_decode_64kb_100k_frames                                  0.118017645  1E-1            1E-2          1E-1
websocket_decode_64kb_with_a_masking_key_100k_frames               0.120845408  1E-1            1E-2          1E-1
websocket_decode_64kb_+1_100k_frames                               0.117700529  1E-1            1E-2          1E-1
websocket_decode_64kb_+1_with_a_masking_key_100k_frames            0.12086373   1E-1            1E-2          1E-1
circular_buffer_into_byte_buffer_1kb                               0.036630641  1E-2            1E-2          1E0
circular_buffer_into_byte_buffer_1mb                               0.07287621   1E-2            1E-2          1E0
byte_buffer_view_iterator_1mb                                      0.020055745  1E-2            1E-2          1E0
byte_buffer_view_contains_12mb                                     0.207476717  1E-1            1E-2          1E-1
byte_to_message_decoder_decode_many_small                          0.170375659  1E-1            1E-2          1E-1
generate_10k_random_request_keys                                   0.089314066  1E-2            1E-2          1E0
bytebuffer_rw_10_uint32s                                           0.28914708   1E-1            1E-2          1E-1
bytebuffer_multi_rw_10_uint32s                                     0.05438756   1E-2            1E-2          1E0
lock_1_thread_10M_ops                                              0.155880855  1E-1            1E-2          1E-1
lock_2_threads_10M_ops                                             0.872071354  1E-1            1E-2          1E-1
lock_4_threads_10M_ops                                             0.944893321  1E-1            1E-2          1E-1
lock_8_threads_10M_ops                                             0.905209719  1E-1            1E-2          1E-1
schedule_10000_tasks                                               0.005108103  1E-3            1E-2          1E1
schedule_and_run_10000_tasks                                       0.031540699  1E-2            1E-2          1E0
execute_10000                                                      0.016732096  1E-2            1E-2          1E0
bytebufferview_copy_to_array_1000_times_1kb                        0.000123618  1E-4            1E-2          1E2
circularbuffer_copy_to_array_1000_times_1kb                        0.001885264  1E-3            1E-2          1E1

@simonjbeaumont simonjbeaumont force-pushed the run-longer-benchmarks-to-increase-snr branch 2 times, most recently from 1260844 to a75b7c9 Compare March 9, 2022 12:21
@Lukasa Lukasa added the needs-no-version-bump For PRs that when merged do not need a bump in version number. label Mar 9, 2022
@Lukasa
Copy link
Contributor

Lukasa commented Mar 9, 2022

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@Lukasa
Copy link
Contributor

Lukasa commented Mar 9, 2022

Huh, I wonder if we overshot: the build timed out after 10 minutes.

@Lukasa
Copy link
Contributor

Lukasa commented Mar 9, 2022

Ah, I see why: the output numbers in the chart aren't ns, they're fractional seconds. These scaling factors are all about 10^9 too high.

@simonjbeaumont
Copy link
Contributor Author

Ah, I see why: the output numbers in the chart aren't ns, they're fractional seconds. These scaling factors are all about 10^9 too high.

🤦 Should have traced my way through to the outputs, I just read this which deals in nanoseconds.

@simonjbeaumont simonjbeaumont force-pushed the run-longer-benchmarks-to-increase-snr branch from a75b7c9 to 9d3eb26 Compare March 9, 2022 14:44
@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

simonjbeaumont commented Mar 9, 2022

@Lukasa I have reworked the script to work out a scaling factor with appropriate units now and updated the PR description with its output. I've also adjusted all the iterations to be in line with what we're looking for: a O(100 ms) hot loop.

the build timed out after 10 minutes.

Well FWICT each test is run 10 times and an average is taken:

14:51:57 measuring: websocket_decode_64kb_with_a_masking_key_100k_frames: 0.125121872, 0.124363388, 0.124045883, 0.12475775, 0.124550551, 0.123625381, 0.124706225, 0.124888407, 0.124184521, 0.124370663, 

There are 62 tests, each of which taking O(100 ms), so upper bounded by 1 second. That means we could slightly overshoot 10 mins in the worst case just with the hot loops. We'd need some headroom to allow for warmups and compile time.

The current run got this far...

14:51:57 measuring: websocket_decode_64kb_+1_100k_frames: 0.12155783, 0.121102395, 0.121057444, 0.120217503, 0.121161953, 0.12120Build step 'Execute shell' marked build as failure

...which was test 44/62.

Looking at the output, it seems that we have managed to get O(100 ms) for all the bodies though, so if we want this, I guess we should increase the CI timeout?

@Lukasa
Copy link
Contributor

Lukasa commented Mar 9, 2022

Hmm, does that math check out? If each test run is ~100ms, then 10 test runs is 1 second, and 62 tests adds up to 62 seconds, a.k.a 1 minute. That shouldn't be a problem, should it?

@Lukasa
Copy link
Contributor

Lukasa commented Mar 9, 2022

Ah, but many of these are closer to 1s per run. Can we drop an order of magnitude off them all, so none of them go above 100ms? That should make this fly by.

@simonjbeaumont
Copy link
Contributor Author

simonjbeaumont commented Mar 9, 2022

Hmm, does that math check out? If each test run is ~100ms, then 10 test runs is 1 second, and 62 tests adds up to 62 seconds, a.k.a 1 minute. That shouldn't be a problem, should it?

The maths checks out with O(100ms) rather than ~100ms. There's an order of magnitude gap in the worst case. I was thinking we were aiming for "hundreds of ms".

Ah, but many of these are closer to 1s per run. Can we drop an order of magnitude off them all, so none of them go above 100ms? That should make this fly by.

OK... do you want to include reducing existing ones that currently take many hundreds of ms? E.g. the locking ones?

If so, I can rinse and repeat this exercise to shoot for "tens of ms".

@Lukasa
Copy link
Contributor

Lukasa commented Mar 10, 2022

Yeah, I think we should probably shoot for tens of ms.

@simonjbeaumont simonjbeaumont force-pushed the run-longer-benchmarks-to-increase-snr branch from 9d3eb26 to 079cc84 Compare March 16, 2022 20:59
@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@simonjbeaumont simonjbeaumont changed the title Increase runtime of performance tests to O(100 ms) to increase SNR Increase runtime of performance tests to O(10 ms) to increase SNR Mar 16, 2022
@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

!!! Couldn't read commit file !!!

@Lukasa what's the story here? I thought maybe this was because it was out of date with main but I merged in main and still get the same error.

@Lukasa
Copy link
Contributor

Lukasa commented Mar 17, 2022

The run seems to have silently failed. Let's ask it again, but if it happens again we'll need to investigate why the perf runner failed to complete.

@Lukasa
Copy link
Contributor

Lukasa commented Mar 17, 2022

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@Lukasa
Copy link
Contributor

Lukasa commented Mar 17, 2022

Yeah that implies an issue. I'll try to reproduce it.

@Lukasa
Copy link
Contributor

Lukasa commented Mar 17, 2022

Hmm, I don't reproduce this locally running the perf tests on 5.6. I'll try 5.4 when I start my work day and move to my Intel machine.

@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

simonjbeaumont commented Mar 24, 2022

Summary of findings so far:

Change CI Logs
Changing the number of iterations and the labels https://ci.swiftserver.group/job/swift-nio2-performance-prb/115/console
Reverting everything (so, just main) https://ci.swiftserver.group/job/swift-nio2-performance-prb/116/console
Revert the revert (back to proposed PR) but disabling one test where CI stops printing https://ci.swiftserver.group/job/swift-nio2-performance-prb/117/console
Re-enable all tests and update just the labels to match those on main https://ci.swiftserver.group/job/swift-nio2-performance-prb/118/console
Update labels again, but disable the test after the one that was printing when CI died https://ci.swiftserver.group/job/swift-nio2-performance-prb/119/console
Binary chop, disable tests 32...62 https://ci.swiftserver.group/job/swift-nio2-performance-prb/120/console
Binary chop, disable tests 48...62 https://ci.swiftserver.group/job/swift-nio2-performance-prb/121/console
Binary chop, disable tests 56...62 https://ci.swiftserver.group/job/swift-nio2-performance-prb/122/console
Binary chop, disable tests 59...62 https://ci.swiftserver.group/job/swift-nio2-performance-prb/123/console

So a test between 56...58

@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

performance report

build id: 126

timestamp: Thu Mar 24 10:53:02 UTC 2022

results

nameminmaxmeanstd
lock_8_threads_1M_ops 0.080995249 0.089466475 0.08518604020000001 0.0026077801603845954

comparison

name current previous winner diff
lock_8_threads_1M_ops 0.080995249 n/a n/a n/a%

Signed-off-by: Si Beaumont <beaumont@apple.com>
@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

!!! Couldn't read commit file !!!

@simonjbeaumont
Copy link
Contributor Author

simonjbeaumont commented Mar 24, 2022

We have a winner... test 57: schedule_1m_tasks. Here is the CI run for just this test enabled: https://ci.swiftserver.group/job/swift-nio2-performance-prb/127/

Looking at the change between this PR and main it looks like it was 10k and now it's 1m. I guess we blow through some resource limits on the Jenkins Docker that we don't locally?

$ git show origin/main:Sources/NIOPerformanceTester/main.swift | grep -o "desc: .*" | sed 's|.*\"\(.*\)\".*|\1|g' | grep schedule
schedule_10000_tasks
schedule_and_run_10000_tasks

@Lukasa looking at the report it looks like this took O(1ms) on main. The thing to note here is that the existing label isn't correct. shedule_10000_tasks actually called into a helper type which had a hard-coded 100000 (so 100k, not 10k). This PR made that a parameter to the constructor and then also bumped it to 1m (just one order of magnitude). I guess this it too many tasks to have lying around only scheduled.

name                                                               current_s    O(current_s)    O(desired_s)  scale_by
...
schedule_10000_tasks                                               0.005108103  1E-3            1E-2          1E1

Here is ScheduleBenchmark for reference:

            for _ in 0..<self.numTasks {
                self.loop.scheduleTask(in: .hours(1)) {
                    counter &+= 1
                }
            }

@Lukasa
Copy link
Contributor

Lukasa commented Mar 24, 2022

Definitely seems like it's busting a limit, though I'm a bit surprised to see that. I suspect it's going to be memory. Want to drop the task count back down to 100k?

Signed-off-by: Si Beaumont <beaumont@apple.com>
@simonjbeaumont
Copy link
Contributor Author

@swift-nio-bot test perf please

@swift-server-bot
Copy link

performance report

build id: 128

timestamp: Thu Mar 24 14:16:08 UTC 2022

results

nameminmaxmeanstd
write_http_headers 0.041745234 0.042039137 0.041842891199999996 8.855224061761453e-05
http_headers_canonical_form 0.087005866 0.087605933 0.08723090089999999 0.0002552183074156317
http_headers_canonical_form_trimming_whitespace 0.016677892 0.017203268 0.0167525769 0.00015895641292886306
http_headers_canonical_form_trimming_whitespace_from_short_string 0.015255643 0.015783383 0.0153131593 0.00016531229863507208
http_headers_canonical_form_trimming_whitespace_from_long_string 0.023606377 0.024123394 0.023675499699999998 0.00015840318310567832
bytebuffer_write_12MB_short_string_literals 0.092980217 0.098861777 0.09390248729999999 0.0017913698287326278
bytebuffer_write_12MB_short_calculated_strings 0.063091556 0.063650439 0.0632849672 0.00021201030784154503
bytebuffer_write_12MB_medium_string_literals 0.599583337 0.612481709 0.6039419099 0.004658535176707978
bytebuffer_write_12MB_medium_calculated_strings 0.080329622 0.082413528 0.0811817408 0.0007271972215368009
bytebuffer_write_12MB_large_calculated_strings 0.142810461 0.143793308 0.1433129236 0.00029668302315778987
bytebuffer_lots_of_rw 0.042762346 0.043332873 0.0429970955 0.0002581998649402568
bytebuffer_write_http_response_ascii_only_as_string 0.039854021 0.04042547 0.0400403196 0.00021665154340563665
bytebuffer_write_http_response_ascii_only_as_staticstring 0.031400039 0.031955898 0.0315446714 0.00021418305308839724
bytebuffer_write_http_response_some_nonascii_as_string 0.03994209 0.040613909 0.040127891900000004 0.00025937037050709693
bytebuffer_write_http_response_some_nonascii_as_staticstring 0.031381504 0.032015003 0.031493262499999994 0.0001866785770023203
no-net_http1_1k_reqs_1_conn 0.01062603 0.010768935 0.0107127077 4.278940388058201e-05
http1_1k_reqs_1_conn 0.060817924 0.062090177 0.061217355200000004 0.00039857613512374085
http1_1k_reqs_100_conns 0.08865786 0.089659995 0.08898618280000001 0.00035705781481180134
future_whenallsucceed_100k_immediately_succeeded_off_loop 0.073131634 0.07397148 0.0734776492 0.0002524014578681261
future_whenallsucceed_100k_immediately_succeeded_on_loop 0.07422346 0.081301693 0.0753476136 0.0021131154420498113
future_whenallsucceed_10k_deferred_off_loop 0.029456798 0.029913547 0.0296353663 0.000156454617944736
future_whenallsucceed_10k_deferred_on_loop 0.012483152 0.012941087 0.0125658354 0.00013467757683049294
future_whenallcomplete_100k_immediately_succeeded_off_loop 0.031372871 0.031911476 0.0315591445 0.0002083461548733269
future_whenallcomplete_100k_immediately_succeeded_on_loop 0.031529132 0.032074252 0.031787526499999996 0.00017320872803467172
future_whenallcomplete_10k_deferred_off_loop 0.022117573 0.02271911 0.022266883100000003 0.0002090366475952909
future_whenallcomplete_100k_deferred_on_loop 0.065622477 0.067780216 0.0662131356 0.0006424761269906736
future_reduce_10k_futures 0.037678581 0.038466282 0.0380574215 0.00027577570529981553
future_reduce_into_10k_futures 0.036206727 0.036821312 0.0364869019 0.00019804755616683489
channel_pipeline_1m_events 0.097171067 0.097649212 0.09728125909999999 0.0001400198219963081
websocket_encode_50b_space_at_front_100k_frames_cow 0.049856036 0.050301141 0.0499917963 0.00017410260824330686
websocket_encode_50b_space_at_front_1m_frames_cow_masking 0.673284879 0.680615323 0.6744029078 0.002199224521845175
websocket_encode_1kb_space_at_front_1m_frames_cow 0.527848584 0.5290743 0.5284040271 0.00030418220760861737
websocket_encode_50b_no_space_at_front_100k_frames_cow 0.049495636 0.04998959 0.0496108235 0.00019027713290078912
websocket_encode_1kb_no_space_at_front_100k_frames_cow 0.052841754 0.053290318 0.0529483217 0.00017670958961845323
websocket_encode_50b_space_at_front_100k_frames 0.068058419 0.06852406 0.06821071819999999 0.00020589913384729507
websocket_encode_50b_space_at_front_10k_frames_masking 0.00856399 0.009089216 0.0086343702 0.00016017833011768153
websocket_encode_1kb_space_at_front_10k_frames 0.013216157 0.013245655 0.0132245848 8.609553605669164e-06
websocket_encode_50b_no_space_at_front_100k_frames 0.06799402 0.068494592 0.0681974617 0.00020646785427117686
websocket_encode_1kb_no_space_at_front_10k_frames 0.012592638 0.013025949 0.0126438516 0.00013459675290965328
websocket_decode_125b_10k_frames 0.011285869 0.011463937 0.011398288400000001 5.119622906729649e-05
websocket_decode_125b_with_a_masking_key_10k_frames 0.011597051 0.012116748 0.011735956 0.00016173854499160065
websocket_decode_64kb_10k_frames 0.011745145 0.012228933 0.0118485524 0.00014296304842332142
websocket_decode_64kb_with_a_masking_key_10k_frames 0.011963872 0.012131804 0.0120366483 5.932539682884514e-05
websocket_decode_64kb_+1_10k_frames 0.011689181 0.012227598 0.011828746800000001 0.00014893028169344485
websocket_decode_64kb_+1_with_a_masking_key_10k_frames 0.011813875 0.012487327 0.012025444 0.00018952811332828098
circular_buffer_into_byte_buffer_1kb 0.041278388 0.041731537 0.041376412200000004 0.000185249724507098
circular_buffer_into_byte_buffer_1mb 0.082283792 0.082746566 0.0825256969 0.00022641760773032437
byte_buffer_view_iterator_1mb 0.020486981 0.020930844 0.020543099500000002 0.00013664342003530373
byte_buffer_view_contains_12mb 0.052866455 0.053448 0.05307531320000001 0.00023026167666316855
byte_to_message_decoder_decode_many_small 0.034975305 0.035490973 0.035095001800000004 0.0002056988989371494
generate_10k_random_request_keys 0.090162299 0.090488192 0.0902962467 0.00011040781287576918
bytebuffer_rw_10_uint32s 0.028070422 0.039560411 0.030636961400000003 0.0032087761888174807
bytebuffer_multi_rw_10_uint32s 0.056512284 0.057893943 0.0570053979 0.0003795496553298256
lock_1_thread_1M_ops 0.015953284 0.016050557 0.0159750413 2.751573437568654e-05
lock_2_threads_1M_ops 0.063251877 0.097395233 0.0858307654 0.010911102621343751
lock_4_threads_1M_ops 0.087359015 0.100329256 0.0938935453 0.004518064647022096
lock_8_threads_1M_ops 0.091603643 0.101952349 0.0974102218 0.003327647989783734
schedule_100k_tasks 0.071288435 0.110307442 0.078933878 0.012159959937450648
schedule_and_run_100k_tasks 0.417008028 0.43110721 0.4247691649 0.004228343042565229
execute_100k_tasks 0.212708943 0.214618922 0.21329781360000002 0.0007071605601373823
bytebufferview_copy_to_array_100k_times_1kb 0.012637182 0.012675838 0.0126466622 1.1173166066766976e-05
circularbuffer_copy_to_array_10k_times_1kb 0.0210899 0.02152909 0.021139799600000002 0.0001372085256180049

comparison

name current previous winner diff
write_http_headers 0.041745234 0.004156324 previous 904%
http_headers_canonical_form 0.087005866 0.087176781 current 0%
http_headers_canonical_form_trimming_whitespace 0.016677892 0.167224441 current -90%
http_headers_canonical_form_trimming_whitespace_from_short_string 0.015255643 0.152840232 current -90%
http_headers_canonical_form_trimming_whitespace_from_long_string 0.023606377 0.236946146 current -90%
bytebuffer_write_12MB_short_string_literals 0.092980217 0.15626777 current -40%
bytebuffer_write_12MB_short_calculated_strings 0.063091556 0.306472959 current -79%
bytebuffer_write_12MB_medium_string_literals 0.599583337 0.060320819 previous 893%
bytebuffer_write_12MB_medium_calculated_strings 0.080329622 0.160524061 current -49%
bytebuffer_write_12MB_large_calculated_strings 0.142810461 0.145666986 current -1%
bytebuffer_lots_of_rw 0.042762346 0.450269309 current -90%
bytebuffer_write_http_response_ascii_only_as_string 0.039854021 0.039765483 previous 0%
bytebuffer_write_http_response_ascii_only_as_staticstring 0.031400039 0.031767899 current -1%
bytebuffer_write_http_response_some_nonascii_as_string 0.03994209 0.039992365 current 0%
bytebuffer_write_http_response_some_nonascii_as_staticstring 0.031381504 0.031762613 current -1%
no-net_http1_1k_reqs_1_conn 0.01062603 n/a n/a n/a%
http1_1k_reqs_1_conn 0.060817924 n/a n/a n/a%
http1_1k_reqs_100_conns 0.08865786 n/a n/a n/a%
future_whenallsucceed_100k_immediately_succeeded_off_loop 0.073131634 0.078185913 current -6%
future_whenallsucceed_100k_immediately_succeeded_on_loop 0.07422346 0.079444721 current -6%
future_whenallsucceed_10k_deferred_off_loop 0.029456798 n/a n/a n/a%
future_whenallsucceed_10k_deferred_on_loop 0.012483152 n/a n/a n/a%
future_whenallcomplete_100k_immediately_succeeded_off_loop 0.031372871 0.031299125 previous 0%
future_whenallcomplete_100k_immediately_succeeded_on_loop 0.031529132 0.031184463 previous 1%
future_whenallcomplete_10k_deferred_off_loop 0.022117573 n/a n/a n/a%
future_whenallcomplete_100k_deferred_on_loop 0.065622477 0.064373813 previous 1%
future_reduce_10k_futures 0.037678581 0.038389384 current -1%
future_reduce_into_10k_futures 0.036206727 0.036942603 current -1%
channel_pipeline_1m_events 0.097171067 0.097150271 previous 0%
websocket_encode_50b_space_at_front_100k_frames_cow 0.049856036 n/a n/a n/a%
websocket_encode_50b_space_at_front_1m_frames_cow_masking 0.673284879 0.067069988 previous 903%
websocket_encode_1kb_space_at_front_1m_frames_cow 0.527848584 n/a n/a n/a%
websocket_encode_50b_no_space_at_front_100k_frames_cow 0.049495636 n/a n/a n/a%
websocket_encode_1kb_no_space_at_front_100k_frames_cow 0.052841754 0.052726584 previous 0%
websocket_encode_50b_space_at_front_100k_frames 0.068058419 n/a n/a n/a%
websocket_encode_50b_space_at_front_10k_frames_masking 0.00856399 0.08473578 current -89%
websocket_encode_1kb_space_at_front_10k_frames 0.013216157 n/a n/a n/a%
websocket_encode_50b_no_space_at_front_100k_frames 0.06799402 n/a n/a n/a%
websocket_encode_1kb_no_space_at_front_10k_frames 0.012592638 n/a n/a n/a%
websocket_decode_125b_10k_frames 0.011285869 n/a n/a n/a%
websocket_decode_125b_with_a_masking_key_10k_frames 0.011597051 n/a n/a n/a%
websocket_decode_64kb_10k_frames 0.011745145 n/a n/a n/a%
websocket_decode_64kb_with_a_masking_key_10k_frames 0.011963872 n/a n/a n/a%
websocket_decode_64kb_+1_10k_frames 0.011689181 n/a n/a n/a%
websocket_decode_64kb_+1_with_a_masking_key_10k_frames 0.011813875 n/a n/a n/a%
circular_buffer_into_byte_buffer_1kb 0.041278388 0.04126336 previous 0%
circular_buffer_into_byte_buffer_1mb 0.082283792 0.082262485 previous 0%
byte_buffer_view_iterator_1mb 0.020486981 0.020483592 previous 0%
byte_buffer_view_contains_12mb 0.052866455 0.211968198 current -75%
byte_to_message_decoder_decode_many_small 0.034975305 0.174910624 current -80%
generate_10k_random_request_keys 0.090162299 0.090187434 current 0%
bytebuffer_rw_10_uint32s 0.028070422 0.296285854 current -90%
bytebuffer_multi_rw_10_uint32s 0.056512284 0.055609926 previous 1%
lock_1_thread_1M_ops 0.015953284 n/a n/a n/a%
lock_2_threads_1M_ops 0.063251877 n/a n/a n/a%
lock_4_threads_1M_ops 0.087359015 n/a n/a n/a%
lock_8_threads_1M_ops 0.091603643 n/a n/a n/a%
schedule_100k_tasks 0.071288435 n/a n/a n/a%
schedule_and_run_100k_tasks 0.417008028 n/a n/a n/a%
execute_100k_tasks 0.212708943 n/a n/a n/a%
bytebufferview_copy_to_array_100k_times_1kb 0.012637182 n/a n/a n/a%
circularbuffer_copy_to_array_10k_times_1kb 0.0210899 n/a n/a n/a%

significant differences found

@simonjbeaumont
Copy link
Contributor Author

@Lukasa well that was fun...

@Lukasa
Copy link
Contributor

Lukasa commented Mar 24, 2022

Great work though!

@Lukasa Lukasa enabled auto-merge (squash) March 24, 2022 14:23
@Lukasa Lukasa merged commit 1af9615 into apple:main Mar 24, 2022
@tomerd
Copy link
Member

tomerd commented Mar 24, 2022

@simonjbeaumont wrt to buffering, how is the performance test outputting? maybe it should flush more aggressively? the harness script is not doing anything to get in the way

@simonjbeaumont
Copy link
Contributor Author

@simonjbeaumont wrt to buffering, how is the performance test outputting? maybe it should flush more aggressively? the harness script is not doing anything to get in the way

Yep. Address in #2072.

@hassila
Copy link
Contributor

hassila commented Apr 14, 2022

the build timed out after 10 minutes.

Well FWICT each test is run 10 times and an average is taken:

A bit late to the party, but just want to mention that in my experience it might be useful to also drop e.g. the two worst runs before taking the average if you want better test stability.

It would still capture any systemic regressions and/or improvements introduced in the remaining samples while removing most ad-hoc outliers caused by other noise on the test machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-no-version-bump For PRs that when merged do not need a bump in version number.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants