Increase runtime of performance tests to O(10 ms) to increase SNR #2063

simonjbeaumont · 2022-03-09T12:08:31Z

Motivation:

The current running time of some of the performance benchmarks is O(1 ms) which can lead to low signal in the results. It would be better if these tests ran for O(10 ms).

Modifications:

Increase the iteration counts in all the tests within NIOPerformanceTester to aim for O(10 ms) measurements with the current code.
Exception: execute_100k_tasks which is still too quick but the CI cannot handle 1M tasks.

Result:

All benchmarks bodies should now take O(100 ms).

Notes:

The scaling factors were determined by looking at the current column from a recent report on #2059:

$ python3 scaling_factors.py perf-output
Loading file: perf-output...
Loaded 62 results.
Scaling tests so that runtime is O(10 ms)

name                                                               current_s    O(current_s)    O(desired_s)  scale_by
write_http_headers                                                 0.004221364  1E-3            1E-2          1E1
http_headers_canonical_form                                        0.085858433  1E-2            1E-2          1E0
http_headers_canonical_form_trimming_whitespace                    0.162971487  1E-1            1E-2          1E-1
http_headers_canonical_form_trimming_whitespace_from_short_string  0.148830606  1E-1            1E-2          1E-1
http_headers_canonical_form_trimming_whitespace_from_long_string   0.229662477  1E-1            1E-2          1E-1
bytebuffer_write_12MB_short_string_literals                        0.151619987  1E-1            1E-2          1E-1
bytebuffer_write_12MB_short_calculated_strings                     0.297283243  1E-1            1E-2          1E-1
bytebuffer_write_12MB_medium_string_literals                       0.058404425  1E-2            1E-2          1E0
bytebuffer_write_12MB_medium_calculated_strings                    0.15610171   1E-1            1E-2          1E-1
bytebuffer_write_12MB_large_calculated_strings                     0.142773805  1E-1            1E-2          1E-1
bytebuffer_lots_of_rw                                              0.433748776  1E-1            1E-2          1E-1
bytebuffer_write_http_response_ascii_only_as_string                0.039572042  1E-2            1E-2          1E0
bytebuffer_write_http_response_ascii_only_as_staticstring          0.030672688  1E-2            1E-2          1E0
bytebuffer_write_http_response_some_nonascii_as_string             0.039276667  1E-2            1E-2          1E0
bytebuffer_write_http_response_some_nonascii_as_staticstring       0.030416912  1E-2            1E-2          1E0
no-net_http1_10k_reqs_1_conn                                       0.105070526  1E-1            1E-2          1E-1
http1_10k_reqs_1_conn                                              0.589343003  1E-1            1E-2          1E-1
http1_10k_reqs_100_conns                                           0.581899912  1E-1            1E-2          1E-1
future_whenallsucceed_100k_immediately_succeeded_off_loop          0.070154794  1E-2            1E-2          1E0
future_whenallsucceed_100k_immediately_succeeded_on_loop           0.071029944  1E-2            1E-2          1E0
future_whenallsucceed_100k_deferred_off_loop                       0.331273236  1E-1            1E-2          1E-1
future_whenallsucceed_100k_deferred_on_loop                        0.121564379  1E-1            1E-2          1E-1
future_whenallcomplete_100k_immediately_succeeded_off_loop         0.030060621  1E-2            1E-2          1E0
future_whenallcomplete_100k_immediately_succeeded_on_loop          0.030060624  1E-2            1E-2          1E0
future_whenallcomplete_100k_deferred_off_loop                      0.258080927  1E-1            1E-2          1E-1
future_whenallcomplete_100k_deferred_on_loop                       0.061702521  1E-2            1E-2          1E0
future_reduce_10k_futures                                          0.03736344   1E-2            1E-2          1E0
future_reduce_into_10k_futures                                     0.035719515  1E-2            1E-2          1E0
channel_pipeline_1m_events                                         0.095089666  1E-2            1E-2          1E0
websocket_encode_50b_space_at_front_1m_frames_cow                  0.491598504  1E-1            1E-2          1E-1
websocket_encode_50b_space_at_front_1m_frames_cow_masking          0.065660544  1E-2            1E-2          1E0
websocket_encode_1kb_space_at_front_100k_frames_cow                0.05179348   1E-2            1E-2          1E0
websocket_encode_50b_no_space_at_front_1m_frames_cow               0.492513012  1E-1            1E-2          1E-1
websocket_encode_1kb_no_space_at_front_100k_frames_cow             0.051848542  1E-2            1E-2          1E0
websocket_encode_50b_space_at_front_10k_frames                     0.006663117  1E-3            1E-2          1E1
websocket_encode_50b_space_at_front_10k_frames_masking             0.083932953  1E-2            1E-2          1E0
websocket_encode_1kb_space_at_front_1k_frames                      0.001293064  1E-3            1E-2          1E1
websocket_encode_50b_no_space_at_front_10k_frames                  0.006696183  1E-3            1E-2          1E1
websocket_encode_1kb_no_space_at_front_1k_frames                   0.001222385  1E-3            1E-2          1E1
websocket_decode_125b_100k_frames                                  0.115220387  1E-1            1E-2          1E-1
websocket_decode_125b_with_a_masking_key_100k_frames               0.118480726  1E-1            1E-2          1E-1
websocket_decode_64kb_100k_frames                                  0.118017645  1E-1            1E-2          1E-1
websocket_decode_64kb_with_a_masking_key_100k_frames               0.120845408  1E-1            1E-2          1E-1
websocket_decode_64kb_+1_100k_frames                               0.117700529  1E-1            1E-2          1E-1
websocket_decode_64kb_+1_with_a_masking_key_100k_frames            0.12086373   1E-1            1E-2          1E-1
circular_buffer_into_byte_buffer_1kb                               0.036630641  1E-2            1E-2          1E0
circular_buffer_into_byte_buffer_1mb                               0.07287621   1E-2            1E-2          1E0
byte_buffer_view_iterator_1mb                                      0.020055745  1E-2            1E-2          1E0
byte_buffer_view_contains_12mb                                     0.207476717  1E-1            1E-2          1E-1
byte_to_message_decoder_decode_many_small                          0.170375659  1E-1            1E-2          1E-1
generate_10k_random_request_keys                                   0.089314066  1E-2            1E-2          1E0
bytebuffer_rw_10_uint32s                                           0.28914708   1E-1            1E-2          1E-1
bytebuffer_multi_rw_10_uint32s                                     0.05438756   1E-2            1E-2          1E0
lock_1_thread_10M_ops                                              0.155880855  1E-1            1E-2          1E-1
lock_2_threads_10M_ops                                             0.872071354  1E-1            1E-2          1E-1
lock_4_threads_10M_ops                                             0.944893321  1E-1            1E-2          1E-1
lock_8_threads_10M_ops                                             0.905209719  1E-1            1E-2          1E-1
schedule_10000_tasks                                               0.005108103  1E-3            1E-2          1E1
schedule_and_run_10000_tasks                                       0.031540699  1E-2            1E-2          1E0
execute_10000                                                      0.016732096  1E-2            1E-2          1E0
bytebufferview_copy_to_array_1000_times_1kb                        0.000123618  1E-4            1E-2          1E2
circularbuffer_copy_to_array_1000_times_1kb                        0.001885264  1E-3            1E-2          1E1

Lukasa · 2022-03-09T12:26:04Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-09T12:37:51Z

!!! Couldn't read commit file !!!

Lukasa · 2022-03-09T12:39:04Z

Huh, I wonder if we overshot: the build timed out after 10 minutes.

Lukasa · 2022-03-09T12:39:47Z

Ah, I see why: the output numbers in the chart aren't ns, they're fractional seconds. These scaling factors are all about 10^9 too high.

simonjbeaumont · 2022-03-09T13:36:24Z

Ah, I see why: the output numbers in the chart aren't ns, they're fractional seconds. These scaling factors are all about 10^9 too high.

🤦 Should have traced my way through to the outputs, I just read this which deals in nanoseconds.

simonjbeaumont · 2022-03-09T14:45:19Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-09T14:52:54Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-09T14:56:27Z

@Lukasa I have reworked the script to work out a scaling factor with appropriate units now and updated the PR description with its output. I've also adjusted all the iterations to be in line with what we're looking for: a O(100 ms) hot loop.

the build timed out after 10 minutes.

Well FWICT each test is run 10 times and an average is taken:

14:51:57 measuring: websocket_decode_64kb_with_a_masking_key_100k_frames: 0.125121872, 0.124363388, 0.124045883, 0.12475775, 0.124550551, 0.123625381, 0.124706225, 0.124888407, 0.124184521, 0.124370663,

There are 62 tests, each of which taking O(100 ms), so upper bounded by 1 second. That means we could slightly overshoot 10 mins in the worst case just with the hot loops. We'd need some headroom to allow for warmups and compile time.

The current run got this far...

14:51:57 measuring: websocket_decode_64kb_+1_100k_frames: 0.12155783, 0.121102395, 0.121057444, 0.120217503, 0.121161953, 0.12120Build step 'Execute shell' marked build as failure

...which was test 44/62.

Looking at the output, it seems that we have managed to get O(100 ms) for all the bodies though, so if we want this, I guess we should increase the CI timeout?

Lukasa · 2022-03-09T15:37:56Z

Hmm, does that math check out? If each test run is ~100ms, then 10 test runs is 1 second, and 62 tests adds up to 62 seconds, a.k.a 1 minute. That shouldn't be a problem, should it?

Lukasa · 2022-03-09T15:40:14Z

Ah, but many of these are closer to 1s per run. Can we drop an order of magnitude off them all, so none of them go above 100ms? That should make this fly by.

simonjbeaumont · 2022-03-09T15:54:19Z

Hmm, does that math check out? If each test run is ~100ms, then 10 test runs is 1 second, and 62 tests adds up to 62 seconds, a.k.a 1 minute. That shouldn't be a problem, should it?

The maths checks out with O(100ms) rather than ~100ms. There's an order of magnitude gap in the worst case. I was thinking we were aiming for "hundreds of ms".

Ah, but many of these are closer to 1s per run. Can we drop an order of magnitude off them all, so none of them go above 100ms? That should make this fly by.

OK... do you want to include reducing existing ones that currently take many hundreds of ms? E.g. the locking ones?

If so, I can rinse and repeat this exercise to shoot for "tens of ms".

Lukasa · 2022-03-10T10:03:06Z

Yeah, I think we should probably shoot for tens of ms.

simonjbeaumont · 2022-03-16T21:00:30Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-16T21:03:39Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-16T21:06:17Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-16T21:09:24Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-16T21:11:19Z

!!! Couldn't read commit file !!!

@Lukasa what's the story here? I thought maybe this was because it was out of date with main but I merged in main and still get the same error.

Lukasa · 2022-03-17T07:23:59Z

The run seems to have silently failed. Let's ask it again, but if it happens again we'll need to investigate why the perf runner failed to complete.

Lukasa · 2022-03-17T07:24:14Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-17T07:27:19Z

!!! Couldn't read commit file !!!

Lukasa · 2022-03-17T07:35:33Z

Yeah that implies an issue. I'll try to reproduce it.

Lukasa · 2022-03-17T07:50:45Z

Hmm, I don't reproduce this locally running the perf tests on 5.6. I'll try 5.4 when I start my work day and move to my Intel machine.

simonjbeaumont · 2022-03-24T10:11:23Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-24T10:14:30Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-24T10:16:26Z

Summary of findings so far:

Change	CI	Logs
Changing the number of iterations and the labels	❌	https://ci.swiftserver.group/job/swift-nio2-performance-prb/115/console
Reverting everything (so, just `main`)	✅	https://ci.swiftserver.group/job/swift-nio2-performance-prb/116/console
Revert the revert (back to proposed PR) but disabling one test where CI stops printing	❌	https://ci.swiftserver.group/job/swift-nio2-performance-prb/117/console
Re-enable all tests and update just the labels to match those on `main`	❌	https://ci.swiftserver.group/job/swift-nio2-performance-prb/118/console
Update labels again, but disable the test after the one that was printing when CI died	❌	https://ci.swiftserver.group/job/swift-nio2-performance-prb/119/console
Binary chop, disable tests 32...62	✅	https://ci.swiftserver.group/job/swift-nio2-performance-prb/120/console
Binary chop, disable tests 48...62	✅	https://ci.swiftserver.group/job/swift-nio2-performance-prb/121/console
Binary chop, disable tests 56...62	✅	https://ci.swiftserver.group/job/swift-nio2-performance-prb/122/console
Binary chop, disable tests 59...62	❌	https://ci.swiftserver.group/job/swift-nio2-performance-prb/123/console

So a test between 56...58

simonjbeaumont · 2022-03-24T10:23:17Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-24T10:25:38Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-24T10:48:52Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-24T10:51:11Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-24T10:51:12Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-24T10:53:06Z

performance report

build id: 126

timestamp: Thu Mar 24 10:53:02 UTC 2022

results

name	min	max	mean	std
lock_8_threads_1M_ops	0.080995249	0.089466475	0.08518604020000001	0.0026077801603845954

comparison

name	current	previous	winner	diff
lock_8_threads_1M_ops	0.080995249	n/a	n/a	n/a%

Signed-off-by: Si Beaumont <beaumont@apple.com>

simonjbeaumont · 2022-03-24T11:02:17Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-24T11:04:34Z

!!! Couldn't read commit file !!!

simonjbeaumont · 2022-03-24T11:08:25Z

We have a winner... test 57: schedule_1m_tasks. Here is the CI run for just this test enabled: https://ci.swiftserver.group/job/swift-nio2-performance-prb/127/

Looking at the change between this PR and main it looks like it was 10k and now it's 1m. I guess we blow through some resource limits on the Jenkins Docker that we don't locally?

$ git show origin/main:Sources/NIOPerformanceTester/main.swift | grep -o "desc: .*" | sed 's|.*\"\(.*\)\".*|\1|g' | grep schedule
schedule_10000_tasks
schedule_and_run_10000_tasks

@Lukasa looking at the report it looks like this took O(1ms) on main. The thing to note here is that the existing label isn't correct. shedule_10000_tasks actually called into a helper type which had a hard-coded 100000 (so 100k, not 10k). This PR made that a parameter to the constructor and then also bumped it to 1m (just one order of magnitude). I guess this it too many tasks to have lying around only scheduled.

name                                                               current_s    O(current_s)    O(desired_s)  scale_by
...
schedule_10000_tasks                                               0.005108103  1E-3            1E-2          1E1

Here is ScheduleBenchmark for reference:

            for _ in 0..<self.numTasks {
                self.loop.scheduleTask(in: .hours(1)) {
                    counter &+= 1
                }
            }

Lukasa · 2022-03-24T11:46:58Z

Definitely seems like it's busting a limit, though I'm a bit surprised to see that. I suspect it's going to be memory. Want to drop the task count back down to 100k?

Signed-off-by: Si Beaumont <beaumont@apple.com>

simonjbeaumont · 2022-03-24T14:13:18Z

@swift-nio-bot test perf please

swift-server-bot · 2022-03-24T14:16:13Z

performance report

build id: 128

timestamp: Thu Mar 24 14:16:08 UTC 2022

results

name	min	max	mean	std
write_http_headers	0.041745234	0.042039137	0.041842891199999996	8.855224061761453e-05
http_headers_canonical_form	0.087005866	0.087605933	0.08723090089999999	0.0002552183074156317
http_headers_canonical_form_trimming_whitespace	0.016677892	0.017203268	0.0167525769	0.00015895641292886306
http_headers_canonical_form_trimming_whitespace_from_short_string	0.015255643	0.015783383	0.0153131593	0.00016531229863507208
http_headers_canonical_form_trimming_whitespace_from_long_string	0.023606377	0.024123394	0.023675499699999998	0.00015840318310567832
bytebuffer_write_12MB_short_string_literals	0.092980217	0.098861777	0.09390248729999999	0.0017913698287326278
bytebuffer_write_12MB_short_calculated_strings	0.063091556	0.063650439	0.0632849672	0.00021201030784154503
bytebuffer_write_12MB_medium_string_literals	0.599583337	0.612481709	0.6039419099	0.004658535176707978
bytebuffer_write_12MB_medium_calculated_strings	0.080329622	0.082413528	0.0811817408	0.0007271972215368009
bytebuffer_write_12MB_large_calculated_strings	0.142810461	0.143793308	0.1433129236	0.00029668302315778987
bytebuffer_lots_of_rw	0.042762346	0.043332873	0.0429970955	0.0002581998649402568
bytebuffer_write_http_response_ascii_only_as_string	0.039854021	0.04042547	0.0400403196	0.00021665154340563665
bytebuffer_write_http_response_ascii_only_as_staticstring	0.031400039	0.031955898	0.0315446714	0.00021418305308839724
bytebuffer_write_http_response_some_nonascii_as_string	0.03994209	0.040613909	0.040127891900000004	0.00025937037050709693
bytebuffer_write_http_response_some_nonascii_as_staticstring	0.031381504	0.032015003	0.031493262499999994	0.0001866785770023203
no-net_http1_1k_reqs_1_conn	0.01062603	0.010768935	0.0107127077	4.278940388058201e-05
http1_1k_reqs_1_conn	0.060817924	0.062090177	0.061217355200000004	0.00039857613512374085
http1_1k_reqs_100_conns	0.08865786	0.089659995	0.08898618280000001	0.00035705781481180134
future_whenallsucceed_100k_immediately_succeeded_off_loop	0.073131634	0.07397148	0.0734776492	0.0002524014578681261
future_whenallsucceed_100k_immediately_succeeded_on_loop	0.07422346	0.081301693	0.0753476136	0.0021131154420498113
future_whenallsucceed_10k_deferred_off_loop	0.029456798	0.029913547	0.0296353663	0.000156454617944736
future_whenallsucceed_10k_deferred_on_loop	0.012483152	0.012941087	0.0125658354	0.00013467757683049294
future_whenallcomplete_100k_immediately_succeeded_off_loop	0.031372871	0.031911476	0.0315591445	0.0002083461548733269
future_whenallcomplete_100k_immediately_succeeded_on_loop	0.031529132	0.032074252	0.031787526499999996	0.00017320872803467172
future_whenallcomplete_10k_deferred_off_loop	0.022117573	0.02271911	0.022266883100000003	0.0002090366475952909
future_whenallcomplete_100k_deferred_on_loop	0.065622477	0.067780216	0.0662131356	0.0006424761269906736
future_reduce_10k_futures	0.037678581	0.038466282	0.0380574215	0.00027577570529981553
future_reduce_into_10k_futures	0.036206727	0.036821312	0.0364869019	0.00019804755616683489
channel_pipeline_1m_events	0.097171067	0.097649212	0.09728125909999999	0.0001400198219963081
websocket_encode_50b_space_at_front_100k_frames_cow	0.049856036	0.050301141	0.0499917963	0.00017410260824330686
websocket_encode_50b_space_at_front_1m_frames_cow_masking	0.673284879	0.680615323	0.6744029078	0.002199224521845175
websocket_encode_1kb_space_at_front_1m_frames_cow	0.527848584	0.5290743	0.5284040271	0.00030418220760861737
websocket_encode_50b_no_space_at_front_100k_frames_cow	0.049495636	0.04998959	0.0496108235	0.00019027713290078912
websocket_encode_1kb_no_space_at_front_100k_frames_cow	0.052841754	0.053290318	0.0529483217	0.00017670958961845323
websocket_encode_50b_space_at_front_100k_frames	0.068058419	0.06852406	0.06821071819999999	0.00020589913384729507
websocket_encode_50b_space_at_front_10k_frames_masking	0.00856399	0.009089216	0.0086343702	0.00016017833011768153
websocket_encode_1kb_space_at_front_10k_frames	0.013216157	0.013245655	0.0132245848	8.609553605669164e-06
websocket_encode_50b_no_space_at_front_100k_frames	0.06799402	0.068494592	0.0681974617	0.00020646785427117686
websocket_encode_1kb_no_space_at_front_10k_frames	0.012592638	0.013025949	0.0126438516	0.00013459675290965328
websocket_decode_125b_10k_frames	0.011285869	0.011463937	0.011398288400000001	5.119622906729649e-05
websocket_decode_125b_with_a_masking_key_10k_frames	0.011597051	0.012116748	0.011735956	0.00016173854499160065
websocket_decode_64kb_10k_frames	0.011745145	0.012228933	0.0118485524	0.00014296304842332142
websocket_decode_64kb_with_a_masking_key_10k_frames	0.011963872	0.012131804	0.0120366483	5.932539682884514e-05
websocket_decode_64kb_+1_10k_frames	0.011689181	0.012227598	0.011828746800000001	0.00014893028169344485
websocket_decode_64kb_+1_with_a_masking_key_10k_frames	0.011813875	0.012487327	0.012025444	0.00018952811332828098
circular_buffer_into_byte_buffer_1kb	0.041278388	0.041731537	0.041376412200000004	0.000185249724507098
circular_buffer_into_byte_buffer_1mb	0.082283792	0.082746566	0.0825256969	0.00022641760773032437
byte_buffer_view_iterator_1mb	0.020486981	0.020930844	0.020543099500000002	0.00013664342003530373
byte_buffer_view_contains_12mb	0.052866455	0.053448	0.05307531320000001	0.00023026167666316855
byte_to_message_decoder_decode_many_small	0.034975305	0.035490973	0.035095001800000004	0.0002056988989371494
generate_10k_random_request_keys	0.090162299	0.090488192	0.0902962467	0.00011040781287576918
bytebuffer_rw_10_uint32s	0.028070422	0.039560411	0.030636961400000003	0.0032087761888174807
bytebuffer_multi_rw_10_uint32s	0.056512284	0.057893943	0.0570053979	0.0003795496553298256
lock_1_thread_1M_ops	0.015953284	0.016050557	0.0159750413	2.751573437568654e-05
lock_2_threads_1M_ops	0.063251877	0.097395233	0.0858307654	0.010911102621343751
lock_4_threads_1M_ops	0.087359015	0.100329256	0.0938935453	0.004518064647022096
lock_8_threads_1M_ops	0.091603643	0.101952349	0.0974102218	0.003327647989783734
schedule_100k_tasks	0.071288435	0.110307442	0.078933878	0.012159959937450648
schedule_and_run_100k_tasks	0.417008028	0.43110721	0.4247691649	0.004228343042565229
execute_100k_tasks	0.212708943	0.214618922	0.21329781360000002	0.0007071605601373823
bytebufferview_copy_to_array_100k_times_1kb	0.012637182	0.012675838	0.0126466622	1.1173166066766976e-05
circularbuffer_copy_to_array_10k_times_1kb	0.0210899	0.02152909	0.021139799600000002	0.0001372085256180049

comparison

name	current	previous	winner	diff
write_http_headers	0.041745234	0.004156324	previous	904%
http_headers_canonical_form	0.087005866	0.087176781	current	0%
http_headers_canonical_form_trimming_whitespace	0.016677892	0.167224441	current	-90%
http_headers_canonical_form_trimming_whitespace_from_short_string	0.015255643	0.152840232	current	-90%
http_headers_canonical_form_trimming_whitespace_from_long_string	0.023606377	0.236946146	current	-90%
bytebuffer_write_12MB_short_string_literals	0.092980217	0.15626777	current	-40%
bytebuffer_write_12MB_short_calculated_strings	0.063091556	0.306472959	current	-79%
bytebuffer_write_12MB_medium_string_literals	0.599583337	0.060320819	previous	893%
bytebuffer_write_12MB_medium_calculated_strings	0.080329622	0.160524061	current	-49%
bytebuffer_write_12MB_large_calculated_strings	0.142810461	0.145666986	current	-1%
bytebuffer_lots_of_rw	0.042762346	0.450269309	current	-90%
bytebuffer_write_http_response_ascii_only_as_string	0.039854021	0.039765483	previous	0%
bytebuffer_write_http_response_ascii_only_as_staticstring	0.031400039	0.031767899	current	-1%
bytebuffer_write_http_response_some_nonascii_as_string	0.03994209	0.039992365	current	0%
bytebuffer_write_http_response_some_nonascii_as_staticstring	0.031381504	0.031762613	current	-1%
no-net_http1_1k_reqs_1_conn	0.01062603	n/a	n/a	n/a%
http1_1k_reqs_1_conn	0.060817924	n/a	n/a	n/a%
http1_1k_reqs_100_conns	0.08865786	n/a	n/a	n/a%
future_whenallsucceed_100k_immediately_succeeded_off_loop	0.073131634	0.078185913	current	-6%
future_whenallsucceed_100k_immediately_succeeded_on_loop	0.07422346	0.079444721	current	-6%
future_whenallsucceed_10k_deferred_off_loop	0.029456798	n/a	n/a	n/a%
future_whenallsucceed_10k_deferred_on_loop	0.012483152	n/a	n/a	n/a%
future_whenallcomplete_100k_immediately_succeeded_off_loop	0.031372871	0.031299125	previous	0%
future_whenallcomplete_100k_immediately_succeeded_on_loop	0.031529132	0.031184463	previous	1%
future_whenallcomplete_10k_deferred_off_loop	0.022117573	n/a	n/a	n/a%
future_whenallcomplete_100k_deferred_on_loop	0.065622477	0.064373813	previous	1%
future_reduce_10k_futures	0.037678581	0.038389384	current	-1%
future_reduce_into_10k_futures	0.036206727	0.036942603	current	-1%
channel_pipeline_1m_events	0.097171067	0.097150271	previous	0%
websocket_encode_50b_space_at_front_100k_frames_cow	0.049856036	n/a	n/a	n/a%
websocket_encode_50b_space_at_front_1m_frames_cow_masking	0.673284879	0.067069988	previous	903%
websocket_encode_1kb_space_at_front_1m_frames_cow	0.527848584	n/a	n/a	n/a%
websocket_encode_50b_no_space_at_front_100k_frames_cow	0.049495636	n/a	n/a	n/a%
websocket_encode_1kb_no_space_at_front_100k_frames_cow	0.052841754	0.052726584	previous	0%
websocket_encode_50b_space_at_front_100k_frames	0.068058419	n/a	n/a	n/a%
websocket_encode_50b_space_at_front_10k_frames_masking	0.00856399	0.08473578	current	-89%
websocket_encode_1kb_space_at_front_10k_frames	0.013216157	n/a	n/a	n/a%
websocket_encode_50b_no_space_at_front_100k_frames	0.06799402	n/a	n/a	n/a%
websocket_encode_1kb_no_space_at_front_10k_frames	0.012592638	n/a	n/a	n/a%
websocket_decode_125b_10k_frames	0.011285869	n/a	n/a	n/a%
websocket_decode_125b_with_a_masking_key_10k_frames	0.011597051	n/a	n/a	n/a%
websocket_decode_64kb_10k_frames	0.011745145	n/a	n/a	n/a%
websocket_decode_64kb_with_a_masking_key_10k_frames	0.011963872	n/a	n/a	n/a%
websocket_decode_64kb_+1_10k_frames	0.011689181	n/a	n/a	n/a%
websocket_decode_64kb_+1_with_a_masking_key_10k_frames	0.011813875	n/a	n/a	n/a%
circular_buffer_into_byte_buffer_1kb	0.041278388	0.04126336	previous	0%
circular_buffer_into_byte_buffer_1mb	0.082283792	0.082262485	previous	0%
byte_buffer_view_iterator_1mb	0.020486981	0.020483592	previous	0%
byte_buffer_view_contains_12mb	0.052866455	0.211968198	current	-75%
byte_to_message_decoder_decode_many_small	0.034975305	0.174910624	current	-80%
generate_10k_random_request_keys	0.090162299	0.090187434	current	0%
bytebuffer_rw_10_uint32s	0.028070422	0.296285854	current	-90%
bytebuffer_multi_rw_10_uint32s	0.056512284	0.055609926	previous	1%
lock_1_thread_1M_ops	0.015953284	n/a	n/a	n/a%
lock_2_threads_1M_ops	0.063251877	n/a	n/a	n/a%
lock_4_threads_1M_ops	0.087359015	n/a	n/a	n/a%
lock_8_threads_1M_ops	0.091603643	n/a	n/a	n/a%
schedule_100k_tasks	0.071288435	n/a	n/a	n/a%
schedule_and_run_100k_tasks	0.417008028	n/a	n/a	n/a%
execute_100k_tasks	0.212708943	n/a	n/a	n/a%
bytebufferview_copy_to_array_100k_times_1kb	0.012637182	n/a	n/a	n/a%
circularbuffer_copy_to_array_10k_times_1kb	0.0210899	n/a	n/a	n/a%

significant differences found

simonjbeaumont · 2022-03-24T14:17:35Z

@Lukasa well that was fun...

Lukasa · 2022-03-24T14:21:26Z

Great work though!

tomerd · 2022-03-24T16:48:09Z

@simonjbeaumont wrt to buffering, how is the performance test outputting? maybe it should flush more aggressively? the harness script is not doing anything to get in the way

simonjbeaumont · 2022-03-25T12:33:24Z

@simonjbeaumont wrt to buffering, how is the performance test outputting? maybe it should flush more aggressively? the harness script is not doing anything to get in the way

Yep. Address in #2072.

hassila · 2022-04-14T17:02:53Z

the build timed out after 10 minutes.

Well FWICT each test is run 10 times and an average is taken:

A bit late to the party, but just want to mention that in my experience it might be useful to also drop e.g. the two worst runs before taking the average if you want better test stability.

It would still capture any systemic regressions and/or improvements introduced in the remaining samples while removing most ad-hoc outliers caused by other noise on the test machine.

simonjbeaumont force-pushed the run-longer-benchmarks-to-increase-snr branch 2 times, most recently from 1260844 to a75b7c9 Compare March 9, 2022 12:21

Lukasa added the needs-no-version-bump For PRs that when merged do not need a bump in version number. label Mar 9, 2022

simonjbeaumont force-pushed the run-longer-benchmarks-to-increase-snr branch from a75b7c9 to 9d3eb26 Compare March 9, 2022 14:44

Increase runtime of performance tests to O(10 ms) to increase SNR

079cc84

simonjbeaumont force-pushed the run-longer-benchmarks-to-increase-snr branch from 9d3eb26 to 079cc84 Compare March 16, 2022 20:59

Merge branch 'main' into run-longer-benchmarks-to-increase-snr

8e763dd

simonjbeaumont changed the title ~~Increase runtime of performance tests to O(100 ms) to increase SNR~~ Increase runtime of performance tests to O(10 ms) to increase SNR Mar 16, 2022

ENABLE ONLY tests 56...58

add1061

ENABLE ONLY test 56

a6fa2f3

ENABLE ONLY test 57

128ad3c

Signed-off-by: Si Beaumont <beaumont@apple.com>

Enable all tests but drop tasks from 1M to 100k

a8b4207

Signed-off-by: Si Beaumont <beaumont@apple.com>

Lukasa approved these changes Mar 24, 2022

View reviewed changes

Lukasa enabled auto-merge (squash) March 24, 2022 14:23

Lukasa merged commit 1af9615 into apple:main Mar 24, 2022

simonjbeaumont mentioned this pull request Mar 24, 2022

Improve the performance of copying CircularBuffer #2059

Merged

simonjbeaumont mentioned this pull request Mar 25, 2022

Use unbuffered IO for stdout in NIOPerformanceTester #2072

Merged

simonjbeaumont mentioned this pull request May 17, 2022

empty: Checking out the noise on the perf results #2120

Closed

Increase runtime of performance tests to O(10 ms) to increase SNR #2063

Increase runtime of performance tests to O(10 ms) to increase SNR #2063

Conversation

simonjbeaumont commented Mar 9, 2022 • edited Loading

Motivation:

Modifications:

Result:

Notes:

Lukasa commented Mar 9, 2022

swift-server-bot commented Mar 9, 2022

Lukasa commented Mar 9, 2022

Lukasa commented Mar 9, 2022

simonjbeaumont commented Mar 9, 2022

simonjbeaumont commented Mar 9, 2022

swift-server-bot commented Mar 9, 2022

simonjbeaumont commented Mar 9, 2022 • edited Loading

Lukasa commented Mar 9, 2022

Lukasa commented Mar 9, 2022

simonjbeaumont commented Mar 9, 2022 • edited Loading

Lukasa commented Mar 10, 2022

simonjbeaumont commented Mar 16, 2022

swift-server-bot commented Mar 16, 2022

simonjbeaumont commented Mar 16, 2022

swift-server-bot commented Mar 16, 2022

simonjbeaumont commented Mar 16, 2022

Lukasa commented Mar 17, 2022

Lukasa commented Mar 17, 2022

swift-server-bot commented Mar 17, 2022

Lukasa commented Mar 17, 2022

Lukasa commented Mar 17, 2022

simonjbeaumont commented Mar 24, 2022

swift-server-bot commented Mar 24, 2022

simonjbeaumont commented Mar 24, 2022 • edited Loading

simonjbeaumont commented Mar 24, 2022

swift-server-bot commented Mar 24, 2022

simonjbeaumont commented Mar 24, 2022

swift-server-bot commented Mar 24, 2022

simonjbeaumont commented Mar 24, 2022

swift-server-bot commented Mar 24, 2022

performance report

results

comparison

simonjbeaumont commented Mar 24, 2022

swift-server-bot commented Mar 24, 2022

simonjbeaumont commented Mar 24, 2022 • edited Loading

Lukasa commented Mar 24, 2022

simonjbeaumont commented Mar 24, 2022

swift-server-bot commented Mar 24, 2022

performance report

results

comparison

simonjbeaumont commented Mar 24, 2022

Lukasa commented Mar 24, 2022

tomerd commented Mar 24, 2022

simonjbeaumont commented Mar 25, 2022

hassila commented Apr 14, 2022

simonjbeaumont commented Mar 9, 2022 •

edited

Loading

simonjbeaumont commented Mar 9, 2022 •

edited

Loading

simonjbeaumont commented Mar 9, 2022 •

edited

Loading

simonjbeaumont commented Mar 24, 2022 •

edited

Loading

simonjbeaumont commented Mar 24, 2022 •

edited

Loading