Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call the cancellationTask of Scheduled directly #2011

Merged

Conversation

FranzBusch
Copy link
Member

Motivation:

In my previous PR #2010, I was able to decrease the allocations for both scheduleTask and execute by 1 already. Gladly, there are no more allocations left to remove from execute now; however, scheduleTask still provides a couple of allocations that we can try to get rid of.

Modifications:

This PR removes two allocations inside Scheduled where we were using the passed in EventLoopPromise to call the cancellationTask once the EventLoopFuture of the promise fails. This requires two allocations inside whenFailure and inside _whenComplete. However, since we are passing the cancellationTask to Scheduled anyhow and Scheduled is also the one that is failing the promise from the cancel() method. We can just go ahead and store the cancellationTask inside Scheduled and call it from the cancel() method directly instead of going through the future.

Importantly, here is that the cancellationTask is not allowed to retain the ScheduledTask.task otherwise we would change the semantics and retain the ScheduledTask.task longer than necessary. My previous PR #2010, already implemented the work to get rid of the retain from the cancellationTask closure. So we are good to go ahead and store the cancellationTask inside Scheduled now

Result:

scheduleTask requires two fewer allocations

@FranzBusch FranzBusch added patch-version-bump-only For PRs that when merged will only cause a bump of the patch version, ie. 1.0.x -> 1.0.(x+1) performance labels Dec 14, 2021
@FranzBusch FranzBusch force-pushed the feature/scheduling-task-allocations-part-2 branch from 9d369f7 to 7a185d4 Compare December 15, 2021 08:31
@FranzBusch
Copy link
Member Author

@swift-nio-bot test perf please

@apple apple deleted a comment from swift-server-bot Dec 15, 2021
@swift-server-bot
Copy link

performance report

build id: 91

timestamp: Wed Dec 15 08:39:45 UTC 2021

results

nameminmaxmeanstd
write_http_headers 0.004165186 0.004316102 0.004205813599999999 4.972365768301604e-05
http_headers_canonical_form 0.087457721 0.088071196 0.0877237837 0.0002705434812188437
http_headers_canonical_form_trimming_whitespace 0.166121134 0.166680982 0.1665388149 0.00015767080862385155
http_headers_canonical_form_trimming_whitespace_from_short_string 0.151979483 0.154099416 0.1526977324 0.0006185987199773342
http_headers_canonical_form_trimming_whitespace_from_long_string 0.236236808 0.236768887 0.23639276429999997 0.00019594956756554649
bytebuffer_write_12MB_short_string_literals 0.540474513 0.545257935 0.5418411197 0.0013320743162926992
bytebuffer_write_12MB_short_calculated_strings 0.537279177 0.539644905 0.5386034690000001 0.0008426689587247195
bytebuffer_write_12MB_medium_string_literals 0.178940646 0.18346667 0.18105625679999998 0.0016036284286611994
bytebuffer_write_12MB_medium_calculated_strings 0.226212733 0.226825106 0.2264438351 0.00021107866236939996
bytebuffer_write_12MB_large_calculated_strings 0.146095844 0.149898543 0.1471552027 0.0010634316578047407
bytebuffer_lots_of_rw 0.448634338 0.46087472 0.4515590671 0.004139316098232692
bytebuffer_write_http_response_ascii_only_as_string 0.041193806 0.041741625 0.0413561668 0.00020375569562907639
bytebuffer_write_http_response_ascii_only_as_staticstring 0.032163749 0.03297106 0.0323794628 0.0003120143473146357
bytebuffer_write_http_response_some_nonascii_as_string 0.041158026 0.041828929 0.041341088 0.0002486148754372249
bytebuffer_write_http_response_some_nonascii_as_staticstring 0.03263154 0.033141886 0.0327959268 0.00016925316739383188
no-net_http1_10k_reqs_1_conn 0.107124683 0.108500865 0.10786284589999999 0.0005065118318728698
http1_10k_reqs_1_conn 0.6044588 0.608852191 0.6070550274000001 0.0013232878087152847
http1_10k_reqs_100_conns 0.594144737 0.60021081 0.5971973462 0.001937806164482147
future_whenallsucceed_100k_immediately_succeeded_off_loop 0.073947586 0.075233846 0.0745224867 0.0004166419022686152
future_whenallsucceed_100k_immediately_succeeded_on_loop 0.074909507 0.082296651 0.0763059716 0.0021486378566815136
future_whenallsucceed_100k_deferred_off_loop 0.23397515 0.236144893 0.2350038666 0.0007390062952409997
future_whenallsucceed_100k_deferred_on_loop 0.12700891 0.129447187 0.1282036733 0.0006280644127766304
future_whenallcomplete_100k_immediately_succeeded_off_loop 0.031679472 0.032316968 0.0319783031 0.0002137628023084629
future_whenallcomplete_100k_immediately_succeeded_on_loop 0.031827899 0.032655857 0.032158638 0.0002816563098581132
future_whenallcomplete_100k_deferred_off_loop 0.161233211 0.167030165 0.1634776125 0.0018672386151479463
future_whenallcomplete_100k_deferred_on_loop 0.06422727 0.068721744 0.06501174550000001 0.001384893059563505
future_reduce_10k_futures 0.03847232 0.039021021 0.038685281599999996 0.0001974767537183968
future_reduce_into_10k_futures 0.038142113 0.039088333 0.0385365426 0.0002798000542626437
channel_pipeline_1m_events 0.097132599 0.097257362 0.0971869474 5.200581291483917e-05
websocket_encode_50b_space_at_front_1m_frames_cow 0.502184734 0.503026445 0.5025347447999999 0.0002860062319013371
websocket_encode_50b_space_at_front_1m_frames_cow_masking 0.067009732 0.067543528 0.06722382660000001 0.0002131810134511146
websocket_encode_1kb_space_at_front_100k_frames_cow 0.053147541 0.053606139 0.053292238199999994 0.00019440423195085096
websocket_encode_50b_no_space_at_front_1m_frames_cow 0.504419647 0.504936578 0.5047184459999999 0.00020898652979400102
websocket_encode_1kb_no_space_at_front_100k_frames_cow 0.053156219 0.053634055 0.053312212000000005 0.00021803388041515142
websocket_encode_50b_space_at_front_10k_frames 0.006881674 0.006910102 0.0068890805 8.684369346385765e-06
websocket_encode_50b_space_at_front_10k_frames_masking 0.082570846 0.085146473 0.0835459514 0.000806860764146826
websocket_encode_1kb_space_at_front_1k_frames 0.000777961 0.000788685 0.0007832590999999999 3.730062807037517e-06
websocket_encode_50b_no_space_at_front_10k_frames 0.006641237 0.006671262 0.0066527808999999995 9.03910632812287e-06
websocket_encode_1kb_no_space_at_front_1k_frames 0.000716782 0.001075653 0.0007560656 0.00011232035006692043
websocket_decode_125b_100k_frames 0.113528988 0.114493326 0.1139923714 0.00028101412228894
websocket_decode_125b_with_a_masking_key_100k_frames 0.11629334 0.11691935 0.1166196217 0.000254701726260369
websocket_decode_64kb_100k_frames 0.116269648 0.116926056 0.1166045953 0.0002624512423242972
websocket_decode_64kb_with_a_masking_key_100k_frames 0.119078699 0.120179468 0.1195042052 0.0003365496534331228
websocket_decode_64kb_+1_100k_frames 0.116510436 0.117177447 0.11686523310000001 0.0002761496140430045
websocket_decode_64kb_+1_with_a_masking_key_100k_frames 0.119406705 0.120069331 0.1197327202 0.0002580271076645525
circular_buffer_into_byte_buffer_1kb 0.041228017 0.041656281 0.0413319135 0.00017031898820562804
circular_buffer_into_byte_buffer_1mb 0.082253037 0.08272205 0.0824585387 0.0002103572135181564
byte_buffer_view_iterator_1mb 0.020486884 0.02092305 0.0205405957 0.0001346983936529225
byte_to_message_decoder_decode_many_small 0.17520385 0.175813306 0.1756484204 0.0001662575242228511
generate_10k_random_request_keys 0.089818578 0.090119172 0.08997608189999999 0.00011420149925309998
bytebuffer_rw_10_uint32s 0.306457095 0.308425737 0.3074215192 0.0006255376830751793
bytebuffer_multi_rw_10_uint32s 0.056714977 0.05811964 0.0573843836 0.0005183912892006914
lock_1_thread_10M_ops 0.159265153 0.159628207 0.1593786681 9.506586619046953e-05
lock_2_threads_10M_ops 0.928384425 0.973057969 0.9502458329 0.01445297346992824
lock_4_threads_10M_ops 0.967872602 1.011488876 0.9908614733000001 0.015211928775695657
lock_8_threads_10M_ops 0.963611936 0.997541467 0.9856889333 0.008884626853777415
schedule_10000_tasks 0.004537941 0.006792258 0.0048422611 0.0006893927346105974
schedule_and_run_10000_tasks 0.018120923 0.018679179 0.0183508429 0.00015459378957444837
execute_10000 0.008791002 0.00920704 0.0088471362 0.00012718085654846005

comparison

name current previous winner diff
write_http_headers 0.004165186 0.004325808 current -3%
http_headers_canonical_form 0.087457721 0.087792958 current 0%
http_headers_canonical_form_trimming_whitespace 0.166121134 0.165687643 previous 0%
http_headers_canonical_form_trimming_whitespace_from_short_string 0.151979483 0.151502188 previous 0%
http_headers_canonical_form_trimming_whitespace_from_long_string 0.236236808 0.234946483 previous 0%
bytebuffer_write_12MB_short_string_literals 0.540474513 0.53222344 previous 1%
bytebuffer_write_12MB_short_calculated_strings 0.537279177 0.53121642 previous 1%
bytebuffer_write_12MB_medium_string_literals 0.178940646 0.178244762 previous 0%
bytebuffer_write_12MB_medium_calculated_strings 0.226212733 0.227810106 current 0%
bytebuffer_write_12MB_large_calculated_strings 0.146095844 0.144872252 previous 0%
bytebuffer_lots_of_rw 0.448634338 0.440783434 previous 1%
bytebuffer_write_http_response_ascii_only_as_string 0.041193806 0.041037679 previous 0%
bytebuffer_write_http_response_ascii_only_as_staticstring 0.032163749 0.03172142 previous 1%
bytebuffer_write_http_response_some_nonascii_as_string 0.041158026 0.041282768 current 0%
bytebuffer_write_http_response_some_nonascii_as_staticstring 0.03263154 0.031931522 previous 2%
no-net_http1_10k_reqs_1_conn 0.107124683 0.109685748 current -2%
http1_10k_reqs_1_conn 0.6044588 0.599968129 previous 0%
http1_10k_reqs_100_conns 0.594144737 0.594821994 current 0%
future_whenallsucceed_100k_immediately_succeeded_off_loop 0.073947586 0.073978034 current 0%
future_whenallsucceed_100k_immediately_succeeded_on_loop 0.074909507 0.075369018 current 0%
future_whenallsucceed_100k_deferred_off_loop 0.23397515 0.233529562 previous 0%
future_whenallsucceed_100k_deferred_on_loop 0.12700891 0.127089255 current 0%
future_whenallcomplete_100k_immediately_succeeded_off_loop 0.031679472 0.031054764 previous 2%
future_whenallcomplete_100k_immediately_succeeded_on_loop 0.031827899 0.030816301 previous 3%
future_whenallcomplete_100k_deferred_off_loop 0.161233211 0.160567872 previous 0%
future_whenallcomplete_100k_deferred_on_loop 0.06422727 0.063192659 previous 1%
future_reduce_10k_futures 0.03847232 0.037521456 previous 2%
future_reduce_into_10k_futures 0.038142113 0.036470829 previous 4%
channel_pipeline_1m_events 0.097132599 0.097135423 current 0%
websocket_encode_50b_space_at_front_1m_frames_cow 0.502184734 0.492372453 previous 1%
websocket_encode_50b_space_at_front_1m_frames_cow_masking 0.067009732 0.066346687 previous 0%
websocket_encode_1kb_space_at_front_100k_frames_cow 0.053147541 0.052234785 previous 1%
websocket_encode_50b_no_space_at_front_1m_frames_cow 0.504419647 0.493987666 previous 2%
websocket_encode_1kb_no_space_at_front_100k_frames_cow 0.053156219 0.052404521 previous 1%
websocket_encode_50b_space_at_front_10k_frames 0.006881674 0.006571163 previous 4%
websocket_encode_50b_space_at_front_10k_frames_masking 0.082570846 0.081477419 previous 1%
websocket_encode_1kb_space_at_front_1k_frames 0.000777961 0.000760866 previous 2%
websocket_encode_50b_no_space_at_front_10k_frames 0.006641237 0.006559155 previous 1%
websocket_encode_1kb_no_space_at_front_1k_frames 0.000716782 0.000706397 previous 1%
websocket_decode_125b_100k_frames 0.113528988 0.113479355 previous 0%
websocket_decode_125b_with_a_masking_key_100k_frames 0.11629334 0.117016262 current 0%
websocket_decode_64kb_100k_frames 0.116269648 0.116602352 current 0%
websocket_decode_64kb_with_a_masking_key_100k_frames 0.119078699 0.119026453 previous 0%
websocket_decode_64kb_+1_100k_frames 0.116510436 0.117202803 current 0%
websocket_decode_64kb_+1_with_a_masking_key_100k_frames 0.119406705 0.119256634 previous 0%
circular_buffer_into_byte_buffer_1kb 0.041228017 0.041253553 current 0%
circular_buffer_into_byte_buffer_1mb 0.082253037 0.082247206 previous 0%
byte_buffer_view_iterator_1mb 0.020486884 0.020482017 previous 0%
byte_to_message_decoder_decode_many_small 0.17520385 0.175206024 current 0%
generate_10k_random_request_keys 0.089818578 0.090759003 current -1%
bytebuffer_rw_10_uint32s 0.306457095 0.302710001 previous 1%
bytebuffer_multi_rw_10_uint32s 0.056714977 0.056212724 previous 0%
lock_1_thread_10M_ops 0.159265153 0.159206369 previous 0%
lock_2_threads_10M_ops 0.928384425 0.901443469 previous 2%
lock_4_threads_10M_ops 0.967872602 0.963074809 previous 0%
lock_8_threads_10M_ops 0.963611936 0.984081425 current -2%
schedule_10000_tasks 0.004537941 0.00735269 current -38%
schedule_and_run_10000_tasks 0.018120923 0.023618556 current -23%
execute_10000 0.008791002 0.008810439 current 0%

significant differences found

@FranzBusch FranzBusch force-pushed the feature/scheduling-task-allocations-part-2 branch from 7a185d4 to 7a9b590 Compare December 15, 2021 08:43
Copy link
Contributor

@glbrntt glbrntt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great modulo a couple of nits.

@@ -21,18 +21,13 @@ import Dispatch
/// will be notified once the execution is complete.
public struct Scheduled<T> {
/* private but usableFromInline */ @usableFromInline let _promise: EventLoopPromise<T>

@usableFromInline let cancellationTask: (() -> Void)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this would be private if it wasn't @usableFromInline. By convention in these cases we prefix the name with a _ so this should be _cancellationTask.

Comment on lines 192 to 193
while self.scheduledTasks.peek() != nil {
self.scheduledTasks.pop()?.fail(EventLoopError.shutdown)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not while let task = self.scheduledTasks.pop() { ... }?

@FranzBusch FranzBusch force-pushed the feature/scheduling-task-allocations-part-2 branch from 7a9b590 to 7883758 Compare December 15, 2021 09:16
while self.scheduledTasks.pop() != nil {}
while let task = self.scheduledTasks.pop() {
task.fail(EventLoopError.shutdown)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is a behavioural change. Can you write a test that encodes this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa So there is a test that is executing this testFinishWithRecursivelyScheduledTasks and without this change the test is actually crashing because the future is leaking. Which I am still not 100% sure why it did work before. Should I write an additional test that checks if the error is actually the EventLoopError.shutdown?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a different test that was already testing the EmbeddedEventLoop specifically when draining the remaining tasks. I just added an assertion that makes sure the last task is failed with the proper error. Is that fine with you?

@FranzBusch FranzBusch force-pushed the feature/scheduling-task-allocations-part-2 branch from 7883758 to 6a935b8 Compare December 15, 2021 10:13
@FranzBusch FranzBusch force-pushed the feature/scheduling-task-allocations-part-2 branch 2 times, most recently from f04f2ab to fbd9eae Compare December 15, 2021 11:36
Sources/NIOEmbedded/Embedded.swift Outdated Show resolved Hide resolved
Sources/NIOEmbedded/Embedded.swift Outdated Show resolved Hide resolved
### Motivation:

In my previous PR apple#2010, I was able to decrease the allocations for both `scheduleTask` and `execute` by 1 already. Gladly, there are no more allocations left to remove from `execute` now; however, `scheduleTask` still provides a couple of allocations that we can try to get rid of.

### Modifications:

This PR removes two allocations inside `Scheduled` where we were using the passed in `EventLoopPromise` to call the `cancellationTask` once the `EventLoopFuture` of the promise fails. This requires two allocations inside `whenFailure` and inside `_whenComplete`. However, since we are passing the `cancellationTask` to `Scheduled` anyhow and `Scheduled` is also the one that is failing the promise from the `cancel()` method. We can just go ahead and store the `cancellationTask` inside `Scheduled` and call it from the `cancel()` method directly instead of going through the future.

Importantly, here is that the `cancellationTask` is not allowed to retain the `ScheduledTask.task` otherwise we would change the semantics and retain the `ScheduledTask.task` longer than necessary. My previous PR apple#2010, already implemented the work to get rid of the retain from the `cancellationTask` closure. So we are good to go ahead and store the `cancellationTask` inside `Scheduled` now

### Result:

`scheduleTask` requires two fewer allocations
@FranzBusch FranzBusch force-pushed the feature/scheduling-task-allocations-part-2 branch from fbd9eae to 7fcc0bb Compare December 15, 2021 11:37
@FranzBusch FranzBusch merged commit 3c3e5fc into apple:main Dec 15, 2021
@FranzBusch FranzBusch deleted the feature/scheduling-task-allocations-part-2 branch December 15, 2021 12:46
glbrntt added a commit to glbrntt/swift-nio-transport-services that referenced this pull request Jan 12, 2022
Motivation:

When a task is scheduled using a `DispatchTimerSource` on a `NIOTSEventLoop`
the source is retain via a callback on the promise associated with the
scheduled task. This stops the source from deinit'd before the task is
run. However, the callback being added is an implementation details of
`Scheduled` and the retain cycle is incidental.

If that detail chnaged (such as in
apple/swift-nio#2011) then tasks may not run and
the promise could be leaked.

Modifications:

- Explicitly add a retain cycle between the future and the timer source
  which is broken by the promise being completed and callbacks run.

Result:

The timer source is kept alive until the event fires.
Lukasa pushed a commit to apple/swift-nio-transport-services that referenced this pull request Jan 12, 2022
Motivation:

When a task is scheduled using a `DispatchTimerSource` on a `NIOTSEventLoop`
the source is retain via a callback on the promise associated with the
scheduled task. This stops the source from deinit'd before the task is
run. However, the callback being added is an implementation details of
`Scheduled` and the retain cycle is incidental.

If that detail chnaged (such as in
apple/swift-nio#2011) then tasks may not run and
the promise could be leaked.

Modifications:

- Explicitly add a retain cycle between the future and the timer source
  which is broken by the promise being completed and callbacks run.

Result:

The timer source is kept alive until the event fires.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch-version-bump-only For PRs that when merged will only cause a bump of the patch version, ie. 1.0.x -> 1.0.(x+1) performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants