Skip to content

Add a dedicated method for disconnecting TLS connections#10293

Merged
yhabteab merged 17 commits intosupport/2.14from
graceful-tls-disconnect-214
Jan 14, 2025
Merged

Add a dedicated method for disconnecting TLS connections#10293
yhabteab merged 17 commits intosupport/2.14from
graceful-tls-disconnect-214

Conversation

Calling `AsioTlsStream::async_shutdown()` performs a TLS shutdown which
exchanges messages (that's why it takes a `yield_context`) and thus has the
potential to block the coroutine. Therefore, it should be protected with a
timeout. As `async_shutdown()` doesn't simply take a timeout, this has to be
implemented using a timer. So far, these timers are scattered throughout the
codebase with some places missing them entirely. This commit adds helper
functions to properly shutdown a TLS connection with a single function call.
This new helper functions allows deduplicating the timeout handling for
`async_shutdown()`.
This new helper function has proper timeout handling which was missing here.
The reason for introducing AsioTlsStream::GracefulDisconnect() was to handle
the TLS shutdown properly with a timeout since it involves a timeout. However,
the implementation of this timeout involves spwaning coroutines which are
redundant in some cases. This commit adds comments to the remaining calls of
async_shutdown() stating why calling it is safe in these places.
@yhabteab yhabteab added this to the 2.14.4 milestone Jan 13, 2025
@yhabteab yhabteab requested a review from julianbrost January 13, 2025 09:34
@cla-bot cla-bot bot added the cla/signed label Jan 13, 2025
yhabteab and others added 13 commits January 13, 2025 10:36
PR #7445 incorrectly assumed that a peer that had already disconnected
and never reconnected was due to the endpoint client being dropped after
a successful socket shutdown. However, the issue at that time was that
there was not a single timeout guards that could cancel the `async_shutdown`
call, petentially blocking indefinetely. Although removing the client from
cache early might have allowed the endpoint to reconnect, it did not
resolve the underlying problem. Now that we have a proper cancellation
timeout, we can wait until the currently used socket is fully closed
before dropping the client from our cache. When our socket termination
works reliably, the `ApiListener` reconnect timer should attempt to
reconnect this endpoint after the next tick. Additionally, we now have
logs both for before and after socket termination, which may help
identify if it is hanging somewhere in between.
It's not used. Also, the callback shall run completely at once. This ensures that it won't (continue to) run once another coroutine on the strand calls Timeout#Cancel().
…&&), #operator=(const Timeout&), #operator=(Timeout&&)
@yhabteab yhabteab requested review from Al2Klimov and oxzi and removed request for julianbrost January 13, 2025 10:14
Copy link
Member

@oxzi oxzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared the effective diff between this PR and the three linked ones. Seems to be the same delta.

@yhabteab yhabteab removed the request for review from Al2Klimov January 14, 2025 09:03
@yhabteab yhabteab merged commit 2c0925c into support/2.14 Jan 14, 2025
@yhabteab yhabteab deleted the graceful-tls-disconnect-214 branch January 14, 2025 09:03
@Al2Klimov Al2Klimov mentioned this pull request Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants