Skip to content

graceful connection shutdown #13976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

graceful connection shutdown #13976

wants to merge 11 commits into from

Conversation

icing
Copy link
Contributor

@icing icing commented Jun 19, 2024

When libcurl discards a connection there are two phases this may go through: "shutdown" and "closing". If a connection is aborted, the shutdown phase is skipped and it is closed right away.

The connection filters attached to the connection implement the phases in their do_shutdown() and do_close() callbacks. Filters carry now a shutdown flag next to connected to keep track of the shutdown operation.

Filters are shut down from top to bottom. If a filter is not connected, its shutdown is skipped. Notable filters that do something during shutdown are HTTP/2 and TLS. HTTP/2 sends the GOAWAY frame. TLS sends its close notify and expects to receive a close notify from the server.

As sends and receives may EAGAIN on the network, a shutdown is often not successful right away and needs to poll the connection's socket(s). To facilitate this, such connections are placed on a new shutdown list inside the connection cache.

Since managing this list requires the cooperation of a multi handle, only the connection cache belonging to a multi handle is used. If a connection was in another cache when being discarded, it is removed there and added to the multi's cache shutdown list. If no multi handle is available at that time, the connection is shutdown and closed in a one-time, best-effort attempt.

When a multi handle is destroyed, all connection still on the shutdown list are discarded with a final shutdown attempt and close. In curl debug builds, the environment variable CURL_GRACEFUL_SHUTDOWN can be set to make this graceful with a timeout in milliseconds given by the variable.

The shutdown list is limited to the max number of connections configured for a multi cache. Set via CURLMOPT_MAX_TOTAL_CONNECTIONS. When the limit is reached, the oldest connection on the shutdown list is discarded.

Operation

  • In multi_wait() and multi_waitfds(), iterate over all shutdown connections in the multi's cache. Each such connection contributes sockets and POLLIN/OUT events.
  • in multi_perform() call perform on the multi's cache to process the shutdown list.
  • for event based multis (multi->socket_cb set), add the sockets and their poll events via the callback. When multi_socket() is invoked for a socket not known by an active transfer, forward this to the multi's cache for processing. On closing a connection, remove its socket(s) via the callback.

TLS

TLS connection filters MUST NOT send close nofity messages in their do_close() implementation. The reason is that a TLS close notify signals a success. When a connection is aborted and skips its shutdown phase, the server needs to see a missing close notify to detect something has gone wrong.

FTP

A graceful shutdown of FTP's data connection is performed implicitly before regarding the upload/download as complete and continuing on the control connection. For FTP without TLS, there is just the socket close happening. But with TLS, the sent/received close notify signals that the transfer is complete and healthy. Servers like vsftpd verify that and reject uploads without a TLS close notify.

Tests

  • added test_19_* for shutdown related tests
  • test_19_01 and test_19_02 test for TCP RST packets which happen without a graceful shutdown and should no longer appear otherwise.
  • add test_19_03 for handling shutdowns by the server
  • add test_19_04 for handling shutdowns by curl
  • add test_19_05 for event based shutdowny by server
  • add test_30_06/07 and test_31_06/07 for shutdown checks on FTP up- and downloads.

TODO

This PR handles graceful shutdown during multi operations without changes visible in the API. Enabling a graceful shutdown when a cache is destroyed is available to debug builds via environment variables only.

The maximum number of connections in shutdown is the maximum number configured or a connection cache. This is only available for caches owned by a multi handle, where CURLMOPT_MAX_TOTAL_CONNECTIONS can be used.

There is a data->set.shutdowntimeout, but there is no CURLOPT_SHUTDOWN_TIMEOUT_MS to set it. Instead, the default, internal DEFAULT_SHUTDOWN_TIMEOUT_MS of 2 seconds it used then.

Non-Features

For share caches, the discarded connections move into the multi's cache (if there is a multi, otherwise they are shut down and closed right away). If the multi is an "easy multi", the easy cleans up the multi when done and that in turns shutdown+closes all pending shutdowns. So with serial easy handles, as used in curl, there is no graceful shutdown going on in the background. I presume we can live with that.

When libcurl discards a connection there are two phases
this may go through: "shutdown" and "closing". If a connection
is aborted, the shutdown phase is skipped and it is closed
right away.

The connection filters attached to the connection implement
the phases in their `do_shutdown()` and `do_close()` callbacks.
Filters carry now a `shutdown` flags next to `connected` to
keep track of the shutdown operation.

Filters are shut down from top to bottom. If a filter is not
connected, its shutdown is skipped. Notable filters that *do*
something during shutdown are HTTP/2 and TLS. HTTP/2 sends
the GOAWAY frame. TLS sends its close notify and expects to
receive a close notify from the server.

As sends and receives may EAGAIN on the network, a shutdown
is often not successful right away and needs to poll the
connection's socket(s). To facilitate this, such connections
are placed on a new shutdown list inside the connection cache.

Since managing this list requires the cooperation of a multi
handle, only the connection cache belonging to a multi handle
is used. If a connection was in another cache when being discarded,
it is removed there and added to the multi's cache. If no
multi handle is available at that time, the connection is
shutdown and closed in a one-time, best-effort attempt.

When a multi handle is destroyed, all connection still on
the shutdown list are discarded with a final shutdown attempt
and close. In curl debug builds, the environment variable
`CURL_GRACEFUL_SHUTDOWN` can be set to make this graceful with
a timeout in milliseconds given by the variable.

The shutdown list is limited to the max number of connections
configured for a multi cache. Set via CURLMOPT_MAX_TOTAL_CONNECTIONS.
When the limit is reached, the oldest connection on the
shutdown list is discarded.

- In multi_wait() and multi_waitfds(), collect all
  connection caches involved (each transfer might carry
  its own) into a temporary list. Let each connection
  cache on the list contribute sockets and POLLIN/OUT
  events it's connections are waiting for.
- in multi_perform() collect the connection caches the
  same way and let them peform their maintenance. This
  will make another non-blocking attempt to shutdown
  all connections on its shutdown list.
- for event based multis (multi->socket_cb set), add the
  sockets and their poll events via the callback. When
  `multi_socket()` is invoked for a socket not known by
  an active transfer, forward this to the multi's cache
  for processing. On closing a connection, remove its
  socket(s) via the callback.

TLS connection filters MUST NOT send close nofity messages
in their `do_close()` implementation. The reason is that
a TLS close notify signals a success. When a connection
is aborted and skips its shutdown phase, the server needs
to see a missing close notify to detect something has gone
wrong.

A graceful shutdown of FTP's data connection is performed
implicitly before regarding the upload/download as complete
and continuing on the control connection. For FTP without
TLS, there is just the socket close happening. But with TLS,
the sent/received close notify signals that the transfer
is complete and healthy. Servers like `vsftpd` verify that
and reject uploads without a TLS close notify.

- added test_19_* for shutdown related tests
- test_19_01 and test_19_02 test for TCP RST packets
  which happen without a graceful shutdown and should
  no longer appear otherwise.
- add test_19_03 for handling shutdowns by the server
- add test_19_04 for handling shutdowns by curl
- add test_19_05 for event based shutdowny by server
- add test_30_06/07 and test_31_06/07 for shutdown checks
  on FTP up- and downloads.
@icing icing requested a review from bagder June 19, 2024 12:53
@bagder bagder closed this in c9b95c0 Jun 26, 2024
@bagder
Copy link
Member

bagder commented Jun 26, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

CURLOPT_FORBID_REUSE=1 + Graceful TCP shutdown with FIN-ACK FIN-ACK and SSL
2 participants