Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Under certain conditions meta requests are retried after receiving non-recoverable S3 error responses. #264

Closed
alexsmith1612 opened this issue Mar 9, 2023 · 6 comments

Comments

@alexsmith1612
Copy link

Hello,

I have noticed that when using the AWS CRT to upload S3 objects with auto-ranged
PUT requests, occasionally when the server sends an error response, instead of
the meta request immediately terminating and invoking struct aws_s3_meta_request_options.finish_callback, the meta request will be retried
until the maximum number of retry attempts has been met at which point the
finish callback will be invoked with either AWS_IO_SOCKET_CLOSED or
AWS_ERROR_HTTP_CONNECTION_CLOSED.

After some investigation, I believe this happens if and only if the request body
is still in the process of being transmitted when the error response (with a
Connection: close header) has been received, followed by a subsequent TCP FIN
or TLS close_notify alert.

When the Connection: close header is received,
aws-c-http/source/h1_connection.c:s_decoder_on_header() sets
incoming_stream->is_final_stream = true. Then in
aws-c-http/source/h1_connection.c:s_decoder_on_done(), after seeing that
is_final_stream is set,

        s_stop(
            connection, true /*stop_reading*/, false /*stop_writing*/, false /*schedule_shutdown*/, AWS_ERROR_SUCCESS);

is called, which sets connection->thread_data.is_reading_stopped = true, but
does not indicate at all to its leftward slot or the channel as a whole to stop
reading.

Later in aws-c-http/source/h1_connection.c:s_decoder_on_done(), if
incoming_stream->is_outgoing_message_done is set, then s_stream_complete()
is called, and (since is_final_stream is set), s_connection_close() is
called and

    s_stop(connection, false /*stop_reading*/, false /*stop_writing*/, true /*schedule_shutdown*/, AWS_ERROR_SUCCESS);

finally leads to aws_channel_shutdown() being called.

But if is_outgoing_message_done is not set, then the leftward slot will
continue its read loop and immediately encounter the TCP FIN (when cleartext
HTTP is being used) or TLS close_notify alert (when HTTPS is being used).

In the HTTP case, the aws_socket_read() call in
aws-c-io/socket_channel_handler.c:s_do_read() will get a zero-length read and
raise AWS_IO_SOCKET_CLOSED.

In the HTTPS case (at least in s2n builds), the s2n_recv() call in
aws-c-io/source/s2n/s2n_tls_channel_handler.c:s_s2n_handler_process_read_message()
will get a zero-length read, the close_notify alert will be logged, and return
with AWS_OP_SUCCESS. I haven't quite tracked down the exact code path yet, but
this ends up with aws-c-http/source/h1_connection.c:s_stream_complete() being
invoked with AWS_ERROR_HTTP_CONNECTION_CLOSED.

This AWS_IO_SOCKET_CLOSED or AWS_ERROR_HTTP_CONNECTION_CLOSED value
eventually propagates to
aws-c-s3/source/s3_meta_request.c:s_s3_meta_request_send_request_finish_helper(). Since
this error code does not match AWS_ERROR_S3_INVALID_RESPONSE_STATUS or
AWS_ERROR_S3_NON_RECOVERABLE_ASYNC_ERROR, finish_code is set to
AWS_S3_CONNECTION_FINISH_CODE_RETRY, and the meta request is tried again
despite the error response from the S3 server potentially being non-recoverable.

I am not sure if the proper solution to this lies in aws-c-s3 (e.g. changing
the logic for when a meta request should be retried), aws-c-http (e.g. signal
its leftward slot to stop reading once a complete Connection: close response
has been processed), aws-c-io (e.g. change the socket and TLS handler logic to
be able to stop reading immediately when signaled by its rightward slot), or
some combination of the three, but I figured that raising the issue in the
top-level repo would make the most sense. Let me know if this issue would be
more appropriate in one of the other CRT repos instead.

It may also make sense to consider:

  • Logic to immediately stop transmitting a request when a response that
    indicates failure has been received (perhaps if the HTTP response status >=
    300) to avoid potentially triggering a RST from the server

  • Ensuring that a response is gracefully handled even if a RST, TLS alert, or
    write error is received after having received a complete response. I have seen
    instances where instead of the zero read in
    aws-c-io/source/posix/socket.c:aws_socket_read(), an ECONNRESET was
    received--looking at a pcap of the stream a RST had been received immediately
    after the FIN.

    I have also seen AWS_IO_TLS_ERROR_WRITE_FAILURE raised after the
    close_notify TLS alert was received, but this case also ends up with
    AWS_ERROR_HTTP_CONNECTION_CLOSED being propagated to
    aws-c-s3/source/s3_meta_request.c:s_s3_meta_request_send_request_finish_helper().

Here are some log snippets of such failures that I was able to reproduce using
both HTTP and HTTPS and using the AWS S3 servers and a third-party, nominally
S3-compatible server that I spun up locally for testing (I have seen this error
behavior using both HTTP and HTTPS with both the AWS S3 servers and the
third-party local server). Note that the AWS_LL_TRACE messages that were
logged are listed with level DEBUG due to my logger implementation.

Log snippet using HTTP with FIN/zero read
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Reading from body stream.
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Sending 16384 bytes of body, progress: 703902/804335
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming response status: 403 (Forbidden).
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: x-amz-request-id: DCT2S3BF391M7NX6
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: x-amz-id-2: tm8C9/WzAZn6vxs1/g9q36SVl4NvfTW1fgcwuY8nf6AkEeA2ZJsE7UCUIiUevjXX2tXomJ8LR30=
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: Content-Type: application/xml
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: Transfer-Encoding: chunked
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: Date: Thu, 02 Mar 2023 21:09:54 GMT
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: Server: AmazonS3
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming header: Connection: close
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Received 'Connection: close' header. This will be the final stream on this connection.
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Main header block done.
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Incoming body: 344 bytes received.
[DEBUG]: AWS [00007f1b114fc700] [S3MetaRequest] - id=0x61600006ea80 Incoming body for request 0x611000059f00. Response status: 403. Data Size: 344. connection: 0x60400006dfd0.
[DEBUG]: AWS [00007f1b114fc700] [S3MetaRequest] - response body:
[DEBUG]: <?xml version="1.0" encoding="UTF-8"?>
[DEBUG]: <Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>minioadmin</AWSAccessKeyId><RequestId>DCT2S3BF391M7NX6</RequestId><HostId>tm8C9/WzAZn6vxs1/g9q36SVl4NvfTW1fgcwuY8nf6AkEeA2ZJsE7UCUIiUevjXX2tXomJ8LR30=</HostId></Error>
[INFO]: AWS [00007f1b114fc700] [socket] - id=0x613000053b00 fd=19: zero read, socket is closed
[DEBUG]: AWS [00007f1b114fc700] [task-scheduler] - id=0x61800007f680: Scheduling channel_shutdown task for immediate execution
[DEBUG]: AWS [00007f1b114fc700] [channel] - id=0x61800007f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007f1b114fc700] [task-scheduler] - id=0x61800007f680: Running channel_shutdown task with <Running> status
[DEBUG]: AWS [00007f1b114fc700] [channel] - id=0x61800007f480: beginning shutdown process
[DEBUG]: AWS [00007f1b114fc700] [channel] - id=0x61800007f480: handler 0x61200009b2c0 shutdown in read dir completed.
[DEBUG]: AWS [00007f1b114fc700] [channel] - id=0x61800007f480: handler 0x61700007eb08 shutdown in read dir completed.
[DEBUG]: AWS [00007f1b114fc700] [task-scheduler] - id=0x61800007f4a8: Scheduling (null) task for immediate execution
[DEBUG]: AWS [00007f1b114fc700] [task-scheduler] - id=0x61800007f4a8: Running (null) task with <Running> status
[DEBUG]: AWS [00007f1b114fc700] [http-stream] - id=0x6160000be580: Stream completed with error code 1051 (AWS_IO_SOCKET_CLOSED).
[INFO]: AWS [00007f1b114fc700] [http-connection] - id=0x61700007eb00: Shutting down connection with error code 0 (AWS_ERROR_SUCCESS).
[DEBUG]: AWS [00007f1b114fc700] [channel] - id=0x61800007f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007f1b114fc700] [S3MetaRequest] - id=0x61600006ea80: Request 0x611000059f00 finished with error code 1051 (aws-c-io: AWS_IO_SOCKET_CLOSED, socket is closed.) and response status 403
[ERROR]: AWS [00007f1b114fc700] [S3MetaRequest] - id=0x61600006ea80 Meta request failed from error 1051 (socket is closed.). (request=0x611000059f00, response status=403). Try to setup a retry.
Log snippet using HTTP with ECONNRESET
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Reading from body stream.
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Sending 16384 bytes of body, progress: 720286/804335
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming response status: 403 (Forbidden).
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: x-amz-request-id: B5VDFTXHSRW9HVK3
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: x-amz-id-2: vJaT/N9Q9JXQkGu91IP3VH4ak//uge8vt9hDXolY8PJp06bsvNo0SkMoLPy9fpAjXRkDTeKh84Y=
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: Content-Type: application/xml
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: Transfer-Encoding: chunked
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: Date: Thu, 02 Mar 2023 21:09:54 GMT
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: Server: AmazonS3
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming header: Connection: close
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Received 'Connection: close' header. This will be the final stream on this connection.
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Main header block done.
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Incoming body: 344 bytes received.
[DEBUG]: AWS [00007f1b10cfb700] [S3MetaRequest] - id=0x61600006ea80 Incoming body for request 0x611000059f00. Response status: 403. Data Size: 344. connection: 0x60400006dfd0.
[DEBUG]: AWS [00007f1b10cfb700] [S3MetaRequest] - response body:
[DEBUG]: <?xml version="1.0" encoding="UTF-8"?>
[DEBUG]: <Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>minioadmin</AWSAccessKeyId><RequestId>B5VDFTXHSRW9HVK3</RequestId><HostId>vJaT/N9Q9JXQkGu91IP3VH4ak//uge8vt9hDXolY8PJp06bsvNo0SkMoLPy9fpAjXRkDTeKh84Y=</HostId></Error>
[INFO]: AWS [00007f1b10cfb700] [socket] - id=0x613000045b00 fd=19: socket is closed.
[DEBUG]: AWS [00007f1b10cfb700] [task-scheduler] - id=0x61800006f680: Scheduling channel_shutdown task for immediate execution
[DEBUG]: AWS [00007f1b10cfb700] [channel] - id=0x61800006f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007f1b10cfb700] [task-scheduler] - id=0x61800006f680: Running channel_shutdown task with <Running> status
[DEBUG]: AWS [00007f1b10cfb700] [channel] - id=0x61800006f480: beginning shutdown process
[DEBUG]: AWS [00007f1b10cfb700] [channel] - id=0x61800006f480: handler 0x6120000832c0 shutdown in read dir completed.
[DEBUG]: AWS [00007f1b10cfb700] [channel] - id=0x61800006f480: handler 0x61700005ec08 shutdown in read dir completed.
[DEBUG]: AWS [00007f1b10cfb700] [task-scheduler] - id=0x61800006f4a8: Scheduling (null) task for immediate execution
[DEBUG]: AWS [00007f1b10cfb700] [task-scheduler] - id=0x61800006f4a8: Running (null) task with <Running> status
[DEBUG]: AWS [00007f1b10cfb700] [http-stream] - id=0x6160000ae680: Stream completed with error code 1051 (AWS_IO_SOCKET_CLOSED).
[INFO]: AWS [00007f1b10cfb700] [http-connection] - id=0x61700005ec00: Shutting down connection with error code 0 (AWS_ERROR_SUCCESS).
[DEBUG]: AWS [00007f1b10cfb700] [channel] - id=0x61800006f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007f1b10cfb700] [S3MetaRequest] - id=0x61600006ea80: Request 0x611000059f00 finished with error code 1051 (aws-c-io: AWS_IO_SOCKET_CLOSED, socket is closed.) and response status 403
[ERROR]: AWS [00007f1b10cfb700] [S3MetaRequest] - id=0x61600006ea80 Meta request failed from error 1051 (socket is closed.). (request=0x611000059f00, response status=403). Try to setup a retry.
Log snippet using HTTPS
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Reading from body stream.
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Sending 16331 bytes of body, progress: 244362/804335
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming response status: 403 (Forbidden).
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Accept-Ranges: bytes
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Content-Length: 314
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Content-Type: application/xml
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Server: MinIO
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Vary: Origin
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Date: Thu, 02 Mar 2023 19:09:19 GMT
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming header: Connection: close
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Received 'Connection: close' header. This will be the final stream on this connection.
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Main header block done.
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Incoming body: 314 bytes received.
[DEBUG]: AWS [00007fe36ccfb700] [S3MetaRequest] - id=0x61600003ed80 Incoming body for request 0x611000059f00. Response status: 403. Data Size: 314. connection: 0x604000027fd0.
[DEBUG]: AWS [00007fe36ccfb700] [S3MetaRequest] - response body:
[DEBUG]: <?xml version="1.0" encoding="UTF-8"?>
[DEBUG]: <Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the server&#39;s time is too large.</Message><Resource>/test0/twocities.txt</Resource><RequestId></RequestId><HostId>64e90a68-95ff-4541-bf12-7e354f1dc058</HostId></Error>
[DEBUG]: AWS [00007fe36ccfb700] [tls-handler] - id=0x61600007cb80: Alert code 0
[DEBUG]: AWS [00007fe36ccfb700] [task-scheduler] - id=0x61800003f680: Scheduling channel_shutdown task for immediate execution
[DEBUG]: AWS [00007fe36ccfb700] [task-scheduler] - id=0x61800003f680: Running channel_shutdown task with <Running> status
[DEBUG]: AWS [00007fe36ccfb700] [channel] - id=0x61800003f480: beginning shutdown process
[DEBUG]: AWS [00007fe36ccfb700] [channel] - id=0x61800003f480: handler 0x61200005b6c0 shutdown in read dir completed.
[DEBUG]: AWS [00007fe36ccfb700] [tls-handler] - id=0x61600007cb80: Shutting down read direction with error code 0
[DEBUG]: AWS [00007fe36ccfb700] [channel] - id=0x61800003f480: handler 0x61600007cb80 shutdown in read dir completed.
[DEBUG]: AWS [00007fe36ccfb700] [channel] - id=0x61800003f480: handler 0x61700005b788 shutdown in read dir completed.
[DEBUG]: AWS [00007fe36ccfb700] [task-scheduler] - id=0x61800003f4a8: Scheduling (null) task for immediate execution
[DEBUG]: AWS [00007fe36ccfb700] [task-scheduler] - id=0x61800003f4a8: Running (null) task with <Running> status
[DEBUG]: AWS [00007fe36ccfb700] [http-stream] - id=0x61600007bf80: Stream completed with error code 2058 (AWS_ERROR_HTTP_CONNECTION_CLOSED).
[INFO]: AWS [00007fe36ccfb700] [http-connection] - id=0x61700005b780: Shutting down connection with error code 0 (AWS_ERROR_SUCCESS).
[DEBUG]: AWS [00007fe36ccfb700] [channel] - id=0x61800003f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007fe36ccfb700] [S3MetaRequest] - id=0x61600003ed80: Request 0x611000059f00 finished with error code 2058 (aws-c-http: AWS_ERROR_HTTP_CONNECTION_CLOSED, The connection has closed or is closing.) and response status 403
[ERROR]: AWS [00007fe36ccfb700] [S3MetaRequest] - id=0x61600003ed80 Meta request failed from error 2058 (The connection has closed or is closing.). (request=0x611000059f00, response status=403). Try to setup a retry.
Log snippet using HTTPS with `AWS_IO_TLS_ERROR_WRITE_FAILURE` mentioned
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Reading from body stream.
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Sending 16331 bytes of body, progress: 228031/804335
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming response status: 403 (Forbidden).
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Accept-Ranges: bytes
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Content-Length: 314
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Content-Type: application/xml
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Server: MinIO
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Vary: Origin
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Date: Thu, 02 Mar 2023 19:09:19 GMT
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming header: Connection: close
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Received 'Connection: close' header. This will be the final stream on this connection.
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Main header block done.
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Incoming body: 314 bytes received.
[DEBUG]: AWS [00007fe36d4fc700] [S3MetaRequest] - id=0x61600003ed80 Incoming body for request 0x611000059f00. Response status: 403. Data Size: 314. connection: 0x604000027fd0.
[DEBUG]: AWS [00007fe36d4fc700] [S3MetaRequest] - response body:
[DEBUG]: <?xml version="1.0" encoding="UTF-8"?>
[DEBUG]: <Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the server&#39;s time is too large.</Message><Resource>/test0/twocities.txt</Resource><RequestId></RequestId><HostId>64e90a68-95ff-4541-bf12-7e354f1dc058</HostId></Error>
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61100006a840: Scheduling socket_written_task task for immediate execution
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61100006a840: Running socket_written_task task with <Running> status
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61700007b710: Scheduling http1_connection_outgoing_stream task for immediate execution
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61700007b710: Running http1_connection_outgoing_stream task with <Running> status
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Reading from body stream.
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Sending 16331 bytes of body, progress: 244362/804335
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61100006a840: Scheduling socket_written_task task for immediate execution
[DEBUG]: AWS [00007fe36d4fc700] [tls-handler] - id=0x61600008d080: Alert code 0
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61800004f680: Scheduling channel_shutdown task for immediate execution
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61100006a840: Running socket_written_task task with <Running> status
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61700007b710: Scheduling http1_connection_outgoing_stream task for immediate execution
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61800004f680: Running channel_shutdown task with <Running> status
[DEBUG]: AWS [00007fe36d4fc700] [channel] - id=0x61800004f480: beginning shutdown process
[DEBUG]: AWS [00007fe36d4fc700] [channel] - id=0x61800004f480: handler 0x6120000748c0 shutdown in read dir completed.
[DEBUG]: AWS [00007fe36d4fc700] [tls-handler] - id=0x61600008d080: Shutting down read direction with error code 0
[DEBUG]: AWS [00007fe36d4fc700] [channel] - id=0x61800004f480: handler 0x61600008d080 shutdown in read dir completed.
[DEBUG]: AWS [00007fe36d4fc700] [channel] - id=0x61800004f480: handler 0x61700007b688 shutdown in read dir completed.
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61800004f4a8: Scheduling (null) task for immediate execution
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61700007b710: Running http1_connection_outgoing_stream task with <Running> status
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Reading from body stream.
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Sending 16331 bytes of body, progress: 260693/804335
[ERROR]: AWS [00007fe36d4fc700] [http-connection] - id=0x61700007b680: Failed to send message in write direction, error 1031 (AWS_IO_TLS_ERROR_WRITE_FAILURE). Closing connection.
[INFO]: AWS [00007fe36d4fc700] [http-connection] - id=0x61700007b680: Shutting down connection with error code 1031 (AWS_IO_TLS_ERROR_WRITE_FAILURE).
[DEBUG]: AWS [00007fe36d4fc700] [channel] - id=0x61800004f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007fe36d4fc700] [task-scheduler] - id=0x61800004f4a8: Running (null) task with <Running> status
[DEBUG]: AWS [00007fe36d4fc700] [http-stream] - id=0x61600008c480: Stream completed with error code 2058 (AWS_ERROR_HTTP_CONNECTION_CLOSED).
[INFO]: AWS [00007fe36d4fc700] [http-connection] - id=0x61700007b680: Shutting down connection with error code 0 (AWS_ERROR_SUCCESS).
[DEBUG]: AWS [00007fe36d4fc700] [channel] - id=0x61800004f480: Channel shutdown is already pending, not scheduling another.
[DEBUG]: AWS [00007fe36d4fc700] [S3MetaRequest] - id=0x61600003ed80: Request 0x611000059f00 finished with error code 2058 (aws-c-http: AWS_ERROR_HTTP_CONNECTION_CLOSED, The connection has closed or is closing.) and response status 403
[ERROR]: AWS [00007fe36d4fc700] [S3MetaRequest] - id=0x61600003ed80 Meta request failed from error 2058 (The connection has closed or is closing.). (request=0x611000059f00, response status=403). Try to setup a retry.

The CRT library versions that I'm currently using are

aws-c-auth: v0.6.25
aws-c-cal: v0.5.21
aws-c-common: v0.8.12
aws-c-compression: v0.2.16
aws-checksums: v0.1.14
aws-c-http: v0.7.5
aws-c-io: v0.13.18
aws-c-s3: v0.2.5
aws-c-sdkutils: v0.1.7
s2n-tls: v1.3.38

but I had seen this behavior for many months across multiple earlier versions
before finding the time to try to track down the error to a reportable issue.

Let me know if there is any additional information that you would like me to
provide.

Thank you,
Alex

@TingDaoK
Copy link
Contributor

TingDaoK commented Mar 21, 2023

Thank you for reporting the issue and the detail investigation!
We just merged the fix from aws-c-http, it should fix the issue here.

@graebm
Copy link
Contributor

graebm commented Mar 22, 2023

Which language CRT are you using to get at aws-c-s3? aws-crt-java? aws-crt-cpp?

btw thank you for the AMAZING deep dive writeup. It really helped us dive right into a fix

graebm added a commit to awslabs/aws-crt-java that referenced this issue Mar 22, 2023
*Issue #:* awslabs/aws-c-s3#264

*Description of changes:*
Fix issue where an HTTP connection closing after the response was received, but before the request was completely sent, was treated as an error. An HTTP exchange is now considered successful as long as the complete response was received. This fix was made in aws-c-http: awslabs/aws-c-http#431
@alexsmith1612
Copy link
Author

Thank you for reporting the issue and the detail investigation!
We just merged the fix from aws-c-http, it should fix the issue here.

Awesome! I'll give it a test soon to ensure that my issue has been fixed.

Which language CRT are you using to get at aws-c-s3? aws-crt-java? aws-crt-cpp?

I'm using the C CRT libraries directly.

btw thank you for the AMAZING deep dive writeup. It really helped us dive right into a fix

Of course! I try to provide the level of detail in problem descriptions and debug information that I'd like to receive myself. Thank you two for the quick turnaround.

@alexsmith1612
Copy link
Author

The changes in awslabs/aws-c-http@0600662 appear to fix the issue, thanks!

FWIW, I haven't encountered this as an issue, but it does seem to me that it might make sense to finish transmitting the (current) request even after receiving a complete Connection: close response if the response status is < 300, although I'm not sure how often that situation arises in practice. If such a change did get implemented, I believe that an additional change would be needed to ensure that reads (on leftward slots) do truly stop after the client finishes processing the response while transmission continues. But I digress.

@TingDaoK
Copy link
Contributor

TingDaoK commented Mar 22, 2023

We thought about it. part of conversation can be found here

The reasons that we just end transmitting after connection: close no matter what response status:

  • The spec said MUST cease, it's a bit ambiguous

A client that receives a "close" connection option MUST cease sending requests on that connection and close the connection after reading the response message containing the "close" connection option

  • I also tested with curl against a local host. curl also stops sending with 200 and connection: close, some log from curl
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 
< Server: mock_s3_server
< ETag: b54357faf0632cce46e942fa68356b38
< Connection: close
< Content-Length: 0
< 
* we are done reading and this is set to close, stop send
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 
< Server: mock_s3_server
< ETag: b54357faf0632cce46e942fa68356b38
< Connection: keep-alive
< Content-Length: 0
< 
* We are completely uploaded and fine

@alexsmith1612
Copy link
Author

I saw that conversation, and my reading of the RFC is that no additional requests following the current in-progress request may be sent after receiving Connection: close. The RFC 9112 Section 9.5 paragraph 5 that Michael quoted in particular seems to heavily imply to me that the in-progress body does not need to be immediately terminated when a non-error response with Connection: close is received.

Looking at curl's commit history, their behavior of terminating of in-progress body transmissions upon receipt of a 200 response with Connection: close (or the absence of Connection: keep-alive for HTTP/1.0) started with curl/curl@c67a99f. Following the bug report in that commit message, it appears that this was implemented due to curl continuing to send data after receiving error responses with Connection: close, with no discussion about non-error responses. Based on that, it seems to me that the curl behavior on a 200 response with Connection: close may have been less of an explicit design choice and more of an unhandled corner case.

That being said, since this curl behavior has existed since 2008, that probably means that this is not a situation that is encountered often, so it may not matter what the client's behavior is in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants