Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asio::ssl::stream::async_read_once null_buffers causes busy-loop #1015

Open
adamncasey opened this issue Mar 7, 2022 · 2 comments
Open

asio::ssl::stream::async_read_once null_buffers causes busy-loop #1015

adamncasey opened this issue Mar 7, 2022 · 2 comments

Comments

@adamncasey
Copy link

I've found an issue where calling async_read_some with the null_buffers technique (idiom?) causes an application to busy loop. Worth noting there is no async_wait equivalent available here either. Below describes the steps we go through from invoking async_read_some to nearly immediately having the handler invoked when there's no data.

A small application along the lines of this highlights the issue (sorry it doesn't compile - on my todo list):

boost::asio::ssl::stream<boost::asio::ip::tcp::socket> socket;
// get a socket from a listening TLS context.
// Our application listens for an incoming TLS connection - unsure if this also happens with outbound connections.
// Set socket non-blocking

sock.async_read_some(boost::asio::null_buffers(), [](error_code ec, std::size_t) {
    if (ec) {
        // handle error
    }

    size_t readAmount = socket.read_some(/*actual buffer*/, ec);
    if (ec == boost::asio::error::would_block) {
        // re-schedule async_read_some
    }
});

I believe the issue is that stream::async_read_some has no null_buffers specialisation, and therefore we eventually end up down in engine::read with an empty buffer.

engine::want engine::read(const asio::mutable_buffer& data,
asio::error_code& ec, std::size_t& bytes_transferred)
{
if (data.size() == 0)
{
ec = asio::error_code();
return engine::want_nothing;
}

This returns want_nothing, instead of what I think should be something closer to want_input_and_retry.

want_nothing causes io_op::operator() to get here:

// The SSL operation is done and we can invoke the handler, but we
// have to keep in mind that this function might be being called from
// the async operation's initiating function. In this case we're not
// allowed to call the handler directly. Instead, issue a zero-sized
// read so the handler runs "as-if" posted using io_context::post().
if (start)
{
ASIO_HANDLER_LOCATION((
__FILE__, __LINE__, Operation::tracking_name()));
next_layer_.async_read_some(
asio::buffer(core_.input_buffer_, 0),
ASIO_MOVE_CAST(io_op)(*this));
// Yield control until asynchronous operation completes. Control
// resumes at the "default:" label below.
return;

Which triggers an empty read event such that io_op gets re-executed in the next(ish?) executor loop iteration.

When we come back round to io_op start_ is set to 0

void operator()(asio::error_code ec,
std::size_t bytes_transferred = ~std::size_t(0), int start = 0)
{
switch (start_ = start)

and then we jump into a default: statement nested inside of the do { } while(); loop (which it's fair to say, surprised me):

default:
if (bytes_transferred == ~std::size_t(0))
bytes_transferred = 0; // Timer cancellation, no data transferred.
else if (!ec_)
ec_ = ec;
switch (want_)
{
case engine::want_input_and_retry:

want_ hasn't been re-written since the last time we were here, so in my case it's still want_nothing

.. and finally we end up invoking the user-provided handler (via read_op) just inside the switch(want_) with zero bytes read and no error code. Our application (as would others) then invoke async_read_some again again (and there's the busy loop).

default:
// Pass the result to the handler.
op_.call_handler(handler_,
core_.engine_.map_error_code(ec_),
ec_ ? 0 : bytes_transferred_);
// Our work here is done.
return;

I found at least one other person on stackoverflow has hit this issue before, a long time ago: https://stackoverflow.com/questions/40163626/boost-asio-ssl-not-working-as-expected-when-used-with-null-buffers

I think that there is something wrong here, unfortunately I just don't quite see where the best place to fix this issue is.

Perhaps some kind of async_wait / async_read_some(boost::asio::null_buffers) specialisation could drop us straight into this logic (or something like it) ?

// If the input buffer already has data in it we can pass it to the
// engine and then retry the operation immediately.
if (core_.input_.size() != 0)
{
core_.input_ = core_.engine_.put_input(core_.input_);
continue;
}
// The engine wants more data to be read from input. However, we
// cannot allow more than one read operation at a time on the
// underlying transport. The pending_read_ timer's expiry is set to
// pos_infin if a read is in progress, and neg_infin otherwise.
if (core_.expiry(core_.pending_read_) == core_.neg_infin())
{
// Prevent other read operations from being started.
core_.pending_read_.expires_at(core_.pos_infin());
ASIO_HANDLER_LOCATION((
__FILE__, __LINE__, Operation::tracking_name()));
// Start reading some data from the underlying transport.
next_layer_.async_read_some(
asio::buffer(core_.input_buffer_),
ASIO_MOVE_CAST(io_op)(*this));
}
else
{
ASIO_HANDLER_LOCATION((
__FILE__, __LINE__, Operation::tracking_name()));
// Wait until the current read operation completes.
core_.pending_read_.async_wait(ASIO_MOVE_CAST(io_op)(*this));
}

@adamncasey
Copy link
Author

Looking at this a bit deeper - I think this is hard to solve because for these reasons:

  1. We want to call to asio::ssl::stream::async_read_some with no buffer to store data (basically so we can create a perfectly sized buffer when data is available)
  2. OpenSSL's SSL_read method only consumes data into a user buffer. There doesn't seem to be a function which lets you just move socket data into openssl's internal buffer
  3. The async_read_some handler is passed the number of bytes available, and this would logically be the number of application-bytes you could immediately get if you called ssl::stream::read_some.

Without a solution to 2) I don't see how anything can call the handler function in 3) with the correct number of available bytes without a buffer to do at least some of the work.

I think I see two possible solutions here:

  1. Perhaps we can call SSL_read with a zero byte buffer. I see some code in there which is supposed to guarantee forward progress over non-data records even when called with zero bytes. There is some specific code in asio::ssl which avoids calling SSL_read with a zero byte buffer though so I'm not sure if this can work. The handler should always be called with the result of SSL_pending() so users can allocate a buffer for all of the available data.

  2. Null_buffers could be emulated by always passing openssl a very small buffer (even one byte might be enough). A stream::available() method could return the result of SSL_pending() plus this extra byte, when available. All stream read methods would need to be adjusted to read this byte when it is available.

Implementing 2) on the application side might make the most sense, it's just a shame this won't help applications out there using null_buffers with asio::ssl at the moment.

adamncasey added a commit to adamncasey/amqpprox that referenced this issue Mar 25, 2022
chriskohlhoff/asio#1015

`asio::ssl` has a bug where calling async_read_some with `null_buffers`
isn't correctly handled, and more or less immediately invokes the
provided handler with no data. This causes `amqpprox` to busy-loop
whenever a TLS socket has been accepted or opened.

There were two options for how to fix this:

1) Always read with a fixed size buffer, such as 32kb. This would
   simplify the code slightly, at the expense of needing multiple reads
   to handle larger than 32kb frames in non-TLS mode, even when the full
   frame is available to `amqpprox` in one go.
2) Ask asio::ssl to read with a very small buffer, then ask the openssl
   library how many bytes are available. This technique aligns with how
   `amqpprox`'s read loop works today. That is what is implemented here.

In theory something similar could be upstreamed into `asio::ssl`. It's a
little tricky though and this exact code couldn't handle the generic
`MutableBufferSequence` interface - we can take some shortcuts in our
code.

I've done some benchmarking to check this change isn't going to regress
performance noticeably. Data throughput tests indicate that this fix improves
performance for TLS connections over the existing code. Still running
connection throughput tests.
adamncasey added a commit to adamncasey/amqpprox that referenced this issue Mar 28, 2022
chriskohlhoff/asio#1015

`asio::ssl` has a bug where calling async_read_some with `null_buffers`
isn't correctly handled, and more or less immediately invokes the
provided handler with no data. This causes `amqpprox` to busy-loop
whenever a TLS socket has been accepted or opened.

There were two options for how to fix this:

1) Always read with a fixed size buffer, such as 32kb. This would
   simplify the code slightly, at the expense of needing multiple reads
   to handle larger than 32kb frames in non-TLS mode, even when the full
   frame is available to `amqpprox` in one go.
2) Ask asio::ssl to read with a very small buffer, then ask the openssl
   library how many bytes are available. This technique aligns with how
   `amqpprox`'s read loop works today. That is what is implemented here.

In theory something similar could be upstreamed into `asio::ssl`. It's a
little tricky though and this exact code couldn't handle the generic
`MutableBufferSequence` interface - we can take some shortcuts in our
code.

I've done some benchmarking to check this change isn't going to regress
performance noticeably. Data throughput tests indicate that this fix improves
performance for TLS connections over the existing code. Still running
connection throughput tests.
adamncasey added a commit to adamncasey/amqpprox that referenced this issue Apr 1, 2022
chriskohlhoff/asio#1015

`asio::ssl` has a bug where calling async_read_some with `null_buffers`
isn't correctly handled, and more or less immediately invokes the
provided handler with no data. This causes `amqpprox` to busy-loop
whenever a TLS socket has been accepted or opened.

There were two options for how to fix this:

1) Always read with a fixed size buffer, such as 32kb. This would
   simplify the code slightly, at the expense of needing multiple reads
   to handle larger than 32kb frames in non-TLS mode, even when the full
   frame is available to `amqpprox` in one go.
2) Ask asio::ssl to read with a very small buffer, then ask the openssl
   library how many bytes are available. This technique aligns with how
   `amqpprox`'s read loop works today. That is what is implemented here.

In theory something similar could be upstreamed into `asio::ssl`. It's a
little tricky though and this exact code couldn't handle the generic
`MutableBufferSequence` interface - we can take some shortcuts in our
code.

I've done some benchmarking to check this change isn't going to regress
performance noticeably. Data throughput tests indicate that this fix improves
performance for TLS connections over the existing code. Still running
connection throughput tests.
adamncasey added a commit to adamncasey/amqpprox that referenced this issue Apr 12, 2022
chriskohlhoff/asio#1015

`asio::ssl` has a bug where calling async_read_some with `null_buffers`
isn't correctly handled, and more or less immediately invokes the
provided handler with no data. This causes `amqpprox` to busy-loop
whenever a TLS socket has been accepted or opened.

There were two options for how to fix this:

1) Always read with a fixed size buffer, such as 32kb. This would
   simplify the code slightly, at the expense of needing multiple reads
   to handle larger than 32kb frames in non-TLS mode, even when the full
   frame is available to `amqpprox` in one go.
2) Ask asio::ssl to read with a very small buffer, then ask the openssl
   library how many bytes are available. This technique aligns with how
   `amqpprox`'s read loop works today. That is what is implemented here.

In theory something similar could be upstreamed into `asio::ssl`. It's a
little tricky though and this exact code couldn't handle the generic
`MutableBufferSequence` interface - we can take some shortcuts in our
code.

I've done some benchmarking to check this change isn't going to regress
performance noticeably. Data throughput tests indicate that this fix improves
performance for TLS connections over the existing code. Still running
connection throughput tests.
adamncasey added a commit to bloomberg/amqpprox that referenced this issue Apr 12, 2022
chriskohlhoff/asio#1015

`asio::ssl` has a bug where calling async_read_some with `null_buffers`
isn't correctly handled, and more or less immediately invokes the
provided handler with no data. This causes `amqpprox` to busy-loop
whenever a TLS socket has been accepted or opened.

There were two options for how to fix this:

1) Always read with a fixed size buffer, such as 32kb. This would
   simplify the code slightly, at the expense of needing multiple reads
   to handle larger than 32kb frames in non-TLS mode, even when the full
   frame is available to `amqpprox` in one go.
2) Ask asio::ssl to read with a very small buffer, then ask the openssl
   library how many bytes are available. This technique aligns with how
   `amqpprox`'s read loop works today. That is what is implemented here.

In theory something similar could be upstreamed into `asio::ssl`. It's a
little tricky though and this exact code couldn't handle the generic
`MutableBufferSequence` interface - we can take some shortcuts in our
code.

I've done some benchmarking to check this change isn't going to regress
performance noticeably. Data throughput tests indicate that this fix improves
performance for TLS connections over the existing code. Still running
connection throughput tests.
@ppadevski-vmw
Copy link

I'm also seeing this issue, though because of a different reason.

Looking at the example, the way it was written was to basically do async_read_some(null_buffers(), ...) ... but on the next layer which was in most cases tcp and ended up in epoll. It was wrong but used to work because up to TLS 1.2 the SSL handshake had to complete before application data was sent and in between requests for HTTP/1.1 epoll seemed to work okayish (next request received X time after response and epoll notices the bytes).

With TLS 1.3, however, this is no longer the case and the application data and the client certificate ended up being buffered in OpenSSL but not yet being processed. Hence epoll now gets stuck as there's no data in the socket.

The reasonable solution was to use sslsocket.async_read_some(null_buffers(), ...) and now I'm seeing this busy loop.

I'm going to use the same workaround that reads 1 byte but that looks really inefficient. There's an optimization involving SSL_pending but that would require me to add a ton of extra unit tests to make sure I have covered all of the corner cases (including e.g. read 0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants