New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add zero byte read to SslStream #87563
Conversation
Tagging subscribers to this area: @dotnet/ncl, @bartonjs, @vcsjones Issue Detailsfixes #76029 This change aims to postpone buffer allocations and avoid allocations of GCHandles. The current worst case is Kestrel with idle connections. With current code We would allocate large buffer as well as GCHandle as the Socket read fails to finish synchronously. That creates pressure and it can fragment memory with block that is immovable. To fix this I added zero bytes read to EnsureFullTlsFrameAsync and I moved all EnsureAvailableSpace calls behind it. There is some overhead but from what I can tell it is not measurable
The proposed change does zero byte read only before each TLS frame. I wrote test that writes random data and adds some delay. I saw ~ 5% reads would till allocate GCHandle as the TLS frame may not be read in single read. That of course depends on specific pattern but it still looks like improvement. When we receive beginning of the frame, it is likely that rest will arrive as well e.g. the receive buffer will be needed. We could add the zero byte ready into the loop just before current I had to touch the conformance tests as they would no see zero byte reads that were not excepted as well as zero byte reads during handshake. e.g. before the test body even starts. cc: @davidfowl @Tratcher in case this has some unexpected impact on Kestrel
|
src/libraries/System.Net.Security/src/System/Net/Security/SslStream.IO.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Security/src/System/Net/Security/SslStream.IO.cs
Outdated
Show resolved
Hide resolved
if (frameSize == UnknownTlsFrameLength) | ||
{ | ||
// make sure we have space for the whole frame | ||
_buffer.EnsureAvailableSpace(frameSize - _buffer.EncryptedLength); | ||
// We do not have enough data to determine frame size. Use provided estimate e.g. | ||
// full TLS frame for read, and somewhat shorter frame for handshake or renegotiation | ||
_buffer.EnsureAvailableSpace(estimatedSize); | ||
} | ||
else | ||
{ | ||
// move existing data to the beginning of the buffer (they will | ||
// be couple of bytes only, otherwise we would have entire | ||
// header and know exact size) | ||
_buffer.EnsureAvailableSpace(_buffer.Capacity - _buffer.EncryptedLength); | ||
// make sure we have space for the whole frame | ||
_buffer.EnsureAvailableSpace(frameSize - _buffer.EncryptedLength); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider instead:
// If we don't have enough data to determine the frame size, use the provided estimate
// (e.g. a full TLS frame for reads, and a somewhat shorter frame for handshake / renegotiation).
// If we do know the frame size, ensure we have space for the whole frame.
_buffer.EnsureAvailableSpace(frameSize == UnknownTlsFrameLength ?
estimatedSize :
frameSize - _buffer.EncryptedLength);
src/libraries/System.Net.Http/tests/FunctionalTests/ResponseStreamZeroByteReadTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Co-authored-by: Stephen Toub <stoub@microsoft.com>
cc @amcasey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo existing comments
Eager to try this change out. |
I want to clarify one thing, with all of the existing code doing zero byte reads, do we still end up doing multiple layers of zero byte reads or does this really only affect code the handshake (which the caller doesn't own the buffer)? i.e. await sslStream.ReadAsync(Memory<byte>.Empty);
await sslStream.ReadAsync(buffer); How many reads happen on the underlying stream? |
I'd expect three. I don't believe there's any memory here for whether the last read issued was zero-byte (nor am I convinced that would always be sound, in particular if |
we actually do not break out from |
@wfurt lets wait for the performance numbers to come back from the various https benchmarks. In Kestrel's case, the stream isn't a NetworkStream, it's a custom stream implementation that reads from a PipeReader. |
Improvements on arm64: improvements on x64: |
fixes #76029
This change aims to postpone buffer allocations and avoid allocations of GCHandles. The current worst case is Kestrel with idle connections. With current code We would allocate large buffer as well as GCHandle as the Socket read fails to finish synchronously. That creates pressure and it can fragment memory with block that is immovable.
To fix this I added zero bytes read to EnsureFullTlsFrameAsync and I moved all EnsureAvailableSpace calls behind it.
If the underlying Stream supports blocking calls this will wait some data are actually available. Then we allocate the buffer and socket read will finish synchronously using the cheap
fixed
block.If the underlying stream does not support blocking zero byte streams we just process everything as we did before.
There is some overhead but from what I can tell it is not measurable
The proposed change does zero byte read only before each TLS frame. I wrote test that writes random data and adds some delay. I saw ~ 5% reads would till allocate GCHandle as the TLS frame may not be read in single read. That of course depends on specific pattern but it still looks like improvement. When we receive beginning of the frame, it is likely that rest will arrive as well e.g. the receive buffer will be needed. We could add the zero byte ready into the loop just before current
TIOAdapter.ReadAsync
to get 100% of synchronous receive. Or we can do it only for Windows, async path or cases when underlying stream is NetworkStream (may not work for Kestrel) to minimize impact for other cases when extra call has overhead with no benefits.I had to touch the conformance tests as they would no see zero byte reads that were not excepted as well as zero byte reads during handshake. e.g. before the test body even starts.
cc: @davidfowl @Tratcher in case this has some unexpected impact on Kestrel