New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in SslStream reading/writing by ~30% #13274

Merged
merged 3 commits into from Nov 3, 2016

Conversation

Projects
None yet
6 participants
@stephentoub
Member

stephentoub commented Nov 2, 2016

#12935 reduced allocations in SslStream reading/writing by ~70%. This PR reduces it further by another ~30% from where it was after that PR.

Before (10K reads / 10K writes):
image

After (10K reads / 10K writes):
image

Main changes:

  • Every read/write operation was allocating a new AsyncProtocolRequest. Without changing any of the plumbing through SslStream, we can take advantage of the fact that SslStream allows only a single read and a single write operation at a time, which means we can store and reuse an AsyncProtocolRequest instance for reading and one for writing, reusing the instances for all read/write operations on the stream.
  • Every read operation was boxing an Int32 result. This changes the BufferAsyncResult instance used for reads to reuse one of its existing fields to store that result, rather than storing it in the base LazyAsyncResult's object field.
  • Also took the opportunity to remove an unused field from BufferAsyncResult, though it doesn't change the size on 64-bit given the other fields on the base class.

cc: @geoffkizer, @ericeil, @davidsh, @CIPop, @benaadams, @davidfowl
Contributes to #11826

stephentoub added some commits Nov 2, 2016

Reuse AsyncProtocolRequest instances across read/write operations
Saves an allocation per read and write, ammortizing the costs across the whole stream's lifetime.

@stephentoub stephentoub changed the title from Ssl allocs to Reduce allocations in SslStream reading/writing by 30% Nov 2, 2016

@stephentoub stephentoub changed the title from Reduce allocations in SslStream reading/writing by 30% to Reduce allocations in SslStream reading/writing by ~30% Nov 2, 2016

internal void CompleteUser(object userResult)
internal void CompleteUser(int userResult)
{

This comment has been minimized.

@geoffkizer

geoffkizer Nov 2, 2016

Member

nit: capitalization

@geoffkizer

geoffkizer Nov 2, 2016

Member

nit: capitalization

This comment has been minimized.

@stephentoub

stephentoub Nov 2, 2016

Member

nit: capitalization

Of what?

@stephentoub

stephentoub Nov 2, 2016

Member

nit: capitalization

Of what?

This comment has been minimized.

@stephentoub

stephentoub Nov 2, 2016

Member

Oh, in the debug message below? GitHub wasn't showing that on the comment view. Will lower-case the R.

@stephentoub

stephentoub Nov 2, 2016

Member

Oh, in the debug message below? GitHub wasn't showing that on the comment view. Will lower-case the R.

This comment has been minimized.

@geoffkizer

geoffkizer Nov 2, 2016

Member

Yes, sorry for the confusion, I meant the R in the comment below.

@geoffkizer

geoffkizer Nov 2, 2016

Member

Yes, sorry for the confusion, I meant the R in the comment below.

@geoffkizer

This comment has been minimized.

Show comment
Hide comment
@geoffkizer

geoffkizer Nov 2, 2016

Member

LGTM. Nice work.

I do think that longer-term we should revisit the async implementation pattern here and elsewhere. There are likely more wins to be had here. Even with your improvements, async objects account for ~80% of allocated bytes.

Member

geoffkizer commented Nov 2, 2016

LGTM. Nice work.

I do think that longer-term we should revisit the async implementation pattern here and elsewhere. There are likely more wins to be had here. Even with your improvements, async objects account for ~80% of allocated bytes.

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 2, 2016

Member

I do think that longer-term we should revisit the async implementation pattern here and elsewhere. There are likely more wins to be had here. Even with your improvements, async objects account for ~80% of allocated bytes.

Absolutely. As you suggest, this is trying to make the best of the current design. Note that the SecBufferDesc allocation per operation can also be removed with a bit of work (see my comments in #12935), at which point all allocations per operation will be due to asynchrony. Once that's removed, we'll essentially have two allocations per read/write operation, one for the IAsyncResult and one for the Task wrapping it; it'll take some work, but we should be able to get that down to one, at which point I think that's generally the best we can do with the current API shape (though for some operations, we should be able to eliminate all allocations if they complete synchronously).

Member

stephentoub commented Nov 2, 2016

I do think that longer-term we should revisit the async implementation pattern here and elsewhere. There are likely more wins to be had here. Even with your improvements, async objects account for ~80% of allocated bytes.

Absolutely. As you suggest, this is trying to make the best of the current design. Note that the SecBufferDesc allocation per operation can also be removed with a bit of work (see my comments in #12935), at which point all allocations per operation will be due to asynchrony. Once that's removed, we'll essentially have two allocations per read/write operation, one for the IAsyncResult and one for the Task wrapping it; it'll take some work, but we should be able to get that down to one, at which point I think that's generally the best we can do with the current API shape (though for some operations, we should be able to eliminate all allocations if they complete synchronously).

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Nov 2, 2016

Collaborator

LGTM - Excellent work! 👍

Collaborator

benaadams commented Nov 2, 2016

LGTM - Excellent work! 👍

@davidsh

davidsh approved these changes Nov 2, 2016

LGTM. Thanks @stephentoub !

Avoid boxing Int32 result per operation
- Change BufferAsyncResult to allow storing int result, without adding another field
- Use that in both SslStream and NegotiateStream to avoid boxing an int per read and write
- Also rename AsyncProtocolRequest.CompleteWithError to CompleteUserWithError, to avoid confusion and keep it consistent with the other CompleteUser methods.
@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 2, 2016

Member

@dotnet-bot Test Outerloop Windows_NT Debug please
@dotnet-bot Test Outerloop Ubuntu14.04 Debug please

Member

stephentoub commented Nov 2, 2016

@dotnet-bot Test Outerloop Windows_NT Debug please
@dotnet-bot Test Outerloop Ubuntu14.04 Debug please

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 2, 2016

Member

Thanks for the reviews, all.

Member

stephentoub commented Nov 2, 2016

Thanks for the reviews, all.

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 2, 2016

Member

@dotnet-bot Test OuterLoop Ubuntu14.04 Debug please ("arm32 emulator vms build and test on the main drive which is small. This is tracked with issue dotnet/coreclr#6676")

Member

stephentoub commented Nov 2, 2016

@dotnet-bot Test OuterLoop Ubuntu14.04 Debug please ("arm32 emulator vms build and test on the main drive which is small. This is tracked with issue dotnet/coreclr#6676")

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 2, 2016

Member

@dotnet-bot Test OuterLoop Windows_NT Debug please (#13281)

Member

stephentoub commented Nov 2, 2016

@dotnet-bot Test OuterLoop Windows_NT Debug please (#13281)

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 2, 2016

Member

@dotnet-bot Test OuterLoop Windows_NT Debug please (#13284)

Member

stephentoub commented Nov 2, 2016

@dotnet-bot Test OuterLoop Windows_NT Debug please (#13284)

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Nov 3, 2016

Member

@dotnet-bot Test OuterLoop Windows_NT Debug please (now that #13284 is fixed)

Member

stephentoub commented Nov 3, 2016

@dotnet-bot Test OuterLoop Windows_NT Debug please (now that #13284 is fixed)

@stephentoub stephentoub merged commit 8b5760b into dotnet:master Nov 3, 2016

12 checks passed

Innerloop CentOS7.1 Debug Build and Test Build finished.
Details
Innerloop CentOS7.1 Release Build and Test Build finished.
Details
Innerloop Linux ARM Emulator SoftFP Debug Cross Build Build finished.
Details
Innerloop Linux ARM Emulator SoftFP Release Cross Build Build finished.
Details
Innerloop OSX Debug Build and Test Build finished.
Details
Innerloop OSX Release Build and Test Build finished.
Details
Innerloop Ubuntu14.04 Debug Build and Test Build finished.
Details
Innerloop Ubuntu14.04 Release Build and Test Build finished.
Details
Innerloop Windows_NT Debug Build and Test Build finished.
Details
Innerloop Windows_NT Release Build and Test Build finished.
Details
OuterLoop Ubuntu14.04 Debug Build finished.
Details
OuterLoop Windows_NT Debug Build finished.
Details

@stephentoub stephentoub deleted the stephentoub:ssl_allocs branch Nov 3, 2016

@karelz karelz modified the milestone: 1.2.0 Dec 3, 2016

@stephentoub stephentoub referenced this pull request Jun 23, 2017

Open

Faster SslStream #21371

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment