-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor symmetric one-shots for improved performance #58270
Conversation
This refactors the symmetric one shots to do less work. This removes the calls to Reset for the one shots since the cipher will not be reused, and in the Unix cases, removes keeping a copy of the IV around.
Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq, @GrabYourPitchforks Issue DetailsThis is a revival of #55090 which was unable to make it in the 6.0 time frame. This reduces allocations and overhead in for the symmetric one shots.
Starting off a draft because I think I can fix up Windows a bit more, but benchmarks look good. Allocations that are "overhead" are cut about half. ECB and CBC show small throughput improvements. CFB shows larger improvements on the Apple platform because the overhead of doing a Reset with CFB is much higher. Work remaining:
Closes #55601. Benchmarks on Apple M1CBC
CFB
ECB
|
Note: this PR omits some one-shot improvements for the *Cng one-shot implementations. I'll do that in a follow up PR, as this one has gotten a bit big (though most of the changes are just shuffling / decoupling existing code). |
Linux, macOS, and Windows benchmarks look positive to me, so I'll mark it ready for review. |
Performance of Benchmarks
|
Are these scenarios that ought to be added to dotnet/performance to protect this change? I don't think I see them there. |
I don't know, and I don't know how volatile it would be. Crypto is largely dependent on platform implementations, and the majority of the time spent is in native calls. Benchmarks could be added but they would largely be timing these native APIs that we have no control over. |
@bartonjs just a ping to see if this one flew under your radar (I think you were out when I opened this PR). No worries if you cannot get to this one any time soon. The diff looks scary but it's mostly just moving code around, not re-writes. |
} | ||
} | ||
|
||
public static int PadBlock(ReadOnlySpan<byte> block, Span<byte> destination, int paddingSizeInBytes, PaddingMode paddingMode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I (personally) strongly prefer that methods inside internal classes are marked internal, unless they are conforming to an interface.
Every time I see "public" I have an API Review reaction (e.g. something like "is this the right shape for the rest of time?", "did we review this?", or "I don't remember seeing the ref.cs update").
I wouldn't spin the change at this point just for this... but if it has to be spun.... 😄.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I suspect in two years you will have changed my mind on this, much like with var
when I set out to start contributing here a few years ago 😄.
Well... I knew it was here. But that it was long. And I was "so close" to finishing up a thing I was working on. Now it's two weeks later and I'm still "so close", so it was obviously time to bump this higher 😄. Sorry. |
This is a revival of #55090 which was unable to make it in the 6.0 time frame.
This reduces allocations and overhead in for the symmetric one shots.
Reset
is skipped in the case of one shots since we won't use the cipher after the finalize. It's finalized, then disposed.Starting off a draft because I think I can fix up Windows a bit more, but benchmarks look good.
Allocations that are "overhead" are cut about half. ECB and CBC show small throughput improvements. CFB shows larger improvements on the Apple platform because the overhead of doing a Reset with CFB is much higher.
Work remaining:
TransformBlock
andTransformFinalBlock
to ensure we didn't regress the non-one-shots.Contributes to #55601.
Benchmarks on Apple M1
CBC
CFB
ECB
Benchmarks on Windows 10
CBC
CFB
ECB
Benchmarks on Linux
CBC
CFB
ECB