New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why not stream based AES-GCM? #27348
Comments
The biggest problem is streaming decryption. There's a tendency to want to read the data before the tag verifies, and that's dangerous. That said, the model looks fair, and sound. But it's a protocol, not direct exposure to the algorithm. The API we added allows access to the algorithm for anything up to int.MaxValue bytes, and the protocol you've described can be written in terms of it (provided you have 2x2GB of addressable memory to buffer with). If that protocol was a standard, we would likely implement it as the standard, but we aren't currently interested in custom protocols or schemes. |
There is another issue disallowing stream based AES-GCM which is when using it in conjunction with System.IO.Pipelines, which gives you a ReadOnlySequence which consists of multiple buffers internally so they can't be deciphered in one go. And no even reducing them into smaller packets won't help as the amount of data you receive per socket recv call is never guaranteed. I'd appreciate if this issue would be reopened again. |
@AronParker can you read data to the buffer first instead? AES-GCM by design has a limit of how much can be encrypted per (key, nonce) which should fit in memory. Also in order to authenticate correctly library is not suppose to reveal any data to user until the whole decryption is done and authenticated |
Yes I can read the data into a buffer, but that defeats the whole purpose of the scalability of System.IO.Pipelines, the whole point of the package is avoiding these kinds of memory allocations. That limit won't be hit, the amount of data that is to be decrypted by AES-GCM is 2^16 at most, yet due to the unpredictable nature of the sizes received by Socket.Receive calls, it's not possible to rely on this, in practice the received data will be segmented into multiple buffers. I agree on making the API design safe by default because people do forget to verify the tag, however the option for advanced uses should not be taken as it is necessary in some high performance cases, in particular in conjunction with System.IO.Pipelines. |
I've been wondering why it had been chosen not to have a stream-based API for interacting with AES-GCM. I would like to describe a theoretical model built around the API added in this pull request: dotnet/corefx#31389. I understand that the reason behind this decision is security. However, in such a case please explain why the following approach would not meet your security standards.
Motivation
The maximum array length of 2147483591 imposes a restriction on the maximum size of a byte sequence that can be encrypted — roughly 2 GB. It is likely that a customer will try to encrypt a larger piece of data.
Theoretical approach
Imagine a huge piece of data encoded in the following way:
Math.Min(a, b)
chunks, each having:a != b
then one type of data (either ciphertext or additional data) requires more chunks. In such a case all of them are laid out with the following configuration:Some thoughts
@krwq @vcsjones @bartonjs @morganbr @Drawaes @blowdart
The text was updated successfully, but these errors were encountered: