Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a support for decrypting AES-GCM stream? #125

Closed
tomekit opened this issue Dec 21, 2022 · 12 comments
Closed

Is there a support for decrypting AES-GCM stream? #125

tomekit opened this issue Dec 21, 2022 · 12 comments

Comments

@tomekit
Copy link

tomekit commented Dec 21, 2022

I am trying to decrypt an AES-GCM stream.
This is for purposes of previewing the big video file. The other use case is decrypting big files (e.g. 10GB) without necessity of allocating that much RAM.

I would still like to perform the authentication, once I get to the end of stream (last 16 bytes contain MAC).
I understand the security implication of e.g. playing "unauthenticated" video preview to the user.

It seems it's not something that's supported out of the box, as GCM cipher is only supported in block mode currently.

I've managed to decrypt AES-GCM stream using Pointycastle AES-CTR streaming cipher, but... I can't find an easy way of authenticating such stream. Is it feasible to calculate GMAC on top of streamed AES-CTR (e.g. using DartGcm) to get the GCM auth properties?

Streamed AES-GCM with authentication is possible with e.g. Python. Please find this example:

import Crypto.Cipher.AES
import binascii

key = bytes.fromhex('88fe0ff8c4eaf468d4cd9d9a9831662488fe0ff8c4eaf468d4cd9d9a98316624')
nonce = bytes.fromhex('3c95422167063c9542216706')

# streamed encryption then signing
cipher2 = Crypto.Cipher.AES.new(key, Crypto.Cipher.AES.MODE_GCM, nonce=nonce)
print(cipher2.encrypt(b'verylong').hex())                     # e7c5a84a16a30f43
print(cipher2.encrypt(b'video').hex())                     # 64700e4aed
output, mac = cipher2.encrypt_and_digest(b'files')
print(output.hex()) # 128085a073
print(mac.hex()) # e1d9a6971f30d5a1db4f5d36903ceb71

# streamed decryption and verification
cipher3 = Crypto.Cipher.AES.new(key, Crypto.Cipher.AES.MODE_GCM, nonce=nonce)

text = cipher3.decrypt(binascii.unhexlify('e7c5a84a16a30f43'))
print(text) # b'verylong'

text = cipher3.decrypt(binascii.unhexlify('64700e4aed'))
print(text) # b'video'

text = cipher3.decrypt_and_verify(binascii.unhexlify('128085a073'), mac)
print(text) # b'files'

You can play with it here: https://trinket.io/python3/445dfc8d86 (please note that sometimes you need to click Run button multiple times before the ModuleNotFoundError disappears)

Example decryption code for AES-GCM. How to implement streaming?

final key = hexToUint8List('88fe0ff8c4eaf468d4cd9d9a9831662488fe0ff8c4eaf468d4cd9d9a98316624');
  final nonce = hexToUint8List('3c95422167063c9542216706');

  final ciphertext = hexToUint8List('e7c5a84a16a30f4364700e4aed128085a073');
  final mac = hexToUint8List('e1d9a6971f30d5a1db4f5d36903ceb71');

  final algorithm = AesGcm.with256bits();
  final secretBox = SecretBox(ciphertext, nonce: nonce, mac: Mac(mac));

  final cleartext = await algorithm.decrypt(
    secretBox,
    secretKey: SecretKey(key),
    // keyStreamIndex: keyStreamIndex,
  );

  print(cleartext);
@tomekit tomekit changed the title How to use AES-GCM keyStreamIndex for streams? Mac validation fails, authentication tag is at the end of ciphertext. Is there a support for decrypting AES-GCM stream? Dec 22, 2022
@tomekit
Copy link
Author

tomekit commented Dec 22, 2022

I've satisfied this requirement using Pointycastle. It would be nice to know if it's possible to achieve this using cryptography package for flutter. Don't really want to clutter the issues, so I am closing that one.

@tomekit tomekit closed this as completed Dec 22, 2022
@elliotsayes
Copy link

Hi @tomekit, could you provide some info on how you implemented this in pointycastle? Cheers

@tomekit
Copy link
Author

tomekit commented Jan 3, 2023

Hi @elliotsayes,
No problem, in principle you need to call processBytes multiple times and on a final sign/verify you call: doFinal.
If your stream chunk size is divisible without remainder by 16 (BLOCK_SIZE), you can call: processBlock directly skipping some of the logic which I was finding buggy (bcgit/pc-dart#182) and fixed here: bcgit/pc-dart@master...tomekit:pc-dart:master

@elliotsayes
Copy link

elliotsayes commented Jan 3, 2023

Thanks for your reply @tomekit. For encrypting, I've implemented something along the lines of what you suggested, using processBlock (good thing by the sounds of it!). I'm generating the blockStream using ChunkedStreamReader from package:async. Testing shows identical results as package:cryptography.

final encryptedStream = blockStream.transform<Uint8List>(
  StreamTransformer.fromHandlers(
    handleData: (data, sink) {
      final encryptedBlockBuffer = Uint8List(encrypter.blockSize);
      final processedBytes = encrypter.processBlock(data, 0, encryptedBlockBuffer, 0);
      sink.add(encryptedBlockBuffer.sublist(0, processedBytes));
    },
    handleDone: (sink) {
      var macBuffer = Uint8List(encrypter.macSize);
      encrypter.doFinal(macBuffer, 0);
      sink.add(macBuffer);
      sink.close();
    },
  ),
);

@tomekit
Copy link
Author

tomekit commented Jan 27, 2023

Nice approach @elliotsayes. I couldn't really use such approach, since my stream isn't continuous, that is, I receive e.g. 5000 bytes in first iteration, then 1000 in another. That means that I often have the ending block which isn't full 16 bytes. If I used processBlock on a less than 16 bytes and then truncate the output (based on processedBytes) then the ciphertext would be correct, however it messes up the authentication tag state, as not full block would be internally padded with zeros (and GHASH'd).
In other words, I needed the buffering, so instead of rolling my own, I've fixed the one used from the processBlock.

I am wondering if you had any issues with this approach? ... but given that you seem to have continuous stream I guess it all works alright.

@elliotsayes
Copy link

elliotsayes commented Feb 7, 2023

Thanks for the comment @tomekit.

I couldn't really use such approach, since my stream isn't continuous, that is, I receive e.g. 5000 bytes in first iteration, then 1000 in another.

My scenario is actually very similar, I was just preprocessing the stream to split it into chunks of 16 bytes using ChunkedStreamReader. However it turns out that reading in such small chunks causes a lot of overhead, so I had to change this approach to reading larger chunks and process them piecewise using the offsets in processBlock, i.e.:

encrypter.processBlock(sourceBuffer, bufferOffset, sinkBuffer, bufferOffset);

That means that I often have the ending block which isn't full 16 bytes. If I used processBlock on a less than 16 bytes and then truncate the output (based on processedBytes) then the ciphertext would be correct, however it messes up the authentication tag state, as not full block would be internally padded with zeros (and GHASH'd). In other words, I needed the buffering, so instead of rolling my own, I've fixed the one used from the processBlock.

This is interesting, AFAICT I have had no such problems with the authentication tag (this is what determines the MAC at the end of the ciphertext?). Bear in mind here that I have thus far only played around with encryption and not decryption, but based on my testing the output of my pointycastle implementation (including MAC concatenation) is identical to that of the output from package:cryptography's standard Uint8Array methods. This includes scenarios where the final block length is less than 16

I am wondering if you had any issues with this approach? ... but given that you seem to have continuous stream I guess it all works alright.

Although the output (for encryption) appears to be correct, the performance is lacking to say the least. In a dart test debug environment I'm only getting ~250KiB/s (macos m1). Perf doesn't change much when switching to processBytes with larger buffers, or running on flutter web release mode. Also note this PR when running on web: bcgit/pc-dart#181

This is in contrast to e.g. package:cryptography/package:webcrypto which uses the browser's WebCryptoAPI on web and a C-based implementation on mobile. (Note that WebCryptoAPI-based solutions are unsuitable as the standard currently lacks stream support).

I think my only remaining option to improve performance is something like https://pub.dev/packages/flutter_rust_bridge ? Curious on your experience/thoughts related to performance

@tomekit
Copy link
Author

tomekit commented Feb 7, 2023

AFAICT I have had no such problems with the authentication tag (this is what determines the MAC at the end of the ciphertext?).

It's interesting. Yes authentication tag and MAC are the same thing in this context.
Since this library doesn't seem to be much of an use for both of us due to performance issues, we can ignore these problems.

Also note this PR when running on web: bcgit/pc-dart#181

Yep, that's something similar to what I've did in my fixes: bcgit/pc-dart@master...tomekit:pc-dart:master unfortunately performance for Web is a pure tragedy.

I think my only remaining option to improve performance is something like https://pub.dev/packages/flutter_rust_bridge ? Curious on your experience/thoughts related to performance

I don't have experience with Rust and bindings to Rust, however I've used FFI in context of libsodium and performance was great: https://pub.dev/packages/sodium_libs
The issue is that libsodium doesn't support AES-GCM streaming as well, also there is AES-GCM missing in iOS implementation: https://github.com/jedisct1/swift-sodium/
The library itself supports concept of streaming: https://doc.libsodium.org/advanced/stream_ciphers for many ciphers, but that's unauthenticated.
We've currently put our cryptography improvements on hold, but since in our case we can switch ciphers, we will probably start with libsodium and modify it, so it would support XChaCha20 streaming with Poly1305 authentication.
Since XChaCha20 isn't yet IETF approved we might actually use start with ChaCha20 first and see how the situation evolves.
We'll simply wait a bit and if no-one by this time invented some sort of magic that we could use, we will have to get our hands dirty with C.

What Rust library do you have in mind when used with the Flutter bridge?

@elliotsayes
Copy link

elliotsayes commented Feb 9, 2023

I wrote comment earlier but github ate it... oops. Here goes again

I don't have experience with Rust and bindings to Rust

I barely do either. But from my investigation, the rust-in-flutter story is a bit messy right now, especially when you bring web into the mix. Some issues below:

No support for plugins with web:
  https://github.com/Desdaemon/flutter_rust_bridge_template/pull/30#issuecomment-1421983581
  https://github.com/fzyzcjy/flutter_rust_bridge/issues/857#issuecomment-1344894284
No support for Dart -> Rust streams:
  https://github.com/fzyzcjy/flutter_rust_bridge/issues/1032#issuecomment-1422016558
WASM issues:
  https://cjycode.com/flutter_rust_bridge/wasm_limitations.html
Web debugging/hot restart issues:
  https://github.com/fzyzcjy/flutter_rust_bridge/issues/1002

What Rust library do you have in mind when used with the Flutter bridge?

I've been playing with RustCrypto, which is audited for both AES-GCM and ChaCha20Poly1305. There is also tink-rust which wraps RustCrypto, but has a more complex key system.

I've used FFI in context of libsodium and performance was great

What kind of numbers were you getting for AES256-GCM? FYR my testing with RustCrypto encrypts at ~10MiB/s (macos m1, flutter web release mode, 1MiB chunks)

The library itself supports concept of streaming: https://doc.libsodium.org/advanced/stream_ciphers ...

Based on my research, the term "stream cipher" (i.e. as opposed to "block cipher") means something different to whether the API lets you encrypt chunks incrementally or not. Read more: https://en.wikipedia.org/wiki/Stream_cipher

we will probably start with libsodium and modify it...

Does this mean writing JS for web and native code for each additional platform? If so that is a serious undertaking.

... so it would support XChaCha20 streaming with Poly1305 authentication.

Just bear in mind that although XChaCha20 is faster in software, AES has optimised instructions on modern processors (AES-NI), though these are not supported in WASM yet.

@tomekit
Copy link
Author

tomekit commented Feb 9, 2023

What kind of numbers were you getting for AES256-GCM? FYR my testing with RustCrypto encrypts at ~10MiB/s (macos m1, flutter web release mode, 1MiB chunks)

By no means this is scientific (please find test code at the bottom of this post), but on a 100MiB file I get:

AMD Ryzen 7 5800U on Ubuntu 22.04

XChaCha20:
Encrypted after 171ms
Decrypted after 163ms

AES-GCM
Encrypted after 152ms
Decrypted after 156ms

Mac Mini M1

XChaCha20
Encrypted after 332ms
Decrypted after 345ms

AES-GCM - FFI fails;

Based on my research, the term "stream cipher" (i.e. as opposed to "block cipher") means something different to whether the API lets you encrypt chunks incrementally or not. Read more: https://en.wikipedia.org/wiki/Stream_cipher

You're right. However in this case, the API (https://doc.libsodium.org/advanced/stream_ciphers) gives you low-level building blocks (e.g. crypto_stream_xchacha20_xor and crypto_stream_xchacha20_xor_ic) that you shall be able to use for streaming.

Conversely if an API advertises streaming support: https://doc.libsodium.org/secret-key_cryptography/secretstream it doesn't necessarily mean that it's actually conventional streaming that you would expect it to be (More about that: Skycoder42/libsodium_dart_bindings#26 (comment)).

Does this mean writing JS for web and native code for each additional platform? If so that is a serious undertaking.

My assumption is, that most of the stuff (streaming and authentication - Poly1305) is already there. The C library will have to be modified and I hope it won't be any low-level stuff, but mostly using existing functions and connecting them really. Then it will have to be compiled to WASM/JS - https://github.com/jedisct1/libsodium.js/
I am not sure why C implementation wasn't / can't used for iOS/macOS, but instead there is a Swift one: https://github.com/jedisct1/swift-sodium/ but since Swift library supports XChaCha20 and Poly1305 already, such change would require similar sort of work as with C library, it's just language is subjectively nicer to use.
I have time for more research before the decision is made.

Test / benchmark code:

test('libsodiumXChaCha20Bench', () async {
      final libsodium = DynamicLibrary.open('/usr/lib/x86_64-linux-gnu/libsodium.so'); // dpkg -L libsodium-dev | grep "\.so"
      // final libsodium = DynamicLibrary.open('/opt/homebrew/Cellar/libsodium/1.0.18_1/lib/libsodium.23.dylib'); // brew info libsodium
      final sodium = await SodiumInit.init(libsodium);

      final base64MasterKey = "+Hv/rT8HPG+Qmk3zhV2NDA==";
      final encryptedKey = "8IK5l6NGSudK/b57goLjZ6ePvfHj+w29D7rle8ShLCLdl0Yy5irmtw==";
      final iv = "T4jMtxyX/+s60T3rT4jMtxyX/+s60T3r"; // For AES use half of that, e.g.: `T4jMtxyX/+s60T3r`
      final nonce = base64Decode(iv);

      // final plaintext = Uint8List(1024*1024*100);
      final plaintext = await File('/home/tomek/Downloads/rand100MiB.bin').readAsBytes(); // dd if=/dev/random of=/home/tomek/Downloads/rand100MiB.bin bs=1024 count=102400
      final unwrappedKey = AesKwRfc3394.unwrap(encryptedKey, base64MasterKey);

      final secureKey = SecureKey.fromList(sodium, Uint8List.fromList(unwrappedKey));

      final timer = Stopwatch()..start();
      final encryptedOutputBinary = sodium.crypto.aead.encrypt(message: plaintext, nonce: nonce, key: secureKey); // Change from `aead` to `aeadAes256Gcm` for AES
      timer.stop();
      print('Encrypted after ${timer.elapsed.inMilliseconds}ms');

      final timer2 = Stopwatch()..start();
      final decryptedOutputBinary = sodium.crypto.aead.decrypt(cipherText: encryptedOutputBinary, nonce: nonce, key: secureKey);  // Change from `aead` to `aeadAes256Gcm` for AES
      timer2.stop();
      print('Decrypted after ${timer2.elapsed.inMilliseconds}ms');
    });

@elliotsayes
Copy link

Regarding your benchmarks, it looks like you are running those on the linux dart runtime which explains the speed. Did you take any measurements on the web runtime?

Also note that if AES-GCM is is faster than XChaCha20 then it probably means your CPU has the optimised AES-NI instructions, otherwise it would be 2-4x slower than XChaCha20.

[libsodium] will have to be compiled to WASM

Ah that makes sense, I didn't realise the JS impl- is just compiled from C to WASM. That should simplify things.

if an API advertises streaming support [...] it doesn't actually mean that it's actually conventional streaming that you would expect it to be

Yes, I have noticed this too, and I believe it is because we are dealing with Authenticated Encryption (AE). It seems you can't just expect to take a buffer based AE implementation and turn it into a stream based one without modifying the protocol itself. E.g. RustCrypto uses a protocol called STREAM on top of AEADs like AES/ChaCha-Poly to make it suitable for this, and libsodium seems to do something similar.

If you want to keep compatibility with the standard protocol by skipping the MAC after each chunk, then it goes against the security model of AE (exposing unauthenticated data to the application), and to some extent defeats the purpose of using AE (over unauthenticated encryption) at all.

So IMO the answer is either:

  • Use a custom Streaming AEAD protocol (like STREAM / whatever libsodium uses), or
  • Drop the AE requirement and use an unauthenticated encryption like AES-CTR or plain (X)ChaCha20

I think this is preferable to hacking an AEAD implementation to defer Authentication to the end of the message, unless you really need that and are uncomfortable with committing to a custom protocol. In my case I believe that it is possible to do Authentication at a higher level in the protocol (i.e. the AES-GCM MAC is essentially redundant), so I'll see if I can get by with the latter option.

@tomekit
Copy link
Author

tomekit commented Feb 9, 2023

Did you take any measurements on the web?

Nope. I've only assumed it will be "enough" having chance to see the XChaCha20 results in general.
There is also Web proposal for streaming protocols: https://github.com/wintercg/proposal-webcrypto-streams/blob/main/explainer.md but it is possible that as you've summarized that in your last post - streamed AEAD might never actually end up there.

It seems you can't just expect to take a buffer based AE implementation and turn it into a stream based one without modifying the protocol itself.

If I didn't have a requirement of providing preview of encrypted video files, than I would probably keep files encrypted in e.g. 5MB independently signed chunks. That proves to be difficult if you aim your decrypted plaintext match 1:1 (position-wise) with the ciphertext: jedisct1/libsodium#1255 (comment)

If you want to keep compatibility with the standard protocol by skipping the MAC after each chunk, then it goes against the security model of AE (exposing unauthenticated data to the application), and to some extent defeats the purpose of using AE (over unauthenticated encryption) at all.

You're entirely right. It's just I've assumed that "delayed verification" will almost always be better than none.
I believe that security model of the application that I am building doesn't inherently fail if an unauthenticated data is exposed.
An example is user downloading an encrypted file from the S3 bucket, locally on their desktop. File gets streamed to temporary location and is only moved to the ~/Downloads location if verification is successful.
In other environments where arrangement of actors is different, exposing unauthenticated cipher might allow some additional attacks. Perhaps the "delayed verification" is too risky in some assessments (e.g. what if code that is supposed to discard the malformed plaintext doesn't work?)
I am also considering using mixed approach, that is streaming AEAD protocol for all but video files.

Finally, I've wanted my solution to be cross-compatible and I've followed exactly the same implementation that AWS initially proposed: https://docs.aws.amazon.com/general/latest/gr/aws_sdk_cryptography.html#crypto_features and https://aws.amazon.com/blogs/developer/amazon-s3-client-side-authenticated-encryption/
With AES-GCM it supports up to 64GB files, but without streaming the limit would really be defined by device's RAM size.
It's also one of the reasons I am pushing that hard trying to abuse streamed AEAD.

Thank you for having this great discussion. It's really useful to reiterate and validate this all again.

@elliotsayes
Copy link

If I didn't have a requirement of providing preview of encrypted video files, than I would probably keep files encrypted in e.g. 5MB independently signed chunks. That proves to be difficult if you aim your decrypted plaintext match 1:1 (position-wise) with the ciphertext: jedisct1/libsodium#1255 (comment)

Thinking about this requirement as I may want to implement it myself at some point... intuitively it would be possible to do this with a gateway service that proxies a given HTTP request, mapping the to the correct Range offset and decrypting. But granted that a gateway isn't an option, I wonder if there is a way implement this same sort of proxy but within the client browser? I believe this does something along those lines, by intercepting and transforming fetch calls: https://github.com/myelnet/js-dcdn/blob/dev/packages/service-worker/src/controller.ts

File gets streamed to temporary location and is only moved to the ~/Downloads location if verification is successful. In other environments where arrangement of actors is different, exposing unauthenticated cipher might allow some additional attacks. Perhaps the "delayed verification" is too risky in some assessments (e.g. what if code that is supposed to discard the malformed plaintext doesn't work?)

Yes, this is essentially leaking the security abstraction by offloading some of the security burden to your application code. I'd put it in the "probably a bad idea" category, but you could get away with it if you are careful.

Finally, I've wanted my solution to be cross-compatible [...]

Same here, as I'm working on an open standard (ArFS), and don't really want to include a custom protocol (e.g. libsodium's secretstream) into the spec. The problem is though, even if you are following a standard, it doesn't help much if no audited implementation has out-of-the-box support of your desired combination of parameters and target hardware (plaintext size > available RAM). So essentially it's a choice between unsupported standards and supported non-standards 😂

Personally I'm going to wait for confirmation of whether we can remove Authentication entirely from this layer of the protocol, and if so go with AES256-CTR or ChaCha20. Otherwise I'll probably bite the bullet and go with (AES256-GCM or (X)ChaCha20-Poly1305) over (STREAM or secretstream) for a worry-free Authentication abstraction.

Thank you for having this great discussion. It's really useful to reiterate and validate this all again.

Likewise, appreciate the exchange :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants