Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 27755 - Using the Subtle Crypto Interface with Streams #73

Open
mwatson2 opened this issue May 24, 2016 · 74 comments
Open

Bug 27755 - Using the Subtle Crypto Interface with Streams #73

mwatson2 opened this issue May 24, 2016 · 74 comments

Comments

@mwatson2
Copy link
Collaborator

Bug 27755:

Though the StreamsAPI is referenced in Informative Reference, the functions under window.crypto.subtle are specified with only one-shot data inputs.

Use-cases: Data may not be available at once. Data may be too huge to keep in memory.

For encrypt()/decrypt() it would make sense to have a streaming readable output if the input is a readable stream.

@jimsch
Copy link
Collaborator

jimsch commented May 24, 2016

After listening to Ryan rage about the use of BER encoding for ASN.1 objects, I have a feeling that this should be closed as won't fix because it presents a security issue. When one looks at the encrypt/decrypt APIs for authenticated encryption, it is required that the entire stream be observed on the decrypt side and could be argued that it needs to be observed on the encrypt side prior to emitting the processed stream. This is due to the fact that if the decryption process does not validate then no output is to be produced for consumption. Allowing this to be done in a streaming fashion means that the browser potentially needs to have an infinite size buffer to hold the intermediate result to be returned to the client.

Similar issues hold for processing of signature values for the new X448 EdDSA algorithm where the message M is hashed twice. Allowing for an indefinite length input means that there are potential buffer overrun problems.

@feross
Copy link

feross commented May 24, 2016

Node.js has a streaming crypto API without any security issues:

const crypto = require('crypto');
const hash = crypto.createHash('sha256');

hash.update('some data to hash');
hash.update('more data');
hash.update('even more data');
console.log(hash.digest('hex'));

Why can't the web platform?

@indutny
Copy link

indutny commented May 24, 2016

I absolutely agree with @feross on this. Most (if not all) of the APIs can work in a streaming mode without any security issues. In fact, this is how these APIs are exposed in OpenSSL, so they always work in a streaming mode under the hood anyway, regardless of what high-level API may look like.

@jimsch
Copy link
Collaborator

jimsch commented May 24, 2016

All of the current hash functions that I am familiar with will allow for streaming APIs because they are built using a Merkle–Damgård construction. This means that they are processed on a block by block basis. However there are algorithms for which this is not doable. For example, the EdDSA algorithm that I mentioned above computes:

R = fn( SHAKE256(dom(F, C) || prefix || M, 114) )
and then
k = SHAKE256(dom(F, C) || R || A || M, 114)

as you can see, you need to all of the message M to compute R before you can start doing the computation of k. This means that the entire message needs to be buffered unlike the hash example you gave above.

Note also the comment that I made on authenticated decryption where the entire message needs to be kept before doing the validation step at the end.

@indutny
Copy link

indutny commented May 24, 2016

@jimsch in your description SHAKE256 appears to be just a hashing function, most of the hashing functions support streaming input. There is nothing that could prevent one from creating two streaming SHAKE256 hashes and using their digests at the end of the stream to compute R and k.

Authenticated decryption should work as well, as far as I can tell... Though, the fact that the integrity is checked only at the end of decryption process means that the API will be kind of awkward. I don't think that there are much pros of using streams for authenticated decryption.

@jimsch
Copy link
Collaborator

jimsch commented May 24, 2016

@indutny please re-read my previous post and look at the requirements to finish R before using M for k

@indutny
Copy link

indutny commented May 24, 2016

@jimsch oh I see it now. Sorry about that! Yeah, streaming won't work for this kind of encryption/decryption schemes indeed.

Still many hashes and ciphers work just fine with streams.

@tanx
Copy link

tanx commented May 24, 2016

A native streaming api would indeed be great. Our use-case would be large file encryption in OpenPGP.js.

@mwatson2
Copy link
Collaborator Author

If we address this, I think it will not be in this version since it requires substantial work.

@mwatson2 mwatson2 added this to the VNext milestone May 24, 2016
@hhalpin
Copy link

hhalpin commented Jun 20, 2016

I imagine we can close this as won't fix, but when streaming stabilizes we can then revisit as part of maintenance of the spec since as @jimsch correctly points out, it won't work for quite a few algorithms. We could also try to test to see if anyone supports streaming - any ideas?

@hhalpin
Copy link

hhalpin commented Jun 20, 2016

v.Next.

@evilaliv3
Copy link

Is there any update on this topic?

@roccomuso
Copy link

+1

@ericmackrodt
Copy link

If streaming/progressive encryption isn't implemented, it's going to hugely limit the scope of usage of the API. I really need that kind of functionality for the software I work on.

@neckaros
Copy link

neckaros commented May 9, 2017

+1!

@neckaros
Copy link

neckaros commented May 9, 2017

Privacy is a groing concern. Being able to decrypt locally without consuming too much memory is a must i think.
For exemple encrypt huge file locally as you send it to a server so the server never has the decrypted data.
It works well on nodejs

@alanwaketan
Copy link

I think digest maybe a good point to start with.

@thiccar
Copy link

thiccar commented Jul 14, 2017

+1000

@daviddias
Copy link

daviddias commented Aug 28, 2017

Hi all, bringing this issue back up. Any updates or recent discussion on it?

I believe that the security considerations do not hold and it what it promotes is for users to find other ways to encrypt their files as the usage of browsers to share large documents grows. Possibly by having to shim their own encryption streaming API which will be considerably slower than a native one through WebCrypto.

@JulianKlug
Copy link

+1

@johnozbay
Copy link

100% agreed with @ericmackrodt & @neckaros & @diasdavid. With GDPR on the horizon this would make things a lot more easier for European establishments.

@dead-claudia
Copy link

@jimsch By any chance, could a streaming API be provided for those encryption schemes that could be streamed? Just because it's not possible for some doesn't make it impossible for all (and there's different tradeoffs for each). And one good example of this is with client decryption of large files on mobile (only high end phones/tablets have the RAM available to reliably decrypt a 750MB video download in-memory).

@jimsch
Copy link
Collaborator

jimsch commented Mar 27, 2018

It could, on the other hand there may be other things that could be done as well. For example one could do chunked encryption of large objects such as video which is designed to be streamed so that each chunk can be independently decrypted and streamed. The world is moving towards only using authenticated encrypted algorithms and doing streaming such as you suggest means that you are willing to use a decrypted stream that may have been corrupted w/o being able to detect this.

Additionally, one would need to get a group of people together at the W3C who are interested in doing an update to the document and then decide which algorithms could/should be streamable and which should not.

@lucacasonato
Copy link
Contributor

Special casing ReadableStream seems totally reasonable to me.

@twiss
Copy link
Member

twiss commented Jan 4, 2022

@lucacasonato Fair enough. Part of the reason I liked the idea of having a separate object / type is that detection of support for streaming hashing (and creating polyfills) is slightly easier. But ok, I suppose having a try/catch is not that bad either, and I agree it's a more straightforward solution.

@jasnell Would you be happy with that solution as well?

@isiahmeadows I agree that implementations should be able to optimize handling ReadableStreams, but I'm not sure the spec necessarily needs to account for that, as the result should be indistinguishable? I also don't see what you mean re. the fetch spec; I see it accepts ReadableStreams, but it doesn't accept [Async]Iterator<BufferSource>, so it doesn't seem like the same scenario?

@lucacasonato
Copy link
Contributor

but I'm not sure the spec necessarily needs to account for that, as the result should be indistinguishable?

We unfortunately do, because the iteration protocol is observable to the user (they could add a custom iteration function to an existing ReadableStream). By using a brand check to explicitly target ReadableStream differently, we could circumvent this observableness somewhat.

TBH, I really do not think this makes any sort of performance difference though, as the async iteration would introduce at most one more promise per turn (it might actually even be the same).

@twiss
Copy link
Member

twiss commented Jan 4, 2022

Right. If it does make a difference for performance, the implementation could check whether [Symbol.asyncIterator] is the built-in one. Deoptimizing the case where it isn't seems fine, as it's similar to passing a custom AsyncIterator in the first place.

@jfbrennan
Copy link

jfbrennan commented Jan 4, 2022

Came here and was pleasantly surprised to see @feross's 5 year old suggestion above (exactly what I need in the browser) and that it has so many positive reactions #73 (comment)

Is that no longer in scope?

Lots of years of discussion here...perhaps what I'm seeing is the deeper convo about how to support such an API. If so, yay! But apparently still far from getting shipped.

If not, this seems like a really basic use case. For example, before I can upload to my storage service, I need to generate a hash hex from the digest, but the digest can't be created in one go because the file data comes in chunks (our users have gigabyte-sized files to upload, so the files get chunked in order to avoid hitting in-memory limits).

@tniessen
Copy link
Contributor

FYI, there is ongoing work in https://github.com/wintercg/proposal-webcrypto-streams, which is managed by the @wintercg.

A first explainer exists and we welcome feedback through GitHub issues in that repository.

@simonhaenisch
Copy link

simonhaenisch commented Jun 24, 2022

I came here because I needed to generate a hash (for etag header) via streaming inside a Cloudflare worker and thanks to the mention of the non-standard crypto.DigestStream I got it working... couldn't find any docs yet, so here's a Cloudflare worker example for streaming some result into KV, and generating an etag along with it:

interface Env {
  KV: KVNamespace;
}

const onFetch: ExportedHandlerFetchHandler<Env> = async (request, env): Promise<Response> => {
  const encoder = new TextEncoder();

  const etagStream = new crypto.DigestStream('SHA-1');
  const { readable, writable } = new TransformStream();

  const etagWriter = etagStream.getWriter();
  const resultWriter = writable.getWriter();

  // not awaiting this because we'll start streaming into
  // the writable side of the transform stream after
  env.KV.put('result', readable);

  // imagine some utility function that reads a ReadableStream,
  // decoding each chunk and passing the data to the callback
  await processStream(request.body, (data: string) => {
    // ... do whatever with the data here

    const encoded = encoder.encode(data);

    etagWriter.write(encoded);
    resultWriter.write(encoded);
  }

  await Promise.all([etagWriter.close(), resultWriter.close()]);

  const etagBuffer = await etagStream.digest;

  const etag = Array.from(new Uint8Array(etagBuffer))
    .map((byte) => byte.toString(16).padStart(2, '0'))
    .join('');

  await env.KV.put('etag', etag);

  return new Response('done');
}

const worker: ExportedHandler<Env> = {
	fetch: onFetch,
};

export default worker;

simonhaenisch added a commit to simonhaenisch/cloudflare-docs that referenced this issue Jun 24, 2022
After dropping this as a comment here: w3c/webcrypto#73 (comment), I thought might be worth turning it into an example for the docs as well?

Feel free to update this.
@fabiospampinato
Copy link

fabiospampinato commented Mar 9, 2023

This seems a huge limitation, making hashes should go at the speed of light, instead the WebCrypto API seems almost designed to force you to write slow code (no sync method for tiny inputs, no streams support for huge inputs).

@Zectbumo
Copy link

Since this API hasn't been completed yet I would like to point out that saving state would be useful. The web is transient in nature and would benefit from being able to save the state. For example the Python hashlib api did not provide saving the state which resulted in people reverting to C.
https://stackoverflow.com/questions/2130892/persisting-hashlib-state

@devgs

This comment was marked as off-topic.

@leoselig
Copy link

What a shame. Almost a decade without any progress on such a no-brainer feature.

A quick search revealed that there is indeed an active draft exploring this: https://webcrypto-streams.proposal.wintercg.org/#encryptionstream

I do not support the WhatWG in any meaningful way but since you are rather rudely criticizing there lack of progress in their non-funded pursuit, I assume you do know more here. Would love to get some insights on why you think this is simpler to do than it looks!

achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 11, 2024
- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in @libp2p/crypto-aes-ctr
- RSA encryption/decryption and PEM import/export are now in @libp2p/crypto-rsa

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to require a padding algorithm that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pull out into @libp2p/crypto-rsa which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/crypto-rsa, streaming AES-CTR ciphers are in @libp2p/crypto-aes-ctr
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 11, 2024
TLDR: the bundle size has been reduced by ~50KB

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in @libp2p/crypto-aes-ctr
- RSA encryption/decryption and PEM import/export are now in @libp2p/crypto-rsa

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to require a padding algorithm that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pull out into @libp2p/crypto-rsa which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/crypto-rsa, streaming AES-CTR ciphers are in @libp2p/crypto-aes-ctr
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 11, 2024
TLDR: the bundle size has been reduced by ~50KB

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in @libp2p/crypto-aes-ctr
- RSA encryption/decryption and PEM import/export are now in @libp2p/crypto-rsa

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to require a padding algorithm that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pull out into @libp2p/crypto-rsa which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/crypto-rsa, streaming AES-CTR ciphers are in @libp2p/crypto-aes-ctr
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 11, 2024
TLDR: the bundle size has been reduced by ~50KB

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in @libp2p/crypto-aes-ctr
- RSA encryption/decryption and PEM import/export are now in @libp2p/crypto-rsa

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to require a padding algorithm that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pull out into @libp2p/crypto-rsa which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/crypto-rsa, streaming AES-CTR ciphers are in @libp2p/crypto-aes-ctr
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 12, 2024
TLDR: the bundle size has been reduced by about 1/3rd

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in [@libp2p/aes-ctr](https://github.com/libp2p/js-libp2p-aes-ctr)
- RSA encryption/decryption and PEM import/export are now in [@libp2p/rsa](https://github.com/libp2p/js-libp2p-rsa)

## AES-CTR

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

## RSA encrypt/decrypt

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to use a [padding algorithm](https://github.com/libp2p/js-libp2p-crypto/blob/3d0fd234deb73984ddf0f7c9959bbca92194926a/src/keys/rsa.ts#L59) that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pulled out into `@libp2p/rsa` which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

## PEM/pkix/pkcs1

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

Fixes #2086

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/rsa, streaming AES-CTR ciphers are in @libp2p/aes-ctr
@vszakd
Copy link

vszakd commented Mar 8, 2024

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests