Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression Streams #207

Closed
ricea opened this issue Oct 8, 2019 · 17 comments · Fixed by #276
Closed

Compression Streams #207

ricea opened this issue Oct 8, 2019 · 17 comments · Fixed by #276
Labels
position: positive venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG)

Comments

@ricea
Copy link

ricea commented Oct 8, 2019

Request for Mozilla Position on an Emerging Web Specification

  • Specification Title: Compression Streams
  • Specification or proposal URL: https://ricea.github.io/compression/
  • Caniuse.com URL (optional):
  • Bugzilla URL (optional):
  • Mozillians who can provide input (optional):

Other information

See also the explainer at https://github.com/ricea/compressstream-explainer/blob/master/README.md
I have requested WICG incubation for this specification: https://discourse.wicg.io/t/proposal-compression-streams-standard/3920

@ekr
Copy link
Contributor

ekr commented Oct 8, 2019

I'd like to hear from @annevk about this, but from my perspective, this seems like it's kind of platform bloat. What's the reason for baking a specific set of compression algorithms into the platform rather than just having a generic system (which may not need any new standardization) that allows one to write your compressor in WASM?

The choice algorithms here is suggestive:

Gzip and Deflate are ubiquitous and already shipping in every browser. This means the incremental cost of exposing them is very low. They are used so extensively in the web platform that there is almost zero chance of them ever being removed, so committing to supporting them long-term is safe.

However, gzip and deflate are also old (which is why they are ubiquitous) and hence not very advanced, but that means that more or less by definition this API will always be behind. Note that we've already seen this problem with WebCrypto, so it's not hypothetical, but at least WebCrypto had the excuse that you really needed the algorithms implemented in the browser. That doesn't seem to apply here.

@annevk
Copy link
Contributor

annevk commented Oct 9, 2019

I have seen some requests for this from web developers, but yeah, it would be interesting to know to what extent implementing this in JavaScript or Wasm (or importing a module that has done it) is prohibitive.

@ricea
Copy link
Author

ricea commented Oct 10, 2019

Thank you for your feedback. I'd like to address your points one-by-one.

I'd like to hear from @annevk about this, but from my perspective, this seems like it's kind of platform bloat.

  • The Web Platform is unusual in not having compression APIs built in. Android, iOS, .NET, etc. all include them.
  • Browsers already ship the algorithms, so the API is just wrappers. In Chrome the API implementation only costs 1.5KB.
  • There's an ergonomic benefit in having a standard API for something, even if the functionality can be provided by developers. TextEncoder and TextDecoder are other examples of APIs which in principle can be supplied by developers, but in practice there's considerable benefit in having them built into the platform.

What's the reason for baking a specific set of compression algorithms into the platform rather than just having a generic system (which may not need any new standardization) that allows one to write your compressor in WASM?

People can and do already include compressors in WASM or JavaScript in their apps. The benefits of supplying compression algorithms in the platform are:

  • Download size. Developers don't want to pay the cost of including a compressor in their bundle. This is particularly relevant for analytics.
  • CPU usage. Facebook found that the native implementation was 20x faster than JavaScript and 10% faster than WASM.

However, gzip and deflate are also old (which is why they are ubiquitous) and hence not very advanced

more or less by definition this API will always be behind.

  • I consider this a feature rather than a bug. If we were constantly trying to ship the latest-and-greatest compression algorithms, then we would end up stuck shipping a bunch of algorithms that had low usage because they were no longer fashionable. Being "behind" gives us the advantage of picking algorithms we are confident are going to stick around.
  • For many use cases, state-of-the-art compression algorithms aren't needed. See for example, this email from the blink-dev mailing list, where the poster says the benefits of using a browser-supplied API would outweigh the losses from not using their custom algorithm.
  • Using widely-deployed algorithms balances the needs of users, developers and browser vendors best. Users don't want to pay for stuff that isn't used. Developers want to adopt things that will work with their current and future server frameworks. Browsers vendors don't want to have to maintain mostly-unused code.

In conclusion, for a low investment from browser vendors we can provide a substantial benefit to developers in terms of functionality, and to users in terms of reduced data usage.

@Krinkle
Copy link

Krinkle commented Oct 24, 2019

In case it helps - This standard would benefit Wikipedia users. Our VisualEditor software has to upload a sizeable HTML document to the server. In order to reduce upload times we currently use a JavaScript implementation of Deflate which cuts down the payload from e.g. 1.4MB to 205K, which generally makes saving changes a lot faster.

The only downside is that often takes between 1 and 2 seconds to compress with a JavaScript implementation (ThinkPad running Ubuntu). We're working on tuning the compression levels and considering competing JS implementations, but it's hard to beat native of course. Also as it would avoid needing to download the JavaScript library itself.

@annevk
Copy link
Contributor

annevk commented Oct 25, 2019

@Krinkle how big is the JavaScript library? Did you look into Wasm? Point taken though that it would be simpler and faster (definitely compared to JavaScript) if builtin.

@Yoric
Copy link

Yoric commented Oct 25, 2019

CloudFlare has tested compiling brotli into wasm for server-side use. While I don't have exact numbers, I remember that the binary was multi-megabytes large (possibly owing to the fact that brotli contains a pretty large built-in dictionary), which made it essentially unusable for their use case. I suspect that the same applies here.

@dbaron
Copy link
Contributor

dbaron commented Nov 15, 2019

I think this is a pretty basic gap in the platform (and one where something the platform already has just isn't exposed), and I'd rather not force less advanced developers to drop into WASM to have to fill it.

@adamroach adamroach added the venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG) label Nov 16, 2019
@mconca
Copy link

mconca commented Feb 19, 2020

For better or worse, Google shipped this in Chrome 80. Reaction from devs has been generally positive, and it's encouraging that several have asked for Mozilla's position on this from a standards point-of-view.

I think I'm generally supportive of this because it reduces the need for devs to ship compress/decompress code (JS or WASM), and leveraging the native browser implementation is going to be faster, regardless. Smaller page sizes and faster execution is a performance win for web users.

@yutakahirano
Copy link

We're adding "defalte-raw" compression format, to allow web developers to access the raw deflate stream. Please see whatwg/compression#25. Chrome is working to ship the feature.

@miketaylr
Copy link

@mconca @annevk should we open a new issue for "deflate-raw", or do you consider that covered by the existing Compression Streams entry (maybe not, it's new and all)?

@mconca
Copy link

mconca commented May 9, 2022

This seems like a logical extension of "deflate" so I don't think a new entry is needed. Thanks.

@jimmywarting
Copy link

Thinking of doing some p2p file transfer and would like to use the de/compression streams, I can't really figure out if the other peer supports compression streams so i need to signal that in advance (durning sdp signaling) if they support it or not.

@BenjaminAster
Copy link

Apple will ship CompressionStreams in Safari 16.4, so Firefox will be the only browser not supporting it. Implementing this is now a question of web compatibility.

Personally, I am particularly excited for compression streams because they let you natively read and write .zip, .docx, .xlsx, etc. files in the browser without any libraries.

@ekr
Copy link
Contributor

ekr commented Feb 19, 2023

Apple will ship CompressionStreams in Safari 16.4, so Firefox will be the only browser not supporting it. Implementing this is now a question of web compatibility.

Reminder: this is not the place to discuss Firefox's product plans.

@cavac
Copy link

cavac commented Mar 17, 2023

I must say, with all the faffing about by the Firefox team, as a web developer i have resigned to relegate Firefox to a "backup platform" instead of being a main target for our software. It regularly falls behind other browsers in terms of implemented standard APIs because the team "doesn't see the point in implementing those".

It's a rather frustrating experience. And frankly, i'm not sure how much longer i can even support the platform at all, since it time-and-again needs special handling in my web systems.

@olfek
Copy link

olfek commented Mar 19, 2023

😢😢😢

@tantek
Copy link
Member

tantek commented Mar 20, 2023

@olfek @cavac neither of your comments provided any new technical information on this specification and are out of scope for this repo. If you want to vent, please use your own blog to do so.

Please re-read the CONTRIBUTING guide https://github.com/mozilla/standards-positions/blob/main/CONTRIBUTING.md before making any further comments on any issue in the repo.

All of the comments made on this issue in this year were out of scope, and none of them provided any new technical information about the specification so I am locking this issue accordingly.

(Originally published at: https://tantek.com/2023/079/t1/)

@mozilla mozilla locked as off-topic and limited conversation to collaborators Mar 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
position: positive venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG)
Projects
None yet
Development

Successfully merging a pull request may close this issue.