Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python wheels are not compressed by the CDN #3655

Closed
rth opened this issue Mar 14, 2023 · 6 comments · Fixed by #3667
Closed

Python wheels are not compressed by the CDN #3655

rth opened this issue Mar 14, 2023 · 6 comments · Fixed by #3667

Comments

@rth
Copy link
Member

rth commented Mar 14, 2023

It looks like the JsDelivr CDN does not compress Python wheels.

For instance, if you load numpy,

https://cdn.jsdelivr.net/pyodide/v0.22.1/full/numpy-1.23.5-cp310-cp310-emscripten_3_1_27_wasm32.whl

## Request Headers
Accept-Encoding: gzip, deflate, br

## Response Headers
content-length: 3126204
content-type: binary/octet-stream
There is no response encoding.

it will download 3.1MB of wheel. It's not terrible as it will be zip compressed, but brotli should achieve better compression.

However, if one takes the wheel (zip file) and apply brotli, the size is still around 3.0MB. So re-compressing on top doesn't help much (and it's disabled currently).

While if we were to take these files, put them in a tar (or a zip with 0 compression) and brotli compress it, the size would be 1.9MB in this example. But then technically it's no longer a standard wheel file.

Still, maybe we should switch to producing wheels with no compression, forcing the CDN to re-compress them (if possible) and let the browser do the decompression.

Anyway, also to keep this in mind when repacking more optimized bundles.

@ryanking13
Copy link
Member

ryanking13 commented Mar 15, 2023

While if we were to take these files, put them in a tar (or a zip with 0 compression) and brotli compress it, the size would be 1.9MB in this example. But then technically it's no longer a standard wheel file.

That's very Interesting. We should check the size of all wheel files + python_stdlib.zip. Probably we can also benefit from decompress speed if we use zip with 0 compression.

@rth
Copy link
Member Author

rth commented Mar 15, 2023

I need to check with JsDelivr folks if it's possible to compress binary files. It may also be that whether it's compressed is defined by MIME type, and that we should change the MIME type from binary/octet-stream to application/wasm.

@MartinKolarik
Copy link

Yes, compression is based on mime type, and binaries are not compressed. Changing it to application/wasm would enable compression, but as you already found, double compression isn't particularly effective.

@rth
Copy link
Member Author

rth commented Mar 15, 2023

Thanks for the confirmation @MartinKolarik !

@bollwyvl
Copy link
Contributor

A big win for non-official-CDN would be compressing the .data file, if possible: application/octet-stream doesn't get compressed by default on a number of hosts (e.g. RTD), though it does on jsdelivr.

@rth
Copy link
Member Author

rth commented Mar 17, 2023

Yes, this should be resolved in #3667. We were doing this previously for .data files but apparently forgot to adapt the mechanism after switching to .whl format.

@rth rth closed this as completed in #3667 Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants