Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bz2 or lzma for python >= 3.3 ? #247

Open
NotSqrt opened this issue Aug 9, 2018 · 13 comments
Open

Use bz2 or lzma for python >= 3.3 ? #247

NotSqrt opened this issue Aug 9, 2018 · 13 comments

Comments

@NotSqrt
Copy link

NotSqrt commented Aug 9, 2018

Hi,

  1. Just an idea : now that many python projects are ditching python2.7, and mostly support only python >= 3.4, if a wheel is only for python3.X, it could reduce the size by using bz2 or lzma.

Typical example : numpy-1.15.0-cp37-cp37m-manylinux1_x86_64.whl (size: 13845063 bytes), only for python 3.7, could gain 50% by using lzma (size: 6899142 bytes)..

  1. A simpler idea if keeping zlib, starting with python 3.7, zipfile exposes the compresslevel param.
    When wheels are generated from python 3.7, it could also help to set compresslevel to a value higher than 6.

Thanks !

@agronholm
Copy link
Contributor

Certainly not by default on any Python, and PyPI may consider refusing such wheels.

@agronholm
Copy link
Contributor

The second idea sounds a bit safer. Do you have numbers for this?

@NotSqrt
Copy link
Author

NotSqrt commented Oct 1, 2018

Hi @agronholm

The compresslevel with zlib is less interesting:
For numpy-1.15.2-cp27-cp27mu-manylinux1_x86_64.whl,

  • uncompressed size: 51928 kibibytes
  • wheel available on Pypi: 13510.78 kibibytes
  • rebuilt wheel with compresslevel=9: 13380.84 kibibytes (0.96% better)

I combined the compresslevel parameter with the choice to store files that would result in a bigger compressed file as Stored, not Deflate.

@dholth
Copy link
Member

dholth commented Oct 2, 2018

Wheels are zipfiles which are compressed per-file, and the zip metadata (filenames) is not compressed. If you were to create the zip file with no compression "store" and then lzma the whole thing you would see better results. Compression algorithms work better on large inputs.
Convincing others to accept those wheels would be the tricky part.

@agronholm
Copy link
Contributor

Yeah, there are bound to be practical difficulties with compression other than zlib.

@dholth
Copy link
Member

dholth commented Oct 3, 2018

I like the idea of improving compression, does it make sense to only use zlib forever, but you'd have to update more tools than just bdist_wheel to pull it off.
Another zipfile compression trick is to put one "stored" zipfile inside another one.

@agronholm
Copy link
Contributor

I've done some groundwork for this in PR #316. Once I am sure that PyPI will reject bzip2/lzma based wheels, I can add support for other compression algorithms as well.

@dholth
Copy link
Member

dholth commented Oct 27, 2019 via email

@nbcsm
Copy link

nbcsm commented Aug 18, 2020

I've done some groundwork for this in PR #316. Once I am sure that PyPI will reject bzip2/lzma based wheels, I can add support for other compression algorithms as well.

@agronholm, will PyPI reject bzip2/lzma? may I know the reason?
Thanks.

I am trying to reduce binary size for our wheel, and LZMA shows good potential.
It will be great if wheel and pypi can support it.

@agronholm
Copy link
Contributor

agronholm commented Aug 18, 2020

Those are not supported on Python 2, and thus historically wheels did not support it. Going forward, the plan seems to be to have two layers in the zip where the actual content is xz compressed, making it even more efficient.

@nbcsm
Copy link

nbcsm commented Aug 18, 2020

Thanks for the quick response.

But if my package only targets for Python 3, will there be any blocking issue to upload my wheel to PyPI and let user install my package via pip3?

@agronholm
Copy link
Contributor

I honestly don't know. I know it's not supported or recommended.

@nbcsm
Copy link

nbcsm commented Aug 18, 2020

Got it, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants