Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Option to hardlink duplicate bytecache files of different opt. levels #16

Closed
hroncok opened this issue Jan 14, 2020 · 1 comment · Fixed by #19
Closed

RFE: Option to hardlink duplicate bytecache files of different opt. levels #16

hroncok opened this issue Jan 14, 2020 · 1 comment · Fixed by #19
Assignees

Comments

@hroncok
Copy link
Member

hroncok commented Jan 14, 2020

I'd like to have and option of compileall that deduplicates identical .pyc files of the same module if compiled for different optimization levels.

Example:

$ python -m compileall -o0 -o1 -o2 --hardlink-dupes ...

This would hardlink module.cpython-3?.pyc with module.cpython-3?.opt-1.pyc with module.cpython-3?.opt-1.pyc if identical, on operating systems supporting hardlinks.

Given the nature of the bytecode caches, the non-optimized, optimized level 1 and optimized level 2 .pyc files may or may not be identical.

Consider the following Python module:

1

All three bytecode cache files would by identical.

While with:

assert 1

Only the two optimized cache files would be identical with each other.

And this:

"""Dummy module docstring"""
1

Would produce two identical bytecode cache files but the opt-2 file would differ.

Only modules like this would produce 3 different files:

"""Dummy module docstring"""
assert 1

Hardlinking identical .pyc files can cause significant storage savings. As a single data point: On my workstation I have 360 MiB of various Python 3.7 bytecode files in /usr and I can save 108 MiB.

jollaitbot pushed a commit to sailfishos-mirror/python-rpm-macros that referenced this issue May 14, 2020
Adds the optional --hardlink-dupes flag for compileall2 for pyc deduplication

This is explained in https://discuss.python.org/t/3014
                 and fedora-python/compileall2#16

This option is not yet used anywhere. That allows us to backport this to all
Fedoras but only use --hardlink-dupes on rawhide first.
hroncok added a commit to hroncok/python3.8 that referenced this issue May 26, 2020
…of BRP script

This is explained in https://discuss.python.org/t/3014
                 and fedora-python/compileall2#16

The %__brp_python_hardlink script already does this by Shell, this should be slightly faster.

Also, this is more explicit.
jollaitbot pushed a commit to sailfishos-mirror/python-rpm-macros that referenced this issue Aug 12, 2020
Adds the optional --hardlink-dupes flag for compileall2 for pyc deduplication

This is explained in https://discuss.python.org/t/3014
                 and fedora-python/compileall2#16

This option is not yet used anywhere. That allows us to backport this to all
Fedoras but only use --hardlink-dupes on rawhide first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants