-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC - Hardlinking of duplicated pyc files #19
Conversation
Feel free to comment if you think that something can be done better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, this looks OK. I would have done it in a similar way. I'm OK if 3.4 warns and ignores the option.
See the latest update. The Python 3.4 compatibility is kinda painful but we can keep it for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the last commit I've improved the implementation for Python 3.4 when only (1, 2) optimization levels are used.
One test that I think we need to have:
Alternate version fo the test: Replace (3) with importing the module from Python. |
It seems that it works. I'll create a test. I'd check that only compiled files in step 3 have different inode and checksum and the file compiled in the step 1 has the same inode/checksum as before. WDYT? |
Sounds good. I'd also test this: $ echo "a = 0" > module.py
$ python -m compileall2 -o 0 -o 1 -o 2 --hardlink-dupes ./
Listing './'...
Compiling './module.py'...
$ ls -i __pycache__/
8792583 module.cpython-37.opt-1.pyc 8792583 module.cpython-37.pyc
8792583 module.cpython-37.opt-2.pyc
$ md5sum __pycache__/*
3d62806ca2e42ea0b8e3a96dd7a1cbb8 __pycache__/module.cpython-37.opt-1.pyc
3d62806ca2e42ea0b8e3a96dd7a1cbb8 __pycache__/module.cpython-37.opt-2.pyc
3d62806ca2e42ea0b8e3a96dd7a1cbb8 __pycache__/module.cpython-37.pyc
$ echo "b = 1" > module.py
$ python -c 'import module'
$ ls -i __pycache__/
8792583 module.cpython-37.opt-1.pyc 8792590 module.cpython-37.pyc
8792583 module.cpython-37.opt-2.pyc
$ md5sum __pycache__/*
3d62806ca2e42ea0b8e3a96dd7a1cbb8 __pycache__/module.cpython-37.opt-1.pyc
3d62806ca2e42ea0b8e3a96dd7a1cbb8 __pycache__/module.cpython-37.opt-2.pyc
dc87c896d4f87fba2c30da386fe701f3 __pycache__/module.cpython-37.pyc |
Even though I think we are testing this too much, the tests are implemented in the two latest commits. |
There's no such thing as testing too much 😄 This is a response to some of the things mentioned in the discussion. |
So, technically, all the tests that assert equal inode numbers will fail on Windows? This is not a big deal in Fedora, but will need some tweaks when proposing upstream. I suppose some of the following:
The first option seems like the easiest one. |
Anyway, for now, I think we can have this and test it in Fedora without worrying about Windows. WDYT? |
Could you rebase this, so we see the tests results on Windows? |
That's my plan. Rebase, clean, squash, skip tests on Windows… |
93be8b2
to
825ddd7
Compare
The tests simply passed? |
It seems so :D I don't know how inodes work on Windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge and release this, so we can backport it into Fedora from a released version? Or would you prefer to test it in copr first?
It's more or less up to you. What do you prefer from minimization perspective? I'm ok with merge, release and use in Fedora directly. |
Ok, let's get a merge and release please. |
It is possible to save some disk space on posix systems by using hard links for identical pyc files produced for different optimization levels.
825ddd7
to
1d4d994
Compare
Proof of concept of hardlinking of duplicated pyc files.
There is one weakness I am aware of - compatibility with Python 3.4. Otherwise, it looks good and it also seems to work.
Fixes #16