-
-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: determinism of the interpreters and bytecode (pyc) files #22570
Comments
I think using faketime is probably a bad idea, e.g. I suspect some unit tests in random packages can start behaving like crazy under it. |
With Python 2.7 I have the following issue left:
|
This is actually a performance issue; as I described in #22569 (coincidentally filed minutes before this), the wrong timestamps causes python to ignore the cached bytecode and re-compile every library on every invocation. Notably irritating to me, this causes nvr (neovim-remote) to take 0.3 seconds instead of 0.1 seconds to run. It seems to my relatively nix-inexperienced intuition that byte-wise patching all .pyc files generated in a python interpreter or library build to have timestamp 0 in postBuild or preInstall or some such should be a straightforward fix; am I missing something? |
@edanaher you couldn't tell any difference with Python 2? On Python 2 we use this patch and we need to port this to Python 3. Unfortunately, the touched files are quite different in 3. |
Sadly, @FRidh, your patch doesn't seem to solve my problem; the nix-store appears to use timestamp 1, not 0, for its files:
Though tweaking your patch to set the timestamp to 1 doesn't appear to work either; I'll keep digging today. edit: Actually, setting the timestamp to 1 helps; however, some files appear to still have the actual timestamps embedded. In fact, nvr now loses about half as much time to recompiling as it did before. I wonder what's wrong with the remaining files... |
#22585 is incomplete; there appears to be another codepath that can write mtimes to the compiled files. Adding the following substituteInPlace fixes it:
Unfortunately, this is not based on DETERMINISTIC_BUILD; this change is in a bootstrapping file, so getting at the environment is non-trivial. But it does fix nearly the rest of the nondeterminism; running with an additional convenience patch to check the timestamps:
I get the following:
So for some reason site is still getting the timestamp wrong and breaking determinism; I'm not sure what's up with the attempted write to sysconfigdate. |
Thanks @edanaher. I thought that path wasn't relevant for the writing but apparently it is then. Looking at the diffoscope result all that is left I think are sets. I've discussing this issue on |
Today I got a very useful reply from a PyPy dev:
|
Does debian ship any |
I've looked at Debian and their patches, but they're not this far yet with reproducible Python builds it seems. Also interesting.
http://bugs.python.org/issue29514#msg287560 This came up when 3.5.3 changed the magic part of the bytecode which typically shouldn't happen in minor releases. While they don't ship the bytecode, they still compile it during installation time. Since we don't distinguish between the two we have to choose whether we include the bytecode or not. Not including bytecode has a negative impact on the performance. |
Note that I updated #22585. Builds of 2.7 and 3.5 are now deterministic. |
Closing because most interpreters are now built deterministic. |
❤️ @FRidh |
The 2.7.16 interpreter seems to be reproducible again. Python packages however not. |
This comment has been minimized.
This comment has been minimized.
Python packages are mostly reproducible again on master. An exception is pytest, which generates its own bytecode. The interpreters are not yet. |
Closing as done. |
Slowdown because unoptimized bytecode is no longer generated because it is not reproducible #118810. |
Issue description
Bytecode is created during builds of the interpreter and packages. The bytecode records a timestamp, which we cannot set. We could patch the interpreter, but we do want it to use the actual timestamp when used outside of Nix.
A fix for this issue is available in #2281 where a
useFakeTime
option is added to the generic builder.The text was updated successfully, but these errors were encountered: