-
Notifications
You must be signed in to change notification settings - Fork 890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grammar caching causes EOFError and race condition when used as pre-commit hook #1164
Comments
The problem seems to be in Something like this: def dump(self, filename):
"""Dump the grammar tables to a pickle file."""
with tempfile.NamedTemporaryFile(mode='wb') as f:
pickle.dump(self.__dict__, f, pickle.HIGHEST_PROTOCOL)
os.link(f.name, filename) |
Agreed, I think that would work on Unix systems. To my knowledge, there is no documented, guaranteed atomic write operation on Windows. The |
There's an alternative method that runs a small risk of leaving a temporary file lying around, but might work better on Windows [this code is untested; and it probably should have a try/except to deal with cleanup on error]: with tempfile.NamedTemporaryFile(mode='wb', delete=False, delete_on_close=False) as f:
tmp_file_name = f.name
pickle.dump((self.__dict__, f, pickle.HIGHEST_PROTOCOL)
os.rename(tmp_file_name, filename) |
Same problem, how to solve it? |
* GitHub workflow for new runners * Ignore first yapf failure See google/yapf#1164. * run test_line_info.py separately * triton-runner-base:0.0.2 * Include cmake/llvm-hash.txt to a cache key for packages * Run pre-commit checks in parallel * Use pip cache for pre-commit checks * Redirect first failing yapf to /dev/null * Use jobs.<job_id>.defaults.run to initialize oneapi
We have to run yapf twice because in a clean environment the first run most likely fails because of the race condition in yapf, see google/yapf#1164. If, however, the first run was successful and yapf has modified files, then we need to reset the changes, so the second run can detect them again. Note that the whole block (ignore the first run and reset the tree) will not be necessary and will be removed after fixed google/yapf#1164.
We have to run yapf twice because in a clean environment the first run most likely fails because of the race condition in yapf, see google/yapf#1164. If, however, the first run was successful and yapf has modified files, then we need to reset the changes, so the second run can detect them again. Note that the whole block (ignore the first run and reset the tree) will not be necessary and will be removed after fixed google/yapf#1164.
similar issue and the way to fix it: #1204 |
Thanks to pre-commit#851, hooks can now be executed in parallel for each file. This feature is enabled by default.
When a large number of files need to be formatted at once, this concurrency introduces a race condition in wherein one process can conclude that the pickled grammar file does not exist and start to create it. Meanwhile, another process sees the newly created pickle file before it is done being written, concludes that it is newer than the raw unpickled grammar, and then attempts to load it.
The result is the following stack trace:
Since the process writing the file is not impacted by this error, the pickled grammar will get cached. Thus, subsequent runs will succeed. However, if one is using
yapf
in a continuous integration context, this would cause a failed pipeline with potentially high probability (depending on the number of files) with no obvious recourse.A workaround is to add
require_serial
to theyapf
hook in your project's.pre-commit-config.yaml
, but this comes at the cost of losing the advantages of pre-commit#851. I believe the logic in_load_grammar
could be refactored to avoid the race condition with some more careful checks.The text was updated successfully, but these errors were encountered: