New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random buglet fixes in lektor.builder #1146
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is a longstand bug — there was a race condition. For efficiency, Lektor's build system caches filesystem os.stat() results and (sometimes) checksums of file content. (For this is uses PathCache to cache FileInfos which, in turn, cache stat and checksums for individual files.) The FileInfo classes compute defer computation of stat results and checksums until they are first accessed. `FileInfo.unchanged` is a method that is used to compare a historical FileInfo which has been retrieved from the build database with "current" state of a file. The logic in `FileInfo.unchanged` was subtly flawed. It first compared the `size` and `mtime` of the file. For regular (i.e. non-directory) files it returned `True` early if those matched. If they differed it went on to compare checksums, and returned a result based (only) as the checksum comparison. The logic is flawed: if the size of the file has changed, we can be assured the file has indeed changed, and should return early right then — no need to check the checksum. The real bug arises due to the fact that values of size/mtime and the checksum that are stored in the build database may have be computed and cached at different times. It's possible, in the case that a file is modified during a build cycle, for the size/mtime stored in the build database to correspond to it's pre-modification state, while the checksum corresponds to its *post*-modification state. In this case, due to to above mentioned logic flaw, a file will be detected as unchanged, even though its size has changed. There is a comment in the code that indicates the intent was to skip computing and checking checksums for regular files if their size and mtime match. Here, we fix the logic to do just that. Another option would be to check checksums as well (but only in the case that the size matches.) Going forward, we probably should take care to ensure that whatever we record about source file state in FileInfo and the build database represents the file state at a single point in time.
dairiki
added a commit
to dairiki/lektor
that referenced
this pull request
Sep 11, 2023
* fix(builder): logic in FileInfo.unchanged was incorrect This is a longstand bug — there was a race condition. For efficiency, Lektor's build system caches filesystem os.stat() results and (sometimes) checksums of file content. (For this is uses PathCache to cache FileInfos which, in turn, cache stat and checksums for individual files.) The FileInfo classes compute defer computation of stat results and checksums until they are first accessed. `FileInfo.unchanged` is a method that is used to compare a historical FileInfo which has been retrieved from the build database with "current" state of a file. The logic in `FileInfo.unchanged` was subtly flawed. It first compared the `size` and `mtime` of the file. For regular (i.e. non-directory) files it returned `True` early if those matched. If they differed it went on to compare checksums, and returned a result based (only) as the checksum comparison. The logic is flawed: if the size of the file has changed, we can be assured the file has indeed changed, and should return early right then — no need to check the checksum. The real bug arises due to the fact that values of size/mtime and the checksum that are stored in the build database may have be computed and cached at different times. It's possible, in the case that a file is modified during a build cycle, for the size/mtime stored in the build database to correspond to it's pre-modification state, while the checksum corresponds to its *post*-modification state. In this case, due to to above mentioned logic flaw, a file will be detected as unchanged, even though its size has changed. There is a comment in the code that indicates the intent was to skip computing and checking checksums for regular files if their size and mtime match. Here, we fix the logic to do just that. Another option would be to check checksums as well (but only in the case that the size matches.) Going forward, we probably should take care to ensure that whatever we record about source file state in FileInfo and the build database represents the file state at a single point in time. * fix(builder): fix Builder.touch_site_config * fix(builder): don't ignore `encoding` when opening temp file * fix(builder): fix docstring
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue(s) Resolved
FileInfo.unchanged
caused a source file to be considered unchanged in certain edge-cases even when its size (and therefore its contents) have changed.Builder.touch_site_config
wasn't actually touching the site config.encoding
parameter toArtifact.open
was being ignored in certain circumstances. (Sinceencoding
defaults to utf-8 — which is normally what is wanted — this was mostly harmless.)Related Issues / Links
Description of Changes