Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random buglet fixes in lektor.builder #1146

Merged
merged 4 commits into from Jun 4, 2023
Merged

Conversation

dairiki
Copy link
Contributor

@dairiki dairiki commented May 24, 2023

Issue(s) Resolved

  • Logic flaw in FileInfo.unchanged caused a source file to be considered unchanged in certain edge-cases even when its size (and therefore its contents) have changed.
  • Builder.touch_site_config wasn't actually touching the site config.
  • The encoding parameter to Artifact.open was being ignored in certain circumstances. (Since encoding defaults to utf-8 — which is normally what is wanted — this was mostly harmless.)
  • Updated a stale comment.

Related Issues / Links

Description of Changes

  • Added unit test(s) covering the changes (if testable)

This is a longstand bug — there was a race condition.

For efficiency, Lektor's build system caches filesystem os.stat() results and
(sometimes) checksums of file content.  (For this is uses PathCache to
cache FileInfos which, in turn, cache stat and checksums for
individual files.)

The FileInfo classes compute defer computation of stat results and
checksums until they are first accessed.

`FileInfo.unchanged` is a method that is used to compare a historical
FileInfo which has been retrieved from the build database with
"current" state of a file.  The logic in `FileInfo.unchanged` was
subtly flawed.

It first compared the `size` and `mtime` of the file. For
regular (i.e. non-directory) files it returned `True` early if those
matched. If they differed it went on to compare checksums, and
returned a result based (only) as the checksum comparison.

The logic is flawed: if the size of the file has changed, we can be
assured the file has indeed changed, and should return early right
then — no need to check the checksum.

The real bug arises due to the fact that values of size/mtime and the
checksum that are stored in the build database may have be computed
and cached at different times. It's possible, in the case that a file
is modified during a build cycle, for the size/mtime stored in the
build database to correspond to it's pre-modification state, while the
checksum corresponds to its *post*-modification state.  In this case,
due to to above mentioned logic flaw, a file will be detected as
unchanged, even though its size has changed.

There is a comment in the code that indicates the intent was to skip
computing and checking checksums for regular files if their size and
mtime match.  Here, we fix the logic to do just that.  Another option
would be to check checksums as well (but only in the case that the
size matches.)

Going forward, we probably should take care to ensure that whatever
we record about source file state in FileInfo and the build database
represents the file state at a single point in time.
@dairiki dairiki merged commit ba86d1c into lektor:master Jun 4, 2023
15 checks passed
@dairiki dairiki deleted the fix.builder branch June 4, 2023 20:24
dairiki added a commit to dairiki/lektor that referenced this pull request Sep 11, 2023
* fix(builder): logic in FileInfo.unchanged was incorrect

This is a longstand bug — there was a race condition.

For efficiency, Lektor's build system caches filesystem os.stat() results and
(sometimes) checksums of file content.  (For this is uses PathCache to
cache FileInfos which, in turn, cache stat and checksums for
individual files.)

The FileInfo classes compute defer computation of stat results and
checksums until they are first accessed.

`FileInfo.unchanged` is a method that is used to compare a historical
FileInfo which has been retrieved from the build database with
"current" state of a file.  The logic in `FileInfo.unchanged` was
subtly flawed.

It first compared the `size` and `mtime` of the file. For
regular (i.e. non-directory) files it returned `True` early if those
matched. If they differed it went on to compare checksums, and
returned a result based (only) as the checksum comparison.

The logic is flawed: if the size of the file has changed, we can be
assured the file has indeed changed, and should return early right
then — no need to check the checksum.

The real bug arises due to the fact that values of size/mtime and the
checksum that are stored in the build database may have be computed
and cached at different times. It's possible, in the case that a file
is modified during a build cycle, for the size/mtime stored in the
build database to correspond to it's pre-modification state, while the
checksum corresponds to its *post*-modification state.  In this case,
due to to above mentioned logic flaw, a file will be detected as
unchanged, even though its size has changed.

There is a comment in the code that indicates the intent was to skip
computing and checking checksums for regular files if their size and
mtime match.  Here, we fix the logic to do just that.  Another option
would be to check checksums as well (but only in the case that the
size matches.)

Going forward, we probably should take care to ensure that whatever
we record about source file state in FileInfo and the build database
represents the file state at a single point in time.

* fix(builder): fix Builder.touch_site_config

* fix(builder): don't ignore `encoding` when opening temp file

* fix(builder): fix docstring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant