-
Notifications
You must be signed in to change notification settings - Fork 156
fsmonitor: fix watchman integration #444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When running Git commands quickly -- such as in a shell script or the test suite -- the Git commands frequently complete and start again during the same second. The example fsmonitor hooks to integrate with Watchman truncate the nanosecond times to seconds. In principle, this is fine, as Watchman claims to use inclusive comparisons [1]. The result should only be an over-representation of the changed paths since the last Git command. However, Watchman's own documentation claims "Using a timestamp is prone to race conditions in understanding the complete state of the file tree" [2]. All of their documented examples use a "clockspec" that looks like 'c:123:234'. Git should eventually learn how to store this type of string to provide a stronger integration, but that will be a more invasive change. When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts such as t7519-wtstatus.sh fail due to these race conditions. In fact, running any test script with GIT_TEST_FSMONITOR pointing at t/t7519/fsmonitor-wathcman will cause failures in the test_commit function. The 'git add "$indir$file"' command fails due to not enough time between the creation of '$file' and the 'git add' command. For now, subtract one second from the timestamp we pass to Watchman. This will make our window large enough to avoid these race conditions. Increasing the window causes tests like t7519-wtstatus.sh to pass. When the integration was introduced in def4376 (fsmonitor: add a sample integration script for Watchman, 2018-09-22), the query included an expression that would ignore files created and deleted in that window. The performance reason for this change was to ignore temporary files created by a build between Git commands. However, this causes failures in script scenarios where Git is creating or deleting files quickly. When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails due to this add-and-delete race condition. By removing the "expression" from the Watchman query, we remove this race condition. It will lead to some performance degradation in the case of users creating and deleting temporary files inside their working directory between Git commands. However, that is a cost we need to pay to be correct. [1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39 [2] https://facebook.github.io/watchman/docs/clockspec.html Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Welcome to GitGitGadgetHi @kewillford, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests. Please make sure that this Pull Request has a good description, as it will be used as cover letter. Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:
It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code. Contributing the patchesBefore you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form Both the person who commented An alternative is the channel
Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions. If you want to see what email(s) would be sent for a submit request, add a PR comment If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox ("raw") file corresponding to the mail you want to reply to from the Git mailing list. If you use GMail, you can upload that raw mbox file via: curl -g --user "<EMailAddress>:<Password>" --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt |
/submit |
Error: User kewillford is not permitted to use GitGitGadget |
/allow |
User kewillford is now allowed to use GitGitGadget. |
/submit |
Submitted as pull.444.git.1572889841.gitgitgadget@gmail.com |
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
This branch is now known as |
This patch series was integrated into pu via git@f7b61b3. |
On the Git mailing list, Kevin Willford wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
This patch series was integrated into pu via git@e262bc1. |
This patch series was integrated into pu via git@7650ce9. |
This patch series was integrated into pu via git@dabb67f. |
This patch series was integrated into pu via git@c1b9dab. |
This patch series was integrated into pu via git@8fcd128. |
#print STDERR "$0 $version $time\n"; | ||
|
||
# Check the hook interface version | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, SZEDER Gábor wrote (reply to this):
On Mon, Nov 04, 2019 at 05:50:41PM +0000, Kevin Willford via GitGitGadget wrote:
> When running Git commands quickly -- such as in a shell script or the
> test suite -- the Git commands frequently complete and start again
> during the same second. The example fsmonitor hooks to integrate with
> Watchman truncate the nanosecond times to seconds. In principle, this is
> fine, as Watchman claims to use inclusive comparisons [1]. The result
> should only be an over-representation of the changed paths since the
> last Git command.
> [1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
Nit: please refer to the specific blob in this link. The file's
content in 'master' might change in time, and then the highlight will
point to different lines. Worse, the file might be renamed, and then
we'll get a broken link.
This patch series was integrated into pu via git@3716ffd. |
This patch series was integrated into next via git@ee786a5. |
This patch series was integrated into pu via git@8adaa2a. |
This patch series was integrated into pu via git@7444f5e. |
This patch series was integrated into pu via git@0db093d. |
This patch series was integrated into pu via git@567945b. |
This patch series was integrated into pu via git@de08b0f. |
This patch series was integrated into pu via git@fc7b26c. |
This patch series was integrated into next via git@fc7b26c. |
This patch series was integrated into master via git@fc7b26c. |
Closed via fc7b26c. |
When running Git commands quickly -- such as in a shell script or the
test suite -- the Git commands frequently complete and start again
during the same second. The example fsmonitor hooks to integrate with
Watchman truncate the nanosecond times to seconds. In principle, this is
fine, as Watchman claims to use inclusive comparisons [1]. The result
should only be an over-representation of the changed paths since the
last Git command.
However, Watchman's own documentation claims "Using a timestamp is prone
to race conditions in understanding the complete state of the file tree"
[2]. All of their documented examples use a "clockspec" that looks like
'c:123:234'. Git should eventually learn how to store this type of
string to provide a stronger integration, but that will be a more
invasive change.
When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts
such as t7519-wtstatus.sh fail due to these race conditions. In fact,
running any test script with GIT_TEST_FSMONITOR pointing at
t/t7519/fsmonitor-wathcman will cause failures in the test_commit
function. The 'git add "$indir$file"' command fails due to not enough
time between the creation of '$file' and the 'git add' command.
For now, subtract one second from the timestamp we pass to Watchman.
This will make our window large enough to avoid these race conditions.
Increasing the window causes tests like t7519-wtstatus.sh to pass.
When the integration was introduced in def4376 (fsmonitor: add a
sample integration script for Watchman, 2018-09-22), the query included
an expression that would ignore files created and deleted in that
window. The performance reason for this change was to ignore temporary
files created by a build between Git commands. However, this causes
failures in script scenarios where Git is creating or deleting files
quickly.
When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails
due to this add-and-delete race condition.
By removing the "expression" from the Watchman query, we remove this
race condition. It will lead to some performance degradation in the case
of users creating and deleting temporary files inside their working
directory between Git commands. However, that is a cost we need to pay
to be correct.
[1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
[2] https://facebook.github.io/watchman/docs/clockspec.html