Skip to content

Conversation

kewillford
Copy link

When running Git commands quickly -- such as in a shell script or the
test suite -- the Git commands frequently complete and start again
during the same second. The example fsmonitor hooks to integrate with
Watchman truncate the nanosecond times to seconds. In principle, this is
fine, as Watchman claims to use inclusive comparisons [1]. The result
should only be an over-representation of the changed paths since the
last Git command.

However, Watchman's own documentation claims "Using a timestamp is prone
to race conditions in understanding the complete state of the file tree"
[2]. All of their documented examples use a "clockspec" that looks like
'c:123:234'. Git should eventually learn how to store this type of
string to provide a stronger integration, but that will be a more
invasive change.

When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts
such as t7519-wtstatus.sh fail due to these race conditions. In fact,
running any test script with GIT_TEST_FSMONITOR pointing at
t/t7519/fsmonitor-wathcman will cause failures in the test_commit
function. The 'git add "$indir$file"' command fails due to not enough
time between the creation of '$file' and the 'git add' command.

For now, subtract one second from the timestamp we pass to Watchman.
This will make our window large enough to avoid these race conditions.
Increasing the window causes tests like t7519-wtstatus.sh to pass.

When the integration was introduced in def4376 (fsmonitor: add a
sample integration script for Watchman, 2018-09-22), the query included
an expression that would ignore files created and deleted in that
window. The performance reason for this change was to ignore temporary
files created by a build between Git commands. However, this causes
failures in script scenarios where Git is creating or deleting files
quickly.

When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails
due to this add-and-delete race condition.

By removing the "expression" from the Watchman query, we remove this
race condition. It will lead to some performance degradation in the case
of users creating and deleting temporary files inside their working
directory between Git commands. However, that is a cost we need to pay
to be correct.

[1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
[2] https://facebook.github.io/watchman/docs/clockspec.html

When running Git commands quickly -- such as in a shell script or the
test suite -- the Git commands frequently complete and start again
during the same second. The example fsmonitor hooks to integrate with
Watchman truncate the nanosecond times to seconds. In principle, this is
fine, as Watchman claims to use inclusive comparisons [1]. The result
should only be an over-representation of the changed paths since the
last Git command.

However, Watchman's own documentation claims "Using a timestamp is prone
to race conditions in understanding the complete state of the file tree"
[2]. All of their documented examples use a "clockspec" that looks like
'c:123:234'. Git should eventually learn how to store this type of
string to provide a stronger integration, but that will be a more
invasive change.

When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts
such as t7519-wtstatus.sh fail due to these race conditions. In fact,
running any test script with GIT_TEST_FSMONITOR pointing at
t/t7519/fsmonitor-wathcman will cause failures in the test_commit
function. The 'git add "$indir$file"' command fails due to not enough
time between the creation of '$file' and the 'git add' command.

For now, subtract one second from the timestamp we pass to Watchman.
This will make our window large enough to avoid these race conditions.
Increasing the window causes tests like t7519-wtstatus.sh to pass.

When the integration was introduced in def4376 (fsmonitor: add a
sample integration script for Watchman, 2018-09-22), the query included
an expression that would ignore files created and deleted in that
window. The performance reason for this change was to ignore temporary
files created by a build between Git commands. However, this causes
failures in script scenarios where Git is creating or deleting files
quickly.

When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails
due to this add-and-delete race condition.

By removing the "expression" from the Watchman query, we remove this
race condition. It will lead to some performance degradation in the case
of users creating and deleting temporary files inside their working
directory between Git commands. However, that is a cost we need to pay
to be correct.

[1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
[2] https://facebook.github.io/watchman/docs/clockspec.html

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
@gitgitgadget
Copy link

gitgitgadget bot commented Nov 4, 2019

Welcome to GitGitGadget

Hi @kewillford, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that this Pull Request has a good description, as it will be used as cover letter.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "commit:", and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the FreeNode IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions.

If you want to see what email(s) would be sent for a submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this.

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox ("raw") file corresponding to the mail you want to reply to from the Git mailing list. If you use GMail, you can upload that raw mbox file via:

curl -g --user "<EMailAddress>:<Password>" --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

@kewillford
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 4, 2019

Error: User kewillford is not permitted to use GitGitGadget

@wilbaker
Copy link

wilbaker commented Nov 4, 2019

/allow

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 4, 2019

User kewillford is now allowed to use GitGitGadget.

@kewillford
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 4, 2019

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2019

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:

> When running Git commands quickly -- such as in a shell script or the test
> suite -- the Git commands frequently complete and start again during the
> same second. The example fsmonitor hooks to integrate with Watchman truncate
> the nanosecond times to seconds. In principle, this is fine, as Watchman
> claims to use inclusive comparisons [1]. The result should only be an
> over-representation of the changed paths since the last Git command.
> ...

So, it doesn't seem to use "inclusive" and we need a workaround?

> Kevin Willford (1):
>   fsmonitor: fix watchman integration
>
>  t/t7519/fsmonitor-watchman                 | 13 ++++---------
>  templates/hooks--fsmonitor-watchman.sample | 13 ++++---------
>  2 files changed, 8 insertions(+), 18 deletions(-)

Thanks, will queue.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2019

This branch is now known as kw/fsmonitor-watchman-fix.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2019

This patch series was integrated into pu via git@f7b61b3.

@gitgitgadget gitgitgadget bot added the pu label Nov 6, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2019

On the Git mailing list, Kevin Willford wrote (reply to this):

> From: Junio C Hamano <gitster@pobox.com>
> Sent: Tuesday, November 5, 2019 8:30 PM
>=20
> "Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:
>=20
> > When running Git commands quickly -- such as in a shell script or the
> > test suite -- the Git commands frequently complete and start again
> > during the same second. The example fsmonitor hooks to integrate with
> > Watchman truncate the nanosecond times to seconds. In principle, this
> > is fine, as Watchman claims to use inclusive comparisons [1]. The
> > result should only be an over-representation of the changed paths since
> the last Git command.
> > ...
>=20
> So, it doesn't seem to use "inclusive" and we need a workaround?

That is what is seems like.  I would like to dig into the watchman code
to understand what is really going on.  They also document that "Using
a timestamp is prone to race conditions in understanding the complete
state of the file tree." Which could be the cause since the tests are
running things in quick succession, i.e. change a file, run a git command.

Long term we should switch to using watchman's clock id which the
documentation says does not have the race conditions. But the clock
id is a string and would take more invasive changes to integrate that
into the index where we are now simply using a uint64_t.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2019

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 11/6/2019 10:32 AM, Kevin Willford wrote:
>> From: Junio C Hamano <gitster@pobox.com>
>> Sent: Tuesday, November 5, 2019 8:30 PM
>>
>> "Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> When running Git commands quickly -- such as in a shell script or the
>>> test suite -- the Git commands frequently complete and start again
>>> during the same second. The example fsmonitor hooks to integrate with
>>> Watchman truncate the nanosecond times to seconds. In principle, this
>>> is fine, as Watchman claims to use inclusive comparisons [1]. The
>>> result should only be an over-representation of the changed paths since
>> the last Git command.
>>> ...
>>
>> So, it doesn't seem to use "inclusive" and we need a workaround?
> 
> That is what is seems like.  I would like to dig into the watchman code
> to understand what is really going on.  They also document that "Using
> a timestamp is prone to race conditions in understanding the complete
> state of the file tree." Which could be the cause since the tests are
> running things in quick succession, i.e. change a file, run a git command.
> 
> Long term we should switch to using watchman's clock id which the
> documentation says does not have the race conditions. But the clock
> id is a string and would take more invasive changes to integrate that
> into the index where we are now simply using a uint64_t.

I should mention that I'm working on a patch series that will allow us
to use GIT_TEST_FSMONITOR pointing at Watchman in our CI builds. There
are several places where our fsmonitor integration does not work well
(such as when we delete submodules), but most of the remaining fixes
are small.

This timing issue fixes MOST of the problems we see when running the
test suite.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 7, 2019

This patch series was integrated into pu via git@e262bc1.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 8, 2019

This patch series was integrated into pu via git@7650ce9.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 10, 2019

This patch series was integrated into pu via git@dabb67f.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 11, 2019

This patch series was integrated into pu via git@c1b9dab.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 12, 2019

This patch series was integrated into pu via git@8fcd128.

#print STDERR "$0 $version $time\n";

# Check the hook interface version

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, SZEDER Gábor wrote (reply to this):

On Mon, Nov 04, 2019 at 05:50:41PM +0000, Kevin Willford via GitGitGadget wrote:
> When running Git commands quickly -- such as in a shell script or the
> test suite -- the Git commands frequently complete and start again
> during the same second. The example fsmonitor hooks to integrate with
> Watchman truncate the nanosecond times to seconds. In principle, this is
> fine, as Watchman claims to use inclusive comparisons [1]. The result
> should only be an over-representation of the changed paths since the
> last Git command.

> [1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39

Nit: please refer to the specific blob in this link.  The file's
content in 'master' might change in time, and then the highlight will
point to different lines.  Worse, the file might be renamed, and then
we'll get a broken link.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 19, 2019

This patch series was integrated into pu via git@3716ffd.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 19, 2019

This patch series was integrated into next via git@ee786a5.

@gitgitgadget gitgitgadget bot added the next label Nov 19, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Nov 20, 2019

This patch series was integrated into pu via git@8adaa2a.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 21, 2019

This patch series was integrated into pu via git@7444f5e.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 22, 2019

This patch series was integrated into pu via git@0db093d.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 25, 2019

This patch series was integrated into pu via git@567945b.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 27, 2019

This patch series was integrated into pu via git@de08b0f.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 2, 2019

This patch series was integrated into pu via git@fc7b26c.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 2, 2019

This patch series was integrated into next via git@fc7b26c.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 2, 2019

This patch series was integrated into master via git@fc7b26c.

@gitgitgadget gitgitgadget bot added the master label Dec 2, 2019
@gitgitgadget gitgitgadget bot closed this Dec 2, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Dec 2, 2019

Closed via fc7b26c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants