Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSMonitor Preliminary Commits #860

Closed

Conversation

jeffhostetler
Copy link

@jeffhostetler jeffhostetler commented Feb 1, 2021

Here is version 2 of this series.

In version 1, I replaced the non-portable "xargs -d" with "xargs -0", but
it turns out that that too is not universally available. In this version I
replace the need for either one by filtering out the problematic paths
(such as ones with LFs) and quoting paths to handle whitespace. The
resulting paths can be passed to xargs without any arguments.

Also, I updated the test to use test-tool chmtime rather than touch
to ensure that the files actually look dirty on low-resolution file systems.

cc: Taylor Blau me@ttaylorr.com
cc: Jeff Hostetler git@jeffhostetler.com

@jeffhostetler
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 1, 2021

Submitted as pull.860.git.1612216941.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-860/jeffhostetler/fsmonitor-prework-v1

To fetch this version to local tag pr-860/jeffhostetler/fsmonitor-prework-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-860/jeffhostetler/fsmonitor-prework-v1

@@ -165,7 +165,7 @@ test_fsmonitor_suite() {
'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> The Mac version of xargs does not support the "-d" option.  Convert the test
> setup to pipe the data set thru `lf_to_nul | xargs -0` instead.

"xargs -0" is not all that portable, either, and neither is "touch -h".

But since the t/perf stuff already depends on having GNU toolchain
anyway, I can be persuaded to believe that it is OK.

Do we know that this part runs much later than the staged files are
last touched, so that these uses of "touch" actually are effective
to make the paths stat-dirty?  Otherwise, we may be just "touch"ing
them with the timestamp they already have after all.

Thanks.

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  t/perf/p7519-fsmonitor.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 9b43342806b..7bb37e9a6c1 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -165,7 +165,7 @@ test_fsmonitor_suite() {
>  	'
>  
>  	test_perf_w_drop_caches "status (dirty) ($DESC)" '
> -		git ls-files | head -100000 | xargs -d "\n" touch -h &&
> +		git ls-files | head -100000 | lf_to_nul | xargs -0 touch -h &&
>  		git status
>  	'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 2/1/21 6:25 PM, Junio C Hamano wrote:
> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> The Mac version of xargs does not support the "-d" option.  Convert the test
>> setup to pipe the data set thru `lf_to_nul | xargs -0` instead.
> 
> "xargs -0" is not all that portable, either, and neither is "touch -h".
> 
> But since the t/perf stuff already depends on having GNU toolchain
> anyway, I can be persuaded to believe that it is OK.
> 
> Do we know that this part runs much later than the staged files are
> last touched, so that these uses of "touch" actually are effective
> to make the paths stat-dirty?  Otherwise, we may be just "touch"ing
> them with the timestamp they already have after all.

I'm not sure now that you mention it.  I suppose on modern filesystems
that have mtimes with nanosecond fields we could (are) assuming that
"touch" is actually doing something.  On older filesystems (such as
FAT32), you're right it is probably not doing anything at the speed
that the test runs.

TBH I'm not sure that the test needs the "-h".  Symlinks are not that
common and it shouldn't affect the timings that much if there are a few.

I'm not sure what to do about "-0".  Not even "--null" is portable.

Let me do a little digging here.

> 
> Thanks.
> 
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   t/perf/p7519-fsmonitor.sh | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>> index 9b43342806b..7bb37e9a6c1 100755
>> --- a/t/perf/p7519-fsmonitor.sh
>> +++ b/t/perf/p7519-fsmonitor.sh
>> @@ -165,7 +165,7 @@ test_fsmonitor_suite() {
>>   	'
>>   
>>   	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>> -		git ls-files | head -100000 | xargs -d "\n" touch -h &&
>> +		git ls-files | head -100000 | lf_to_nul | xargs -0 touch -h &&
>>   		git status
>>   	'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff Hostetler <git@jeffhostetler.com> writes:

> I'm not sure now that you mention it.  I suppose on modern filesystems
> that have mtimes with nanosecond fields we could (are) assuming that
> "touch" is actually doing something.  On older filesystems (such as
> FAT32), you're right it is probably not doing anything at the speed
> that the test runs.

That one is probably the most relevant nit among the ones I raised.
I do not actually mind if we used test-chmtime to force our own
timestamp (e.g. "5 seconds before the filesystem time"), and added
the helper the "--stdin" option to read paths to work around the
"xargs" issue.

> TBH I'm not sure that the test needs the "-h".  Symlinks are not that
> common and it shouldn't affect the timings that much if there are a few.

I agree.

> I'm not sure what to do about "-0".  Not even "--null" is portable.

Correct.  I do not think it is worth "digging", though.  I do not
mind "ls-files -z | test-tool chmtime -600 --stdin -z" to lose
xargs, but we already depend on GNU time to run t/perf, and it is
not too far a stretch to require GNU xargs that knows "-0" or "-d".

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 2, 2021

User Jeff Hostetler <git@jeffhostetler.com> has been added to the cc: list.

jeffhostetler and others added 11 commits February 2, 2021 18:17
Convert the test to use a more portable method to update the mtime on a
large number of files under version control.

The Mac version of xargs does not support the "-d" option.
Likewise, the "-0" and "--null" options are not portable.

Furthermore, use `test-tool chmtime` rather than `touch` to update the
mtime to ensure that it is actually updated (especially on file systems
with only whole second resolution).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Only use the final portion of the test trash directory file name
when verifying that Watchman was started.

On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style
path with forward slashes and a "/c/" drive name.  However
`watchman watch-list` reports a proper Windows-style pathname
with drive letters and backslashes.  This causes the grep to
fail.  Since we don't really care about the full pathname (and
we really don't want to bother with normalizaing them), just see
if the test-name portion of the path is found.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Shutdown Watchman after the Watchman-based tests and before the block of
"no fsmonitor" tests.

This helps ensure that Watchman cannot affect the test results for the
other.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add optional trace logging to allow us to better compare performance of
various fsmonitor providers and compare results with non-fsmonitor runs.

Currently, this includes Trace2 logging, but may be extended to include
other trace targets, such as GIT_TRACE_FSMONITOR if desired.

Using this logging helped me explain an odd behavior on MacOS where the
kernel was dropping events and causing the hook to Watchman to timeout.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Report the total number of calls made to lstat() inside preload_index().

FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files.  This can be seen in
`preload_index()`.  Let's measure this.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Report the total number of calls made to lstat() inside of refresh_index().

FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files.  This can be seen in
`refresh_index()`.  Let's measure this.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Report the number of files in the working directory that were read and
their hashes verified in `refresh_index()`.

FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files.  Let's measure this.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Let's measure the time taken to request and receive FSMonitor data
via the hook API and the size of the response.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Allow fsmonitor to report directory changes by reporting paths with a
trailing slash.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Isolate and document initialization of `istate->fsmonitor_last_update`.
This field should contain a fsmonitor-specific opaque token, but we
need to initialize it before we can actually talk to a fsmonitor process,
so we create a generic default value.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
@jeffhostetler
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 3, 2021

Submitted as pull.860.v2.git.1612366490.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-860/jeffhostetler/fsmonitor-prework-v2

To fetch this version to local tag pr-860/jeffhostetler/fsmonitor-prework-v2:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-860/jeffhostetler/fsmonitor-prework-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 3, 2021

On the Git mailing list, Taylor Blau wrote (reply to this):

Hi Jeff,

On Mon, Feb 01, 2021 at 10:02:09PM +0000, Jeff Hostetler via GitGitGadget wrote:
> This patch series fixes runtime errors in t/perf/p7519 on Windows and MacOS.
> It adds Trace2 logging to p7519 to make it easier to compare results when
> Watchman is enabled and disabled. And finally, it adds some Trace2 regions
> and data events around our usage of Watchman and the existing FSMonitor
> framework.

I read v2 of this series and it all looked very sane to me. I didn't
have much in the way of comments or concerns along the way, so that
version of the series has my reviewed-by.

Thanks,
Taylor

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 3, 2021

User Taylor Blau <me@ttaylorr.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 16, 2021

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 2/3/21 10:34 AM, Jeff Hostetler via GitGitGadget wrote:
> Here is version 2 of this series.

I didn't see this series in the "what's cooking" emails and
was wondering if there was something that I still needed to
attend to.

Thanks
Jeff


> 
> In version 1, I replaced the non-portable "xargs -d" with "xargs -0", but it
> turns out that that too is not universally available. In this version I
> replace the need for either one by filtering out the problematic paths (such
> as ones with LFs) and quoting paths to handle whitespace. The resulting
> paths can be passed to xargs without any arguments.
> 
> Also, I updated the test to use test-tool chmtime rather than touch to
> ensure that the files actually look dirty on low-resolution file systems.
> 
> Jeff Hostetler (10):
>    p7519: do not rely on "xargs -d" in test
>    p7519: fix watchman watch-list test on Windows
>    p7519: move watchman cleanup earlier in the test
>    p7519: add trace logging during perf test
>    preload-index: log the number of lstat calls to trace2
>    read-cache: log the number of lstat calls to trace2
>    read-cache: log the number of scanned files to trace2
>    fsmonitor: log invocation of FSMonitor hook to trace2
>    fsmonitor: log FSMN token when reading and writing the index
>    fsmonitor: refactor initialization of fsmonitor_last_update token
> 
> Kevin Willford (1):
>    fsmonitor: allow all entries for a folder to be invalidated
> 
>   fsmonitor.c               | 107 ++++++++++++++++++++++++++++++++++----
>   fsmonitor.h               |   5 ++
>   preload-index.c           |  10 ++++
>   read-cache.c              |  24 +++++++--
>   t/perf/.gitignore         |   1 +
>   t/perf/Makefile           |   4 +-
>   t/perf/p7519-fsmonitor.sh |  71 +++++++++++++++++++++----
>   7 files changed, 196 insertions(+), 26 deletions(-)
> 
> 
> base-commit: 71ca53e8125e36efbda17293c50027d31681a41f
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-860%2Fjeffhostetler%2Ffsmonitor-prework-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-860/jeffhostetler/fsmonitor-prework-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/860
> 
> Range-diff vs v1:
> 
>    1:  cf252e24b8c !  1:  e570f7316cc p7519: use xargs -0 rather than -d in test
>       @@ Metadata
>        Author: Jeff Hostetler <jeffhost@microsoft.com>
>        
>         ## Commit message ##
>       -    p7519: use xargs -0 rather than -d in test
>       +    p7519: do not rely on "xargs -d" in test
>        
>       -    The Mac version of xargs does not support the "-d" option.  Convert the test
>       -    setup to pipe the data set thru `lf_to_nul | xargs -0` instead.
>       +    Convert the test to use a more portable method to update the mtime on a
>       +    large number of files under version control.
>       +
>       +    The Mac version of xargs does not support the "-d" option.
>       +    Likewise, the "-0" and "--null" options are not portable.
>       +
>       +    Furthermore, use `test-tool chmtime` rather than `touch` to update the
>       +    mtime to ensure that it is actually updated (especially on file systems
>       +    with only whole second resolution).
>        
>            Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>        
>         ## t/perf/p7519-fsmonitor.sh ##
>        @@ t/perf/p7519-fsmonitor.sh: test_fsmonitor_suite() {
>       + 		git status -uall
>         	'
>         
>       ++	# Update the mtimes on upto 100k files to make status think
>       ++	# that they are dirty.  For simplicity, omit any files with
>       ++	# LFs (i.e. anything that ls-files thinks it needs to dquote).
>       ++	# Then fully backslash-quote the paths to capture any
>       ++	# whitespace so that they pass thru xargs properly.
>       ++	#
>         	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>        -		git ls-files | head -100000 | xargs -d "\n" touch -h &&
>       -+		git ls-files | head -100000 | lf_to_nul | xargs -0 touch -h &&
>       ++		git ls-files | \
>       ++			head -100000 | \
>       ++			grep -v \" | \
>       ++			sed '\''s/\(.\)/\\\1/g'\'' | \
>       ++			xargs test-tool chmtime -300 &&
>         		git status
>         	'
>         
>    2:  a641f9e357c =  2:  3042fc92fe6 p7519: fix watchman watch-list test on Windows
>    3:  2af6858716f =  3:  9ceba5e6942 p7519: move watchman cleanup earlier in the test
>    4:  8de9985a706 =  4:  f6ea0a51f50 p7519: add trace logging during perf test
>    5:  cdd49f1fdb1 =  5:  3c5035e4649 preload-index: log the number of lstat calls to trace2
>    6:  65488f7a1bf =  6:  d150a2d4576 read-cache: log the number of lstat calls to trace2
>    7:  c84531f6244 =  7:  33cc0b838fa read-cache: log the number of scanned files to trace2
>    8:  ef64b60c7a0 =  8:  c043bccc8af fsmonitor: log invocation of FSMonitor hook to trace2
>    9:  edb88ffe39e =  9:  6ec4a4468f6 fsmonitor: log FSMN token when reading and writing the index
>   10:  384d2eff863 = 10:  2ac66f07a59 fsmonitor: allow all entries for a folder to be invalidated
>   11:  4686196bbc6 = 11:  5410d3ab61d fsmonitor: refactor initialization of fsmonitor_last_update token
> 

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 16, 2021

User Jeff Hostetler <git@jeffhostetler.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 17, 2021

This branch is now known as jh/fsmonitor-prework.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 17, 2021

This patch series was integrated into seen via git@3919a38.

@gitgitgadget gitgitgadget bot added the seen label Feb 17, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 17, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff Hostetler <git@jeffhostetler.com> writes:

> On 2/3/21 10:34 AM, Jeff Hostetler via GitGitGadget wrote:
>> Here is version 2 of this series.
>
> I didn't see this series in the "what's cooking" emails and
> was wondering if there was something that I still needed to
> attend to.

I think it fell through the cracks.  It seems that Taylor gave a
reviewed-by for the whole set in v2 but as a reply to the cover
letter of the original series, and no messages in v2 saw any
comments.

I just picked them up and queued.  Thanks for pinging.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 17, 2021

This patch series was integrated into seen via git@034918a.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 18, 2021

This patch series was integrated into seen via git@3aadfd4.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2021

This patch series was integrated into seen via git@a6a7d5f.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2021

This patch series was integrated into next via git@1943efb.

@gitgitgadget gitgitgadget bot added the next label Feb 19, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 23, 2021

This patch series was integrated into seen via git@2561cba.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 25, 2021

This patch series was integrated into seen via git@324a60d.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 26, 2021

This patch series was integrated into seen via git@2845b05.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 26, 2021

This patch series was integrated into seen via git@d1b26b0.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 27, 2021

This patch series was integrated into seen via git@a1aaf75.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

This patch series was integrated into seen via git@700696b.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

This patch series was integrated into next via git@700696b.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

This patch series was integrated into master via git@700696b.

@gitgitgadget gitgitgadget bot added the master label Mar 1, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

Closed via 700696b.

@gitgitgadget gitgitgadget bot closed this Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants