Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
Merge pull request #3082 from dscho/fsmonitor-gfw
Add an experimental built-in FSMonitor
  • Loading branch information
jeffhostetler committed Mar 8, 2021
2 parents 4321276 + 60deac2 commit dc703dd
Show file tree
Hide file tree
Showing 46 changed files with 7,165 additions and 79 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -71,6 +71,7 @@
/git-format-patch
/git-fsck
/git-fsck-objects
/git-fsmonitor--daemon
/git-gc
/git-get-tar-commit-id
/git-grep
Expand Down
47 changes: 37 additions & 10 deletions Documentation/config/core.txt
Expand Up @@ -66,18 +66,45 @@ core.fsmonitor::
will identify all files that may have changed since the
requested date/time. This information is used to speed up git by
avoiding unnecessary processing of files that have not changed.
See the "fsmonitor-watchman" section of linkgit:githooks[5].
+
See the "fsmonitor-watchman" section of linkgit:githooks[5].
+
Note: FSMonitor hooks (and this config setting) are ignored if the
(experimental) built-in FSMonitor is enabled (see
`core.useBuiltinFSMonitor`).

core.fsmonitorHookVersion::
Sets the version of hook that is to be used when calling fsmonitor.
There are currently versions 1 and 2. When this is not set,
version 2 will be tried first and if it fails then version 1
will be tried. Version 1 uses a timestamp as input to determine
which files have changes since that time but some monitors
like watchman have race conditions when used with a timestamp.
Version 2 uses an opaque string so that the monitor can return
something that can be used to determine what files have changed
without race conditions.
Sets the version of hook that is to be used when calling the
FSMonitor hook (as configured via `core.fsmonitor`).
+
There are currently versions 1 and 2. When this is not set,
version 2 will be tried first and if it fails then version 1
will be tried. Version 1 uses a timestamp as input to determine
which files have changes since that time but some monitors
like watchman have race conditions when used with a timestamp.
Version 2 uses an opaque string so that the monitor can return
something that can be used to determine what files have changed
without race conditions.
+
Note: FSMonitor hooks (and this config setting) are ignored if the
built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).

core.useBuiltinFSMonitor::
(EXPERIMENTAL) If set to true, enable the built-in filesystem
event watcher (for technical details, see
linkgit:git-fsmonitor--daemon[1]).
+
Like external (hook-based) FSMonitors, the built-in FSMonitor can speed up
Git commands that need to refresh the Git index (e.g. `git status`) in a
worktree with many files. The built-in FSMonitor facility eliminates the
need to install and maintain an external third-party monitoring tool.
+
The built-in FSMonitor is currently available only on a limited set of
supported platforms.
+
Note: if this config setting is set to `true`, any FSMonitor hook
configured via `core.fsmonitor` (and possibly `core.fsmonitorHookVersion`)
is ignored.

core.trustctime::
If false, the ctime differences between the index and the
Expand Down
107 changes: 107 additions & 0 deletions Documentation/git-fsmonitor--daemon.txt
@@ -0,0 +1,107 @@
git-fsmonitor--daemon(1)
========================

NAME
----
git-fsmonitor--daemon - (EXPERIMENTAL) Builtin file system monitor daemon

SYNOPSIS
--------
[verse]
'git fsmonitor--daemon' --start
'git fsmonitor--daemon' --run
'git fsmonitor--daemon' --stop
'git fsmonitor--daemon' --is-running
'git fsmonitor--daemon' --is-supported
'git fsmonitor--daemon' --query <token>
'git fsmonitor--daemon' --query-index
'git fsmonitor--daemon' --flush

DESCRIPTION
-----------

NOTE! This command is still only an experiment, subject to change dramatically
(or even to be abandoned).

Monitors files and directories in the working directory for changes using
platform-specific file system notification facilities.

It communicates directly with commands like `git status` using the
link:technical/api-simple-ipc.html[simple IPC] interface instead of
the slower linkgit:githooks[5] interface.

OPTIONS
-------

--start::
Starts the fsmonitor daemon in the background.

--run::
Runs the fsmonitor daemon in the foreground.

--stop::
Stops the fsmonitor daemon running for the current working
directory, if present.

--is-running::
Exits with zero status if the fsmonitor daemon is watching the
current working directory.

--is-supported::
Exits with zero status if the fsmonitor daemon feature is supported
on this platform.

--query <token>::
Connects to the fsmonitor daemon (starting it if necessary) and
requests the list of changed files and directories since the
given token.
This is intended for testing purposes.

--query-index::
Read the current `<token>` from the File System Monitor index
extension (if present) and use it to query the fsmonitor daemon.
This is intended for testing purposes.

--flush::
Force the fsmonitor daemon to flush its in-memory cache and
re-sync with the file system.
This is intended for testing purposes.

REMARKS
-------
The fsmonitor daemon is a long running process that will watch a single
working directory. Commands, such as `git status`, should automatically
start it (if necessary) when `core.useBuiltinFSMonitor` is set to `true`
(see linkgit:git-config[1]).

Configure the built-in FSMonitor via `core.useBuiltinFSMonitor` in each
working directory separately, or globally via `git config --global
core.useBuiltinFSMonitor true`.

Tokens are opaque strings. They are used by the fsmonitor daemon to
mark a point in time and the associated internal state. Callers should
make no assumptions about the content of the token. In particular,
the should not assume that it is a timestamp.

Query commands send a request-token to the daemon and it responds with
a summary of the changes that have occurred since that token was
created. The daemon also returns a response-token that the client can
use in a future query.

For more information see the "File System Monitor" section in
linkgit:git-update-index[1].

CAVEATS
-------

The fsmonitor daemon does not currently know about submodules and does
not know to filter out file system events that happen within a
submodule. If fsmonitor daemon is watching a super repo and a file is
modified within the working directory of a submodule, it will report
the change (as happening against the super repo). However, the client
should properly ignore these extra events, so performance may be affected
but it should not cause an incorrect result.

GIT
---
Part of the linkgit:git[1] suite
4 changes: 3 additions & 1 deletion Documentation/git-update-index.txt
Expand Up @@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
This feature is intended to speed up git operations for repos that have
large working directories.

It enables git to work together with a file system monitor (see the
It enables git to work together with a file system monitor (see
linkgit:git-fsmonitor--daemon[1]
and the
"fsmonitor-watchman" section of linkgit:githooks[5]) that can
inform it as to what files have been modified. This enables git to avoid
having to lstat() every file to find modified files.
Expand Down
3 changes: 2 additions & 1 deletion Documentation/githooks.txt
Expand Up @@ -584,7 +584,8 @@ fsmonitor-watchman

This hook is invoked when the configuration option `core.fsmonitor` is
set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
depending on the version of the hook to use.
depending on the version of the hook to use, unless overridden via
`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).

Version 1 takes two arguments, a version (1) and the time in elapsed
nanoseconds since midnight, January 1, 1970.
Expand Down
105 changes: 105 additions & 0 deletions Documentation/technical/api-simple-ipc.txt
@@ -0,0 +1,105 @@
Simple-IPC API
==============

The Simple-IPC API is a collection of `ipc_` prefixed library routines
and a basic communication protocol that allow an IPC-client process to
send an application-specific IPC-request message to an IPC-server
process and receive an application-specific IPC-response message.

Communication occurs over a named pipe on Windows and a Unix domain
socket on other platforms. IPC-clients and IPC-servers rendezvous at
a previously agreed-to application-specific pathname (which is outside
the scope of this design) that is local to the computer system.

The IPC-server routines within the server application process create a
thread pool to listen for connections and receive request messages
from multiple concurrent IPC-clients. When received, these messages
are dispatched up to the server application callbacks for handling.
IPC-server routines then incrementally relay responses back to the
IPC-client.

The IPC-client routines within a client application process connect
to the IPC-server and send a request message and wait for a response.
When received, the response is returned back the caller.

For example, the `fsmonitor--daemon` feature will be built as a server
application on top of the IPC-server library routines. It will have
threads watching for file system events and a thread pool waiting for
client connections. Clients, such as `git status` will request a list
of file system events since a point in time and the server will
respond with a list of changed files and directories. The formats of
the request and response are application-specific; the IPC-client and
IPC-server routines treat them as opaque byte streams.


Comparison with sub-process model
---------------------------------

The Simple-IPC mechanism differs from the existing `sub-process.c`
model (Documentation/technical/long-running-process-protocol.txt) and
used by applications like Git-LFS. In the LFS-style sub-process model
the helper is started by the foreground process, communication happens
via a pair of file descriptors bound to the stdin/stdout of the
sub-process, the sub-process only serves the current foreground
process, and the sub-process exits when the foreground process
terminates.

In the Simple-IPC model the server is a very long-running service. It
can service many clients at the same time and has a private socket or
named pipe connection to each active client. It might be started
(on-demand) by the current client process or it might have been
started by a previous client or by the OS at boot time. The server
process is not associated with a terminal and it persists after
clients terminate. Clients do not have access to the stdin/stdout of
the server process and therefore must communicate over sockets or
named pipes.


Server startup and shutdown
---------------------------

How an application server based upon IPC-server is started is also
outside the scope of the Simple-IPC design and is a property of the
application using it. For example, the server might be started or
restarted during routine maintenance operations, or it might be
started as a system service during the system boot-up sequence, or it
might be started on-demand by a foreground Git command when needed.

Similarly, server shutdown is a property of the application using
the simple-ipc routines. For example, the server might decide to
shutdown when idle or only upon explicit request.


Simple-IPC protocol
-------------------

The Simple-IPC protocol consists of a single request message from the
client and an optional response message from the server. Both the
client and server messages are unlimited in length and are terminated
with a flush packet.

The pkt-line routines (Documentation/technical/protocol-common.txt)
are used to simplify buffer management during message generation,
transmission, and reception. A flush packet is used to mark the end
of the message. This allows the sender to incrementally generate and
transmit the message. It allows the receiver to incrementally receive
the message in chunks and to know when they have received the entire
message.

The actual byte format of the client request and server response
messages are application specific. The IPC layer transmits and
receives them as opaque byte buffers without any concern for the
content within. It is the job of the calling application layer to
understand the contents of the request and response messages.


Summary
-------

Conceptually, the Simple-IPC protocol is similar to an HTTP REST
request. Clients connect, make an application-specific and
stateless request, receive an application-specific
response, and disconnect. It is a one round trip facility for
querying the server. The Simple-IPC routines hide the socket,
named pipe, and thread pool details and allow the application
layer to focus on the application at hand.
24 changes: 24 additions & 0 deletions Makefile
Expand Up @@ -464,6 +464,11 @@ all::
# directory, and the JSON compilation database 'compile_commands.json' will be
# created at the root of the repository.
#
# If your platform supports an built-in fsmonitor backend, set
# FSMONITOR_DAEMON_BACKEND to the name of the corresponding
# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
# `fsmonitor_fs_listen__*()` routines.
#
# Define DEVELOPER to enable more compiler warnings. Compiler version
# and family are auto detected, but could be overridden by defining
# COMPILER_FEATURES (see config.mak.dev). You can still set
Expand Down Expand Up @@ -736,6 +741,7 @@ TEST_BUILTINS_OBJS += test-serve-v2.o
TEST_BUILTINS_OBJS += test-sha1.o
TEST_BUILTINS_OBJS += test-sha256.o
TEST_BUILTINS_OBJS += test-sigchain.o
TEST_BUILTINS_OBJS += test-simple-ipc.o
TEST_BUILTINS_OBJS += test-strcmp-offset.o
TEST_BUILTINS_OBJS += test-string-list.o
TEST_BUILTINS_OBJS += test-submodule-config.o
Expand Down Expand Up @@ -882,6 +888,7 @@ LIB_OBJS += fetch-pack.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
LIB_OBJS += fsmonitor-ipc.o
LIB_OBJS += gettext.o
LIB_OBJS += gpg-interface.o
LIB_OBJS += graph.o
Expand Down Expand Up @@ -1081,6 +1088,7 @@ BUILTIN_OBJS += builtin/fmt-merge-msg.o
BUILTIN_OBJS += builtin/for-each-ref.o
BUILTIN_OBJS += builtin/for-each-repo.o
BUILTIN_OBJS += builtin/fsck.o
BUILTIN_OBJS += builtin/fsmonitor--daemon.o
BUILTIN_OBJS += builtin/gc.o
BUILTIN_OBJS += builtin/get-tar-commit-id.o
BUILTIN_OBJS += builtin/grep.o
Expand Down Expand Up @@ -1667,6 +1675,14 @@ ifdef NO_UNIX_SOCKETS
BASIC_CFLAGS += -DNO_UNIX_SOCKETS
else
LIB_OBJS += unix-socket.o
LIB_OBJS += unix-stream-server.o
LIB_OBJS += compat/simple-ipc/ipc-shared.o
LIB_OBJS += compat/simple-ipc/ipc-unix-socket.o
endif

ifdef USE_WIN32_IPC
LIB_OBJS += compat/simple-ipc/ipc-shared.o
LIB_OBJS += compat/simple-ipc/ipc-win32.o
endif

ifdef NO_ICONV
Expand Down Expand Up @@ -1881,6 +1897,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
COMPAT_OBJS += compat/access.o
endif

ifdef FSMONITOR_DAEMON_BACKEND
COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
endif

ifeq ($(TCLTK_PATH),)
NO_TCLTK = NoThanks
endif
Expand Down Expand Up @@ -2731,6 +2752,9 @@ GIT-BUILD-OPTIONS: FORCE
@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
@echo X=\'$(X)\' >>$@+
ifdef FSMONITOR_DAEMON_BACKEND
@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
endif
ifdef TEST_OUTPUT_DIRECTORY
@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
endif
Expand Down
1 change: 1 addition & 0 deletions builtin.h
Expand Up @@ -158,6 +158,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
int cmd_format_patch(int argc, const char **argv, const char *prefix);
int cmd_fsck(int argc, const char **argv, const char *prefix);
int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix);
int cmd_gc(int argc, const char **argv, const char *prefix);
int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
int cmd_grep(int argc, const char **argv, const char *prefix);
Expand Down
3 changes: 2 additions & 1 deletion builtin/credential-cache--daemon.c
Expand Up @@ -203,9 +203,10 @@ static int serve_cache_loop(int fd)

static void serve_cache(const char *socket_path, int debug)
{
struct unix_stream_listen_opts opts = UNIX_STREAM_LISTEN_OPTS_INIT;
int fd;

fd = unix_stream_listen(socket_path);
fd = unix_stream_listen(socket_path, &opts);
if (fd < 0)
die_errno("unable to bind to '%s'", socket_path);

Expand Down

0 comments on commit dc703dd

Please sign in to comment.