Skip to content

Implement a simple IPC mechanism in preparation for the internal FSMonitor #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

dscho
Copy link
Member

@dscho dscho commented Jun 4, 2020

This topic branch introduces support for a super simply command server, backed by Unix sockets (and named pipes on Windows). All it does is to listen for short, textual commands, and respond with a reply, then close the connection.

The intention is to support an upcoming implementation of a built-in, Git-aware FSMonitor.

jeffhostetler and others added 4 commits June 4, 2020 16:27
Teach packet_write_gently() to use a stack buffer rather than a static
buffer when composing the packet line message.  This helps get us ready
for threaded operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
So far, the (possibly indirect) callers of `get_packet_data()` can ask
that function to return an error instead of `die()`ing upon end-of-file.

However, when we call this function in a long-running daemon, we
absolutely want the daemon to live, still, even if there was a read
error when one random read failed.

So let's introduce an explicit option to tell the packet reader
machinery to please be nice and only return an error.

This change prepares pkt-line for the internal fsmonitor (which is
precisely such a daemon).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…uf()

This function currently has only one caller: `apply_multi_file_filter()`
in `convert.c`. That caller wants a flush packet to be written after
writing the payload.

However, we are about to introduce a user that wants to write many
packets before a final flush packet, so let's extend this function to
prepare for that scenario.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `read_packetized_to_strbuf()` function reads packets into a strbuf
until a flush packet has been received. So far, it has only one caller:
`apply_multi_file_filter()` in `convert.c`. This caller really only
needs the `PACKET_READ_GENTLE_ON_EOF` option to be passed to
`packet_read()` (which makes sense in the scenario where packets should
be read until a flush packet is received).

We are about to introduce a caller that wants to pass other options
through to `packet_read()`, so let's extend the function signature
accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho mentioned this pull request Jun 4, 2020
@dscho
Copy link
Member Author

dscho commented Jun 4, 2020

@jeffhostetler do you want to work on the task where simple IPC learns to potentially service multiple clients in parallel? If so, do you think it should be an add-on, or should it be squashed into this PR?

@derrickstolee
Copy link

@jeffhostetler do you want to work on the task where simple IPC learns to potentially service multiple clients in parallel? If so, do you think it should be an add-on, or should it be squashed into this PR?

Unless there is a good reason why something done here is blocking that feature, I think it would be good to get this reviewed and merged before he updates the protocol with that part.

Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend some pretty significant shifts in the IPC code organization, so please give them some thought. Most everything else is a nit.

test_expect_success 'start simple command server' '
{ test-tool simple-ipc daemon & } &&
SIMPLE_IPC_PID=$! &&
test_atexit stop_simple_IPC_server

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -182,4 +182,143 @@ int ipc_send_command(const char *path, const char *message, struct strbuf *answe
close(fd);
return ret < 0 ? -1 : 0;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this commit could maybe use more text in the message body. Perhaps:

We previously implemented simple Inter-Process Communication (IPC) on
Windows using named pipes. Create an equivalent implementation for
Unix-based platforms using sockets.

(insert any subtleties about how this differs from the Windows implementation)

The tests in t0052-simple-ipc.sh will automatically start running on Unix
platforms with this change due to the response from "test-tool simple-ipc
SUPPORTS_SIMPLE_IPC" changing automatically with the change here in
simple-ipc.h.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

simple-ipc: implement a backend based on Unix-sockets

We previously implemented simple Inter-Process Communication (IPC) on
Windows using named pipes. Create an equivalent implementation for
Unix-based platforms using Unix sockets.

Note that there are a variety of differences between Win32 Named Pipes
and Unix sockets, for example:

- a Win32 Named Pipe vanishes once the listener handle is closed. Unix
  sockets do not just disappear, even if the process that created them
  exited.

- Win32 Named Pipes exist in a name space parallel to the actual file
  system. In contrast, Unix sockets live in the same name space, i.e. a
  Unix socket cannot have the same path as an existing file and vice
  versa.

- Win32 Named Pipes have two communication modes: "message" and "byte".
  The former allows senders to define the size of messages to be sent;
  the receiving end will not receive partial messages. Unix sockets only
  operate in the mode equivalent to "byte".

- When a listener accepts a client via a Win32 Named Pipe, subsequent
  communication will use the same handle to the same pipe, i.e. it can
  become a bit tricky to keep the communication to several concurrent
  clients apart from one another. When a listener on a Unix socket
  accepts a new client, it gets handed a new file descriptor specific to
  that client.

The tests in `t0052-simple-ipc.sh` will automatically start running on
Unix platforms with this change due to the response from `test-tool
simple-ipc SUPPORTS_SIMPLE_IPC` changing automatically with the change
here in `simple-ipc.h`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WRT:

    - When a listener accepts a client via a Win32 Named Pipe, subsequent
       communication will use the same handle to the same pipe, i.e. it can
       become a bit tricky to keep the communication to several concurrent
       clients apart from one another. When a listener on a Unix socket
       accepts a new client, it gets handed a new file descriptor specific to
       that client.

Windows Named Pipes actually behave a little differently than that.
The server creates the initial server-pipe (which reserves the pathname
in the Named Pipe File System) and allows it to receive the first client
connection. That part is true. But when either side closes that connection,
the pipe is dead. The pathname is removed from the NPFS when the other
side closes their handle.

For the server to respond to multiple clients (either in series or currently),
it needs to open more than one server pipe handles on the same pathname.
(The server should set the ACL on the initial server-pipe handle, because the
ACL can't be changed once there are multiple handles.)

So yes, the server needs to juggle a list/queue of open server handles. The next
incoming client connection will connect (randomly to) one of the open
server handles. The server needs to pull that from the "ready list/queue" and
dispatch it to a worker thread for the duration of the conversation/protocol.
The server needs to open a new server handle on the existing pathname
to replace the consumed one in the queue.

Most of the example code on the net show a server opening a single server
handle, waiting for a connection, dispatching it and then opening a new
single server handle to replace it (and before the worker thread can close
the connected handle). There are 2 races in that model: (1) the worker hangs
up too quickly and (2) the named pipe is non-responsive during that the interval
(after the connection is accepted and until the new server handle is opened).

(Creating a named-pipe client handle does secretly have a timeout in it, but
it is still possible for the client to timeout and fail to get a client handle (when
you have a high arrival rate (ask me how I know...)))

By letting the server pre-create multiple server handles (mostly) prevents
clients from starving, and implicitly prevents the other race. Conceptually,
this server handle list/queue acts like the listen(n) in the socket code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffhostetler can I ask you to implement this? You are the domain expert about this, and I don't trust myself to get it right...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me take a look at restructuring things and see what I can come up with. I think it would be easier to slip this in at the IPC layer in isolation before worrying about mingling it with FSMonitor details.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Comment on lines 47 to 58
/*
* The test script will start the daemon in the background,
* meaning that it might not be fully set up by the time we are
* verifying that the IPC is active. So let's keep trying for
* up to 2.5 seconds.
*/
for (i = 0; i < 50; i++)
if (ipc_is_active(path))
return 0;
else
sleep_millisec(50);
return 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to generalize this retry loop so we don't need to reimplement it each time?

@derrickstolee derrickstolee self-requested a review June 9, 2020 12:57
@derrickstolee
Copy link

@dscho do you plan to rebase this topic to squash in the fixup! commits? or are you waiting on some more feedback?

Copy link

@jeffhostetler jeffhostetler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to put a hold on this for a while. Reasons forthcoming.

simple-ipc.c Outdated
{
int fd = *(int *)reply_data;

return write_packetized_from_buf(response, len, fd, 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message for this commit (031698d)
says it doesn't use pkt-line routines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed a new commit message in 4e01e87 (as a squash! because we do not have a reword! yet):

    Add support for very simple Inter-Process Communication

    For the upcoming `fsmonitor--daemon` command, we need some way to send a
    command from some Git process to that daemon, and to receive an answer.
    In other words, we need to support Inter-Process Communication.

    By nature, Inter-Process Communication requires OS-specific
    implementations.

    Instead of using full-blown TCP/UDP socket communication (which comes
    with all kinds of challenges such as: restricting access permissions,
    collisions when multiple instances might want to listen on the same
    port, etc), we use named pipes on Windows, and plan on implementing a
    backend based on Unix sockets for Linux/Unix later.

    For now, we use a super-simple protocol: identified by a path into
    .git/, the daemon listens for clients, reads one command, then sends the
    answer and closes the connection.

    To avoid redundant coding patterns, we use the `pkt-line` protocol that
    we just libified to be able to use here: we really, really do not want
    our simple IPC client to take down the entire Git process with it when
    it could not talk to `fsmonitor--daemon`, but would rather instead fall
    back to a full scan of all tracked files.

(I adjusted the second-to-last paragraph minimally, and the last paragraph rather a lot.)

@dscho
Copy link
Member Author

dscho commented Jun 10, 2020

@dscho do you plan to rebase this topic to squash in the fixup! commits? or are you waiting on some more feedback?

I'm just majorly side-tracked with git-for-windows#2674, is all ;-)

@dscho dscho mentioned this pull request Jun 15, 2020
dscho and others added 7 commits June 29, 2020 21:58
For the upcoming `fsmonitor--daemon` command, we need some way to send a
command from some Git process to that daemon, and to receive an answer.
In other words, we need to support Inter-Process Communication.

By nature, Inter-Process Communication requires OS-specific
implementations.

Instead of using full-blown TCP/UDP socket communication (which comes
with all kinds of challenges such as: restricting access permissions,
collisions when multiple instances might want to listen on the same
port, etc), we use named pipes on Windows, and plan on implementing a
backend based on Unix sockets for Linux/Unix later.

For now, we use a super-simple protocol: identified by a path into
.git/, the daemon listens for clients, reads one command, then sends the
answer and closes the connection.

To avoid redundant coding patterns, we use the `pkt-line` protocol that
we just libified to be able to use here: we really, really do not want
our simple IPC client to take down the entire Git process with it when
it could not talk to `fsmonitor--daemon`, but would rather instead fall
back to a full scan of all tracked files.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We previously implemented simple Inter-Process Communication (IPC) on
Windows using named pipes. Create an equivalent implementation for
Unix-based platforms using Unix sockets.

Note that there are a variety of differences between Win32 Named Pipes
and Unix sockets, for example:

- a Win32 Named Pipe vanishes once the listener handle is closed. Unix
  sockets do not just disappear, even if the process that created them
  exited.

- Win32 Named Pipes exist in a name space parallel to the actual file
  system. In contrast, Unix sockets live in the same name space, i.e. a
  Unix socket cannot have the same path as an existing file and vice
  versa.

- Win32 Named Pipes have two communication modes: "message" and "byte".
  The former allows senders to define the size of messages to be sent;
  the receiving end will not receive partial messages. Unix sockets only
  operate in the mode equivalent to "byte".

- When a listener accepts a client via a Win32 Named Pipe, subsequent
  communication will use the same handle to the same pipe, i.e. it can
  become a bit tricky to keep the communication to several concurrent
  clients apart from one another. When a listener on a Unix socket
  accepts a new client, it gets handed a new file descriptor specific to
  that client.

The tests in `t0052-simple-ipc.sh` will automatically start running on
Unix platforms with this change due to the response from `test-tool
simple-ipc SUPPORTS_SIMPLE_IPC` changing automatically with the change
here in `simple-ipc.h`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This adds a very rudimentary test, just to make sure that the simple
Inter-Process Communication (where supported) works as intended (and
maybe also serving as an example how the command listener is supposed to
be implemented).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
prevent multiple listeners associated with the same path

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
TODO details here

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Just _starting_ to fix up the Unix Sockets part to match the new
internal simple IPC API.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@jeffhostetler
Copy link

I'd like to close this PR in favor of #281

@dscho dscho closed this Jul 13, 2020
@dscho dscho deleted the simple-ipc branch August 24, 2020 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants