Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: make process substitution work when calling Git in Bash (e.g. git diff --no-index <(echo 1) <(echo 2)) #4266

Open
dscho opened this issue Feb 3, 2023 · 0 comments
Labels
enhancement Help Wanted help is requested from collaborators, please! Potential Project This issue can be used as a development Project for those loooking for a nice challenge

Comments

@dscho
Copy link
Member

dscho commented Feb 3, 2023

NOTE: I leave this ticket in its current state for now, as I had started looking into fixing the issue, found out a couple of things, but ultimately could not finish the task in a reasonable amount of time.

Background

In Bash, there is a neat feature where you can use "Process Substitution" to feed programs expecting path parameters some dynamically generated content. For example, diff takes two parameters that refer to paths. By using the process substitution construct <(...), we can let diff compare dynamically generated text:

$ diff -u <(seq 5) <(seq 10)
--- /dev/fd/63  2023-02-03 11:49:44.000000000 +0100
+++ /dev/fd/62  2023-02-03 11:49:44.000000000 +0100
@@ -3,3 +3,8 @@
 3
 4
 5
+6
+7
+8
+9
+10

Internally, what Bash does is to replace the parameter with something like /dev/fd/63, which is typically a symlink pointing to something like /proc/22962/fd/63. This works if the called program is an MSYS program, i.e. implicitly aware of the MSYS2 runtime internals. But if it is a MINGW program (like git.exe), it is quite puzzled what to do with that Unix path:

$ git -P diff --no-index <(seq 5) <(seq 10)
error: Could not access '/proc/22962/fd/63'

I've looked into a couple of angles trying to find a way how to fix this.

MSYS2 runtime

The first idea I had was to somehow substitute the parameter with some different path, an NT namespace one (maybe \\.\pipe\*), a path that could be opened even by a regular Win32 program. This section describes my findings toward that goal.

When opening these paths in an MSYS program and calling _get_osfhandle() together with the NtQueryObject() trick to read a pipe's name, it turns out that they are named pipes with a path like \Device\NamedPipe\29148b3eb257a5c5-45908-pipe-nt-0x141. This path is generated in the MSYS2 runtime at this location.

Now, if it is a named pipe, can't we open it like we do in simple-ipc?

Apparently not. My attempts failed with ERROR_PIPE_BUSY, which indicates that either nobody is listening or that the pipe is opened with incompatible parameters (although I would not understand how the latter could be the case, as the MSYS2 runtime itself performs the same call and it works).

Here is some totally undocumented, horribly written quick 'n dirty proof-of-concept that needs to be compiled using /usr/bin/gcc -g -o a1.exe -Wall a1.c -lntdll && ./a1.exe <(echo hello):

a1.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <wchar.h>
#include <windows.h>
#include <io.h>
#include <winternl.h>

#ifdef __MSYS__
#define DWORD_F "%d"
#else
#define DWORD_F "%ld"
#endif

int main(int argc, char **argv)
{
	if (getenv("DDD")) {
		int x = 1;
		while (x) {
			fprintf(stderr, "gdb a1.exe %d\n", getpid());
			sleep(5);
		}
	}

	if (argc < 2)
		fprintf(stderr, "Need an argument\n");
	else {
		int fd = !strcmp("-", argv[1]) ? 0 : open(argv[1], O_RDONLY);
		HANDLE h = (HANDLE)_get_osfhandle(fd < 0 ? 3 : fd);
		DWORD type = GetFileType(h);
		WCHAR path[1024] = { 0 };
		DWORD ret = GetFinalPathNameByHandleW(h, path, 1024, 0);
		char buffer[1024], buffer2[1024];
		POBJECT_NAME_INFORMATION nameinfo = (POBJECT_NAME_INFORMATION) buffer2;
		DWORD result;

		/* get pipe name */
		if (!NT_SUCCESS(NtQueryObject(h, ObjectNameInformation,
					      buffer2, sizeof(buffer2) - 2, &result)))
			fprintf(stderr, "Could not get object info\n");
		else if (result < sizeof(*nameinfo) || !nameinfo->Name.Buffer || !nameinfo->Name.Length)
			fprintf(stderr, "result: " DWORD_F ", buffer: %p, length: " DWORD_F "\n",
				result, nameinfo->Name.Buffer, nameinfo->Name.Length);
		else {
			PWSTR name = nameinfo->Name.Buffer;
			HANDLE h2;
			char buffer3[1024];
			DWORD count;
			WCHAR *prefix1 = L"\\Device\\NamedPipe\\";
			WCHAR *prefix2 = L"\\\\.\\pipe\\";
			DWORD len1 = wcslen(prefix1);
			DWORD len2 = wcslen(prefix2);

			name[nameinfo->Name.Length / sizeof(*name)] = 0;
			printf("object name: '%ls'\n", name);

			if (!wcsncmp(name, prefix1, len1)) {
				memcpy(name + len1 - len2, prefix2, len2 * sizeof(WCHAR));
				name += len1 - len2;
				fprintf(stderr, "Adjusted name to '%ls'\n", name);
			}
			close(fd);
			h2 = CreateFileW(name, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
			if (getenv("WAIT_FOR_PIPE_IF_BUSY") && h2 == INVALID_HANDLE_VALUE && GetLastError() == ERROR_PIPE_BUSY) {
				if (!WaitNamedPipeW(name, 5000))
					fprintf(stderr, "Failure waiting for '%ls': " DWORD_F "\n", name, GetLastError());
				else
					h2 = CreateFileW(name, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
			}

			fprintf(stderr, "h2: %p (last error: " DWORD_F ")\n", h2, GetLastError());
			while (ReadFile(h2, buffer3, sizeof(buffer3), &count, NULL)) {
				fprintf(stderr, "Got " DWORD_F " bytes\n", count);
				write(2, buffer3, count);
			}
			CloseHandle(h2);
			fprintf(stderr, "Done reading h3\n");
		}

		fprintf(stderr, "ret: %d\n", ret);
		printf("fd: %d, h: %p\n", fd, h);
		printf("type: " DWORD_F ", pipe: " DWORD_F "\n", type, FILE_TYPE_PIPE);
		printf("path: '%ls', ret: " DWORD_F "\n", path, ret);
		fflush(stdout);

		for (;;) {
			ssize_t sz = read(fd, buffer, 1024);

			if (sz < 0) {
				fprintf(stderr, "read error: %d (%s)\n", errno, strerror(errno));
				exit(1);
			}
			if (!sz)
				break;
			write(1, buffer, sz);
		}
		close(fd);
		fprintf(stderr, "Done\n");
	}

	return 0;
}

Even calling WaitNamedPipe() does not help, it just times out. My best guess is that the MSYS2 runtime somehow "kicks" some other part (e.g. by sending a signal) to start producing the input of said pipe. Or maybe Bash is waiting for a signal before spawning the process?

In any case, I've hit what seems like a dead end here, and did not continue to research this angle.

Bash

The source code of Bash does have some code to allow for process substitution to work even without any /dev/fd/ support: here begins a section of functions that work with Unix-style FIFOs (which are somewhat comparable to Windows Named Pipes).

Now, it should be relatively straight-forward to implement another set of functions that use straight-up Win32 named pipes instead of Unix FIFOs, guarded by a certain #ifdef ... #endif, and use that in Git Bash.

This sounds doable from a first cursory glance but I ran out of time in this spike.

@dscho dscho added enhancement Potential Project This issue can be used as a development Project for those loooking for a nice challenge Help Wanted help is requested from collaborators, please! labels Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Help Wanted help is requested from collaborators, please! Potential Project This issue can be used as a development Project for those loooking for a nice challenge
Projects
None yet
Development

No branches or pull requests

1 participant