Skip to content

Latest commit

 

History

History
130 lines (83 loc) · 15.9 KB

ARCHITECTURE.md

File metadata and controls

130 lines (83 loc) · 15.9 KB

Architecture of Git for Windows

Git for Windows is a complex project.

What is Git for Windows?

A fork of git/git

First and foremost, it is a friendly fork of git/git, aiming to improve Git's Windows support. The git-for-windows/git repository contains dozens of topics on top of git/git, some awaiting to be "upstreamed" (i.e. to be contributed to git/git), some still being stabilized, and a few topics are specific to the Git for Windows project and are not intended to be integrated into git/git at all.

Enhancing and maintaining Git's support for Windows

On the source code side, Git's Windows support is made a bit more tricky than strictly necessary by the fact that Git does not have any platform abstraction layer (unlike other version control systems, such as Subversion). It relies on the presence of POSIX features such as the hstrerror() function, and on platforms lacking that functionality, Git provides shims. That leads to some challenges e.g. with the stat() function which is very slow on Windows because it has to collect much more metadata than what e.g. the very quick GetFileAttributesExW() Win32 API function provides, even when Git calls stat() merely to test for the presence of a file (for which all that gathered metadata is totally irrelevant).

Providing more than just source code

In contrast to the Git project, Git for Windows not only publishes tagged source code versions, but full builds of Git. In fact, Git for Windows' primary purpose, as far as most users are concerned, is to provide a convenient installer that end-users can run to have Git on their computer, without ever having to check out git-for-windows/git let alone build it. In essence, Git for Windows has to maintain a separate project altogether in addition to the fork of git/git, just to build these release artifacts: git-for-windows/build-extra. This repository also contains the definition for a couple of other release artifacts published by Git for Windows, e.g. the "portable" edition of Git for Windows which is a self-extracting 7-Zip archive that does not need to be installed.

A software distribution, really

Another aspect that contributes to the complexity of Git for Windows is that it is not just building git.exe and distributes that. Due to its heritage within the Linux project, Git takes certain things for granted, such as the presence of a Unix shell, or for that matter, a package management system from which dependencies can be fetched and updated independently of Git itself. Things that are distinctly not present in most Windows setups. To accommodate for that, Git for Windows originally relied on the MSys project, a minimal fork of Cygwin providing a Unix shell ("Bash"), a Perl interpreter and similar Unix-like tools, and on the MINGW project, a project to build libraries and executables using a GNU C Compiler that relies only on Win32 API functions. As of Git for Windows v2.x, the project has switched away from MSys/MinGW (due to less-than-active maintenance) to the MSYS2 project. That switch brought along the benefit of a robust package management system based on Pacman (hailing from Arch Linux). To support Windows users, who are in general unfamiliar with Linux-like package management and the need to update installed packages frequently, Git for Windows bundles a subset of its own fork of MSYS2. To put things in perspective: Git for Windows bundles files from ~170 packages, one of which contains Git, and another one contains Git's help files. In that respect, Git for Windows acts like a distribution more than like a mere single software application.

Most of MSYS2's packages that are bundled in Git for Windows are consumed directly from MSYS2. Others need forks that are maintained by Git for Windows project, to support Git for Windows better. These forks live in the git-for-windows/MSYS2-packages and git-for-windows/MINGW-packages repositories. There are several reasons justifying these forks. For example, the Git for Windows' flavor of the MSYS2 runtime behaves like Git's test suite expects it while MSYS2's flavor does not. Another example: The Bash executable bundled in Git for Windows is code-signed with the same certificate as git.exe to help anti-malware programs get out of the users' way. That is why Git for Windows maintains its own bash Pacman package. And since MSYS2 dropped 32-bit support already, Git for Windows has to update the 32-bit Pacman packages itself, which is done in the git-for-windows/MSYS2-packages repository. (Side note: the 32-bit issue is a bit more complicated, actually: MSYS2 still builds MINGW packages targeting i686 processors, but no longer any MSYS packages for said processor architecture, and Git for Windows does not keep all of the 32-bit MSYS packages up to date but instead judiciously decides which packages are vital enough as far as Git is concerned to justify the maintenance cost.)

Supporting third-party applications that use Git's functionality

Since the infrastructure required by Git is non-trivial the installer (or for that matter, the Portable Git) is not exactly light-weight: As of January 2023, both artifacts are over fifty megabytes. This is a problem for third-party applications wishing to bundle a version of Git for Windows, which is often advisable given that applications may depend on features that have been introduced only in recent Git versions and therefore relying on an installed Git for Windows could break things. To help with that, the Git for Windows project also provides MinGit as a release artifact, a zip file that is much smaller than the full installer and that contains only the parts of Git for Windows relevant for third-party applications. It lacks Git GUI, for example, as well as the terminal program MinTTY, or for that matter, the documentation.

Supporting git/git's GitHub workflows

The Git for Windows project is also responsible for keeping the Windows part of git/git's automated builds up and running. On Windows, there is no canonical and easy way to get a build environment necessary to build Git and run its test suite, therefore this is a non-trivial task that comes with its own maintenance cost. Git for Windows provides two GitHub Actions to help with that: git-for-windows/setup-git-for-windows-sdk to set up a tiny subset of Git for Windows' full SDK (which would require about 500MB to be cloned, as opposed to the ~75MB of that subset) and git-for-windows/get-azure-pipelines-artifact e.g. to download some regularly pre-built artifacts (for example, when git/git's automated tests ran on an Ubuntu version that did not provide an up to date Coccinelle package, this GitHub Action was used to download a pre-built version of that Debian package).

Maintaining Git for Windows' components

Git for Windows uses a combination of a GitHub App called GitForWindowsHelper (to listen for so-called slash commands) combined with workflows in the git-for-windows-automation repository (for computationally heavy tasks) to support Git for Windows' repetitive tasks.

This heavy automation serves two purposes:

  1. Document the knowledge about "how things are done" in the Git for Windows project.
  2. Make Git for Windows' maintenance less tedious by off-loading as many tasks onto machines as possible.

One neat trick of some git-for-windows-automation workflows is that they "mirror back" check runs to the targeted PRs in another repository. This essentially allows versioning the source code independently of the workflow definition.

Here is a diagram showing how the bits and pieces fit together.

graph LR
  A[`monitor-components`] --> |opens| B
  B{issues labeled<br />`component-update`} --> |/open pr| C
  C((GitForWindowsHelper)) --> |triggers| D
  D[`open-pr`] --> |opens| E
  E{PR in</br>MINGW-packages<br />MSYS2-packages<br />build-extra} --> |closes| B
  E --> |/deploy| F
  F((GitForWindowsHelper)) --> |triggers| G
  G[`build-and-deploy`] --> |deploys to| H
  H{Pacman repository}
  C --> |backed by| I
  F --> |backed by| I
  I[[Azure Function]]
  D --> |running in| J
  G --> | running in| J
  J[[git-for-windows-automation]]
  K[[git-sdk-32<br />git-sdk-64<br />git-sdk-arm64]] --> |syncing from| H
  B --> |/add release note| L
  L[`add-release-note`]

For the curious mind, here are detailed instructions how the Azure Function backing the GitForWindowsHelper GitHub App was set up.

The monitor-components workflow

When new versions of components that Git for Windows builds become available, new Pacman packages have to be built. To this end, the monitor-components workflow monitors a couple of RSS feeds and opens new tickets labeled component-update for such new versions.

Opening Pull Requests to update Git for Windows' components

After determining that such a ticket indeed indicates the need for a new Pacman package build, a Git for Windows maintainer issues the /open pr command via an issue comment (example), which gets picked up by the GitForWindowsHelper GitHub App, which in turn triggers the open-pr workflow in the git-for-windows-automation repository.

Deploying the Pacman packages

This will open a Pull Request in one of Git for Windows' repositories, and once the PR build passes, a Git for Windows maintainer issues the /deploy command (example), which gets picked up by the GitForWindowsHelper GitHub App, which triggers the build-and-deploy workflow.

Adding release notes

Finally, once the packages have been built and deployed to the Pacman repository (which is hosted in Azure Blob Storage), a Git for Windows maintainer will merge the PR(s), which in turn will close the ticket, and the maintainer then issues an /add release note command (example), which again gets picked up by the GitForWindowsHelper GitHub App that triggers the add-release-note workflow that creates and pushes a new commit to the ReleaseNotes.md file in build-extra (example).

Releasing official Git for Windows versions

A relatively infrequent part of Git for Windows' maintainers' duties, if the most rewarding part, is the task of releasing new versions of Git for Windows.

Most commonly, this is done in response to the "upstream" Git project releasing a new version. When that happens, a Git for Windows maintainer runs the helper script to perform a "merging rebase" (i.e. a rebase that starts with a fake-merge of the previous tip commit, to maintain both a clean set of commits as well as a fast-forwarding commit history).

Once that is done, the maintainer will open a Pull Request to benefit from the automated builds and tests (example) as well as from reviews of the range-diff relative to the current main branch.

Once everything looks good, the maintainer will issue the /git-artifacts command (example). This will trigger an automated workflow that builds all of the release artifacts: installers, Portable Git, MinGit, .tar.xz archive and a NuGet package. Apart from the NuGet package, two sets of artifacts are built: targeting 32-bit ("x86") and 64-bit ("amd64").

Once these artifacts are built, the maintainer will download the installer and run the "pre-flight checklist".

If everything looks good, a /release command will be issued, which triggers yet another workflow that will download the just-built-and-verified release artifacts, publish them as a new GitHub release, publish the NuGet packages, deploy the Pacman packages to the Pacman repository, send out an announcement mail, and update the respective repositories including Git for Windows' website.

As mentioned before, the /git-artifacts and /release commands are picked up by the GitForWindowsHelper GitHub App which subsequently triggers the respective workflows in the git-for-windows-automation repository. Here is a diagram:

graph LR
  A{Pull Request<br />updating to<br />new Git version} --> |/git-artifacts| B
  B((GitForWindowsHelper)) --> |triggers| C
  C[`tag-git`] --> |upon successful build<br />triggers| D
  D((GitForWindowsHelper)) --> |triggers| E
  E[`git-artifacts`]
  E --> |maintainer verifies artifacts| E
  A --> |upon verified `git-artifacts`<br />/release| F
  F[`release-git`]
  C --> |running in| J
  E --> | running in| J
  F --> | running in| J
  J[[git-for-windows-automation]]

Managing Windows/ARM64 builds

The GitForWindowsHelper comes in real handy for Git for Windows' Pacman packages for the aarch64 architecture, i.e. for Windows/ARM64. These packages cannot be built in regular hosted GitHub Actions runners because there are none of that architecture. To help with that, the respective workflows in git-for-windows-automation use the label runs-on: ["Windows", "ARM64"] to indicate that they need a self-hosted Windows/ARM64 runner.

It would not be cost-effective to have a VM running permanently, hosting such a self-hosted runner: Git for Windows does not build such packages often enough (usually once or twice per week is more the norm).

Therefore, VMs providing self-hosted GitHub Actions runners are spun up and torn down as needed. This job is done by the GitForWindowsHelper:

  • When a job is queued asking for above-mentioned labels, the create-self-hosted-runner workflow is started. This deploys an Azure Resource Management template that creates an ephemeral self-hosted runner (i.e. a runner that will pick up one job and then is immediately unregistered).

  • When a job with above-mentioned labels has finished, the GitForWindowsHelper triggers the delete-self-hosted-runner workflow that tears down the now no longer used VM.

The GitForWindowsHelper GitHub App will also detect when a job is queued for a PR from a forked repository. This is considered unauthorized use, and the job will be canceled immediately by the GitHub App instead of spinning up a self-hosted runner for it.