Make reloadHistory tip hiding much faster #2707
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: Reloading the history in repos with many commits and many branches can be problematically slow. This commit changes the way commits are hidden from libgit2's revwalk, making the process vastly faster. (in the Swift repo, goes from roughly 3.2s to just 70ms)
Intro
Whenever the repository's history changes (either externally, or after manipulating commits within the app itself), GitUpKit's GCLiveRepository updates its internal cache of the repo's history. Since this process occurs very often, including when reacting to user input, it needs to be extremely fast.
To a large degree, it already is! But on repos with many commits and many branches (e.g. the Swift repo), history reloads would take very long (e.g. around three seconds on a M1 Pro MBP).
Context
To update its history cache, GUK asks libgit2 to walk over new commits.
When using libgit2's revision walker, you provide it with a list of tips to explore (“I want callbacks for all parents of that tip”) and a list of tips to ignore (“do not tell me about the parents of that tip, if they ever come up while exploring”). Intuitively, the more you add to that ignore list, the faster the iteration. In practice, the opposite was true!
When first starting a walk, libgit2 iterates over every parent of every ignored tip, to mark these commits as ignored: this way, if they come up later, while exploring the parents of the interesting tips, it'll know not to return them.
What this means is that at best, adding ignored tips has no effect on performance (their parents will be walked anyway) and at worse, they give a lot more work to libgit2. This is what was happening here.
Solution
Thankfully there's an incredibly, almost suspiciously-simple solution: libgit2 revision walkers can be provided an ignore callback. The walker invokes the ignore callback for every parent it explores, to ask whether that commit should be ignored. It's a second way of ignoring commits, except this one causes no extra iteration: the burden is on us, the callback provider, to respond performantly.
As it turns out, GitUpKit already has the perfect solution for this, as it keeps a map of all currently known commits, indexed by OID. This allows us to very quickly skip entire parent hierarchies, as libgit2 does not explore the parents of ignored commits, either.
In informal tests on the Swift repo, this change causes the history reload to go from roughly 3.2 seconds to just 70 milliseconds—that's a 45× speedup!