Skip to content

pr-2149/spkrka/side-exhaust-pr-v1

tagged this 20 Jun 10:37
Hi,

This follows up on my RFC [1] with a concrete proposal. I expect the design
to still be scrutinized, but that may be easier with actual code to look at.

I tried to make this easier to review by splitting into atomic patches. The
first two patches are the meatiest parts, though they are pure refactoring.
The behavior change is in patch 3 and is in itself quite small. The last
patch adds technical documentation to support future development.

----------------------------------------------------------------------------

Optimize paint_down_to_common() for merge-base queries that hit large
one-sided histories.

When the walk from one side reaches a commit with a very low generation
number that the other side never paints, the walk is forced to drain most of
the graph. A common trigger is a repository import that grafts a separate
history with its own root, but any merge that introduces a low-generation
commit never painted by the other side has the same effect.

A new merge-base candidate can only be discovered when exclusive PARENT1 and
PARENT2 paint meet. This series teaches paint_down_to_common() to stop as
soon as one side has no exclusive commits left in the queue; once one side
is exhausted, no further candidates can appear.

  origin/HEAD  o   o  PR HEAD
               |   |
     (import)  o   :
              / \ /
             |   o  merge-base
             |   |
             :   :  (~2.5M commits)
             |   |
  import root   main root

In the RFC thread [1], Derrick Stolee provided a criss-cross counterexample
that sharpened the halt condition, and Elijah Newren independently
discovered the same optimization and shared an implementation in PR #2150
[2]. Patch 4 incorporates test cases from Elijah's branch.

This series implements the optimization only after the walk enters the
finite-generation region, where generation ordering guarantees that paint on
visited commits is final.

Patch layout:

1/6 commit-reach: decouple ahead_behind from nonstale_queue 2/6
commit-reach: introduce paint_queue and per-side counters 3/6 commit-reach:
stop the walk when one side is exhausted 4/6 t6600: add side-exhaustion
edge-case tests 5/6 t6099, t6600: add side-exhaustion regression tests 6/6
Documentation/technical: document paint_down_to_common()

Benchmarks

Measured on a 2.6M-commit monorepo with commit-graph (baseline v2.55-rc1):

merge-base --all  (across import)       4.293s ->    8ms  (537x)
merge-tree        (across import)       5.345s ->   13ms  (411x)
merge-base --all  (1000 commits apart)  5.404s ->    7ms  (772x)

No regression on linux.git (1.4M commits, commit-graph):

merge-base HEAD HEAD~1000                 38ms ->   40ms
merge-base --all HEAD HEAD~1000           87ms ->   36ms
merge-base --is-ancestor HEAD~1000 HEAD   11ms ->   11ms
merge-base --all HEAD HEAD~10000         626ms ->  428ms

[1]
https://lore.kernel.org/git/CAL71e4Ps-2_0+uuZu43N9pFnXBemoAohPs_eyRJf8taXHJPAXQ@mail.gmail.com/T/#u
[2] https://github.com/gitgitgadget/git/pull/2150

Elijah Newren (1):
  t6600: add test cases for side-exhaustion edge cases

Kristofer Karlsson (5):
  commit-reach: decouple ahead_behind from nonstale_queue
  commit-reach: introduce struct paint_queue with per-side counters
  commit-reach: terminate merge-base walk when one paint side is
    exhausted
  t6099, t6600: add side-exhaustion regression tests
  Documentation/technical: add paint-down-to-common doc

 Documentation/Makefile                        |   1 +
 Documentation/technical/meson.build           |   1 +
 .../technical/paint-down-to-common.adoc       | 130 ++++++++++++++
 commit-reach.c                                | 159 +++++++++++-------
 t/meson.build                                 |   1 +
 t/t6099-merge-base-side-exhaustion.sh         |  82 +++++++++
 t/t6600-test-reach.sh                         | 136 +++++++++++++++
 7 files changed, 451 insertions(+), 59 deletions(-)
 create mode 100644 Documentation/technical/paint-down-to-common.adoc
 create mode 100755 t/t6099-merge-base-side-exhaustion.sh

base-commit: 8d96f09e9245ddf80c1981476fcbac8c4bb4125f

Submitted-As: https://lore.kernel.org/git/pull.2149.git.1781951820.gitgitgadget@gmail.com
Assets 2
Loading