-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization batch 13: partial clone optimizations for merge-ort #969
Optimization batch 13: partial clone optimizations for merge-ort #969
Conversation
/submit |
Submitted as pull.969.git.1622856485.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
01352fc
to
6de569e
Compare
Signed-off-by: Elijah Newren <newren@gmail.com>
3b66c8c
to
317bcc7
Compare
/submit |
Submitted as pull.969.v2.git.1623796907.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This branch is now known as |
This patch series was integrated into seen via git@5dc9176. |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
User |
This patch series was integrated into seen via git@ef2357d. |
There was a status update in the "New Topics" section about the branch Performance tweaks of "git merge -sort" around lazy fetching of objects. |
@@ -0,0 +1,439 @@ | |||
#!/bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> + for i in `test_seq 1 88`; do
> + echo content $i >dir/unchanged/file_$i
> + done &&
for i in $(...)
do
... || return 1
done &&
> +test_expect_merge_algorithm failure failure 'Objects downloaded for single relevant rename' '
> + test_setup_repo &&
> + git clone --sparse --filter=blob:none "file://$(pwd)/server" objects-single &&
> + (
> + cd objects-single &&
> +
> + git rev-list --objects --all --missing=print |
> + grep '\?' >missing-objects-before &&
The closing and reopening of single quote here is somewhat
misleading. Wouldn't
grep "?" >... &&
work just as well?
> + git rev-list --objects --all --missing=print |
> + grep ^? >missing-objects-after &&
> + test_cmp missing-objects-before missing-objects-after |
> + grep "^[-+]?" >found-and-new-objects &&
I do not think we specify what test_cmp's output looks like; we only
guarantee that it is the right tool to use for expecting two paths
have the same contents, and it may give some human readable output.
Do not assume you can grep in the output; use "diff -u" if you mean
you expect things that exist on the left side and missing from the
right hand side are prefixed with '-' and vice versa for '+'.
Or sort "missing-objects-after" and "missing-objects-before" and run
"comm" on them, which might be more stable. Depending on the order
two invocations of rev-list spews out the "missing" objects, you may
even see the same object mentioned with "-" and "+" in the diff output
if lines appear to be moved around.
> + # We should not have any NEW missing objects
> + ! grep ^+ found-and-new-objects &&
> + # Fetched 2+1=3 objects, so should have 3 fewer missing objects
> + test_line_count = 3 found-and-new-objects
> + )
> +'
I think similar comments apply to the tests in the remainder of the
patch.
Thanks.
@@ -29,6 +29,7 @@ | |||
#include "entry.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> + /* Ignore clean entries */
> + if (ci->merged.clean)
> + continue;
> +
> + /* Ignore entries that don't need a content merge */
> + if (ci->match_mask || ci->filemask < 6 ||
> + !S_ISREG(ci->stages[1].mode) ||
> + !S_ISREG(ci->stages[2].mode) ||
> + oideq(&ci->stages[1].oid, &ci->stages[2].oid))
> + continue;
> +
> + /* Also don't need content merge if base matches either side */
> + if (ci->filemask == 7 &&
> + S_ISREG(ci->stages[0].mode) &&
> + (oideq(&ci->stages[0].oid, &ci->stages[1].oid) ||
> + oideq(&ci->stages[0].oid, &ci->stages[2].oid)))
> + continue;
Even though this is unlikely to change, it is unsatisfactory that we
reproduce the knowledge on the situations when a merge will
trivially resolve and when it will need to go content level.
One obvious way to solve it would be to fold this logic into the
main code that actually merges a list of "ci"s by making it a two
pass process (the first pass does essentially the same as this new
function, the second pass does the tree-level merge where the above
says "continue", fills mmfiles with the loop below, and calls into
ll_merge() after the loop to merge), but the logic duplication is
not too big and it may not be worth such a code churn.
> + for (i = 0; i < 3; i++) {
> + unsigned side_mask = (1 << i);
> + struct version_info *vi = &ci->stages[i];
> +
> + if ((ci->filemask & side_mask) &&
> + S_ISREG(vi->mode) &&
> + oid_object_info_extended(opt->repo, &vi->oid, NULL,
> + OBJECT_INFO_FOR_PREFETCH))
> + oid_array_append(&to_fetch, &vi->oid);
> + }
> + }
> +
> + promisor_remote_get_direct(opt->repo, to_fetch.oid, to_fetch.nr);
> + oid_array_clear(&to_fetch);
> +}
> +
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Elijah Newren wrote (reply to this):
On Wed, Jun 16, 2021 at 10:04 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > + /* Ignore clean entries */
> > + if (ci->merged.clean)
> > + continue;
> > +
> > + /* Ignore entries that don't need a content merge */
> > + if (ci->match_mask || ci->filemask < 6 ||
> > + !S_ISREG(ci->stages[1].mode) ||
> > + !S_ISREG(ci->stages[2].mode) ||
> > + oideq(&ci->stages[1].oid, &ci->stages[2].oid))
> > + continue;
> > +
> > + /* Also don't need content merge if base matches either side */
> > + if (ci->filemask == 7 &&
> > + S_ISREG(ci->stages[0].mode) &&
> > + (oideq(&ci->stages[0].oid, &ci->stages[1].oid) ||
> > + oideq(&ci->stages[0].oid, &ci->stages[2].oid)))
> > + continue;
>
> Even though this is unlikely to change, it is unsatisfactory that we
> reproduce the knowledge on the situations when a merge will
> trivially resolve and when it will need to go content level.
I agree, it's not the nicest.
> One obvious way to solve it would be to fold this logic into the
> main code that actually merges a list of "ci"s by making it a two
> pass process (the first pass does essentially the same as this new
> function, the second pass does the tree-level merge where the above
> says "continue", fills mmfiles with the loop below, and calls into
> ll_merge() after the loop to merge), but the logic duplication is
> not too big and it may not be worth such a code churn.
I'm worried even more about the resulting complexity than the code
churn. The two-pass model, which I considered, would require special
casing so many of the branches of process_entry() that it feels like
it'd be increasing code complexity more than introducing a function
with a few duplicated checks. process_entry() was already a function
that Stolee reported as coming across as pretty complex to him in
earlier rounds of review, but that seems to just be intrinsic based on
the number of special cases: handling anything from entries with D/F
conflicts, to different file types, to match_mask being precomputed,
to recursive vs. normal cases, to modify/delete, to normalization, to
added on one side, to deleted on both side, to three-way content
merges. The three-way content merges are just one of 9-ish different
branches, and are the only one that we're prefetching for. It just
seems easier and cleaner overall to add these three checks to pick off
the cases that will end up going through the three-way content merges.
I've looked at it again a couple times over the past few days based on
your comment, but I still can't see a way to restructure it that feels
cleaner than what I've currently got.
Also, it may be worth noting here that if these checks fell out of
date with process_entry() in some manner, it still would not affect
the correctness of the code. At worst, it'd only affect whether
enough or too many objects are prefetched. If too many, then some
extra objects would be downloaded, and if too few, then we'd end up
later fetching additional objects 1-by-1 on demand later.
So I'm going to agree with the not-worth-it portion of your final
sentence and leave this out of the next roll.
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
This patch series was integrated into seen via git@16bdd87. |
This patch series was integrated into seen via git@488ea58. |
Signed-off-by: Elijah Newren <newren@gmail.com>
estimate_similarity() was setting up a diff_populate_filespec_options every time it was called, requiring the caller of estimate_similarity() to pass in some data needed to set up this option. Currently the needed data consisted of a single variable (skip_unmodified), but we want to also have the different estimate_similarity() callsites start using different missing_object_cb functions as well. Rather than also passing that data in, just have the caller pass in the whole diff_populate_filespec_options, and reduce the number of times we need to set it up. As a side note, this also drops the number of calls to has_promisor_remote() dramatically. If L is the number of basename paths to compare, M is the number of inexact sources, and N is the number of inexact destinations, then the number of calls to has_promisor_remote() drops from L+M*N down to at most 2 -- one for each of the sites that calls estimate_similarity(). has_promisor_remote() is a very fast function so this almost certainly has no measurable performance impact, but it seems cleaner to avoid calling that function so many times. Signed-off-by: Elijah Newren <newren@gmail.com>
merge-ort was designed to minimize the amount of data needed and used, and several changes were made to diffcore-rename to take advantage of extra metadata to enable this data minimization (particularly the relevant_sources variable for skipping "irrelevant" renames). This effort obviously succeeded in drastically reducing computation times, but should also theoretically allow partial clones to download much less information. Previously, though, the "prefetch" command used in diffcore-rename had never been modified and downloaded many blobs that were unnecessary for merge-ort. This commit corrects that. When doing basename comparisons, we want to fetch only the objects that will be used for basename comparisons. If after basename fetching this leaves us with no more relevant sources (or no more destinations), then we won't need to do the full inexact rename detection and can skip downloading additional source and destination files. Even if we have to do that later full inexact rename detection, irrelevant sources are culled after basename matching and before the full inexact rename detection, so we can still avoid downloading the blobs for irrelevant sources. Rename prefetch() to inexact_prefetch(), and introduce a new basename_prefetch() to take advantage of this. If we modify the testcase from commit 557ac03 ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2021-01-23) to pass --sparse --filter=blob:none to the clone command, and use the new trace2 "fetch_count" output from a few commits ago to track both the number of fetch subcommands invoked and the number of objects fetched across all those fetches, then for the mega-renames testcase we observe the following: BEFORE this commit, rebasing 35 patches: strategy # of fetches total # of objects fetched --------- ------------ -------------------------- recursive 62 11423 ort 30 11391 AFTER this commit, rebasing the same 35 patches: ort 32 63 This means that the new code only needs to download less than 2 blobs per patch being rebased. That is especially interesting given that the repository at the start only had approximately half a dozen TOTAL blobs downloaded to start with (because the default sparse-checkout of just the toplevel directory was in use). So, for this particular linux kernel testcase that involved ~26,000 renames on the upstream side (drivers/ -> pilots/) across which 35 patches were being rebased, this change reduces the number of blobs that need to be downloaded by a factor of ~180. Signed-off-by: Elijah Newren <newren@gmail.com>
Commit 7fbbcb2 ("diff: batch fetching of missing blobs", 2019-04-05) introduced batching of fetching missing blobs, so that the diff machinery would have one fetch subprocess grab N blobs instead of N processes each grabbing 1. However, the diff machinery is not the only thing in a merge that needs to work on blobs. The 3-way content merges need them as well. Rather than download all the blobs 1 at a time, prefetch all the blobs needed for regular content merges. This does not cover all possible paths in merge-ort that might need to download blobs. Others include: - The blob_unchanged() calls to avoid modify/delete conflicts (when blob renormalization results in an "unchanged" file) - Preliminary content merges needed for rename/add and rename/rename(2to1) style conflicts. (Both of these types of conflicts can result in nested conflict markers from the need to do two levels of content merging; the first happens before our new prefetch_for_content_merges() function.) The first of these wouldn't be an extreme amount of work to support, and even the second could be theoretically supported in batching, but all of these cases seem unusual to me, and this is a minor performance optimization anyway; in the worst case we only get some of the fetches batched and have a few additional one-off fetches. So for now, just handle the regular 3-way content merges in our prefetching. For the testcase from the previous commit, the number of downloaded objects remains at 63, but this drops the number of fetches needed from 32 down to 20, a sizeable reduction. Signed-off-by: Elijah Newren <newren@gmail.com>
317bcc7
to
69011cf
Compare
/submit |
Submitted as pull.969.v3.git.1624349082.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
User |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Elijah Newren wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Elijah Newren wrote (reply to this):
|
On the Git mailing list, Elijah Newren wrote (reply to this):
|
This patch series was integrated into seen via git@2ddf193. |
This patch series was integrated into seen via git@a530a1f. |
This patch series was integrated into seen via git@0ded682. |
There was a status update in the "Cooking" section about the branch Performance tweaks of "git merge -sort" around lazy fetching of objects. |
This patch series was integrated into seen via git@5aad55f. |
This patch series was integrated into seen via git@e050eef. |
This patch series was integrated into seen via git@2601db3. |
This patch series was integrated into seen via git@e865939. |
There was a status update in the "Cooking" section about the branch Performance tweaks of "git merge -sort" around lazy fetching of objects. Will merge to 'next'. |
This patch series was integrated into seen via git@a1c4214. |
This patch series was integrated into seen via git@adbc70a. |
This patch series was integrated into next via git@39aad12. |
There was a status update in the "Cooking" section about the branch Performance tweaks of "git merge -sort" around lazy fetching of objects. Will merge to 'master'. |
This patch series was integrated into seen via git@f9d5d2c. |
There was a status update in the "Cooking" section about the branch Performance tweaks of "git merge -sort" around lazy fetching of objects. Will merge to 'master'. |
This patch series was integrated into seen via git@fdbcdfc. |
This patch series was integrated into next via git@fdbcdfc. |
This patch series was integrated into master via git@fdbcdfc. |
Closed via fdbcdfc. |
This series optimizes blob downloading in merges for partial clones.
It can apply on master. It's independent of ort-perf-batch-12.
Changes since v2:
Changes since v1:
=== High level summary ===
inexact renames for a while.
prefetch() provides.
In the worst case, the above means:
However, in practice, since rename detection can usually quit after
find_basename_matches() (usually due to the irrelevant check that
cannot be performed until after find_basename_matches()):
than before.
Adding some prefetching to merge-ort.c allows us to also drop the
number of downloads overall.
=== Modified performance measurement method ===
The testcases I've been using so far to measure performance were not
run in a partial clone, so they aren't directly usable for comparison.
Further, partial clone performance depends on network speed which can
be highly variable. So I want to modify one of the existing testcases
slightly and focus on two different but more stable metrics:
git fetch
operations during rebaseThe first of these should already be decent due to Jonathan Tan's work
to batch fetching of missing blobs during rename detection (see commit
7fbbcb2 ("diff: batch fetching of missing blobs", 2019-04-05)), so
we are mostly looking to optimize the second but would like to also
decrease the first if possible.
The testcase we will look at will be a modification of the
mega-renames testcase from commit 557ac03 ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28).
In particular, we change
to
(The change in clone URL is just to get a server that supports the filter predicate.)
We otherwise keep the test the same (in particular, we do not add any
calls to "git-sparse checkout {set,add}" which means that the resulting
repository will only have 7 total blobs from files in the toplevel
directory before starting the rebase).
=== Results ===
For the mega-renames testcase noted above (which rebases 35 commits
across an upstream with ~26K renames in a partial clone), I found the
following results for our metrics of interest:
So, we have a significant reduction (factor of ~3 relative to
merge-recursive) in the number of
git fetch
operations that have tobe performed in a partial clone to complete the rebase, and a dramatic
reduction (factor of ~180) in the number of objects that need to be
fetched.
=== Summary ===
It's worth pointing out that merge-ort after the series needs only
~1.8 blobs per commit being transplanted to complete this particular
rebase. Essentially, this reinforces the fact the optimization work
so far has taken rename detection from often being an overwhelmingly
costly portion of a merge (leading many to just capitulate on it), to
what I have observed in my experience so far as being just a minor
cost for merges.
cc: Jonathan Tan jonathantanmy@google.com
cc: Derrick Stolee dstolee@gmail.com
cc: Taylor Blau me@ttaylorr.com
cc: Derrick Stolee stolee@gmail.com
cc: Elijah Newren newren@gmail.com