Final optimization batch (#15): use memory pools #990

newren · 2021-07-01T15:23:13Z

This series textually depends on en/ort-perf-batch-14, but the ideas are
orthogonal to it and orthogonal to previous series. It can be reviewed
independently.

Changes since v1, addressing Eric's feedback:

Fixed a comment that became out-of-date in patch 1
Swapped commits 2 and 3 so that one can better motivate the other.

Changes since v2, addressing Peff's feedback:

Rebased on en/ort-perf-batch-14 (resolving a trivial conflict with the new string_list_init_nodup() usage)
Added a new preliminary patch renaming str*_func() to str*_clear_func()
Added a new final patch that hardcodes that we'll just use memory pools

Changes since v3, as per Peff's feedback:

Don't only remove the extra complexity from the USE_MEMORY_POOL #define; also remove the original bookkeeping complexity needed to track individual frees when not using a memory pool.

Changes since v4:

Add Acked-bys from both Stolee and Peff

=== Basic Optimization idea ===

In this series, I make use of memory pools to get faster allocations and
deallocations for many data structures that tend to all be deallocated
at the same time anyway.

=== Results ===

For the testcases mentioned in commit 557ac03 ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
the changes in just this series improves the performance as follows:

                     Before Series           After Series
no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms

As a reminder, before any merge-ort/diffcore-rename performance work,
the performance results we started with were:

no-renames-am:      6.940 s ±  0.485 s
no-renames:        18.912 s ±  0.174 s
mega-renames:    5964.031 s ± 10.459 s
just-one-mega:    149.583 s ±  0.751 s

=== Overall Results across all optimization work ===

This is my final prepared optimization series. It might be worth
reviewing how my optimizations fared overall, comparing the original
merge-recursive timings with three things: how much merge-recursive
improved (as a side-effect of optimizing merge-ort), how much
improvement we would have gotten from a hypothetical infinite
parallelization of rename detection, and what I achieved at the end with
merge-ort:

                               Timings

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename    merge-ort
                 v2.30.0      current     detection     current
                ----------   ---------   -----------   ---------
no-renames:       18.912 s    18.030 s     11.699 s     198.3 ms
mega-renames:   5964.031 s   361.281 s    203.886 s     661.8 ms
just-one-mega:   149.583 s    11.009 s      7.553 s     264.6 ms

                           Speedup factors

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
no-renames:         1           1.05         1.6           95
mega-renames:       1          16.5         29           9012
just-one-mega:      1          13.6         20            565

And, for partial clone users:

             Factor reduction in number of objects needed

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
mega-renames:       1            1            1          181.3

=== Caveat ===

It may be worth noting, though, that my optimization numbers above for
merge-ort use test-tool fast-rebase. git rebase -s ort on the
three testcases above is 5-20 times slower (taking 3.835s, 6.798s, and
1.235s, respectively). At this point, any further optimization work
should go into making a faster full-featured rebase by copying the
ideas from fast-rebase: avoid unnecessary process forking, avoid
updating the index and working copy until either the rebase is
finished or you hit a conflict (and don't write rebase metadata to
disk until that point either), get rid of the glacially slow revision
walking of the upstream side of history (nuke can_fast_forward(), make
--reapply-cherry-picks the default) or at least don't revision walk so
many times (multiple calls to get_merge_bases in can_fast_forward()
plus a is_linear_history() walk, checking for upstream cherry-picks,
probably more), turn off per-commit hooks that probably should have
never been on anyway, etc.

cc: Jeff King peff@peff.net
cc: Eric Sunshine sunshine@sunshineco.com
cc: Elijah Newren newren@gmail.com
cc: Derrick Stolee stolee@gmail.com
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

newren · 2021-07-23T12:53:50Z

/submit

gitgitgadget · 2021-07-23T12:55:12Z

Submitted as pull.990.git.1627044897.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-990/newren/ort-perf-batch-15-v1

To fetch this version to local tag pr-990/newren/ort-perf-batch-15-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-990/newren/ort-perf-batch-15-v1

diffcore-rename.c

gitgitgadget · 2021-07-23T22:01:38Z

User Eric Sunshine <sunshine@sunshineco.com> has been added to the cc: list.

gitgitgadget · 2021-07-23T22:04:36Z

User Elijah Newren <newren@gmail.com> has been added to the cc: list.

merge-ort.c

gitgitgadget · 2021-07-23T23:22:51Z

This branch is now known as en/ort-perf-batch-15.

gitgitgadget · 2021-07-23T23:22:52Z

This patch series was integrated into seen via git@0626b7e.

merge-ort.c

gitgitgadget · 2021-07-26T14:38:00Z

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

gitgitgadget · 2021-07-26T14:47:03Z

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 7/23/2021 8:54 AM, Elijah Newren via GitGitGadget wrote:
...
> === Basic Optimization idea ===
> 
> In this series, I make use of memory pools to get faster allocations and
> deallocations for many data structures that tend to all be deallocated at
> the same time anyway.

Makes sense. This is appropriate for a final optimization, since the gains
tend to be quite small.

> === Results ===
> 
> For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
> performance work; instrument with trace2_region_* calls", 2020-10-28), the
> changes in just this series improves the performance as follows:
> 
>                      Before Series           After Series
> no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
> mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
> just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms

But these are larger than I anticipated! Amazing.

> === Overall Results across all optimization work ===

I enjoyed reading this section. I'm excited to make ORT the default in
the microsoft/git fork and see how this improves the lives of our users.

> === Caveat ===
> 
> It may be worth noting, though, that my optimization numbers above for
> merge-ort use test-tool fast-rebase. git rebase -s ort on the three
> testcases above is 5-20 times slower (taking 3.835s, 6.798s, and 1.235s,
> respectively).

The performance and behavior changes recommended here should definitely
be considered. However, the benefits still apply and at the moment users
do not expect immediate responses from 'git rebase' so we have some time
to approach with caution.

I only had one small question that is not even important to the
correctness of this series, so feel free to ignore it. The patches tell
a convincing story.

Just to be careful, have you taken the time to run the merge-ORT tests
with --valgrind?

Thanks,
-Stolee

gitgitgadget · 2021-07-26T21:50:51Z

This patch series was integrated into seen via git@9e55ef9.

gitgitgadget · 2021-07-26T23:10:03Z

This patch series was integrated into seen via git@5ff447e.

gitgitgadget · 2021-07-27T00:13:52Z

This patch series was integrated into seen via git@0f1c736.

gitgitgadget · 2021-07-27T00:42:39Z

This patch series was integrated into seen via git@8ae575a.

gitgitgadget · 2021-07-27T04:59:55Z

There was a status update in the "New Topics" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

gitgitgadget · 2021-07-27T23:07:54Z

This patch series was integrated into seen via git@2b17e87.

gitgitgadget · 2021-07-28T00:02:55Z

This patch series was integrated into seen via git@c010d20.

gitgitgadget · 2021-07-28T04:53:00Z

This patch series was integrated into seen via git@8f61a4f.

gitgitgadget · 2021-07-28T22:18:35Z

This patch series was integrated into seen via git@3feae66.

gitgitgadget · 2021-07-28T22:55:51Z

On the Git mailing list, Elijah Newren wrote (reply to this):

On Mon, Jul 26, 2021 at 8:44 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 7/23/2021 8:54 AM, Elijah Newren via GitGitGadget wrote:
> ...
> > === Basic Optimization idea ===
> >
> > In this series, I make use of memory pools to get faster allocations and
> > deallocations for many data structures that tend to all be deallocated at
> > the same time anyway.
>
> Makes sense. This is appropriate for a final optimization, since the gains
> tend to be quite small.
>
> > === Results ===
> >
> > For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
> > performance work; instrument with trace2_region_* calls", 2020-10-28), the
> > changes in just this series improves the performance as follows:
> >
> >                      Before Series           After Series
> > no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
> > mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
> > just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms
>
>
> But these are larger than I anticipated! Amazing.
>
> > === Overall Results across all optimization work ===
>
> I enjoyed reading this section. I'm excited to make ORT the default in
> the microsoft/git fork and see how this improves the lives of our users.
>
> > === Caveat ===
> >
> > It may be worth noting, though, that my optimization numbers above for
> > merge-ort use test-tool fast-rebase. git rebase -s ort on the three
> > testcases above is 5-20 times slower (taking 3.835s, 6.798s, and 1.235s,
> > respectively).
>
> The performance and behavior changes recommended here should definitely
> be considered. However, the benefits still apply and at the moment users
> do not expect immediate responses from 'git rebase' so we have some time
> to approach with caution.
>
> I only had one small question that is not even important to the
> correctness of this series, so feel free to ignore it. The patches tell
> a convincing story.
>
> Just to be careful, have you taken the time to run the merge-ORT tests
> with --valgrind?

Yes.  In addition to the testsuite, I also ran the testcases above
under valgrind (especially mega-renames) -- and with those testcases I
had the leak checker turned on.  It was somewhat surprising how much
slowdown I saw when I introduced some accidental memory leaks while
optimizing.  But it all runs clean with the patches I submitted.

gitgitgadget · 2021-08-04T22:21:08Z

This patch series was integrated into seen via git@900f67b.

gitgitgadget · 2021-08-04T22:21:09Z

This patch series was integrated into next via git@87fc290.

gitgitgadget · 2021-08-05T23:52:13Z

This patch series was integrated into seen via git@4a664fc.

gitgitgadget · 2021-08-06T22:22:18Z

This patch series was integrated into seen via git@ee26625.

gitgitgadget · 2021-08-06T23:21:22Z

This patch series was integrated into seen via git@93f90db.

gitgitgadget · 2021-08-07T00:21:12Z

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

gitgitgadget · 2021-08-09T20:15:07Z

This patch series was integrated into seen via git@29239a3.

gitgitgadget · 2021-08-09T22:06:52Z

This patch series was integrated into seen via git@d4bf464.

gitgitgadget · 2021-08-11T01:15:42Z

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

gitgitgadget · 2021-08-11T18:58:23Z

This patch series was integrated into seen via git@c74fc01.

gitgitgadget · 2021-08-11T20:30:57Z

This patch series was integrated into seen via git@6b0687d.

gitgitgadget · 2021-08-12T22:58:28Z

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

gitgitgadget · 2021-08-15T21:20:56Z

This patch series was integrated into seen via git@87fc290.

gitgitgadget · 2021-08-15T22:01:06Z

This patch series was integrated into seen via git@1c090c6.

gitgitgadget · 2021-08-16T21:12:22Z

This patch series was integrated into seen via git@5bf9003.

gitgitgadget · 2021-08-16T23:09:21Z

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

gitgitgadget · 2021-08-19T21:20:15Z

This patch series was integrated into seen via git@b6659ff.

gitgitgadget · 2021-08-20T00:13:53Z

This patch series was integrated into seen via git@51fcba1.

gitgitgadget · 2021-08-20T17:49:22Z

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

gitgitgadget · 2021-08-23T21:48:08Z

This patch series was integrated into seen via git@256d56e.

gitgitgadget · 2021-08-23T21:48:09Z

This patch series was integrated into next via git@d22a850.

gitgitgadget · 2021-08-23T21:59:05Z

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Originally merged to 'next' on 2021-08-04

Final batch for "merge -sort" optimization.

Will merge to 'master'.

gitgitgadget · 2021-08-24T23:32:35Z

This patch series was integrated into seen via git@08ac213.

gitgitgadget · 2021-08-24T23:32:36Z

This patch series was integrated into next via git@08ac213.

gitgitgadget · 2021-08-24T23:32:37Z

This patch series was integrated into master via git@08ac213.

gitgitgadget · 2021-08-24T23:32:39Z

Closed via 08ac213.

newren changed the title ~~Optimization batch 15: use memory pools~~ Final optimization batch (batch 15): use memory pools Jul 10, 2021

newren changed the title ~~Final optimization batch (batch 15): use memory pools~~ Final optimization batch (#15): use memory pools Jul 10, 2021

newren force-pushed the temporary/ort-perf-batch-14 branch from 76bc732 to 7133f0e Compare July 13, 2021 18:18

newren force-pushed the ort-perf-batch-15 branch 2 times, most recently from 16f33dc to be483b7 Compare July 16, 2021 03:08

newren force-pushed the temporary/ort-perf-batch-14 branch from a9cbc1d to c9ada83 Compare July 22, 2021 15:50

newren force-pushed the ort-perf-batch-15 branch from be483b7 to 17aa0a7 Compare July 22, 2021 15:50

gitgitgadget bot reviewed Jul 23, 2021

View reviewed changes

diffcore-rename.c Show resolved Hide resolved

gitgitgadget bot reviewed Jul 23, 2021

View reviewed changes

merge-ort.c Outdated Show resolved Hide resolved

gitgitgadget bot added the seen label Jul 23, 2021

gitgitgadget bot reviewed Jul 26, 2021

View reviewed changes

merge-ort.c Outdated Show resolved Hide resolved

newren force-pushed the ort-perf-batch-15 branch from 17aa0a7 to fda885d Compare July 29, 2021 00:07

gitgitgadget bot added the next label Aug 4, 2021

gitgitgadget bot added the master label Aug 24, 2021

gitgitgadget bot closed this Aug 24, 2021

newren deleted the ort-perf-batch-15 branch August 25, 2021 01:29

Final optimization batch (#15): use memory pools #990

Final optimization batch (#15): use memory pools #990

Conversation

newren commented Jul 1, 2021 • edited Loading

newren commented Jul 23, 2021

gitgitgadget bot commented Jul 23, 2021

gitgitgadget bot commented Jul 23, 2021

gitgitgadget bot commented Jul 23, 2021

gitgitgadget bot commented Jul 23, 2021

gitgitgadget bot commented Jul 23, 2021

gitgitgadget bot commented Jul 26, 2021

gitgitgadget bot commented Jul 26, 2021

gitgitgadget bot commented Jul 26, 2021

gitgitgadget bot commented Jul 26, 2021

gitgitgadget bot commented Jul 27, 2021

gitgitgadget bot commented Jul 27, 2021

gitgitgadget bot commented Jul 27, 2021

gitgitgadget bot commented Jul 27, 2021

gitgitgadget bot commented Jul 28, 2021

gitgitgadget bot commented Jul 28, 2021

gitgitgadget bot commented Jul 28, 2021

gitgitgadget bot commented Jul 28, 2021

gitgitgadget bot commented Aug 4, 2021

gitgitgadget bot commented Aug 4, 2021

gitgitgadget bot commented Aug 5, 2021

gitgitgadget bot commented Aug 6, 2021

gitgitgadget bot commented Aug 6, 2021

gitgitgadget bot commented Aug 7, 2021

gitgitgadget bot commented Aug 9, 2021

gitgitgadget bot commented Aug 9, 2021

gitgitgadget bot commented Aug 11, 2021

gitgitgadget bot commented Aug 11, 2021

gitgitgadget bot commented Aug 11, 2021

gitgitgadget bot commented Aug 12, 2021

gitgitgadget bot commented Aug 15, 2021

gitgitgadget bot commented Aug 15, 2021

gitgitgadget bot commented Aug 16, 2021

gitgitgadget bot commented Aug 16, 2021

gitgitgadget bot commented Aug 19, 2021

gitgitgadget bot commented Aug 20, 2021

gitgitgadget bot commented Aug 20, 2021

gitgitgadget bot commented Aug 23, 2021

gitgitgadget bot commented Aug 23, 2021

gitgitgadget bot commented Aug 23, 2021

gitgitgadget bot commented Aug 24, 2021

gitgitgadget bot commented Aug 24, 2021

gitgitgadget bot commented Aug 24, 2021

gitgitgadget bot commented Aug 24, 2021

newren commented Jul 1, 2021 •

edited

Loading