Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final optimization batch (#15): use memory pools #990

Conversation

newren
Copy link

@newren newren commented Jul 1, 2021

This series textually depends on en/ort-perf-batch-14, but the ideas are
orthogonal to it and orthogonal to previous series. It can be reviewed
independently.

Changes since v1, addressing Eric's feedback:

  • Fixed a comment that became out-of-date in patch 1
  • Swapped commits 2 and 3 so that one can better motivate the other.

Changes since v2, addressing Peff's feedback:

  • Rebased on en/ort-perf-batch-14 (resolving a trivial conflict with the new string_list_init_nodup() usage)
  • Added a new preliminary patch renaming str*_func() to str*_clear_func()
  • Added a new final patch that hardcodes that we'll just use memory pools

Changes since v3, as per Peff's feedback:

  • Don't only remove the extra complexity from the USE_MEMORY_POOL #define; also remove the original bookkeeping complexity needed to track individual frees when not using a memory pool.

Changes since v4:

  • Add Acked-bys from both Stolee and Peff

=== Basic Optimization idea ===

In this series, I make use of memory pools to get faster allocations and
deallocations for many data structures that tend to all be deallocated
at the same time anyway.

=== Results ===

For the testcases mentioned in commit 557ac03 ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
the changes in just this series improves the performance as follows:

                     Before Series           After Series
no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms

As a reminder, before any merge-ort/diffcore-rename performance work,
the performance results we started with were:

no-renames-am:      6.940 s ±  0.485 s
no-renames:        18.912 s ±  0.174 s
mega-renames:    5964.031 s ± 10.459 s
just-one-mega:    149.583 s ±  0.751 s

=== Overall Results across all optimization work ===

This is my final prepared optimization series. It might be worth
reviewing how my optimizations fared overall, comparing the original
merge-recursive timings with three things: how much merge-recursive
improved (as a side-effect of optimizing merge-ort), how much
improvement we would have gotten from a hypothetical infinite
parallelization of rename detection, and what I achieved at the end with
merge-ort:

                               Timings

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename    merge-ort
                 v2.30.0      current     detection     current
                ----------   ---------   -----------   ---------
no-renames:       18.912 s    18.030 s     11.699 s     198.3 ms
mega-renames:   5964.031 s   361.281 s    203.886 s     661.8 ms
just-one-mega:   149.583 s    11.009 s      7.553 s     264.6 ms

                           Speedup factors

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
no-renames:         1           1.05         1.6           95
mega-renames:       1          16.5         29           9012
just-one-mega:      1          13.6         20            565

And, for partial clone users:

             Factor reduction in number of objects needed

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
mega-renames:       1            1            1          181.3

=== Caveat ===

It may be worth noting, though, that my optimization numbers above for
merge-ort use test-tool fast-rebase. git rebase -s ort on the
three testcases above is 5-20 times slower (taking 3.835s, 6.798s, and
1.235s, respectively). At this point, any further optimization work
should go into making a faster full-featured rebase by copying the
ideas from fast-rebase: avoid unnecessary process forking, avoid
updating the index and working copy until either the rebase is
finished or you hit a conflict (and don't write rebase metadata to
disk until that point either), get rid of the glacially slow revision
walking of the upstream side of history (nuke can_fast_forward(), make
--reapply-cherry-picks the default) or at least don't revision walk so
many times (multiple calls to get_merge_bases in can_fast_forward()
plus a is_linear_history() walk, checking for upstream cherry-picks,
probably more), turn off per-commit hooks that probably should have
never been on anyway, etc.

cc: Jeff King peff@peff.net
cc: Eric Sunshine sunshine@sunshineco.com
cc: Elijah Newren newren@gmail.com
cc: Derrick Stolee stolee@gmail.com
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

@newren newren changed the title Optimization batch 15: use memory pools Final optimization batch (batch 15): use memory pools Jul 10, 2021
@newren newren changed the title Final optimization batch (batch 15): use memory pools Final optimization batch (#15): use memory pools Jul 10, 2021
@newren newren force-pushed the temporary/ort-perf-batch-14 branch from 76bc732 to 7133f0e Compare July 13, 2021 18:18
@newren newren force-pushed the ort-perf-batch-15 branch 2 times, most recently from 16f33dc to be483b7 Compare July 16, 2021 03:08
@newren newren force-pushed the temporary/ort-perf-batch-14 branch from a9cbc1d to c9ada83 Compare July 22, 2021 15:50
@newren
Copy link
Author

newren commented Jul 23, 2021

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 23, 2021

Submitted as pull.990.git.1627044897.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-990/newren/ort-perf-batch-15-v1

To fetch this version to local tag pr-990/newren/ort-perf-batch-15-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-990/newren/ort-perf-batch-15-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 23, 2021

User Eric Sunshine <sunshine@sunshineco.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 23, 2021

User Elijah Newren <newren@gmail.com> has been added to the cc: list.

merge-ort.c Outdated Show resolved Hide resolved
@gitgitgadget
Copy link

gitgitgadget bot commented Jul 23, 2021

This branch is now known as en/ort-perf-batch-15.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 23, 2021

This patch series was integrated into seen via git@0626b7e.

@gitgitgadget gitgitgadget bot added the seen label Jul 23, 2021
merge-ort.c Outdated Show resolved Hide resolved
@gitgitgadget
Copy link

gitgitgadget bot commented Jul 26, 2021

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 26, 2021

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 7/23/2021 8:54 AM, Elijah Newren via GitGitGadget wrote:
...
> === Basic Optimization idea ===
> 
> In this series, I make use of memory pools to get faster allocations and
> deallocations for many data structures that tend to all be deallocated at
> the same time anyway.

Makes sense. This is appropriate for a final optimization, since the gains
tend to be quite small.

> === Results ===
> 
> For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
> performance work; instrument with trace2_region_* calls", 2020-10-28), the
> changes in just this series improves the performance as follows:
> 
>                      Before Series           After Series
> no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
> mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
> just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms


But these are larger than I anticipated! Amazing.

> === Overall Results across all optimization work ===

I enjoyed reading this section. I'm excited to make ORT the default in
the microsoft/git fork and see how this improves the lives of our users.

> === Caveat ===
> 
> It may be worth noting, though, that my optimization numbers above for
> merge-ort use test-tool fast-rebase. git rebase -s ort on the three
> testcases above is 5-20 times slower (taking 3.835s, 6.798s, and 1.235s,
> respectively).

The performance and behavior changes recommended here should definitely
be considered. However, the benefits still apply and at the moment users
do not expect immediate responses from 'git rebase' so we have some time
to approach with caution.

I only had one small question that is not even important to the
correctness of this series, so feel free to ignore it. The patches tell
a convincing story.

Just to be careful, have you taken the time to run the merge-ORT tests
with --valgrind?

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 26, 2021

This patch series was integrated into seen via git@9e55ef9.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 26, 2021

This patch series was integrated into seen via git@5ff447e.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 27, 2021

This patch series was integrated into seen via git@0f1c736.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 27, 2021

This patch series was integrated into seen via git@8ae575a.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 27, 2021

There was a status update in the "New Topics" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 27, 2021

This patch series was integrated into seen via git@2b17e87.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 28, 2021

This patch series was integrated into seen via git@c010d20.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 28, 2021

This patch series was integrated into seen via git@8f61a4f.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 28, 2021

This patch series was integrated into seen via git@3feae66.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 28, 2021

On the Git mailing list, Elijah Newren wrote (reply to this):

On Mon, Jul 26, 2021 at 8:44 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 7/23/2021 8:54 AM, Elijah Newren via GitGitGadget wrote:
> ...
> > === Basic Optimization idea ===
> >
> > In this series, I make use of memory pools to get faster allocations and
> > deallocations for many data structures that tend to all be deallocated at
> > the same time anyway.
>
> Makes sense. This is appropriate for a final optimization, since the gains
> tend to be quite small.
>
> > === Results ===
> >
> > For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
> > performance work; instrument with trace2_region_* calls", 2020-10-28), the
> > changes in just this series improves the performance as follows:
> >
> >                      Before Series           After Series
> > no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
> > mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
> > just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms
>
>
> But these are larger than I anticipated! Amazing.
>
> > === Overall Results across all optimization work ===
>
> I enjoyed reading this section. I'm excited to make ORT the default in
> the microsoft/git fork and see how this improves the lives of our users.
>
> > === Caveat ===
> >
> > It may be worth noting, though, that my optimization numbers above for
> > merge-ort use test-tool fast-rebase. git rebase -s ort on the three
> > testcases above is 5-20 times slower (taking 3.835s, 6.798s, and 1.235s,
> > respectively).
>
> The performance and behavior changes recommended here should definitely
> be considered. However, the benefits still apply and at the moment users
> do not expect immediate responses from 'git rebase' so we have some time
> to approach with caution.
>
> I only had one small question that is not even important to the
> correctness of this series, so feel free to ignore it. The patches tell
> a convincing story.
>
> Just to be careful, have you taken the time to run the merge-ORT tests
> with --valgrind?

Yes.  In addition to the testsuite, I also ran the testcases above
under valgrind (especially mega-renames) -- and with those testcases I
had the leak checker turned on.  It was somewhat surprising how much
slowdown I saw when I introduced some accidental memory leaks while
optimizing.  But it all runs clean with the patches I submitted.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into seen via git@900f67b.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into next via git@87fc290.

@gitgitgadget gitgitgadget bot added the next label Aug 4, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Aug 5, 2021

This patch series was integrated into seen via git@4a664fc.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 6, 2021

This patch series was integrated into seen via git@ee26625.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 6, 2021

This patch series was integrated into seen via git@93f90db.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 7, 2021

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 9, 2021

This patch series was integrated into seen via git@29239a3.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 9, 2021

This patch series was integrated into seen via git@d4bf464.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 11, 2021

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 11, 2021

This patch series was integrated into seen via git@c74fc01.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 11, 2021

This patch series was integrated into seen via git@6b0687d.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 12, 2021

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 15, 2021

This patch series was integrated into seen via git@87fc290.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 15, 2021

This patch series was integrated into seen via git@1c090c6.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 16, 2021

This patch series was integrated into seen via git@5bf9003.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 16, 2021

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 19, 2021

This patch series was integrated into seen via git@b6659ff.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 20, 2021

This patch series was integrated into seen via git@51fcba1.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 20, 2021

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Final batch for "merge -sort" optimization.

Will cook in 'next'.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 23, 2021

This patch series was integrated into seen via git@256d56e.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 23, 2021

This patch series was integrated into next via git@d22a850.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 23, 2021

There was a status update in the "Cooking" section about the branch en/ort-perf-batch-15 on the Git mailing list:

Originally merged to 'next' on 2021-08-04

Final batch for "merge -sort" optimization.

Will merge to 'master'.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

This patch series was integrated into seen via git@08ac213.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

This patch series was integrated into next via git@08ac213.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

This patch series was integrated into master via git@08ac213.

@gitgitgadget gitgitgadget bot added the master label Aug 24, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

Closed via 08ac213.

@gitgitgadget gitgitgadget bot closed this Aug 24, 2021
@newren newren deleted the ort-perf-batch-15 branch August 25, 2021 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant