GH-39700: [C++] Feature: use inplace_merge to replace merge. #39701

Light-City · 2024-01-19T03:47:23Z

Rationale for this change

we can use std::inplace_merge to replace merge.

Since our indices are also a whole block of memory, after using std::inplace_merge, we can reduce the memory allocation of temp_indices and reduce std::copy operations, which has a natural advantage for us, so here I think std::inplace_merge is more suitable

What changes are included in this PR?

sort operator, vector_sort.cc

Are these changes tested?

yes, run vector_sort_test.cc

Are there any user-facing changes?

no.

Closes: [C++] Feature: use inplace_merge to replace merge. #39700

github-actions · 2024-01-19T03:47:50Z

⚠️ GitHub issue #39700 has been automatically assigned in GitHub to PR creator.

mapleFU · 2024-01-19T05:14:43Z

I didn't review it carefully but generally this looks ok to me, I think this is because inplace_merge is introduced in C++17, so previous code don't use it.
Would you mind fix the lint first?

Light-City · 2024-01-19T06:24:23Z

Details
The formatting problem has been solved, thank you

pitrou · 2024-01-19T10:11:11Z

Did you run some benchmarks? You can't claim something improves performance without measuring it.

std::inplace_merge does not avoid a memory allocation, it just allocates the memory implicitly. Quoting the documentation:

This function attempts to allocate a temporary buffer. If the allocation fails, the less efficient algorithm is chosen.

pitrou · 2024-01-19T10:12:49Z

@ursabot please benchmark lang=C++

ursabot · 2024-01-19T10:12:54Z

Benchmark runs are scheduled for commit 393e429. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

conbench-apache-arrow · 2024-01-19T12:14:14Z

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit 393e429.

There were 7 benchmark results indicating a performance regression:

Pull Request Run on arm64-m6g-linux-compute at 2024-01-19 11:37:50Z
- ReadMmapCachedFile (C++) with params=num_cols:1/is_partial:0/real_time, source=cpp-micro, suite=arrow-ipc-read-write-benchmark
- ReadMmapCachedFile (C++) with params=num_cols:8/is_partial:0/real_time, source=cpp-micro, suite=arrow-ipc-read-write-benchmark
and 5 more (see the report linked below)

The full Conbench report has more details.

Light-City · 2024-01-19T12:48:44Z

https://conbench.ursa.dev
It seems that the regression results of the benchmark have nothing to do with my changes.

pitrou · 2024-01-19T14:45:28Z

It seems that the regression results of the benchmark have nothing to do with my changes.

That's true, but there doesn't seem to be any improvement either.

I've run the benchmarks locally and neither do I see any improvement.

We can try to reason on the code changes here:

the original code allocates the temporary buffer once at the beginning of the sort operation, and reuses it for all std::merge calls
the changed code lets std::inplace_merge allocate a new temporary buffer at each invocation

So, at least theoretically, the code in this PR is less efficient. Unless you can exhibit benchmark improvements on some configuration, I would recommend rejecting this.

pitrou · 2024-01-19T14:48:46Z

Also, if you filter for chunked array sorts on https://conbench.ursa.dev/compare/runs/6961d70de8424138aaf0b77dc6cba908...d3ea371166c146d4845ac4625d70d2ad/ and https://conbench.ursa.dev/compare/runs/d0e8a5b5cde24106b3a2c60699933ea1...37da92dc9fca4e159568f2563f562a1d/, you'll see that most benchmarks show a slight performance decrease (between 0 and 10%).

Light-City · 2024-01-22T02:11:36Z

Also, if you filter for chunked array sorts on https://conbench.ursa.dev/compare/runs/6961d70de8424138aaf0b77dc6cba908...d3ea371166c146d4845ac4625d70d2ad/ and https://conbench.ursa.dev/compare/runs/d0e8a5b5cde24106b3a2c60699933ea1...37da92dc9fca4e159568f2563f562a1d/, you'll see that most benchmarks show a slight performance decrease (between 0 and 10%).

Indeed....local testing is down a bit

pitrou · 2024-01-22T10:08:33Z

A possible experiment would be to use three-way merging instead of two-way merging. This might increase performance as indexing a chunked array is not trivial.

bkietz

std::inplace_merge can allocate a temporary buffer and the conditions for falling back to the non-allocating Nlog(N) algorithm are not specified (IIUC, libcxx unconditionally allocates a buffer). Therefore, I think this PR should instead rewrite the non-allocating merge algorithm rather than use std::inplace_merge. That will give us a more reliable performance comparison between the two approaches, and might suggest our own explicit heuristic for choosing between them. See libstdc++ for an example

github-actions bot added Component: C++ awaiting review Awaiting review labels Jan 19, 2024

mapleFU requested a review from pitrou January 19, 2024 03:49

Feature: use inplace_merge to replace merge.

393e429

Light-City force-pushed the GH-39700 branch from 42daa7c to 393e429 Compare January 19, 2024 06:06

mapleFU requested a review from bkietz January 19, 2024 06:08

bkietz requested changes Feb 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-39700: [C++] Feature: use inplace_merge to replace merge. #39701

GH-39700: [C++] Feature: use inplace_merge to replace merge. #39701

Light-City commented Jan 19, 2024 •

edited by github-actions bot

github-actions bot commented Jan 19, 2024

mapleFU commented Jan 19, 2024 •

edited

Light-City commented Jan 19, 2024

pitrou commented Jan 19, 2024

pitrou commented Jan 19, 2024

ursabot commented Jan 19, 2024

conbench-apache-arrow bot commented Jan 19, 2024

Light-City commented Jan 19, 2024

pitrou commented Jan 19, 2024

pitrou commented Jan 19, 2024

Light-City commented Jan 22, 2024

pitrou commented Jan 22, 2024

bkietz left a comment

GH-39700: [C++] Feature: use inplace_merge to replace merge. #39701

Are you sure you want to change the base?

GH-39700: [C++] Feature: use inplace_merge to replace merge. #39701

Conversation

Light-City commented Jan 19, 2024 • edited by github-actions bot

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

github-actions bot commented Jan 19, 2024

mapleFU commented Jan 19, 2024 • edited

Light-City commented Jan 19, 2024

pitrou commented Jan 19, 2024

pitrou commented Jan 19, 2024

ursabot commented Jan 19, 2024

conbench-apache-arrow bot commented Jan 19, 2024

Light-City commented Jan 19, 2024

pitrou commented Jan 19, 2024

pitrou commented Jan 19, 2024

Light-City commented Jan 22, 2024

pitrou commented Jan 22, 2024

bkietz left a comment

Choose a reason for hiding this comment

Light-City commented Jan 19, 2024 •

edited by github-actions bot

mapleFU commented Jan 19, 2024 •

edited