Use arrow row format in SortPreservingMerge ~50-70% faster #3386

tustvold · 2022-09-07T12:33:08Z

Which issue does this PR close?

Part of #416

Rationale for this change

merge i64               time:   [18.361 ms 18.383 ms 18.406 ms]                      
                        change: [-53.779% -53.520% -53.283%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

merge f64               time:   [18.271 ms 18.289 ms 18.307 ms]                      
                        change: [-53.881% -53.730% -53.598%] (p = 0.00 < 0.05)
                        Performance has improved.

merge utf8 low cardinality                                                                            
                        time:   [17.168 ms 17.185 ms 17.203 ms]
                        change: [-62.941% -62.831% -62.731%] (p = 0.00 < 0.05)
                        Performance has improved.

merge utf8 high cardinality                                                                            
                        time:   [19.513 ms 19.539 ms 19.566 ms]
                        change: [-54.113% -54.022% -53.932%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

merge utf8 tuple        time:   [27.579 ms 27.608 ms 27.639 ms]                             
                        change: [-56.213% -56.134% -56.054%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

merge utf8 dictionary   time:   [16.251 ms 16.265 ms 16.280 ms]                                  
                        change: [-65.210% -65.063% -64.945%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

merge utf8 dictionary tuple                                                                            
                        time:   [19.057 ms 19.081 ms 19.111 ms]
                        change: [-70.756% -70.581% -70.418%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

merge mixed utf8 dictionary tuple                                                                            
                        time:   [24.586 ms 24.634 ms 24.684 ms]
                        change: [-63.732% -63.583% -63.439%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

merge mixed tuple       time:   [27.034 ms 27.075 ms 27.119 ms]                              
                        change: [-47.178% -46.969% -46.762%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

It is also worth highlighting that these benchmarks are in many ways the worst case, as the rows are distributed randomly across streams, instead of large contiguous slices, which increases the cost of reassembly, i.e. the non-comparison portion of the operator.

What changes are included in this PR?

Are there any user-facing changes?

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs

tustvold · 2022-09-07T12:40:23Z

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs

@@ -321,10 +318,13 @@ pub(crate) struct SortPreservingMergeStream {
    next_batch_id: usize,

    /// min heap for record comparison
-    min_heap: BinaryHeap<SortKeyCursor>,
+    max_heap: BinaryHeap<Reverse<SortKeyCursor>>,


This was a somewhat amusing surprise, BinaryHeap is a max heap, not a min heap, the comparator for SortKeyCursor was just backwards.

But w/ Reverse, it's a "min heap" again, so I think the variable name should read min_heap.

It's a max heap of reversed elements no?

Well, we get into philosophical discussions here, but IMHO the variable should describe the entire construct (BinaryHeap<Reverse<SortKeyCursor>>), not just the outer shell (BinaryHeap<...>).

Should you decide the keep the name, then at least adjust the docstring which still read min heap.

alamb

😍 where this is headed

alamb · 2022-09-08T19:37:18Z

datafusion/core/src/physical_plan/sorts/cursor.rs

-    // their batch_idx.
-    batch_comparators: RwLock<HashMap<usize, Vec<DynComparator>>>,
-    sort_options: Arc<Vec<SortOptions>>,
+    rows: Rows,


that certainly looks nicer

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs

yjshen

The speed-up is fantastic, love it!

yjshen · 2022-09-09T02:40:24Z

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs

+
+                        let rows = self.row_converter.convert(&cols);
+
+                        let cursor = SortKeyCursor::new(


We need to track the total memory used by all cursors since the cursor now holds Rows. We could do this as follow-ups but note here as it came to me.

I agree that memory usage is a potential concern (as we are effectively copying data into the Rows format.

A follow on PR would be good I think. I filed #3609

alamb

👍

I think this code is well tested, including

https://github.com/apache/arrow-datafusion/blob/41b4e491663029f653e491b110d0b5e74d08a0b6/datafusion/core/tests/merge_fuzz.rs

🚀

Thanks for this work @tustvold

alamb · 2022-09-25T11:24:39Z

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs

+
+                        let rows = self.row_converter.convert(&cols);
+
+                        let cursor = SortKeyCursor::new(


I agree that memory usage is a potential concern (as we are effectively copying data into the Rows format.

A follow on PR would be good I think. I filed #3609

alamb · 2022-09-25T11:28:32Z

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs

-            let _timer = elapsed_compute.timer();
+        // NB timer records time taken on drop, so there are no
+        // calls to `timer.done()` below.
+        let elapsed_compute = self.tracking_metrics.elapsed_compute().clone();


this simply reduces the overhead of timing , right?

Yes, which turned out to be a major bottleneck, as Instant::now is a syscall

ursabot · 2022-09-27T18:26:29Z

Benchmark runs are scheduled for baseline = 15c19c3 and contender = 451e441. 451e441 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Dandandan · 2022-09-27T19:00:38Z

Real nice 🎉

This reverts commit 451e441.

tustvold changed the title ~~Use arrow row format in SortPreservingMerge~~ Use arrow row format in SortPreservingMerge ~50-70% faster Sep 7, 2022

tustvold commented Sep 7, 2022

View reviewed changes

datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs Show resolved Hide resolved

tustvold commented Sep 7, 2022

View reviewed changes

github-actions bot added the core Core datafusion crate label Sep 7, 2022

alamb reviewed Sep 8, 2022

View reviewed changes

yjshen mentioned this pull request Sep 9, 2022

Comparable Row Format apache/arrow-rs#2593

Merged

yjshen reviewed Sep 9, 2022

View reviewed changes

alamb mentioned this pull request Sep 19, 2022

[EPIC] Add Decimal support #3523

Open

8 tasks

Use arrow row format in SortPreservingMerge

68ff05b

tustvold force-pushed the use-arrow-row-format branch from 8143bb9 to 68ff05b Compare September 20, 2022 11:39

tustvold marked this pull request as ready for review September 24, 2022 03:59

alamb mentioned this pull request Sep 25, 2022

Account for memory used in SortKeyCursor by row format (spilling Sort) #3609

Open

alamb approved these changes Sep 25, 2022

View reviewed changes

alamb mentioned this pull request Sep 25, 2022

Optimize sort preserving merge #416

Closed

2 tasks

alamb merged commit 451e441 into apache:master Sep 27, 2022

alamb added a commit to alamb/datafusion that referenced this pull request Oct 11, 2022

Revert "Use arrow row format in SortPreservingMerge (apache#3386)"

d303625

This reverts commit 451e441.

tustvold mentioned this pull request Oct 14, 2022

Replace Lexicographic Kernels With Row Format apache/arrow-rs#2871

Closed

ghuls mentioned this pull request Nov 8, 2022

Faster multi column sort. pola-rs/polars#5443

Closed

tustvold mentioned this pull request Apr 29, 2023

Research sort directly on raw bytes of composite sort keys for better performance #2150

Closed

tustvold mentioned this pull request Jul 21, 2023

Use Row Format in SortExec #7053

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use arrow row format in SortPreservingMerge ~50-70% faster #3386

Use arrow row format in SortPreservingMerge ~50-70% faster #3386

tustvold commented Sep 7, 2022 •

edited

Loading

tustvold Sep 7, 2022

crepererum Sep 7, 2022

tustvold Sep 7, 2022

crepererum Sep 8, 2022

alamb left a comment

alamb Sep 8, 2022

yjshen left a comment

yjshen Sep 9, 2022

alamb Sep 25, 2022

alamb left a comment

alamb Sep 25, 2022

alamb Sep 25, 2022

tustvold Sep 25, 2022

ursabot commented Sep 27, 2022

Dandandan commented Sep 27, 2022


		let rows = self.row_converter.convert(&cols);

		let cursor = SortKeyCursor::new(

Use arrow row format in SortPreservingMerge ~50-70% faster #3386

Use arrow row format in SortPreservingMerge ~50-70% faster #3386

Conversation

tustvold commented Sep 7, 2022 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yjshen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ursabot commented Sep 27, 2022

Dandandan commented Sep 27, 2022

tustvold commented Sep 7, 2022 •

edited

Loading