Cleanup sort #4613

tustvold · 2023-08-01T10:57:50Z

Which issue does this PR close?

Part of #4545

Rationale for this change

Cleans up the sort kernel to make the code easier to maintain.

Also changes list sorting to use the same rank-based approach used for dictionaries, which should be substantially faster, although I don't have benchmarks to quantify this.

The benchmarks are quite noisy, but this seems to possibly introduce a slight performance improvement.

sort i64 2^10           time:   [9.1265 µs 9.1303 µs 9.1344 µs]
                        change: [-0.1108% -0.0301% +0.0483%] (p = 0.47 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

sort i64 2^12           time:   [30.098 µs 30.180 µs 30.264 µs]
                        change: [+0.0343% +0.3620% +0.6738%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe

sort i64 nulls 2^10     time:   [80.763 µs 80.801 µs 80.844 µs]
                        change: [-0.5475% -0.3889% -0.1707%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

sort i64 nulls 2^12     time:   [30.081 µs 30.134 µs 30.195 µs]
                        change: [-3.5573% -2.4943% -1.5011%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

sort f32 to indices 2^12
                        time:   [129.59 µs 129.64 µs 129.70 µs]
                        change: [-5.1583% -5.0872% -5.0257%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

sort f32 nulls to indices 2^12
                        time:   [93.133 µs 93.219 µs 93.321 µs]
                        change: [-4.6517% -4.5602% -4.4676%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

sort string[10] to indices 2^12
                        time:   [371.99 µs 372.25 µs 372.56 µs]
                        change: [+0.3681% +0.4194% +0.4676%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

sort string[10] nulls to indices 2^12
                        time:   [208.87 µs 208.96 µs 209.05 µs]
                        change: [-3.9053% -3.8462% -3.7828%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

sort string[10] dict to indices 2^12
                        time:   [501.59 µs 501.75 µs 501.92 µs]
                        change: [+1.2853% +1.3713% +1.4949%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

sort string[10] dict nulls to indices 2^12
                        time:   [271.88 µs 271.96 µs 272.06 µs]
                        change: [-2.3687% -2.3245% -2.2848%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

What changes are included in this PR?

Are there any user-facing changes?

tustvold · 2023-08-01T10:58:05Z

arrow-array/src/cast.rs

@@ -799,6 +799,15 @@ pub trait AsArray: private::Sealed {
        self.as_list_opt().expect("list array")
    }

+    /// Downcast this to a [`FixedSizeBinaryArray`] returning `None` if not possible
+    fn as_fixed_size_binary_opt(&self) -> Option<&FixedSizeBinaryArray>;


This trait is sealed so this isn't a breaking change

tustvold · 2023-08-01T11:00:06Z

arrow-ord/src/sort.rs

-
-/// Compare two `Array`s based on the ordering defined in [build_compare]
-fn cmp_array(a: &dyn Array, b: &dyn Array) -> Ordering {
-    let cmp_op = build_compare(a, b).unwrap();


This was getting called per array comparison, doing all this dispatch and allocation logic every time! The performance would have been catastrophic

Looks like it was called as part of sorting ListArray -- so that might become substantially faster

tustvold · 2023-08-01T11:00:41Z

arrow-ord/src/sort.rs

-            }
-        },
-        DataType::Dictionary(_, _) => {
-            let value_null_first = if options.descending {


This logic is moved into a child_rank function

tustvold · 2023-08-01T11:01:45Z

arrow-ord/src/sort.rs

+) -> Result<UInt32Array, ArrowError> {
+    let rank = child_rank(array.values().as_ref(), options)?;
+    let offsets = array.value_offsets();
+    let mut valids = value_indices


The duplication in extracting these valids arrays is somewhat unfortunate, I have some ideas of how to fix it, but wanted to keep the PR size down

tustvold · 2023-08-01T11:02:31Z

arrow-ord/src/sort.rs

@@ -1980,7 +1526,7 @@ mod tests {
                nulls_first: false,
            }),
            None,
-            vec![2, 3, 1, 4, 5, 0],
+            vec![2, 3, 1, 4, 0, 5],


This applies #4603 to boolean arrays which I missed by accident, yet another benefit of consolidating the code 😄

0 and 5 here are the indexes of nulls. Makes sense

alamb

This is a really nice consolidation @tustvold -- 👍

I found the github diff hard to review, but in the editor it is quite clear that the implementations were consolidated into a single type specialized function (sort_impl) rather than manually copy/pasted.

alamb · 2023-08-03T20:21:30Z

arrow-ord/src/sort.rs

-        }
-        DataType::Duration(TimeUnit::Second) => {
-            sort_primitive::<DurationSecondType, _>(values, v, n, cmp, &options, limit)
+    let (v, n) = partition_validity(array);


since partition_validity uses u32 for indexes, doesn't that mean this code can not sort arrays with more than 4B values? Maybe for things like LargeList this could be a problem 🤔

However, it seems like this was the case prior to this PR as well, so it probably isn't an issue

This is a good point, but I think it somewhat inherent to the current design. FWIW acero also only makes use of 32-bit indices

alamb · 2023-08-03T20:23:07Z

arrow-ord/src/sort.rs

+        .into_iter()
+        .map(|index| (index, values.value(index as usize)))
+        .collect::<Vec<(u32, bool)>>();
+    sort_impl(options, &mut valids, &null_indices, limit, |a, b| a.cmp(&b)).into()


👌 sort_impl is very nice

alamb · 2023-08-03T20:26:34Z

arrow-ord/src/sort.rs

-
-/// Compare two `Array`s based on the ordering defined in [build_compare]
-fn cmp_array(a: &dyn Array, b: &dyn Array) -> Ordering {
-    let cmp_op = build_compare(a, b).unwrap();


Looks like it was called as part of sorting ListArray -- so that might become substantially faster

alamb · 2023-08-03T20:29:40Z

arrow-ord/src/sort.rs

@@ -1980,7 +1526,7 @@ mod tests {
                nulls_first: false,
            }),
            None,
-            vec![2, 3, 1, 4, 5, 0],
+            vec![2, 3, 1, 4, 0, 5],


0 and 5 here are the indexes of nulls. Makes sense

tustvold added 2 commits August 1, 2023 11:42

Cleanup sort

cf63334

Add inline

c935dfc

github-actions bot added the arrow Changes to the arrow crate label Aug 1, 2023

tustvold commented Aug 1, 2023

View reviewed changes

tustvold added 2 commits August 1, 2023 13:09

Further cleanup

b541648

Further sort benchmark fixes

609a917

tustvold mentioned this pull request Aug 1, 2023

Further sort benchmark fixes #4618

Closed

alamb approved these changes Aug 3, 2023

View reviewed changes

tustvold merged commit a81da6c into apache:master Aug 4, 2023
25 checks passed

tustvold mentioned this pull request Aug 29, 2023

Fix List Sorting, Revert Removal of Rank Kernels #4747

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup sort #4613

Cleanup sort #4613

tustvold commented Aug 1, 2023

tustvold Aug 1, 2023

tustvold Aug 1, 2023

alamb Aug 3, 2023

tustvold Aug 1, 2023

tustvold Aug 1, 2023

tustvold Aug 1, 2023

alamb Aug 3, 2023

alamb left a comment

alamb Aug 3, 2023

tustvold Aug 4, 2023

alamb Aug 3, 2023

alamb Aug 3, 2023

alamb Aug 3, 2023

Cleanup sort #4613

Cleanup sort #4613

Conversation

tustvold commented Aug 1, 2023

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment