List lexicographic comparator #11129

devavret · 2022-06-21T19:34:39Z

Contributes to #10186

This PR enables lexicographic comparisons between list columns. The comparison is robust to arbitrary levels of nesting, but does not yet support lists of (lists of...) structs. The comparison is based on the Dremel encoding already in use in the Parquet file format. To assist reviewers, here is a reasonably complete list of the changes in this PR:

A helper function to get per-column Dremel data (for list columns) when constructing a preprocessed table, which now owns the Dremel data.
Updating the set of lexicographically compatible columns to now include list columns as long as they do not have any nested structs within.
An implementation of lexicographic::device_row_comparator::operator() for lists. This function is the heart of the change to enable comparisons between list columns.
A new benchmark for sorting that uses list data.
An update to a preexisting rolling collect set test that previously failed (because it requires list comparison) but now works.
New tests for list comparison.

Many iterations already happened. I just realized late that I should commit

sliced no longer works

…mparator

early return and remove unnecessary statements

ttnghia · 2022-09-08T18:25:51Z

The struct performance slowing down is fine IMO.

I'm starting to review this PR. The statement above makes me a bit worried. There are many cases comparing non-nested structs (struct of basic types) in Spark using the flattening approach + basic lex comparator. Switching to using the new lex comparator for non-nested structs could cause a lot of performance regression. The Spark team may scream in panic if they see their benchmarks slow down due to it.

ttnghia · 2022-09-08T18:29:01Z

As per my argument above, if this PR changes the performance of the existing APIs (like binary search and sorting) then I would suggest not to merge this until it has been tested with the performance benchmark in spark-rapids and got confirmation of a safe-merge.

cpp/src/search/search_ordered.cu

cpp/include/cudf/table/experimental/row_operators.cuh

ttnghia · 2022-09-08T20:34:00Z

cpp/src/search/search_ordered.cu

+    auto const d_comparator = comparator.less<true>(nullate::DYNAMIC{has_nulls});
+    if (find_first) {
+      thrust::lower_bound(rmm::exec_policy(stream),
+                          haystack_it,
+                          haystack_it + haystack.num_rows(),
+                          needles_it,
+                          needles_it + needles.num_rows(),
+                          out_it,
+                          d_comparator);
+    } else {
+      thrust::upper_bound(rmm::exec_policy(stream),
+                          haystack_it,
+                          haystack_it + haystack.num_rows(),
+                          needles_it,
+                          needles_it + needles.num_rows(),
+                          out_it,
+                          d_comparator);
+    }
  } else {
-    thrust::upper_bound(rmm::exec_policy(stream),
-                        haystack_it,
-                        haystack_it + haystack.num_rows(),
-                        needles_it,
-                        needles_it + needles.num_rows(),
-                        out_it,
-                        d_comparator);
+    auto const d_comparator = comparator.less<false>(nullate::DYNAMIC{has_nulls});
+    if (find_first) {
+      thrust::lower_bound(rmm::exec_policy(stream),
+                          haystack_it,
+                          haystack_it + haystack.num_rows(),
+                          needles_it,
+                          needles_it + needles.num_rows(),
+                          out_it,
+                          d_comparator);
+    } else {
+      thrust::upper_bound(rmm::exec_policy(stream),
+                          haystack_it,
+                          haystack_it + haystack.num_rows(),
+                          needles_it,
+                          needles_it + needles.num_rows(),
+                          out_it,
+                          d_comparator);
+    }


Previously I tried to use a lambda to simplify the code:

auto const do_search = [find_first](auto&&... args) { if (find_first) { thrust::lower_bound(std::forward<decltype(args)>(args)...); } else { thrust::upper_bound(std::forward<decltype(args)>(args)...); } }; do_search(rmm::exec_policy(stream), count_it, count_it + haystack.num_rows(), count_it, count_it + needles.num_rows(), out_it, comp);

Unfortunately using such variadic template causes false compiler warnings. I want to bring it back here in case you guys may have a new idea for improving this.

Even if not, we can also reduce the code by half by writing a lambda for the block lower_bound/upper_bound and calling it twice.

See #11667 for discussion on how we can improve this. The plan is to do that in a follow-up so that we don't hold up this feature any longer for refactoring-like tasks.

cpp/src/sort/sort_impl.cuh

vyasr · 2022-09-08T21:18:05Z

As per my argument above, if this PR changes the performance of the existing APIs (like binary search and sorting) then I would suggest not to merge this until it has been tested with the performance benchmark in spark-rapids and got confirmation of a safe-merge.

This makes me sad, but it is a good point. I agree that we probably want to get this tested by spark-rapids. Can you follow up with the appropriate person (perhaps @abellina?) to get those benchmarks run?

The old comparator will eventually be phased out entirely in favor of the new comparator. If there are cases where the new comparator's performance measures up poorly against the old one, then we'll have to consider ways to optimize it. I take your point about the new comparator perhaps slowing down the case of structs that don't contain other structs, but I'm honestly not sure how far we want to go here. Truly providing peak performance for every possible combination would probably require us to have many different template overloads of this type, each with different operator overloads enabled or disabled from compilation (lists without structs, structs without lists, no lists or structs, different types of mixed structs/lists) and perhaps even based on whether or not nulls are present. Going down this route will, I suspect, very quickly start to incur a much higher maintenance burden than what we currently have since these performance characteristics are also likely to change with each compiler version. In my view it is expected that enabling more features for comparing nested types could have performance costs, and to some extent that is acceptable.

CC @GregoryKimball

vyasr · 2022-09-12T16:21:16Z

@gpucibot merge

devavret added 30 commits August 26, 2021 19:28

First commit

933c974

Many iterations already happened. I just realized late that I should commit

testing and profiling deep single hierarchy struct

a1636e5

Merge branch 'branch-22.02' into struct-row-comp

d59f54c

Merge branch 'branch-22.02' into struct-row-comp

765dd8d

Make the sandboxed test compile again

3d21daf

Update my row_comparator with nullate

9f32e6b

Merge branch 'branch-22.02' into struct-row-comp

53d3c90

Basic verticalization utility and experimental namespace

022e2a4

clean up most of row operators that I didn't change.

7fef643

Sliced column test

930d8de

column order and null precendence support

0ecc4f8

Manually managed stack

ff36d2d

New depth based method to avoid superimpose nulls

cd0f938

sliced no longer works

Put sort2 impl in separate TU

7b8e060

Merge branch 'branch-22.04' into struct-row-comp

25eb237

Merge branch 'branch-22.04' into struct-row-comp

d2937cf

Move verticalization code to row_comparator.cpp

d55c9c7

Owning row lex operator

3bd749e

merge fixes

613d664

Move struct logic out of main row loop and into element_relational_co…

2ef3ac7

…mparator

pushing even more logic into element_relational_comparator

5577431

More optimizations.

f037bc0

early return and remove unnecessary statements

review changes

8c54a85

Checks to ensure tables can be compared

9d24a87

Super basic list lex working

a664c81

list test expansion and cleanups.

1ebd877

Make struct comp work again

3e6e9f4

List lex benchmark

facc031

Merge branch 'branch-22.08' into list-lex-comp

a19b2c3

Add back code from old lex comparator that had list flattening

11bcf16

github-actions bot removed the cuDF (Python) Affects Python cuDF API. label Sep 8, 2022

vyasr mentioned this pull request Sep 8, 2022

[FEA] Reduce disparity between nested and non-nested column handling in lexicographic comparator #11667

Open

vyasr mentioned this pull request Sep 8, 2022

[FEA] Enable structs of lists in the new row lexicographic comparator #11672

Closed

ttnghia reviewed Sep 8, 2022

View reviewed changes

cpp/src/search/search_ordered.cu Outdated Show resolved Hide resolved

ttnghia reviewed Sep 8, 2022

View reviewed changes

cpp/include/cudf/table/experimental/row_operators.cuh Show resolved Hide resolved

ttnghia reviewed Sep 8, 2022

View reviewed changes

cpp/src/sort/sort_impl.cuh Show resolved Hide resolved

Fix typo.

f7b671a

vyasr removed request for robertmaynard and codereport September 9, 2022 00:50

ttnghia added 5 - DO NOT MERGE Hold off on merging; see PR for details and removed 5 - DO NOT MERGE Hold off on merging; see PR for details labels Sep 9, 2022

ttnghia approved these changes Sep 12, 2022

View reviewed changes

bdice mentioned this pull request Sep 12, 2022

Transfer correct dtype to exploded column #11687

Merged

3 tasks

rapids-bot bot merged commit 9f9a55d into rapidsai:branch-22.10 Sep 12, 2022

v22.10 Release automation moved this from PR-Reviewer approved to Done Sep 12, 2022

This was referenced Oct 3, 2022

[FEA] Sort list columns #5890

Closed

[FEA] Implement full support for nested types #11844

Closed

[FEA] Support lists as groupby keys #8039

Open

vyasr mentioned this pull request Oct 17, 2022

Move functions in jit/type.hpp into the main cudf namespace #11673

Closed

GregoryKimball mentioned this pull request Nov 19, 2022

[FEA] Support sorting on lists of Strings #10184

Closed

This was referenced Dec 2, 2022

[FEA] Support order by on single-level array NVIDIA/spark-rapids#7231

Closed

Cudf vs Spark discrepancies when comparing nested list column with null entries #12298

Closed

[FEA] Support order by on single-level array NVIDIA/spark-rapids#7233

Merged

razajafri mentioned this pull request Jan 6, 2023

Add support for arrays in hashaggregate [databricks] NVIDIA/spark-rapids#7465

Merged

GregoryKimball mentioned this pull request Jan 23, 2023

[FEA] Refactor experimental/row_operators.cuh and make it default #12593

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List lexicographic comparator #11129

List lexicographic comparator #11129

devavret commented Jun 21, 2022 •

edited by vyasr

ttnghia commented Sep 8, 2022 •

edited

ttnghia commented Sep 8, 2022 •

edited

ttnghia Sep 8, 2022

ttnghia Sep 8, 2022 •

edited

vyasr Sep 8, 2022

vyasr commented Sep 8, 2022

vyasr commented Sep 12, 2022

List lexicographic comparator #11129

List lexicographic comparator #11129

Conversation

devavret commented Jun 21, 2022 • edited by vyasr

ttnghia commented Sep 8, 2022 • edited

ttnghia commented Sep 8, 2022 • edited

ttnghia Sep 8, 2022

Choose a reason for hiding this comment

ttnghia Sep 8, 2022 • edited

Choose a reason for hiding this comment

vyasr Sep 8, 2022

Choose a reason for hiding this comment

vyasr commented Sep 8, 2022

vyasr commented Sep 12, 2022

devavret commented Jun 21, 2022 •

edited by vyasr

ttnghia commented Sep 8, 2022 •

edited

ttnghia commented Sep 8, 2022 •

edited

ttnghia Sep 8, 2022 •

edited