-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for scatter() on lists-of-struct columns #6817
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
The for-loop at scatter.cuh#L662 couldn't be boiled down to a 2-sequence |
I'll leave this as a draft PR until its dependency (#6768) is committed. |
Codecov Report
@@ Coverage Diff @@
## branch-0.17 #6817 +/- ##
============================================
Coverage 81.94% 81.94%
============================================
Files 96 96
Lines 16164 16164
============================================
Hits 13246 13246
Misses 2918 2918 Continue to review full report at Codecov.
|
5e77710
to
2072a40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. A few small comments.
1. Review: Switch loop to std::transform() 2. Review: Remove superfluous comments 3. Review: Correct the use of cuda_stream_view
1. Const all that can be const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. For the unit tests, I find them pretty hard to read the way they are written, with all of the varying "parameter list indentation" and vertical lists for initializer lists, i.e.:
auto expected_strings =
strings_column_wrapper{{"eight",
"eight",
"eight",
"one",
"one",
"nine",
"nine",
"nine",
"nine",
"three",
"three",
"four",
"four",
"five",
"five"},
make_counting_transform_iterator(0, [](auto i) { return i != 1; })};
It's hard to give comments though on changes to make. Ideally though, I aim for unit tests that end up looking like this:
TYPED_TEST(FixedPointTypes, FixedPointSimple)
{
using namespace numeric;
using decimalXX = TypeParam;
using RepType = cudf::device_storage_type_t<decimalXX>;
using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;
auto const mask = cudf::test::fixed_width_column_wrapper<bool>{0, 1, 1, 1, 0, 0};
auto const a = fp_wrapper{{110, 220, 330, 440, 550, 660}, scale_type{-2}};
auto const b = fp_wrapper{{0, 0, 0, 0, 0, 0}, scale_type{-2}};
auto const expected = fp_wrapper{{0, 220, 330, 440, 0, 0}, scale_type{-2}};
auto const result = cudf::copy_if_else(a, b, mask);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
}
I might set up a follow up PR / give a better code talk at some point to share my thoughts / see what people think.
@codereport, I agree about the indentation in the tests. These did not start out looking this way.I can try disabling the code formatter around the column definitions. |
Yea, my recommendation would be to try and write the code in a way that plays well with |
I'm not sure to what degree this can be done, given the complexity of P.S. I wasn't a fan of how |
Better formatting for tests.
Done. Resolved merge conflicts, and added (arguably) better formatting for literal columns used in |
I don't like turning clang-format off for more than a few lines. If someone comes in and changes some code and relies on clang-format to clean it up, we will end up with an accretion of terrible formatting. If it's too verbose to create a |
Limit disabling clang-format to the literals' definition.
I've now restricted where |
Thanks, that is much nicer. |
Addendum to #6768, to support scatter on columns of type
list<struct>
.Depends on #6768.
The hard part was the tests.
(This branch is based off of that in #6768, and shows all commits from that PR. Only the last 4 commits are material tostruct_view
support.)