Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column constructor from device_uvector&& #11356

Conversation

SrikarVanavasam
Copy link
Contributor

Closes #11115

This PR adds a column constructor to be constructible from a device_uvector&& using move semantics.

@SrikarVanavasam SrikarVanavasam requested a review from a team as a code owner July 26, 2022 18:03
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jul 26, 2022
@SrikarVanavasam SrikarVanavasam added improvement Improvement / enhancement to an existing function 3 - Ready for Review Ready for review by team non-breaking Non-breaking change labels Jul 26, 2022
@SrikarVanavasam SrikarVanavasam changed the title Add 'column constructor' from 'device_uvector&&' Add column constructor from device_uvector&& Jul 26, 2022
@davidwendt
Copy link
Contributor

This seems unnecessary since we have a constructor that accepts moving a device_buffer and that is easily obtained from device_uvector. Is there an issue with just using the current device_buffer constructor?
Also, I'd rather not add yet another include to column.hpp.

@ttnghia
Copy link
Contributor

ttnghia commented Jul 26, 2022

Oh yeah, the existing constructor seems to be enough, right:

... = column(type, size, buff.release()); 

@nvdbaranec
Copy link
Contributor

@bdice

@bdice
Copy link
Contributor

bdice commented Jul 26, 2022

I think this would be a helpful feature for simplifying code in libcudf. In a lot of places, it would reduce the number of explicitly passed constructor arguments from three (data_type, size, and device_buffer) to one (the device_uvector&& knows its type and size and releases its data buffer).

This move constructor benefits correctness: it prevents users from providing a data_type that doesn't match the device_uvector<T>::value_type or specifying a size that doesn't match the input data. From some code I've been reading lately, I suspect this kind of problem is happening in places where size_type and int32_t are used interchangeably.

While it's true that the snippet from @ttnghia quoted below is equivalent, I think there is a significant benefit in brevity if we can directly construct the column from a moved device_uvector<T>&&.

... = column(type, size, buff.release()); 

We've previously settled in favor of adding these types of conversions for non-owning types (#10302, #9656). Previous input from @jrhemstad and others have also indicated support for this: #9656 (comment)

I think this is a straightforward extension of work we've already done for ensuring semantically safe, accurate, and short code in libcudf.

Could we measure the build time changes to address @davidwendt's concern about includes? I don't expect a significant change because device_uvector.hpp is not that large and mostly includes other headers that are already being included, but it would be good to know the numbers.

@davidwendt
Copy link
Contributor

Perhaps we should remove the device-buffer constructor then?
This convenience wrapper to avoid a single call to release() and an extra size parameter seems unnecessary to me in our library.

@davidwendt
Copy link
Contributor

davidwendt commented Jul 26, 2022

My concern for the extra include goes beyond compile time. More include files increases the burden on building applications using libcudf functions.
Perhaps this could be a helper function in some other include file instead of adding yet another constructor?
Like a factory function: https://docs.rapids.ai/api/libcudf/stable/group__column__factories.html

Comment on lines 86 to 90
template <typename T>
column(rmm::device_uvector<T>&& other) noexcept
: _type{cudf::type_to_id<T>()},
_size{static_cast<cudf::size_type>(other.size())},
_data{other.release()}
Copy link
Contributor

@jrhemstad jrhemstad Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be constrained in some way to prevent construction from a device_uvector<string_view> or device_uvector<list_view>.

It should probably be:

template <typename T, CUDF_ENABLE_IF(is_fixed_width<T>())>

Double check if fixed-point types are considered fixed-width, because I don't think you want this for a device_uvector<decimal32/64> either.

Copy link
Contributor

@bdice bdice Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed point types are fixed width, and we don't want those. We probably want the same condition as what we use for the non-owning column_view from device_span construction: cudf::is_numeric<T>() or cudf::is_chrono<T>()

template <typename T, CUDF_ENABLE_IF(cudf::is_numeric<T>() or cudf::is_chrono<T>())>

Copy link
Contributor

@davidwendt davidwendt Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason this should probably just be a make_fixed_width_column factory function instead of a constructor.
This factory: https://docs.rapids.ai/api/libcudf/stable/group__column__factories.html#ga595aed5d1e6f3d4b3d8da9eeffed916d
takes a buffer for a null mask. It seems reasonable to add one that accepts a device-uvector.
Here is one we use for strings that takes 2 uvectors with move semantics: https://docs.rapids.ai/api/libcudf/stable/group__column__factories.html#ga86f7623f0d230c96491ef88d665385cc

@bdice
Copy link
Contributor

bdice commented Jul 27, 2022

Another argument for adopting this as a constructor: it's easy to write a significant bug with the current constructor from data_type, size_type, device_buffer.

Consider the following:

rmm::device_uvector<int> v{...};
column c{type_to_id<int>(), v.size(), v.release()};

This contains a bug: there is no guaranteed ordering in which v.size() and v.release() are called. We certainly don't want to release ownership of the data and then fetch the size.

Order of evaluation of any part of any expression, including order of evaluation of function arguments is unspecified (with some exceptions listed below).
-- https://en.cppreference.com/w/cpp/language/eval_order

A constructor from device_uvector<T>&& solves this through list initialization sequencing:

  1. In list-initialization, every value computation and side effect of a given initializer clause is sequenced before every value computation and side effect associated with any initializer clause that follows it in the brace-enclosed comma-separated list of initalizers.

On top of that, there are a couple smaller issues:

  • The type int must be repeated in device_uvector<int> and type_to_id<int>, as previously mentioned.
  • It's not as obvious that v is not safe to use after this snippet, when compared to the alternative using std::move (which clearly conveys that ownership is transferred and v is left in an unspecified state).

Credit to @jrhemstad for mentioning this to me.

@davidwendt
Copy link
Contributor

This contains a bug: there is no guaranteed ordering in which v.size() and v.release() are called. We certainly don't want to release ownership of the data and then fetch the size.

This is a very good point. I would still like to consider a make_fixed_width_column factory function instead of a constructor.

@codecov
Copy link

codecov bot commented Jul 29, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@df24808). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 5477423 differs from pull request most recent head 5183cff. Consider uploading reports for the commit 5183cff to get more accurate results

@@               Coverage Diff               @@
##             branch-22.10   #11356   +/-   ##
===============================================
  Coverage                ?   86.47%           
===============================================
  Files                   ?      144           
  Lines                   ?    22856           
  Branches                ?        0           
===============================================
  Hits                    ?    19764           
  Misses                  ?     3092           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@ttnghia
Copy link
Contributor

ttnghia commented Jul 29, 2022

I would still like to consider a make_fixed_width_column factory function instead of a constructor.

I totally support this.
My next PR (#10656) will directly use it. Currently, I have to do this:

  auto result_v = contains(table_view{{haystack}},
                           table_view{{needles}},
                           null_equality::EQUAL,
                           nan_equality::ALL_EQUAL,
                           stream,
                           mr);

  // todo: https://github.com/rapidsai/cudf/pull/11356
  auto result =
    std::make_unique<column>(data_type{type_to_id<bool>()}, needles.size(), result_v.release());
  result->set_null_mask(copy_bitmask(needles, stream, mr), needles.null_count());

  return result;

If using the current implementation of this PR, I will have to do this:

auto result =
    std::make_unique<column>(std::move(result_v);
  result->set_null_mask(copy_bitmask(needles, stream, mr), needles.null_count());
return result;

If we implement a make_fixed_width_column factory function, I can just do this:

return make_fixed_width_column(std::move(result_v), 
  needles.null_count(), copy_bitmask(needles, stream, mr));

@ttnghia
Copy link
Contributor

ttnghia commented Jul 29, 2022

This contains a bug: there is no guaranteed ordering in which v.size() and v.release() are called.

This is a very easy-to-step-on bug. I just got it myself, when implementing the code above ^^.

@jrhemstad
Copy link
Contributor

jrhemstad commented Jul 29, 2022

If using the current implementation of this PR, I will have to do this:

auto result =
std::make_unique(std::move(result_v);
result->set_null_mask(copy_bitmask(needles, stream, mr), needles.null_count());
return result;

If we implement a make_fixed_width_column factory function, I can just do this:

return make_fixed_width_column(std::move(result_v),
needles.null_count(), copy_bitmask(needles, stream, mr));

The ctor in this PR just needs to be updated to optionally accept a null mask and null count.

  template <typename T, CUDF_ENABLE_IF(cudf::is_numeric<T>() or cudf::is_chrono<T>())>
  column(rmm::device_uvector<T>&& other, device_buffer&& null_mask = {}, size_type null_count = UNKNOWN_NULL_COUNT)

@bdice
Copy link
Contributor

bdice commented Aug 1, 2022

I spoke with @jrhemstad and @davidwendt over the last few days and I have a few final notes from those discussions that I'll share here.

My understanding of the historical evolution of constructors-vs-factories (see #2207, #3252, #3461 (comment)) is that all factories except strings factories construct columns with uninitialized buffers. Strings factories (#2811, #7397) are the primary exception, by accepting initialized data buffers of characters and offsets.

The proposal in this issue is to make a constructor. I think this is the right path because the other existing column constructors accept data, while the current factories for fixed-width data return uninitialized columns.

In some cases, the addition of this constructor will mean that developers will follow a pattern like "create device_uvector, fill device_uvector with data, construct column from moved device_uvector." An alternative pattern that negates the need for a constructor (or factory function) is "create column from factory function, use mutable_column_view to fill with data, return column." Both are reasonable patterns, but in some cases (like dealing with indices/offsets), I have a weak preference for using a mutable, typed device_uvector to fill data and avoiding an untyped mutable_column_view (@davidwendt felt the opposite way in our discussion). For some types, like fixed point types, the device_uvector isn't a good option because it lacks the same level of metadata that is contained in a column.

The possibility of a "slippery slope" is that we could make column constructors for any type that is a GPU-resident owning buffer. However, I don't anticipate this being an issue. The most obvious such class is rmm::device_vector, but that is just a typedef for thrust::device_vector, and is not backed by the rmm::device_buffer that we need to construct the column. I don't know of other competing objects that would fall into the same category.

Co-authored-by: Nghia Truong <nghiatruong.vn@gmail.com>
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SrikarVanavasam and others added 2 commits August 2, 2022 22:08
Co-authored-by: Nghia Truong <nghiatruong.vn@gmail.com>
cpp/include/cudf/column/column.hpp Outdated Show resolved Hide resolved
@ttnghia ttnghia requested a review from jrhemstad August 3, 2022 17:51
@nvdbaranec
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit d86bb39 into rapidsai:branch-22.10 Aug 3, 2022
rapids-bot bot pushed a commit that referenced this pull request Aug 17, 2022
This extends the `cudf::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `cudf::contains` will work with literally any type of input data.

In addition, this fixes null handling of `cudf::contains` with structs column + struct scalar input when the structs column contains null rows at the top level while the scalar key is valid but all nulls at children levels.

Closes: #8965
Depends on:
 * #10730
 * #10883
 * #10802
 * #10997
 * NVIDIA/cuCollections#172
 * NVIDIA/cuCollections#173
 * #11037
 * #11356

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Devavret Makkar (https://github.com/devavret)
  - Bradley Dice (https://github.com/bdice)
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Bradley Dice (https://github.com/bdice)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #10656
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add column constructor from device_uvector&&.
6 participants