Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor rmm::device_scalar in terms of rmm::device_uvector #789

Merged

Conversation

harrism
Copy link
Member

@harrism harrism commented Jun 1, 2021

This PR refactors device_scalar to use a single-element device_uvector for its storage. This simplifies the implementation of device_scalar. Also changes the API of device_scalar so that its asynchronous / stream-ordered methods use the same API style (with explicit stream parameter) as device_uvector and device_buffer.

Closes #570

This is a breaking change. When it is merged, PRs are likely to need to be merged immediately in other libraries to account for the API changes.

@harrism harrism added breaking Breaking change improvement Improvement / enhancement to an existing function labels Jun 1, 2021
@harrism harrism requested a review from jrhemstad June 1, 2021 00:25
@harrism harrism self-assigned this Jun 1, 2021
@harrism harrism requested a review from a team as a code owner June 1, 2021 00:25
@harrism harrism added this to PR-WIP in v21.08 Release via automation Jun 1, 2021
@harrism harrism requested a review from rongou June 1, 2021 00:25
@github-actions github-actions bot added the cpp Pertains to C++ code label Jun 1, 2021
@harrism harrism changed the title Fea device scalar refactor Refactor rmm::device_scalar in terms of rmm::device_uvector Jun 1, 2021
T const &initial_value,
cuda_stream_view stream = cuda_stream_view{},
value_type const &initial_value,
cuda_stream_view stream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the default? It seems that it would be a common use case to use the default stream, especially with PTDS enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we want to make all stream-ordered APIs explicit. See e.g. #418. Note that device_uvector and now device_buffer have no default stream parameters.

RMM_CUDA_TRY(
cudaMemcpyAsync(element_ptr(element_index), &v, sizeof(v), cudaMemcpyDefault, s.value()));
s.synchronize_no_throw();
if constexpr (std::is_fundamental<value_type>::value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be done using SFINAE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It certainly can, but why would I want to go back to SFINAE? I replaced SFINAE with if constexpr, which allows having a single implementation of this function rather than three!

Comment on lines 40 to 46
using value_type = T;
using reference = value_type &;
using const_reference = value_type const &;
using pointer = value_type *;
using const_pointer = value_type const *;
using iterator = pointer;
using const_iterator = const_pointer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would define these in terms of device_uvector. Also, I don't think the iterator typedefs are needed for device_scalar?

Suggested change
using value_type = T;
using reference = value_type &;
using const_reference = value_type const &;
using pointer = value_type *;
using const_pointer = value_type const *;
using iterator = pointer;
using const_iterator = const_pointer;
using value_type = typename device_uvector<T>::value_type;
using reference = typename device_uvector<T>::reference;
using const_reference = typename device_uvector<T>::const_reference;
using pointer = typename device_uvector<T>::pointer;
using const_pointer = typename device_uvector<T>::const_pointer;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had been thinking about whether it would be useful for device_scalar to provide begin() and end()...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

@harrism harrism requested review from rongou and jrhemstad June 2, 2021 23:29
v21.08 Release automation moved this from PR-WIP to PR-Reviewer approved Jun 3, 2021
@harrism
Copy link
Member Author

harrism commented Jun 8, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit aa2a2f3 into rapidsai:branch-21.08 Jun 8, 2021
v21.08 Release automation moved this from PR-Reviewer approved to Done Jun 8, 2021
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Jun 8, 2021
rapidsai/rmm#789 refactors `rmm::device_scalar`, which all of `cudf::scalar` depends on. Notably, it renames some methods, makes stream parameters explicit, and deletes streamless constructors. As a result, the present PR deletes the default and non-stream copy constructors of all the `cudf::*_scalar` classes.

This should be merged immediately after rapidsai/rmm#789 because that PR will break the build.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - https://github.com/brandon-b-miller
  - Vukasin Milovanovic (https://github.com/vuule)
  - Robert (Bobby) Evans (https://github.com/revans2)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #8411
rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Jun 8, 2021
In comms/test.hpp `device_scalar::value` was not being passed an explicit stream, which means that the default stream was being synced. rapidsai/rmm#789 will remove the default from this parameter, and would have therefore broken the RAFT build. So this PR fixes the oversynchronization and ensures RAFT will build after the RMM PR is merged.

Note this PR includes the cmake changes from #258 (just so I could build locally). Once #258 is merged this PR's changes will be simplified.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #259
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change cpp Pertains to C++ code improvement Improvement / enhancement to an existing function
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[FEA] Refactor device_scalar in terms of device_uvector and fix asynchrony
3 participants