Add device_uvector::reserve and device_buffer::reserve #1079

upsj · 2022-08-03T07:00:11Z

I am building a parser that outputs variable-sized blocks of data. To collect them, I would like to use pre-allocated device_uvectors, using size() to keep track of how much memory is already in use. Setting the capacity and size manually works at the moment by calling vec.resize(capacity, stream); vec.resize(size, stream); on an empty vector, but this seems unnecessarily complicated. Since device_uvector otherwise already closely matches the std::vector interface, I want to propose adding reserve to the interface.

TODO:

Add to Python interface

harrism

Seems like a reasonable addition. One comment / question about making it a little cleaner.

include/rmm/device_buffer.hpp

jrhemstad · 2022-08-03T15:06:37Z

include/rmm/device_buffer.hpp

+      auto tmp = device_buffer{new_size, stream, _mr};
      RMM_CUDA_TRY(
-        cudaMemcpyAsync(new_data, data(), size(), cudaMemcpyDefault, this->stream().value()));
-      deallocate_async();
-      _data     = new_data;
-      _size     = new_size;
-      _capacity = new_size;
+        cudaMemcpyAsync(tmp.data(), data(), size(), cudaMemcpyDefault, this->stream().value()));
+      std::swap(tmp, *this);


I could have sworn I had a reason for not using CAS here originally, but now I'm struggling to remember what it could have been.

Oh well ¯_(ツ)_/¯

Does this work correctly without adding/specializing a swap function for device_buffer?

I also have this question. From what I can tell std::swap should be safe, but I may be missing something. Also is this "CAS"? There's no comparison, just a swap, right?

Also is this "CAS"?

Sorry, overloaded acronym. "copy and swap"

Does this work correctly without adding/specializing a swap function for device_buffer?

Yeah, the default implementation of swap will just use the move ctor: https://stackoverflow.com/a/25286610/11341974

So I read the link about the behavior of swap, and that all seems fine. But could we just move-assign the new buffer to this instance instead of swapping?

Should be fine as well. No need to move around the old buffer ;) Changed it to std::move

include/rmm/device_uvector.hpp

bdice · 2022-08-03T23:14:41Z

include/rmm/device_buffer.hpp

+      auto tmp = device_buffer{new_size, stream, _mr};
      RMM_CUDA_TRY(
-        cudaMemcpyAsync(new_data, data(), size(), cudaMemcpyDefault, this->stream().value()));
-      deallocate_async();
-      _data     = new_data;
-      _size     = new_size;
-      _capacity = new_size;
+        cudaMemcpyAsync(tmp.data(), data(), size(), cudaMemcpyDefault, this->stream().value()));
+      std::swap(tmp, *this);


I also have this question. From what I can tell std::swap should be safe, but I may be missing something. Also is this "CAS"? There's no comparison, just a swap, right?

harrism · 2022-08-04T12:36:02Z

include/rmm/device_buffer.hpp

+      auto tmp            = device_buffer{new_capacity, stream, _mr};
+      auto const old_size = size();
+      RMM_CUDA_TRY(cudaMemcpyAsync(tmp.data(), data(), size(), cudaMemcpyDefault, stream.value()));
+      *this = std::move(tmp);


If you just move tmp over *this, is the old memory of this properly deallocated? I want to ensure there is no memory leak. With swap, it's obvious to me there is no leak because tmp is on the stack and when it goes out of scope it will be destroyed, taking the old memory of this with it.

is the old memory of this properly deallocated?

Yep, it will invoke the move assignment operator of device_buffer which will deallocate the original memory.

rmm/include/rmm/device_buffer.hpp

Lines 194 to 223 in be9b9a9

/**

* @brief Move assignment operator moves the contents from `other`.

*

* This `device_buffer`'s current device memory allocation will be deallocated

* on `stream()`.

*

* If a different stream is required, call `set_stream()` on

* the instance before assignment. After assignment, this instance's stream is

* replaced by the `other.stream()`.

*

* @param other The `device_buffer` whose contents will be moved.

*/

device_buffer& operator=(device_buffer&& other) noexcept

{

if (&other != this) {

deallocate_async();

_data = other._data;

_size = other._size;

_capacity = other._capacity;

set_stream(other.stream());

_mr = other._mr;

other._data = nullptr;

other._size = 0;

other._capacity = 0;

other.set_stream(cuda_stream_view{});

}

return *this;

}

I agree the move is safe. However I have a question about streams. Should the old buffer be destroyed on the same stream as the copy (to ensure the copy is complete), or the stream the old buffer was constructed with (current behavior)? Should the copy always occur on the stream used to construct the original buffer to ensure that the reserve sequences after the constructor’s allocation? (I don’t remember seeing an explicit sync but need to look again.) Is this solved by the call to set_stream?

Is this solved by the call to set_stream?

Yes. Everything here is happening all on the provided stream argument.

It would be an error to construct a device_buffer on s1 and reserve on s2 without first doing a synchronization between s1 and s2.

Fantastic. I wanted to be sure I understood that correctly. Approving now.

Thanks all. Sounds right.

bdice

No reservations from me! Thanks @upsj!

harrism

Thanks @upsj !

harrism

Thanks @upsj !

harrism · 2022-08-06T02:07:22Z

@gpucibot merge

add device_uvector::reserve

7e19497

upsj requested a review from a team as a code owner August 3, 2022 07:00

upsj requested review from harrism and jrhemstad August 3, 2022 07:00

github-actions bot added the cpp Pertains to C++ code label Aug 3, 2022

add device_uvector::reserve to rmm-python

02aca70

harrism reviewed Aug 3, 2022

View reviewed changes

include/rmm/device_buffer.hpp Outdated Show resolved Hide resolved

implement device_buffer::reserve via swap

31a193e

upsj requested a review from a team as a code owner August 3, 2022 11:36

github-actions bot added the Python Related to RMM Python API label Aug 3, 2022

jrhemstad reviewed Aug 3, 2022

View reviewed changes

bdice reviewed Aug 3, 2022

View reviewed changes

bdice added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Aug 4, 2022

upsj added 3 commits August 4, 2022 06:51

move instead of swap

ab7a095

documentation fixes

57e5d1f

formatting

e01232a

harrism reviewed Aug 4, 2022

View reviewed changes

jrhemstad approved these changes Aug 4, 2022

View reviewed changes

bdice approved these changes Aug 4, 2022

View reviewed changes

harrism approved these changes Aug 6, 2022

View reviewed changes

harrism added this to PR-WIP in v22.10 Release via automation Aug 6, 2022

harrism changed the title ~~Add device_uvector::reserve~~ Add device_uvector::reserve and device_buffer::reserve Aug 6, 2022

rapids-bot bot merged commit d3b1dfb into rapidsai:branch-22.10 Aug 6, 2022

v22.10 Release automation moved this from PR-WIP to Done Aug 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add device_uvector::reserve and device_buffer::reserve #1079

Add device_uvector::reserve and device_buffer::reserve #1079

upsj commented Aug 3, 2022 •

edited

Loading

harrism left a comment

jrhemstad Aug 3, 2022

harrism Aug 3, 2022

bdice Aug 3, 2022

jrhemstad Aug 4, 2022

bdice Aug 4, 2022 •

edited

Loading

upsj Aug 4, 2022 •

edited

Loading

bdice Aug 3, 2022

harrism Aug 4, 2022

jrhemstad Aug 4, 2022

bdice Aug 4, 2022 •

edited

Loading

jrhemstad Aug 4, 2022

bdice Aug 4, 2022

harrism Aug 6, 2022

bdice left a comment

harrism left a comment

harrism left a comment

harrism commented Aug 6, 2022

	/**
	* @brief Move assignment operator moves the contents from `other`.
	*
	* This `device_buffer`'s current device memory allocation will be deallocated
	* on `stream()`.
	*
	* If a different stream is required, call `set_stream()` on
	* the instance before assignment. After assignment, this instance's stream is
	* replaced by the `other.stream()`.
	*
	* @param other The `device_buffer` whose contents will be moved.
	*/
	device_buffer& operator=(device_buffer&& other) noexcept
	{
	if (&other != this) {
	deallocate_async();

	_data = other._data;
	_size = other._size;
	_capacity = other._capacity;
	set_stream(other.stream());
	_mr = other._mr;

	other._data = nullptr;
	other._size = 0;
	other._capacity = 0;
	other.set_stream(cuda_stream_view{});
	}
	return *this;
	}

Add device_uvector::reserve and device_buffer::reserve #1079

Add device_uvector::reserve and device_buffer::reserve #1079

Conversation

upsj commented Aug 3, 2022 • edited Loading

harrism left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

upsj Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice left a comment

Choose a reason for hiding this comment

harrism left a comment

Choose a reason for hiding this comment

harrism left a comment

Choose a reason for hiding this comment

harrism commented Aug 6, 2022

upsj commented Aug 3, 2022 •

edited

Loading

bdice Aug 4, 2022 •

edited

Loading

upsj Aug 4, 2022 •

edited

Loading

bdice Aug 4, 2022 •

edited

Loading