Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix race condition in limiting resource adapter #869

Merged

Conversation

rongou
Copy link
Contributor

@rongou rongou commented Sep 13, 2021

Fixes #868

Also fixed some clang-tidy warnings.

@rongou rongou added bug Something isn't working 3 - Ready for review Ready for review by team non-breaking Non-breaking change cpp Pertains to C++ code labels Sep 13, 2021
@rongou rongou self-assigned this Sep 13, 2021
@rongou rongou requested a review from a team as a code owner September 13, 2021 18:13
@@ -130,11 +132,12 @@ class limiting_resource_adaptor final : public device_memory_resource {
void* p = nullptr;

std::size_t proposed_size = rmm::detail::align_up(bytes, allocation_alignment_);
if (proposed_size + allocated_bytes_ <= allocation_limit_) {
allocated_bytes_ += proposed_size;
if (allocated_bytes_ <= allocation_limit_) {
p = upstream_->allocate(bytes, stream);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if allocate throws? should we try/catch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, good point. We should probably catch the exception and decrement allocated_bytes_ and then rethrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 135 to 136
allocated_bytes_ += proposed_size;
if (allocated_bytes_ <= allocation_limit_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the fact that this is doing a redundant atomic load, I don't believe this is the semantics of what we want. This allows an intervening thread to impact the value of allocated_bytes_ between the fetch_add and the load.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question.

try {
return upstream_->allocate(bytes, stream);
} catch (...) {
allocated_bytes_ -= proposed_size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this need to be .fetch_sub?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operator-=(x) is equivalent to fetch_sub(x). fetch_add was used above to retrieve the old value. The old value isn't needed here, thus the less verbose operator-= could be used.

@harrism
Copy link
Member

harrism commented Sep 14, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit fe53a72 into rapidsai:branch-21.10 Sep 14, 2021
@rongou rongou deleted the fix-limiting-resource-adapter branch October 8, 2021 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for review Ready for review by team bug Something isn't working cpp Pertains to C++ code non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] limiting_resource_adaptor isn't actually thread safe
5 participants