-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM][ADT] Explicitly convert size_t values to SmallVector's size type #77939
[LLVM][ADT] Explicitly convert size_t values to SmallVector's size type #77939
Conversation
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-llvm-adt Author: Andrei Golubev (andrey-golubev) ChangesMultiple places rely on implicit conversion when assigning 'size_t' values to the member fields (size or capacity) of SmallVector. Depending on the platform / compiler configuration, this may result in narrowing conversion warnings (especially given that the size type of SmallVector's member fields is determined based on type T - in SmallVector<T>). To avoid the problem altogether, make the conversions explicit. Full diff: https://github.com/llvm/llvm-project/pull/77939.diff 1 Files Affected:
diff --git a/llvm/include/llvm/ADT/SmallVector.h b/llvm/include/llvm/ADT/SmallVector.h
index 2e6d2dc6ce90a2..65e862911522a8 100644
--- a/llvm/include/llvm/ADT/SmallVector.h
+++ b/llvm/include/llvm/ADT/SmallVector.h
@@ -61,7 +61,7 @@ template <class Size_T> class SmallVectorBase {
SmallVectorBase() = delete;
SmallVectorBase(void *FirstEl, size_t TotalCapacity)
- : BeginX(FirstEl), Capacity(TotalCapacity) {}
+ : BeginX(FirstEl), Capacity(static_cast<Size_T>(TotalCapacity)) {}
/// This is a helper for \a grow() that's out of line to reduce code
/// duplication. This function will report a fatal error if it can't grow at
@@ -99,8 +99,9 @@ template <class Size_T> class SmallVectorBase {
///
/// This does not construct or destroy any elements in the vector.
void set_size(size_t N) {
- assert(N <= capacity());
- Size = N;
+ auto n = static_cast<Size_T>(N);
+ assert(n <= capacity());
+ Size = n;
}
};
@@ -468,7 +469,7 @@ void SmallVectorTemplateBase<T, TriviallyCopyable>::takeAllocationForGrow(
free(this->begin());
this->BeginX = NewElts;
- this->Capacity = NewCapacity;
+ this->Capacity = static_cast<SmallVectorSizeType<T>>(NewCapacity);
}
/// SmallVectorTemplateBase<TriviallyCopyable = true> - This is where we put
|
CC @dexonsmith @alinas @nikic @browneee you seem to be working on / around SmallVector! feel free to add other reviewers as you see fit. |
llvm/include/llvm/ADT/SmallVector.h
Outdated
assert(N <= capacity()); | ||
Size = N; | ||
auto n = static_cast<Size_T>(N); | ||
assert(n <= capacity()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to also assert that n == N
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capacity()
also returns size_t
, so probably better to keep the assertion first. e.g.
assert(N <= capacity());
Size = static_cast<Size_T>(N);
// As Felipe suggests, you could add:
assert(static_cast<size_t>(Size) == N);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, @browneee 's suggestion here is more clear / easier to follow. n
vs. N
is pretty subtle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking a little more: N <= capacity()
implies static_cast<size_t>(Size) == N
because capacity()
is from a Size_T variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capacity() also returns size_t, so probably better to keep the assertion first.
good point. I'm not sure why our downstream had it differently. "reverted" this part.
overall, rewrote this part as per @browneee 's advice. I think it makes sense to keep the assert(static_cast<size_t>(Size) == N);
at the end as well. if anything, it's a function's postcondition validation so should be no harm (also, wondering if it might actually catch any data-losing size_t -> uint32_t
narrowings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to keep the assert(static_cast<size_t>(Size) == N); at the end as well.
This assertion is correct, but not necessary because it is implied by assert(N <= capacity());
.
I would avoid an unnecessary assertion, because this is performance sensitive code.
If you want to keep it for clarity then I'd suggest only doing it in debug:
LLVM_DEBUG(assert(static_cast<size_t>(Size) == N));
My vote would be to omit it, because it is covered by the earlier assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought assert() is by definition debug-only (exception being if one explicitly sets LLVM_ENABLE_ASSERTIONS
). do you think people use assert()
via LLVM_ENABLE_ASSERTIONS in production?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's covered by the earlier assertion, seems like all we need is a comment explaining that. Anyone sufficiently interested can read the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the comment.
llvm/include/llvm/ADT/SmallVector.h
Outdated
@@ -468,7 +469,7 @@ void SmallVectorTemplateBase<T, TriviallyCopyable>::takeAllocationForGrow( | |||
free(this->begin()); | |||
|
|||
this->BeginX = NewElts; | |||
this->Capacity = NewCapacity; | |||
this->Capacity = static_cast<SmallVectorSizeType<T>>(NewCapacity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a set_capacity(size_t)
method to SmallVectorBase
, to make this cast a bit neater?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added. please also take a look at the new function (i'm not a native English speaker so maybe there are errors in the docs. albeit I pretty much copy-pasted the relevant set_size
wording).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
After seeing this change, I think it would probably make sense to include this line in the new function:
this->BeginX = NewElts;
... setting the allocation pointer and the capacity together makes sense.
This would also change the name of the new function:
/// Set the allocation to \p A and the capacity to \p M.
///
/// This does not construct or destroy any elements in the vector.
/// This does not clean up any existing allocation.
void set_allocation_and_capacity(void *A, size_t M) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm actually hesitant a bit: the set_capacity()
has a much broader usage scope than the new function. i guess there are also cases when we only want to set the capacity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess there are also cases when we only want to set the capacity?
Such as?
I think setting a capacity is intrinsically linked to setting a new storage pointer, which is why it makes sense to do both at once..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I agree with @browneee; these are intrinsically linked; the point of capacity is to understand the storage pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess there are also cases when we only want to set the capacity?
Such as?
Was thinking of realloc() but we'd anyway reset the pointer (since same memory block being "reused" is an implementation detail).
I think setting a capacity is intrinsically linked to setting a new storage pointer, which is why it makes sense to do both at once..
yeah, fair point. I was wondering whether data start != allocation start
but these seem to always be equal (I guess the case when they are not is the custom alignment requirements but I haven't found anything like this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing is whether set_allocation_and_capacity
is a good name. to me, an "allocation" would mean both the start and the capacity (or end). so named the function set_allocation_range
(as a tribute to "counted range"). There's however already replaceAllocation()
so maybe set_allocation
is enough? anyways, guessing it's for you guys to decide what makes sense here since you seem to interact with this code more often than me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer set_allocation_and_capacity
, but I think any of these options are fine.
llvm/include/llvm/ADT/SmallVector.h
Outdated
/// | ||
/// This does not construct or destroy any elements in the vector. | ||
void set_capacity(size_t M) { | ||
assert(size() <= M); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this assertion is not useful, and may be misleading.
An assertion that would be clear and useful here would be:
assert(M <= SizeTypeMax());
...before assigning the elements and capacity.
You also wouldn't need assert(capacity() == M);
as this assertion implies it will be successful. So remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, the assertion you propose would indeed be useful!
size() <= M
is to just guarantee that we don't end up in a weird container state where Size > Capacity
. but then using set_size() together with set_capacity() in the current sense would be problematic:
// capacity = 50, size = 40
// grow:
// set_size(70); // asserts: 70 <= capacity() fails
set_capacity(100); // OK
set_size(70); // OK
// shrink:
// set_capacity(10); // asserts: size() <= 10 fails
set_size(2); // OK
set_capacity(10); // OK
I guess due to this "swap" in the order, it might make it harder to write generic code? but then I again think of Size > Capacity
problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(M <= SizeTypeMax());
btw, I guess this could be added to set_size()
as well. then equality check (size() == N
) at the end also gets superseded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if the existing assert(N <= capacity());
passes then I think assert(M <= SizeTypeMax());
must be true.
I think:
set_size
should only checkassert(N <= capacity());
set_capacity
should only checkassert(M <= SizeTypeMax());
Using set_capacity to shrink (regardless of the size) doesn't make sense, because set_capacity
would not actually handle shrinking (only a small piece of shrinking - switching to a smaller allocation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Multiple places rely on implicit conversion when assigning 'size_t' values to the member fields (size or capacity) of SmallVector. Depending on the platform / compiler configuration, this may result in narrowing conversion warnings (especially given that the size type of SmallVector's member fields is determined based on type T - in SmallVector<T>). To avoid the problem altogether, make the conversions explicit. Co-authored-by: Orest Chura <orest.chura@intel.com>
9eebfbc
to
5e2626b
Compare
@dexonsmith added the comment but also squashed the changes for further merge. could you click the merge (if you think the changes are correct)? I lack permissions for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too.
Looks like the |
@dexonsmith ping. The build finished over the weekends. |
Multiple places rely on implicit conversion when assigning 'size_t' values to the member fields (size or capacity) of SmallVector.
Depending on the platform / compiler configuration, this may result in narrowing conversion warnings (especially given that the size type of SmallVector's member fields is determined based on type T - in SmallVector). To avoid the problem altogether, make the conversions explicit.