Skip to content

Commit b8d08e9

Browse files
browneeedexonsmith
authored andcommitted
ADT: SmallVector size/capacity use word-size integers when elements are small
SmallVector currently uses 32bit integers for size and capacity to reduce sizeof(SmallVector). This limits the number of elements to UINT32_MAX. For a SmallVector<char>, this limits the SmallVector size to only 4GB. Buffering bitcode output uses SmallVector<char>, but needs >4GB output. This changes SmallVector size and capacity to conditionally use word-size integers if the element type is small (<4 bytes). For larger elements types, the vector size can reach ~16GB with 32bit size. Making this conditional on the element type provides both the smaller sizeof(SmallVector) for larger types which are unlikely to grow so large, and supports larger capacities for smaller element types. This change also includes a fix for the bug where a SmallVector with 32bit size has reached UINT32_MAX elements, and cannot provide guaranteed growth. Context: // Double the size of the allocated memory, guaranteeing space for at // least one more element or MinSize if specified. void grow(size_t MinSize = 0) { this->grow_pod(MinSize, sizeof(T)); } void push_back(const T &Elt) { if (LLVM_UNLIKELY(this->size() >= this->capacity())) this->grow(); memcpy(reinterpret_cast<void *>(this->end()), &Elt, sizeof(T)); this->set_size(this->size() + 1); } When grow is called in push_back() without a MinSize specified, this is relying on the guarantee of space for at least one more element. There is an edge case bug where the SmallVector is already at its maximum size and push_back() calls grow() with default MinSize of zero. Grow is unable to provide space for one more element, but push_back() assumes the additional element it will be available. This can result in silent memory corruption, as this->end() will be an invalid pointer and the program may continue executing. An alternative to this fix would be to remove the default argument from grow(), which would mean several changing grow() to grow(this->size()+1) in several places. No test case added because it would require allocating a large ammount. Differential Revision: https://reviews.llvm.org/D77621
1 parent 73b7dd1 commit b8d08e9

File tree

2 files changed

+85
-43
lines changed

2 files changed

+85
-43
lines changed

llvm/include/llvm/ADT/SmallVector.h

Lines changed: 82 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -16,37 +16,82 @@
1616
#include "llvm/ADT/iterator_range.h"
1717
#include "llvm/Support/AlignOf.h"
1818
#include "llvm/Support/Compiler.h"
19+
#include "llvm/Support/ErrorHandling.h"
1920
#include "llvm/Support/MathExtras.h"
2021
#include "llvm/Support/MemAlloc.h"
2122
#include "llvm/Support/type_traits.h"
22-
#include "llvm/Support/ErrorHandling.h"
2323
#include <algorithm>
2424
#include <cassert>
2525
#include <cstddef>
2626
#include <cstdlib>
2727
#include <cstring>
2828
#include <initializer_list>
2929
#include <iterator>
30+
#include <limits>
3031
#include <memory>
3132
#include <new>
3233
#include <type_traits>
3334
#include <utility>
3435

3536
namespace llvm {
3637

37-
/// This is all the non-templated stuff common to all SmallVectors.
38-
class SmallVectorBase {
38+
/// This is all the stuff common to all SmallVectors.
39+
///
40+
/// The template parameter specifies the type which should be used to hold the
41+
/// Size and Capacity of the SmallVector, so it can be adjusted.
42+
/// Using 32 bit size is desirable to shink the size of the SmallVector.
43+
/// Using 64 bit size is desirable for cases like SmallVector<char>, where a
44+
/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
45+
/// buffering bitcode output - which can exceed 4GB.
46+
template <class Size_T> class SmallVectorBase {
3947
protected:
4048
void *BeginX;
41-
unsigned Size = 0, Capacity;
49+
Size_T Size = 0, Capacity;
50+
51+
/// The maximum value of the Size_T used.
52+
static constexpr size_t SizeTypeMax() {
53+
return std::numeric_limits<Size_T>::max();
54+
}
4255

4356
SmallVectorBase() = delete;
4457
SmallVectorBase(void *FirstEl, size_t TotalCapacity)
4558
: BeginX(FirstEl), Capacity(TotalCapacity) {}
4659

4760
/// This is an implementation of the grow() method which only works
4861
/// on POD-like data types and is out of line to reduce code duplication.
49-
void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize);
62+
/// This function will report a fatal error if it cannot increase capacity.
63+
void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize) {
64+
// Ensure we can fit the new capacity.
65+
// This is only going to be applicable if the when the capacity is 32 bit.
66+
if (MinCapacity > SizeTypeMax())
67+
report_bad_alloc_error("SmallVector capacity overflow during allocation");
68+
69+
// Ensure we can meet the guarantee of space for at least one more element.
70+
// The above check alone will not catch the case where grow is called with a
71+
// default MinCapacity of 0, but the current capacity cannot be increased.
72+
// This is only going to be applicable if the when the capacity is 32 bit.
73+
if (capacity() == SizeTypeMax())
74+
report_bad_alloc_error("SmallVector capacity unable to grow");
75+
76+
// In theory 2*capacity can overflow if the capacity is 64 bit, but the
77+
// original capacity would never be large enough for this to be a problem.
78+
size_t NewCapacity = 2 * capacity() + 1; // Always grow.
79+
NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax());
80+
81+
void *NewElts;
82+
if (BeginX == FirstEl) {
83+
NewElts = safe_malloc(NewCapacity * TSize);
84+
85+
// Copy the elements over. No need to run dtors on PODs.
86+
memcpy(NewElts, this->BeginX, size() * TSize);
87+
} else {
88+
// If this wasn't grown from the inline copy, grow the allocated space.
89+
NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
90+
}
91+
92+
this->BeginX = NewElts;
93+
this->Capacity = NewCapacity;
94+
}
5095

5196
public:
5297
size_t size() const { return Size; }
@@ -69,17 +114,24 @@ class SmallVectorBase {
69114
}
70115
};
71116

117+
template <class T>
118+
using SmallVectorSizeType =
119+
typename std::conditional<sizeof(T) < 4, uintptr_t, uint32_t>::type;
120+
72121
/// Figure out the offset of the first element.
73122
template <class T, typename = void> struct SmallVectorAlignmentAndSize {
74-
AlignedCharArrayUnion<SmallVectorBase> Base;
123+
AlignedCharArrayUnion<SmallVectorBase<SmallVectorSizeType<T>>> Base;
75124
AlignedCharArrayUnion<T> FirstEl;
76125
};
77126

78127
/// This is the part of SmallVectorTemplateBase which does not depend on whether
79128
/// the type T is a POD. The extra dummy template argument is used by ArrayRef
80129
/// to avoid unnecessarily requiring T to be complete.
81130
template <typename T, typename = void>
82-
class SmallVectorTemplateCommon : public SmallVectorBase {
131+
class SmallVectorTemplateCommon
132+
: public SmallVectorBase<SmallVectorSizeType<T>> {
133+
using Base = SmallVectorBase<SmallVectorSizeType<T>>;
134+
83135
/// Find the address of the first element. For this pointer math to be valid
84136
/// with small-size of 0 for T with lots of alignment, it's important that
85137
/// SmallVectorStorage is properly-aligned even for small-size of 0.
@@ -91,21 +143,20 @@ class SmallVectorTemplateCommon : public SmallVectorBase {
91143
// Space after 'FirstEl' is clobbered, do not add any instance vars after it.
92144

93145
protected:
94-
SmallVectorTemplateCommon(size_t Size)
95-
: SmallVectorBase(getFirstEl(), Size) {}
146+
SmallVectorTemplateCommon(size_t Size) : Base(getFirstEl(), Size) {}
96147

97148
void grow_pod(size_t MinCapacity, size_t TSize) {
98-
SmallVectorBase::grow_pod(getFirstEl(), MinCapacity, TSize);
149+
Base::grow_pod(getFirstEl(), MinCapacity, TSize);
99150
}
100151

101152
/// Return true if this is a smallvector which has not had dynamic
102153
/// memory allocated for it.
103-
bool isSmall() const { return BeginX == getFirstEl(); }
154+
bool isSmall() const { return this->BeginX == getFirstEl(); }
104155

105156
/// Put this vector in a state of being small.
106157
void resetToSmall() {
107-
BeginX = getFirstEl();
108-
Size = Capacity = 0; // FIXME: Setting Capacity to 0 is suspect.
158+
this->BeginX = getFirstEl();
159+
this->Size = this->Capacity = 0; // FIXME: Setting Capacity to 0 is suspect.
109160
}
110161

111162
public:
@@ -123,6 +174,10 @@ class SmallVectorTemplateCommon : public SmallVectorBase {
123174
using pointer = T *;
124175
using const_pointer = const T *;
125176

177+
using Base::capacity;
178+
using Base::empty;
179+
using Base::size;
180+
126181
// forward iterator creation methods.
127182
iterator begin() { return (iterator)this->BeginX; }
128183
const_iterator begin() const { return (const_iterator)this->BeginX; }
@@ -136,7 +191,9 @@ class SmallVectorTemplateCommon : public SmallVectorBase {
136191
const_reverse_iterator rend() const { return const_reverse_iterator(begin());}
137192

138193
size_type size_in_bytes() const { return size() * sizeof(T); }
139-
size_type max_size() const { return size_type(-1) / sizeof(T); }
194+
size_type max_size() const {
195+
return std::min(this->SizeTypeMax(), size_type(-1) / sizeof(T));
196+
}
140197

141198
size_t capacity_in_bytes() const { return capacity() * sizeof(T); }
142199

@@ -231,12 +288,21 @@ class SmallVectorTemplateBase : public SmallVectorTemplateCommon<T> {
231288
// Define this out-of-line to dissuade the C++ compiler from inlining it.
232289
template <typename T, bool TriviallyCopyable>
233290
void SmallVectorTemplateBase<T, TriviallyCopyable>::grow(size_t MinSize) {
234-
if (MinSize > UINT32_MAX)
291+
// Ensure we can fit the new capacity.
292+
// This is only going to be applicable if the when the capacity is 32 bit.
293+
if (MinSize > this->SizeTypeMax())
235294
report_bad_alloc_error("SmallVector capacity overflow during allocation");
236295

296+
// Ensure we can meet the guarantee of space for at least one more element.
297+
// The above check alone will not catch the case where grow is called with a
298+
// default MinCapacity of 0, but the current capacity cannot be increased.
299+
// This is only going to be applicable if the when the capacity is 32 bit.
300+
if (this->capacity() == this->SizeTypeMax())
301+
report_bad_alloc_error("SmallVector capacity unable to grow");
302+
237303
// Always grow, even from zero.
238304
size_t NewCapacity = size_t(NextPowerOf2(this->capacity() + 2));
239-
NewCapacity = std::min(std::max(NewCapacity, MinSize), size_t(UINT32_MAX));
305+
NewCapacity = std::min(std::max(NewCapacity, MinSize), this->SizeTypeMax());
240306
T *NewElts = static_cast<T*>(llvm::safe_malloc(NewCapacity*sizeof(T)));
241307

242308
// Move the elements over.

llvm/lib/Support/SmallVector.cpp

Lines changed: 3 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -36,30 +36,6 @@ static_assert(sizeof(SmallVector<Struct32B, 0>) >= alignof(Struct32B),
3636
static_assert(sizeof(SmallVector<void *, 1>) ==
3737
sizeof(unsigned) * 2 + sizeof(void *) * 2,
3838
"wasted space in SmallVector size 1");
39-
40-
/// grow_pod - This is an implementation of the grow() method which only works
41-
/// on POD-like datatypes and is out of line to reduce code duplication.
42-
void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
43-
size_t TSize) {
44-
// Ensure we can fit the new capacity in 32 bits.
45-
if (MinCapacity > UINT32_MAX)
46-
report_bad_alloc_error("SmallVector capacity overflow during allocation");
47-
48-
size_t NewCapacity = 2 * capacity() + 1; // Always grow.
49-
NewCapacity =
50-
std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
51-
52-
void *NewElts;
53-
if (BeginX == FirstEl) {
54-
NewElts = safe_malloc(NewCapacity * TSize);
55-
56-
// Copy the elements over. No need to run dtors on PODs.
57-
memcpy(NewElts, this->BeginX, size() * TSize);
58-
} else {
59-
// If this wasn't grown from the inline copy, grow the allocated space.
60-
NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
61-
}
62-
63-
this->BeginX = NewElts;
64-
this->Capacity = NewCapacity;
65-
}
39+
static_assert(sizeof(SmallVector<char, 0>) ==
40+
sizeof(void *) * 2 + sizeof(void *),
41+
"1 byte elements have word-sized type for size and capacity");

0 commit comments

Comments
 (0)