-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
List overallocation strategy #82554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently list uses the following formula for allocating an array for items (if size != 0): allocated = size + (size >> 3) + (3 if size < 9 else 6) If add items by 1 the growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... I think this strategy can be improved.
So it may be better to allocate a multiple of two or four items. Few first sizes are multiple of four, so this is good multiplier. I suggest to use the following expression: allocated = (size + (size >> 3) + 6) & ~3 It adds bits AND, but removes conditional value.
>>> import sys
>>> sys.getsizeof([1, 2, 3])
80
>>> sys.getsizeof([*(1, 2, 3)])
104 List display allocates a space for exactly 3 items. But a list created from a tuple allocates a space for 6 items. Other example. Let extend a list by 10 items at a time. size allocated We need to reallocate an array after first four steps. 17 is too small for 20, 28 is too small for 30, 39 is too small for 40. So overallocating does not help, it just spends a space. My idea, that if we adds several items at a time and need to reallocate an array, we check if the overallocated size is enough for adding the same amount of items next time. If it is not enough, we do not overallocate. The final algorithm is: allocated = (size + (size >> 3) + 6) & ~3
if size - old_size > allocated - size:
allocated = (size + 3) & ~3 It will save a space if extend a list by many items few times. |
We should definitely revisit the over-allocation strategy. I last worked on the existing strategy back in 2004. Since then, the weighting of the speed/space trade-off considerations have changed. We need to keep the amortized O(1) append() behavior, but possibly we would benefit from more over-allocation and fewer resizes. Also, we should look at the interaction with the small object allocator to see if its design is causing realloc() to frequently have to move data rather than extending in place. |
Here is some data. step is the number of items added at a time, the first row is the size, the second row is the currently allocated size, the third row is the proposed allocated size. for step in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10):
sizes0 = range(step, step*31, step)
sizes1 = []
sizes2 = []
old_size = allocated1 = allocated2 = 0
for size in sizes0:
if size > allocated1:
allocated1 = size + (size >> 3) + (3 if size < 9 else 6)
sizes1.append(allocated1)
if size > allocated2:
allocated2 = (size + (size >> 3) + 6) & ~3
if size - old_size > allocated2 - size:
allocated2 = (size + 3) & ~3
sizes2.append(allocated2)
old_size = size
print('\nstep =', step)
print(' '.join(f'{x:3}' for x in sizes0))
print(' '.join(f'{x:3}' for x in sizes1))
print(' '.join(f'{x:3}' for x in sizes2))
step = 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
4 4 4 4 8 8 8 8 16 16 16 16 16 16 16 16 25 25 25 25 25 25 25 25 25 35 35 35 35 35
4 4 4 4 8 8 8 8 16 16 16 16 16 16 16 16 24 24 24 24 24 24 24 24 32 32 32 32 32 32
step = 2
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60
5 5 9 9 17 17 17 17 26 26 26 26 26 37 37 37 37 37 48 48 48 48 48 48 62 62 62 62 62 62
8 8 8 8 16 16 16 16 24 24 24 24 32 32 32 32 44 44 44 44 44 44 56 56 56 56 56 56 68 68
step = 3
3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90
6 6 16 16 16 26 26 26 36 36 36 36 49 49 49 49 63 63 63 63 63 80 80 80 80 80 97 97 97 97
8 8 16 16 16 24 24 24 36 36 36 36 48 48 48 48 60 60 60 60 76 76 76 76 76 92 92 92 92 92
step = 4
4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 104 108 112 116 120
7 12 12 24 24 24 37 37 37 51 51 51 64 64 64 64 82 82 82 82 100 100 100 100 100 123 123 123 123 123
8 8 16 16 28 28 28 40 40 40 52 52 52 68 68 68 68 84 84 84 84 104 104 104 104 104 124 124 124 124
step = 5
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150
8 17 17 28 28 39 39 51 51 51 67 67 67 84 84 84 101 101 101 101 124 124 124 124 146 146 146 146 146 174
8 16 16 28 28 36 36 48 48 60 60 60 76 76 76 96 96 96 96 116 116 116 116 140 140 140 140 140 168 168
step = 6
6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102 108 114 120 126 132 138 144 150 156 162 168 174 180
9 19 19 33 33 46 46 60 60 60 80 80 80 100 100 100 120 120 120 120 147 147 147 147 174 174 174 174 174 208
12 12 24 24 36 36 52 52 64 64 80 80 80 100 100 100 120 120 120 120 144 144 144 144 172 172 172 172 200 200
step = 7
7 14 21 28 35 42 49 56 63 70 77 84 91 98 105 112 119 126 133 140 147 154 161 168 175 182 189 196 203 210
10 21 21 37 37 53 53 69 69 84 84 84 108 108 108 132 132 132 155 155 155 155 187 187 187 187 218 218 218 218
8 16 28 28 44 44 60 60 76 76 92 92 92 116 116 116 136 136 136 160 160 160 184 184 184 184 216 216 216 216
step = 8
8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128 136 144 152 160 168 176 184 192 200 208 216 224 232 240
12 24 24 42 42 60 60 78 78 96 96 96 123 123 123 150 150 150 177 177 177 177 213 213 213 213 249 249 249 249
8 24 24 40 40 60 60 76 76 96 96 96 120 120 120 148 148 148 176 176 176 176 212 212 212 212 248 248 248 248
step = 9
9 18 27 36 45 54 63 72 81 90 99 108 117 126 135 144 153 162 171 180 189 198 207 216 225 234 243 252 261 270
16 26 36 36 56 56 76 76 97 97 117 117 117 147 147 147 178 178 178 208 208 208 208 249 249 249 249 289 289 289
12 20 36 36 56 56 76 76 96 96 116 116 136 136 136 168 168 168 196 196 196 228 228 228 228 268 268 268 268 308
step = 10
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300
17 28 39 51 51 73 73 96 96 118 118 141 141 141 174 174 174 208 208 208 242 242 242 242 287 287 287 287 332 332
12 20 32 40 60 60 84 84 104 104 128 128 152 152 152 184 184 184 216 216 216 252 252 252 252 296 296 296 296 340 |
WRT pymalloc, it will always copy on growing resize in this context. A pymalloc pool is dedicated to blocks of the same size class, so if the size class increases (they're 16 bytes apart now), the data must be copied to a different pool (dedicated to blocks of the larger size class). |
That sounds like an excellent reason NOT to use pymalloc ;-) |
Don't know. Define "the problem" ;-) As soon as the allocation is over 512 bytes (64 pointers), it's punted to the system malloc family. Before then, do a relative handful of relatively small memcpy's really matter? pymalloc is faster than system mallocs, and probably uses space more efficiently. There are no pure wins here :-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: