-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slices: Insert/Replace sizing can produce immense GC churn #60134
Comments
The // Use append rather than make so that we bump the size of
// the slice up to the next storage class.
// This is what Grow does but we don't call Grow because
// that might copy the values twice.
s2 := append(S(nil), make(S, tot)...) Here So when you say
can you clarify what you mean? Because it seems to me that it does. Thanks. |
s1: len 516 cap 848 Changing 511 to 1023: (Arguably, perhaps the algorithm should really use the larger of the cap of the original slice, and the len of the new slice, to figure out which size to grow, but in practice I think that almost never matters.) When you append to nil, your size is determined entirely by rounding up to size class from the size of the appended thing. When you append to an existing slice that isn't big enough, your size is determined by the append-growth algorithm based on the current size. That's usually significantly more growth than rounding up by size class. Yes, the code (for Insert, not Replace) is using
This gives me 12 and 51 respectively. |
I do note the test case there is sort of dumb because it's doing allocations unconditionally.
This one shows 12 allocs per run through appending 1024 items using the existing slice logic, and 98 per run using the naive logic. Honestly not sure why 98 rather than, say, 51. It also reports that the second one takes >4x as long to run, which I'm slightly surprised by. |
Thanks. I guess the next question is: what behavior do people expect? People should not normally be building large slices using individual calls to |
FWIW I would have expected func Insert[S ~[]E, E any](s S, i int, v ...E) S {
if i < 0 || i > len(s) { panic("gnoes") }
s = append(s, make(S, len(v))...)
copy(s[i+len(v):], s[i:])
copy(s[i:], v)
return s
} I was kind of surprised that's not happening when reading the Slack discussion prompting this issue. And I think it would be reasonable to assume that I agree that |
I kind of disagree that " Thanks. |
So, I don't have a suggestion for a better algorithm overall; I just think that having the same growth characteristics as Hmm. I think... You can't really make an Actually, that could be a useful way to think about it: The expected/amortized cost of So I think this would get better behavior, in general:
The rationale is, this gets us the This doesn't make it sane to build million-item slices by inserts at random offsets, but it does reduce the overhead quite a bit. If we really want to solve for a more general problem... Honestly, I think at that point you want a Should be a simple matter of programming to design and implement that. But I think, in the mean time, "make |
Adding my $0.02 to this discussion: My problem requires millions of sorted slices, ingesting a 150GB file to provide an in-memory database.
I expected the slice to grow similarly to Instead, I found that this loop was a significant source of memory allocation and garbage collection. Particularly once the slices got over 10MB. My hack solution was to reimplement
Before making this change, |
I'm agreeing with @seebs here that the right, or at least right now, fix is to append to When inserting near the end of large slices, this gets you approximately the same behavior as When inserting near the start of large slices, this doesn't help so much. But if you're inserting near the start of large slices, you probably have bigger problems. GC, although expensive, is not O(n^2) expensive. |
Change https://go.dev/cl/495296 mentions this issue: |
At least when we're inserting/replacing near the end of a slice, when we have to grow it use the same multiplicative growth factor that the runtime uses for append. Before this CL, we would grow the slice one page (8192 bytes) at a time for large slices. This would cause O(n^2) work when appending near the end should only take O(n) work. This doesn't fix the problem if you insert/replace near the start of the array, but maybe that doesn't need fixing because it is O(n^2) anyway. Fixes golang#60134 Change-Id: If05376bc512ab839769180e5ce4cb929f47363b1 Reviewed-on: https://go-review.googlesource.com/c/go/+/495296 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
At least when we're inserting/replacing near the end of a slice, when we have to grow it use the same multiplicative growth factor that the runtime uses for append. Before this CL, we would grow the slice one page (8192 bytes) at a time for large slices. This would cause O(n^2) work when appending near the end should only take O(n) work. This doesn't fix the problem if you insert/replace near the start of the array, but maybe that doesn't need fixing because it is O(n^2) anyway. Fixes golang#60134 Change-Id: If05376bc512ab839769180e5ce4cb929f47363b1 Reviewed-on: https://go-review.googlesource.com/c/go/+/495296 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
What version of Go are you using (
go version
)?(N/A, I don't think this released yet?)
Does this issue reproduce with the latest release?
N/A
What operating system and processor architecture are you using (
go env
)?N/A
What did you do?
Read the source code for slices.go.
What did you expect to see?
That's a really good question. I think I expected slices.Insert() to have allocation behavior more similar to append().
What did you see instead?
Insert computes the needed size, then uses
append(nil, make(..., tot))
to get that size, rounded up to next size class.Replace computes the needed size, then uses
make(..., tot)
to get exactly that size.Neither of them uses the usual append slice growth strategy. Which is surprising, because Insert is using append. But it's using append on a nil slice, not on the original slice, so it's not picking up the usual behavior.
I think that, as a user, I'd have been surprised by this; I'd have expected insert to have the same behavior as append. And it's true that this doesn't actually change algorithmic complexity -- we're always going to have copying costs linear with the size of the original slice, because the offset
i
is assumed to be randomly distributed somewhere in that slice. But it does mean that, if you do a lot of single-item inserts, you will pay a lot more allocation cost than you would if you were using append.One possibility might be something like:
This would pick up the expected append growth patterns from the existing slice's size, rather than just rounding up by size class. I think this would be a nice improvement; if you actually want to coerce a slice to not grow, there's ways, but Insert seems like it should be as efficient as possible for large-scale insertions, and having it generate a lot of garbage seems like it'll hurt that.
Similarly, for Replace:
(same thing, except using j rather than i as the right-side offset).
This would produce the desireable trait that Replace(s, i, i, v...) behaves precisely like Insert(s, i, v...), rather than producing a result slice which is not rounded up to the next size class (in cases where reslicing is in fact needed).
Alternatively, if it's too late to make changes to semantics: Maybe have a second pair of functions, differently named, that have the expected behavior. And if this can't be changed, at least the documentation should be explicit that, for Replace, there is no room for new items if the slice is resliced, and that for Insert, the growth algorithm results in a lot more reslices and allocations than you'd get with similar/corresponding append operations.
The text was updated successfully, but these errors were encountered: