-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: consider adding 24-byte size class #8885
Comments
See useful discussion in dup #17633 too. |
Should we try this? |
Just a reference point: The package syntax AST nodes are all bigger than 24bytes (on 64bit platforms), except for one node (EmptyStmt), which virtually never occurs. All nodes have a 16byte minimal cost (position information), and most nodes have two fields in addition, or perhaps a single interface or string field which bumps up the size to at least 32bytes. For this package I think the sweet spot is actually higher, perhaps around 40 or 56 bytes. I think @dvyukov's tree node example is misleading: It's unlikely in the real world for the payload to be just an int. I could see a string, but that would bump that example up to 32bytes. I think many light-weight (but large) data structures are also not so homogenous in terms of node types, and the way to define large graphs of non-homogenous node (types) in Go is to use interfaces. In other words, I suspect a node is more likely to look like I'd make the experiment and add a 40byte size class. In the "best" case that could still reduce waste by ~20% (40bytes instead of 48bytes). |
Didn't mean to close this. |
A slice is 24 bytes, though. I have to imagine that they're somewhat common. Actual numbers from running all.bash or something would be interesting, though. For reference here are the current size classes: https://golang.org/src/runtime/sizeclasses.go |
A slice rarely comes alone. Also, we don't usually have pointers to slices. |
New mallocgc instrumentation: if size == 8 {
if typ != nil {
println("mallocgc", size, typ.string())
} else {
println("mallocgc", size, "??")
}
} Top 10 for size == 8:
In my experience, allocation of [1]T types are the result of constructing a single-element slice to pass to a variadic function. Top 10 for size == 24:
Of these, only ssa.use and ld.chain are linked lists. |
Not easily. But I tried a different experiment that hopefully will shed some light. I changed mksizeclasses.go to explicitly prevent the 32 byte size class: diff --git a/src/runtime/mksizeclasses.go b/src/runtime/mksizeclasses.go
index b146dbcd6c..37d6b49be1 100644
--- a/src/runtime/mksizeclasses.go
+++ b/src/runtime/mksizeclasses.go
@@ -136,6 +136,9 @@ func makeClasses() []class {
classes[len(classes)-1].size = size
continue
}
+ if size == 32 {
+ continue
+ }
classes = append(classes, class{size: size, npages: npages})
}
@@ -157,7 +160,7 @@ func makeClasses() []class {
}
}
- if len(classes) != 67 {
+ if len(classes) != 66 {
panic("number of size classes has changed")
} and then regenerated sizeclasses.go and compared compiler performance. Results:
So not having a 32 byte size classes slows things down, but not by very much. Predictably, it increases the overall amount of allocation. Surprisingly, the number of allocs go down. I believe that this is because append rounds up to the nearest size class, so append-based growth happens more aggressively, thereby avoiding some allocations. I did the same experiment eliminating the 48 byte size class (and restoring the 32 byte size class). Results:
Similar, but less impactful. I'm not sure exactly what to make of these numbers. Is a 0.5% speed improvement significant enough? Not sure. But at least this gives us a rough feel. |
Thanks @josharian, this is good information. Clearly, my intuition was wrong here. Regarding the 8byte allocations that might come from single-element slices for variadic functions: fmt.Printf comes to mind... (fmt.go for creating type name symbols?). |
We'll consider doing this for 1.15 |
This issue is currently labeled as early-in-cycle for Go 1.15. That time is now, so friendly ping. If it no longer needs to be done early in cycle, that label can be removed. |
Still working on the heapbitmap code. Likely ready in march, just not the first week the tree opens. |
Unfortunately due to shifts in time commitments I wont be able to work on this anymore for the go1.15 cycle. |
Update: I have a ./all.bash passing CL that adds the new size class. Was a bit more complicated then I initially thought due to being the only size class (on amd64) above 8 bytes that isnt 16byte aligned, some outdated or to me misreadable documentation about checkmarks and I initally forgot that the size class also allows slices of 3 pointers so needs unrolling of the ptrmask into the heap bits.. Currently doing documentation and benchmarks. My current plan is to be finished before the tree opens for go1.16 so this can get in early. |
Neat! FYI, I already have a CL (not yet mailed) to move checkmarks out of the bitmap and was planning to send it for 1.16, precisely because they make everything with the bitmap annoying. :) I could send that now and you could rebase on top of it if that would be helpful? |
Depending on if the remove checkmarks from heapbits CL gets more complex if the 24byte size class is merged first I think removing the checkmark from heapbits can go in first. It at least would make this CL easier by not needing to change the set and check checkmark functions. Some masks may need to be adjusted for heapBitsSetType but that should be trivial. My complication was more around understanding when to set them or not from the existing code. The documentation and comments in heapBitsSetType seem to not align. code not concerned if checkmark set or not set: Line 963 in 3a43226
seems checkmark bit actually set to 0 usually in heapBitsSetType. the actual setting doesnt matter since it seems to be cleared when initializing checkmarking: Line 61 in 3a43226
|
Change https://golang.org/cl/242258 mentions this issue: |
The text was updated successfully, but these errors were encountered: