runtime.heapBitsSlice constructs a slice of heap ptr/scalar bits as a notInHeapSlice.
Since these bits live beyond mspan.limit, the GC must not encounter any pointers to them. findObject will throw due to an out-of-bounds pointer.
notInHeapSlice will hide the pointer from the GC [1]. Unfortunately, heapBitsSlice immediately converts the notInHeapSlice to []uintptr, a standard slice type. The data field of this slice will be marked as a pointer and is thus visible to the GC.
Thus if this slice is live when the GC runs, findObject will throw when it sees the invalid pointer.
This slice is ultimately returned from runtime.(*mspan).heapBits. The callers of heapBits all keep the slice live for a pretty short period. So previously, I believe we got lucky because (a) we don't async preempt in the runtime, and (b) there were no explicit preemption points while the slice is live.
Unfortunately, https://go.dev/cl/750480 adds a preemption point at the clear in initHeapBits. Within Google, we've found cases of applications crashing due to a preemption at this point allowing the GC to discover the bad slice.
My current plan is to revert https://go.dev/cl/750480 (and follow-up CLs) to get the tree back into a stable state. After the revert, we should fix this improper use of notInHeapSlice.
Note that most other uses of notInHeapSlice also convert to a standard slice, but all other uses are pointers to objects from persistentalloc or direct mmap. findObject ignores such pointers, so their visibility to the GC is not problematic.
[1] I think https://cs.opensource.google/go/go/+/master:src/cmd/compile/internal/types/size.go;drc=7c42da1bc5369d7ba174328908ecc9172578f6d2;bpv=1;bpt=1;l=714 records *notInHeap as a scalar, though I'm actually not completely sure.
cc @randall77 @golang/runtime
runtime.heapBitsSliceconstructs a slice of heap ptr/scalar bits as anotInHeapSlice.Since these bits live beyond
mspan.limit, the GC must not encounter any pointers to them.findObjectwill throw due to an out-of-bounds pointer.notInHeapSlicewill hide the pointer from the GC [1]. Unfortunately,heapBitsSliceimmediately converts thenotInHeapSliceto[]uintptr, a standard slice type. The data field of this slice will be marked as a pointer and is thus visible to the GC.Thus if this slice is live when the GC runs,
findObjectwill throw when it sees the invalid pointer.This slice is ultimately returned from
runtime.(*mspan).heapBits. The callers ofheapBitsall keep the slice live for a pretty short period. So previously, I believe we got lucky because (a) we don't async preempt in the runtime, and (b) there were no explicit preemption points while the slice is live.Unfortunately, https://go.dev/cl/750480 adds a preemption point at the
clearininitHeapBits. Within Google, we've found cases of applications crashing due to a preemption at this point allowing the GC to discover the bad slice.My current plan is to revert https://go.dev/cl/750480 (and follow-up CLs) to get the tree back into a stable state. After the revert, we should fix this improper use of
notInHeapSlice.Note that most other uses of
notInHeapSlicealso convert to a standard slice, but all other uses are pointers to objects frompersistentallocor direct mmap.findObjectignores such pointers, so their visibility to the GC is not problematic.[1] I think https://cs.opensource.google/go/go/+/master:src/cmd/compile/internal/types/size.go;drc=7c42da1bc5369d7ba174328908ecc9172578f6d2;bpv=1;bpt=1;l=714 records
*notInHeapas a scalar, though I'm actually not completely sure.cc @randall77 @golang/runtime