Skip to content

cmd/compile: broken write barrier #71228

@randall77

Description

@randall77

I've found a case where we can get a pointer write without a corresponding write barrier.
(This is distilled from a Google-internal failure.)

Reproducer:

package main

import "math/rand/v2"

type S struct {
	a, b string
	ptr  *T
}

type T struct {
	x [64]*byte
}

// f is a fancy way of doing dst.ptr = ptr, without a write barrier.
//go:noinline
func f(dst *S, ptr *T) {
	_ = *dst // early nil check

	var s S
	sp := &s
	g = nil      // simple write barrier
	sp.ptr = ptr // put target pointer in s
	*dst = *sp   // move write barrier
}

var g *byte

const W = 4

func main() {
	for i := 1; i < W; i++ {
		go worker(i)
	}
	worker(0)
}

const N = 100000

// Keep all the S's in the heap reachable.
var workbufs [W][N]*S

func worker(w int) {
	workbuf := &workbufs[w]
	for i := range N {
		workbuf[i] = new(S)
	}
	const A = 100
	var allocbuf [A]*T
	for i := 0; i < 10000000; i++ {
		s := workbuf[rand.IntN(N)]
		j := rand.IntN(A)
		ptr := allocbuf[j]
		allocbuf[j] = nil
		f(s, ptr)
		allocbuf[j] = new(T)
	}
}

Run with GOMAXPROCS=2. Seems to fail 5% of the time or so (if anyone has ideas about how to get that percentage up, please do so). When it fails, we get a zombie object report like this:

runtime: marked free object in span 0x10ab655c8, elemsize=512 freeindex=0 (bad use of unsafe.Pointer? try -d=checkptr)
0x14024644000 alloc unmarked
0x14024644200 free  unmarked
0x14024644400 alloc unmarked
0x14024644600 free  unmarked
0x14024644800 alloc unmarked
0x14024644a00 free  marked   zombie
0x0000014024644a00:  0x0000000000000000  0x0000000000000000 
0x0000014024644a10:  0x0000000000000000  0x0000000000000000 
0x0000014024644a20:  0x0000000000000000  0x0000000000000000 
0x0000014024644a30:  0x0000000000000000  0x0000000000000000 
0x0000014024644a40:  0x0000000000000000  0x0000000000000000 
0x0000014024644a50:  0x0000000000000000  0x0000000000000000 
0x0000014024644a60:  0x0000000000000000  0x0000000000000000 
0x0000014024644a70:  0x0000000000000000  0x0000000000000000 
0x0000014024644a80:  0x0000000000000000  0x0000000000000000 
0x0000014024644a90:  0x0000000000000000  0x0000000000000000 
0x0000014024644aa0:  0x0000000000000000  0x0000000000000000 
0x0000014024644ab0:  0x0000000000000000  0x0000000000000000 
0x0000014024644ac0:  0x0000000000000000  0x0000000000000000 
0x0000014024644ad0:  0x0000000000000000  0x0000000000000000 
0x0000014024644ae0:  0x0000000000000000  0x0000000000000000 
0x0000014024644af0:  0x0000000000000000  0x0000000000000000 
0x0000014024644b00:  0x0000000000000000  0x0000000000000000 
0x0000014024644b10:  0x0000000000000000  0x0000000000000000 
0x0000014024644b20:  0x0000000000000000  0x0000000000000000 
0x0000014024644b30:  0x0000000000000000  0x0000000000000000 
0x0000014024644b40:  0x0000000000000000  0x0000000000000000 
0x0000014024644b50:  0x0000000000000000  0x0000000000000000 
0x0000014024644b60:  0x0000000000000000  0x0000000000000000 
0x0000014024644b70:  0x0000000000000000  0x0000000000000000 
0x0000014024644b80:  0x0000000000000000  0x0000000000000000 
0x0000014024644b90:  0x0000000000000000  0x0000000000000000 
0x0000014024644ba0:  0x0000000000000000  0x0000000000000000 
0x0000014024644bb0:  0x0000000000000000  0x0000000000000000 
0x0000014024644bc0:  0x0000000000000000  0x0000000000000000 
0x0000014024644bd0:  0x0000000000000000  0x0000000000000000 
0x0000014024644be0:  0x0000000000000000  0x0000000000000000 
0x0000014024644bf0:  0x0000000000000000  0x0000000000000000 
0x14024644c00 alloc unmarked
0x14024644e00 alloc unmarked
0x14024645000 alloc unmarked
0x14024645200 alloc marked  
0x14024645400 alloc unmarked
0x14024645600 alloc marked  
0x14024645800 alloc unmarked
0x14024645a00 free  unmarked
0x14024645c00 free  unmarked
fatal error: found pointer to free object

This is because in f, the write of ptr to dst.ptr is done erroneously without a write barrier, which ends up hiding a pointer to a white object in a black object.

This fails at tip, 1.23.4, and 1.22.6.
I suspect it may have started with CL 447780.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.CriticalA critical problem that affects the availability or correctness of production systems built using Gocompiler/runtimeIssues related to the Go compiler and/or runtime.release-blocker

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions