Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestGCTestIsReachable failures #67204

Open
gopherbot opened this issue May 6, 2024 · 5 comments
Open

runtime: TestGCTestIsReachable failures #67204

gopherbot opened this issue May 6, 2024 · 5 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@gopherbot
Copy link

#!watchflakes
default <- pkg == "runtime" && test == "TestGCTestIsReachable"

Issue created automatically to collect these failures.

Example (log):

--- FAIL: TestGCTestIsReachable (0.03s)
    gc_test.go:282: did not get expected reachable set; want 101010101010101, got 1101010101010101

watchflakes

@gopherbot gopherbot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 6, 2024
@gopherbot
Copy link
Author

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime" && test == "TestGCTestIsReachable"
2024-05-06 11:46 android-amd64-emu go@ff0bc466 runtime.TestGCTestIsReachable (log)
--- FAIL: TestGCTestIsReachable (0.03s)
    gc_test.go:282: did not get expected reachable set; want 101010101010101, got 1101010101010101

watchflakes

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 6, 2024
@dmitshur dmitshur added this to the Backlog milestone May 15, 2024
@randall77
Copy link
Contributor

It looks like of the 16 test objects, the 15th one is somehow still reachable even though it shouldn't be.

It is possible that this test is flaky because of conservative scanning of top stack frames. A value on a random stack frame somewhere (probably a dead pointer) that happens to point to the object in question would keep it alive longer than this test expects.

Not sure why it is the 15th object in both cases. Maybe the last mallocgc leaves that pointer somewhere deep on the stack and the subsequent call to gctestIsReachable "adopts" that slot uninitialized in its frame, which then gets scanned. I'm not sure how that failure mode wouldn't be deterministic, though.

In any case, I think this test is kind of fundamentally unsound when conservative scanning exists. Not sure what to do about it.

@aclements who wrote the test, @mknyszek who is deep in this kind of stuff.

@randall77
Copy link
Contributor

Here's a test that demonstrates the "problem", that GCTestIsReachable can report false positives based on dead pointers in stacks of other conservatively scanned goroutines.

//go:noinline
func writeToStack(p unsafe.Pointer) unsafe.Pointer {
	var x [100]unsafe.Pointer
	for i := range x {
		x[i] = p
	}
	return x[7]
}

//go:noinline
func conservativelyScanned(c chan bool) int {
	// Tell parent that the vulnerable frame is running.
	c <- true

	// Run in a state where GC has to interrupt us to proceed, and thus
	// has to scan us conservatively.
	var s int
	for i := 0; i < 1<<30; i++ {
		s += i * i
	}

	// Tell parent we're done.
	c <- true

	// Allocate a stack frame region that is uninitialized until here.
	// This region overlaps writeToStack.x and thus contains dead copies
	// of the pointer in question during the loop above.
	var x [100]int
	return x[s%len(x)]
}

func TestBadReachable(t *testing.T) {
	s := make([]unsafe.Pointer, 1)
	s[0] = unsafe.Pointer(new(*int))
	c := make(chan bool, 1)
	go func(p unsafe.Pointer) {
		writeToStack(p)
		conservativelyScanned(c)
	}(s[0])

	// Make sure child goroutine is in a state where conservative scanning would find s[0].
	<-c

	got := runtime.GCTestIsReachable(s...)
	if got != 0 {
		t.Fatalf("s[0] is reachable, but shouldn't be")
	}

	// Wait for goroutine to finish
	<-c
}

(Add this to runtime/gc_test.go.)

@mknyszek
Copy link
Contributor

Wow, nice reproducer. Maybe we run the test as a testprog (or have the test binary reinvoke itself) and explicitly disable asynchronous preemption via GODEBUG?

@randall77
Copy link
Contributor

Yes, a reinvoke might work.
I do worry that if we enshrine GODEBUG=asyncpreemptoff=1 in order to make this test work, then we can never get rid of that debug option. Maybe that's not the end of the world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: In Progress
Status: Active
Development

No branches or pull requests

4 participants