Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fatal error: found pointer to free object #2

Closed
danderson opened this issue Dec 23, 2020 · 10 comments · Fixed by #4
Closed

fatal error: found pointer to free object #2

danderson opened this issue Dec 23, 2020 · 10 comments · Fixed by #4

Comments

@danderson
Copy link
Collaborator

This package has a bug of some kind that causes crashes in the Go runtime under heavy use.

To reproduce, add the following to intern_test.go:

func TestTorture(t *testing.T) {
	// Get the same Value over and over again, while the GC runs in
	// parallel. The objective is to trigger any kind of latent issues
	// with unsafe pointer handling, by forcing the GC and our code to
	// run concurrently.
 	iters := 100000000
	for i := 0; i < iters; i++ {
		_ = Get("foo")
	}
}

Under go test (after a variable amount of time/attempts), this will crash with something like:

runtime: marked free object in span 0x7f69d22642a0, elemsize=32 freeindex=0 (bad use of unsafe.Pointer? try -d=checkptr)
0xc00000c000 alloc marked  
0xc00000c020 free  marked   zombie
000000c00000c020:  000000000051b920  000000000056c380 
000000c00000c030:  0000000005884278  0000000000000000 
0xc00000c040 alloc marked  
0xc00000c060 alloc marked  
0xc00000c080 free  unmarked
0xc00000c0a0 free  unmarked
0xc00000c0c0 free  unmarked
0xc00000c0e0 free  unmarked
0xc00000c100 free  unmarked
0xc00000c120 free  unmarked
0xc00000c140 free  unmarked
0xc00000c160 free  unmarked
0xc00000c180 free  unmarked
0xc00000c1a0 free  unmarked
0xc00000c1c0 free  unmarked
0xc00000c1e0 free  unmarked
0xc00000c200 free  unmarked
0xc00000c220 free  unmarked
0xc00000c240 free  unmarked
0xc00000c260 free  unmarked
0xc00000c280 free  unmarked
0xc00000c2a0 free  unmarked
0xc00000c2c0 free  unmarked
0xc00000c2e0 free  unmarked
0xc00000c300 free  unmarked
0xc00000c320 free  unmarked
0xc00000c340 free  unmarked
0xc00000c360 free  unmarked
0xc00000c380 free  unmarked
0xc00000c3a0 free  unmarked
0xc00000c3c0 free  unmarked
0xc00000c3e0 free  unmarked
0xc00000c400 free  unmarked
0xc00000c420 free  unmarked
0xc00000c440 free  unmarked
0xc00000c460 free  unmarked
0xc00000c480 free  unmarked
0xc00000c4a0 free  unmarked
0xc00000c4c0 free  unmarked
0xc00000c4e0 free  unmarked
0xc00000c500 free  unmarked
0xc00000c520 free  unmarked
0xc00000c540 free  unmarked
0xc00000c560 free  unmarked
0xc00000c580 free  unmarked
0xc00000c5a0 free  unmarked
0xc00000c5c0 free  unmarked
0xc00000c5e0 free  unmarked
0xc00000c600 free  unmarked
0xc00000c620 free  unmarked
0xc00000c640 free  unmarked
0xc00000c660 free  unmarked
0xc00000c680 free  unmarked
0xc00000c6a0 free  unmarked
0xc00000c6c0 free  unmarked
0xc00000c6e0 free  unmarked
0xc00000c700 free  unmarked
0xc00000c720 free  unmarked
0xc00000c740 free  unmarked
0xc00000c760 free  unmarked
0xc00000c780 free  unmarked
0xc00000c7a0 free  unmarked
0xc00000c7c0 free  unmarked
0xc00000c7e0 free  unmarked
0xc00000c800 free  unmarked
0xc00000c820 free  unmarked
0xc00000c840 free  unmarked
0xc00000c860 free  unmarked
0xc00000c880 free  unmarked
0xc00000c8a0 free  unmarked
0xc00000c8c0 free  unmarked
0xc00000c8e0 free  unmarked
0xc00000c900 free  unmarked
0xc00000c920 free  unmarked
0xc00000c940 free  unmarked
0xc00000c960 free  unmarked
0xc00000c980 free  unmarked
0xc00000c9a0 free  unmarked
0xc00000c9c0 free  unmarked
0xc00000c9e0 free  unmarked
0xc00000ca00 free  unmarked
0xc00000ca20 free  unmarked
0xc00000ca40 free  unmarked
0xc00000ca60 free  unmarked
0xc00000ca80 free  unmarked
0xc00000caa0 free  unmarked
0xc00000cac0 free  unmarked
0xc00000cae0 free  unmarked
0xc00000cb00 free  unmarked
0xc00000cb20 free  unmarked
0xc00000cb40 free  unmarked
0xc00000cb60 free  unmarked
0xc00000cb80 free  unmarked
0xc00000cba0 free  unmarked
0xc00000cbc0 free  unmarked
0xc00000cbe0 free  unmarked
0xc00000cc00 free  unmarked
0xc00000cc20 free  unmarked
0xc00000cc40 free  unmarked
0xc00000cc60 free  unmarked
0xc00000cc80 free  unmarked
0xc00000cca0 free  unmarked
0xc00000ccc0 free  unmarked
0xc00000cce0 free  unmarked
0xc00000cd00 free  unmarked
0xc00000cd20 free  unmarked
0xc00000cd40 free  unmarked
0xc00000cd60 free  unmarked
0xc00000cd80 free  unmarked
0xc00000cda0 free  unmarked
0xc00000cdc0 free  unmarked
0xc00000cde0 free  unmarked
0xc00000ce00 free  unmarked
0xc00000ce20 free  unmarked
0xc00000ce40 free  unmarked
0xc00000ce60 free  unmarked
0xc00000ce80 free  unmarked
0xc00000cea0 free  unmarked
0xc00000cec0 free  unmarked
0xc00000cee0 free  unmarked
0xc00000cf00 free  unmarked
0xc00000cf20 free  unmarked
0xc00000cf40 free  unmarked
0xc00000cf60 free  unmarked
0xc00000cf80 free  unmarked
0xc00000cfa0 free  unmarked
0xc00000cfc0 free  unmarked
0xc00000cfe0 free  unmarked
0xc00000d000 free  unmarked
0xc00000d020 free  unmarked
0xc00000d040 free  unmarked
0xc00000d060 free  unmarked
0xc00000d080 free  unmarked
0xc00000d0a0 free  unmarked
0xc00000d0c0 free  unmarked
0xc00000d0e0 free  unmarked
0xc00000d100 free  unmarked
0xc00000d120 free  unmarked
0xc00000d140 free  unmarked
0xc00000d160 free  unmarked
0xc00000d180 free  unmarked
0xc00000d1a0 free  unmarked
0xc00000d1c0 free  unmarked
0xc00000d1e0 free  unmarked
0xc00000d200 free  unmarked
0xc00000d220 free  unmarked
0xc00000d240 free  unmarked
0xc00000d260 free  unmarked
0xc00000d280 free  unmarked
0xc00000d2a0 free  unmarked
0xc00000d2c0 free  unmarked
0xc00000d2e0 free  unmarked
0xc00000d300 free  unmarked
0xc00000d320 free  unmarked
0xc00000d340 free  unmarked
0xc00000d360 free  unmarked
0xc00000d380 free  unmarked
0xc00000d3a0 free  unmarked
0xc00000d3c0 free  unmarked
0xc00000d3e0 free  unmarked
0xc00000d400 free  unmarked
0xc00000d420 free  unmarked
0xc00000d440 free  unmarked
0xc00000d460 free  unmarked
0xc00000d480 free  unmarked
0xc00000d4a0 free  unmarked
0xc00000d4c0 free  unmarked
0xc00000d4e0 free  unmarked
0xc00000d500 free  unmarked
0xc00000d520 free  unmarked
0xc00000d540 free  unmarked
0xc00000d560 free  unmarked
0xc00000d580 free  unmarked
0xc00000d5a0 free  unmarked
0xc00000d5c0 free  unmarked
0xc00000d5e0 free  unmarked
0xc00000d600 free  unmarked
0xc00000d620 free  unmarked
0xc00000d640 free  unmarked
0xc00000d660 free  unmarked
0xc00000d680 free  unmarked
0xc00000d6a0 free  unmarked
0xc00000d6c0 free  unmarked
0xc00000d6e0 free  unmarked
0xc00000d700 free  unmarked
0xc00000d720 free  unmarked
0xc00000d740 free  unmarked
0xc00000d760 free  unmarked
0xc00000d780 free  unmarked
0xc00000d7a0 free  unmarked
0xc00000d7c0 free  unmarked
0xc00000d7e0 free  unmarked
0xc00000d800 free  unmarked
0xc00000d820 free  unmarked
0xc00000d840 free  unmarked
0xc00000d860 free  unmarked
0xc00000d880 free  unmarked
0xc00000d8a0 free  unmarked
0xc00000d8c0 free  unmarked
0xc00000d8e0 free  unmarked
0xc00000d900 free  unmarked
0xc00000d920 free  unmarked
0xc00000d940 free  unmarked
0xc00000d960 free  unmarked
0xc00000d980 free  unmarked
0xc00000d9a0 free  unmarked
0xc00000d9c0 free  unmarked
0xc00000d9e0 free  unmarked
0xc00000da00 free  unmarked
0xc00000da20 free  unmarked
0xc00000da40 free  unmarked
0xc00000da60 free  unmarked
0xc00000da80 free  unmarked
0xc00000daa0 free  unmarked
0xc00000dac0 free  unmarked
0xc00000dae0 free  unmarked
0xc00000db00 free  unmarked
0xc00000db20 free  unmarked
0xc00000db40 free  unmarked
0xc00000db60 free  unmarked
0xc00000db80 free  unmarked
0xc00000dba0 free  unmarked
0xc00000dbc0 free  unmarked
0xc00000dbe0 free  unmarked
0xc00000dc00 free  unmarked
0xc00000dc20 free  unmarked
0xc00000dc40 free  unmarked
0xc00000dc60 free  unmarked
0xc00000dc80 free  unmarked
0xc00000dca0 free  unmarked
0xc00000dcc0 free  unmarked
0xc00000dce0 free  unmarked
0xc00000dd00 free  unmarked
0xc00000dd20 free  unmarked
0xc00000dd40 free  unmarked
0xc00000dd60 free  unmarked
0xc00000dd80 free  unmarked
0xc00000dda0 free  unmarked
0xc00000ddc0 free  unmarked
0xc00000dde0 free  unmarked
0xc00000de00 free  unmarked
0xc00000de20 free  unmarked
0xc00000de40 free  unmarked
0xc00000de60 free  unmarked
0xc00000de80 free  unmarked
0xc00000dea0 free  unmarked
0xc00000dec0 free  unmarked
0xc00000dee0 free  unmarked
0xc00000df00 free  unmarked
0xc00000df20 free  unmarked
0xc00000df40 free  unmarked
0xc00000df60 free  unmarked
0xc00000df80 free  unmarked
0xc00000dfa0 free  unmarked
0xc00000dfc0 free  unmarked
0xc00000dfe0 free  unmarked
fatal error: found pointer to free object

runtime stack:
runtime.SetFinalizer.func1()
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/runtime/mfinal.go:369 +0x2e

goroutine 34 [running]:
runtime.systemstack_switch()
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/runtime/asm_amd64.s:330 fp=0xc00010bdf0 sp=0xc00010bde8 pc=0x469100
runtime.SetFinalizer(0x5202c0, 0xc00000c020, 0x0, 0x0)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/runtime/mfinal.go:368 +0x465 fp=0xc00010bed8 sp=0xc00010bdf0 pc=0x418d85
go4.org/intern.Get(0x51b920, 0x56c380, 0x0)
	/home/dave/hack/intern/intern.go:71 +0x185 fp=0xc00010bf50 sp=0xc00010bed8 pc=0x509d85
go4.org/intern.TestTorture(0xc000182180)
	/home/dave/hack/intern/intern_test.go:64 +0x45 fp=0xc00010bf80 sp=0xc00010bf50 pc=0x50a4e5
testing.tRunner(0xc000182180, 0x54dfc0)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1123 +0xef fp=0xc00010bfd0 sp=0xc00010bf80 pc=0x4cd10f
runtime.goexit()
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00010bfd8 sp=0xc00010bfd0 pc=0x46ad41
created by testing.(*T).Run
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1168 +0x2b3

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000182180, 0x5453c8, 0xb, 0x54dfc0, 0x47e201)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1169 +0x2da
testing.runTests.func1(0xc000001200)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1439 +0x78
testing.tRunner(0xc000001200, 0xc00010cde0)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1123 +0xef
testing.runTests(0xc00000c060, 0x61dc80, 0x2, 0x2, 0xbff0c99eb7a2f577, 0x8bb2ce4af7, 0x625600, 0x40d350)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1437 +0x2fe
testing.(*M).Run(0xc000128000, 0x0)
	/nix/store/whf767pjnhyllfk1jkma1rg4np2dizcc-go-1.15.5/share/go/src/testing/testing.go:1345 +0x1eb
main.main()
	_testmain.go:45 +0x138
@danderson
Copy link
Collaborator Author

(moving discussion here @bradfitz @josharian)

Josh, I'm not sure I follow the sequence of events in your hypothesis that lead to a dangling pointer. Get and the related finalizer are interlocked by mu. If the finalizer runs first, Get will miss in the cache table and therefore create a new pointer. If Get runs first, the finalizer will have the wrong gen version and turn into a no-op.

In both cases, it seems to me that we cannot reach a point in execution where we materialize a pointer from a uintptr that has already ceased to exist - the process of making the pointer cease to exist should be destroying the uintptr. What sequence of operations am I missing that the runtime is allowed to do?

@danderson
Copy link
Collaborator Author

One piece of knowledge I have holds the above together, and I realize I may be wrong about that: is a finalizer allowed to create new references to the object being finalized? That is, does the runtime have to correctly account for "I was about to delete this object, but the finalizer made it exist again" ? Or is it allowed to assume that, once the finalizer returns, the object is collectable?

@bradfitz
Copy link
Contributor

Just saw this issue too. I'd added this test locally too, like yours above, but the allocs make it trigger quicker:

func TestStress(t *testing.T) {
        iters := 1000000
        if testing.Short() {
                iters = 1000
        }
        var sink []byte
        for i := 0; i < iters; i++ {
                _ = Get("foo")
                sink = make([]byte, 1<<20)
        }
        _ = sink
}

bradfitz added a commit that referenced this issue Dec 23, 2020
bradfitz added a commit that referenced this issue Dec 23, 2020
@bradfitz
Copy link
Contributor

Sent #4 that no longer changes a finalizer. It now only choose to re-add a finalizer in the finalizer.

https://golang.org/pkg/runtime/#SetFinalizer says:

When the garbage collector finds an unreachable block with an associated finalizer, it clears the association and runs finalizer(obj) in a separate goroutine. This makes obj reachable again, but now without an associated finalizer. Assuming that SetFinalizer is not called again, the next time the garbage collector sees that obj is unreachable, it will free obj.

(which makes it very clear that it's okay to re-add a finalizer in the finalizer)

@danderson
Copy link
Collaborator Author

I have questions, because I'm still not understanding where the problem was in the first place. The mutex should have made everything completely consistent, no?

@danderson
Copy link
Collaborator Author

Specifically: I'm worried that I don't understand why the previous code was incorrect, and as a result don't know if this is fixed merely by lowering the probability of triggering in this specific test case.

bradfitz added a commit that referenced this issue Dec 23, 2020
@bradfitz
Copy link
Contributor

It seems like our "changing the finalizer" by clearing it & re-setting it was bogus somehow.

Perhaps the intermediate runtime.SetFinalizer(v, nil) gave the GC a small window to collect it. Not sure why it wasn't kept live by the return v later, though.

In any case, PR #4 follows the documented rules super explicitly rather than wandering off into any unknown areas where we have to guess how the runtime might be implemented.

bradfitz added a commit that referenced this issue Dec 23, 2020
bradfitz added a commit that referenced this issue Dec 23, 2020
@danderson
Copy link
Collaborator Author

Oooh, of course, clearing the finalizer lets a concurrent GC do bad things.

So, for sequence of events, something like:

  • GC runs mark phase. Value is dangling, so gets earmarked for collection (w/ finalizer run).
  • Get materializes a new pointer to Value. GC won't notice this until the next mark phase.
  • Get clears finalizer.
  • Concurrent GC gets around to processing that earmarked Value, finds it has no finalizer. Great! free().
  • Get continues to operate on its materialized pointer, which now points to freed memory.

IOW, clearing a finalizer from an object gives the runtime immediate leave to delete that object without further ceremony. Normally, the reference that was being held to pass to SetFinalizer would prevent that, but the reference we're using was freshly conjured out of thin air and it may be one more GC cycle until the runtime is aware of its existence, for keeping-alive purposes.

Sound right? The gap in my knowledge is where precisely finalizers get considered and scheduled in the GC cycle. I'm assuming there's a gap between "okay, this object needs collecting" and "time to collect, or is there a finalizer attached?"

@bradfitz
Copy link
Contributor

Yup, that all sounds right.

As for when they're scheduled, the runtime docs also say:

A single goroutine runs all finalizers for a program, sequentially. If a finalizer must run for a long time, it should do so by starting a new goroutine.

@danderson
Copy link
Collaborator Author

That piece of runtime docs doesn't exactly answer my question, which is: how far apart in time are the mark phases and the check to see if a finalizer exists? That's the danger window in which we hold a pointer that the GC is unaware of, and so the finalizer is the only thing preventing the object from being GC'd.

Empirically the answer seems to be "whatever, it's fine", because the new code unconditionally runs the finalizer at least once, ensuring that the Value will survive until the next GC cycle - at which point the newly materialized Value pointer will be seen in mark. It's notionally equivalent to the finalizer creating a new permanent reference to the object (e.g. in a global map) and thus preventing its collection, which I would hope the runtime handles gracefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants