Skip to content

runtime: go 1.3.3 (32-bits only) panics in gc involving cgo, C heap pointers and finalizers (afaict) #9866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pedronis opened this issue Feb 13, 2015 · 9 comments

Comments

@pedronis
Copy link

What version of Go are you using (go version)?
1.3.3

What operating system and processor architecture are you using?
on Ubuntu, issue happens on both ARM and 386 (32 bits), but not amd64 (64 bits)

What did you do?
run repeatedly go test on the code in the package at

https://github.com/pedronis/go133explode.git

like with

for i in $(seq 100) ; do echo $i ; go test || break ; done

apart the obvious deps, it needs:

apt-get install libclick-0.4-dev

This code is derived from a larger test suite, didn't manage to reduce further without making failure probability much lower.

On some tries on the listed 32 bit archs the code explodes with panics from inside the gc,

  • seen sometimes: fatal error: freelist empty

  • or signals are typical like:

    3
    RUN
    fatal error: unexpected signal during runtime execution
    [signal 0xb code=0x1 addr=0x1 pc=0x806968c]
    
    goroutine 22 [running]:
    runtime.throw(0x82a8ac5)
            /usr/lib/go/src/pkg/runtime/panic.c:520 +0x71 fp=0xc5d5d990 sp=0xc5d5d984
    runtime.sigpanic()
            /usr/lib/go/src/pkg/runtime/os_linux.c:222 +0x46 fp=0xc5d5d99c sp=0xc5d5d990
    runtime.mallocgc(0x10, 0x81c59c0, 0x0)
            /usr/lib/go/src/pkg/runtime/malloc.goc:154 +0x2ec fp=0xc5d5d9d0 sp=0xc5d5d99c
    runtime.new(0x81c59c0, 0x0)
            /usr/lib/go/src/pkg/runtime/malloc.goc:826 +0x35 fp=0xc5d5d9e0 sp=0xc5d5d9d0
    reflect.Value.MapIndex(0x8177f80, 0xd70a4ec0, 0x0, 0x150, 0x8178c00, 0xd71075a8, 0x0, 0x182, 0x81774c0, 0xd71076d8, ...)
            /usr/lib/go/src/pkg/reflect/value.go:1184 +0x2e fp=0xc5d5da54 sp=0xc5d5d9e0
    encoding/json.(*mapEncoder).encode(0xd708e378, 0xd70fa780, 0x8177f80, 0xd70a4ec0, 0x0, 0x150, 0x100)
            /usr/lib/go/src/pkg/encoding/json/encode.go:617 +0x1f8 fp=0xc5d5dafc sp=0xc5d5da54
    encoding/json.*mapEncoder.(encoding/json.encode)·fm(0xd70fa780, 0x8177f80, 0xd70a4ec0, 0x0, 0x150, 0x100)
            /usr/lib/go/src/pkg/encoding/json/encode.go:627 +0x4c fp=0xc5d5db1c sp=0xc5d5dafc
    encoding/json.(*encodeState).reflectValue(0xd70fa780, 0x8177f80, 0xd70a4ec0, 0x0, 0x150)
            /usr/lib/go/src/pkg/encoding/json/encode.go:297 +0x58 fp=0xc5d5db38 sp=0xc5d5db1c
    encoding/json.(*encodeState).marshal(0xd70fa780, 0x8177f80, 0xd70a4ec0, 0x0, 0x0)
            /usr/lib/go/src/pkg/encoding/json/encode.go:268 +0xa9 fp=0xc5d5db74 sp=0xc5d5db38
    encoding/json.Marshal(0x8177f80, 0xd70a4ec0, 0x0, 0x0, 0x0, 0x0, 0x0)
            /usr/lib/go/src/pkg/encoding/json/encode.go:133 +0x88 fp=0xc5d5dbac sp=0xc5d5db74
    explode.(*clientSuite).writeTestConfig(0xd70b8170)
            /home/pedronis/canonical/go-ws/src/explode/explode_test.go:43 +0x94b fp=0xc5d5dc9c sp=0xc5d5dbac
    explode.(*clientSuite).witness(0xd70b8170)
            /home/pedronis/canonical/go-ws/src/explode/explode_test.go:52 +0x98 fp=0xc5d5dcdc sp=0xc5d5dc9c
    explode.(*clientSuite).SetUpSuite(0xd70b8170, 0xd70842a0)
            /home/pedronis/canonical/go-ws/src/explode/explode_test.go:99 +0xd5 fp=0xc5d5dd28 sp=0xc5d5dcdc
    runtime.call16(0x81b828c, 0xd708e2b8, 0x8, 0x8)
            /usr/lib/go/src/pkg/runtime/asm_386.s:383 +0x3a fp=0xc5d5dd3c sp=0xc5d5dd28
    reflect.Value.call(0x81b8240, 0xd70b8170, 0x0, 0x138, 0x81c8a98, 0x4, 0xd70b81d0, 0x1, 0x1, 0x0, ...)
            /usr/lib/go/src/pkg/reflect/value.go:563 +0xdaf fp=0xc5d5df2c sp=0xc5d5dd3c
    reflect.Value.Call(0x81b8240, 0xd70b8170, 0x0, 0x138, 0xd70b81d0, 0x1, 0x1, 0x0, 0x0, 0x0)
            /usr/lib/go/src/pkg/reflect/value.go:411 +0x91 fp=0xc5d5df60 sp=0xc5d5df2c
    launchpad.net/gocheck.func·005(0xd70842a0)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:683 +0xec fp=0xc5d5dfb8 sp=0xc5d5df60
    launchpad.net/gocheck.func·004()
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:628 +0x7c fp=0xc5d5dfd0 sp=0xc5d5dfb8
    runtime.goexit()
            /usr/lib/go/src/pkg/runtime/proc.c:1445 fp=0xc5d5dfd4 sp=0xc5d5dfd0
    created by launchpad.net/gocheck.(*suiteRunner).forkCall
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:629 +0x1ba
    
    goroutine 16 [chan receive]:
    testing.RunTests(0x821b2c4, 0x82a5630, 0x1, 0x1, 0x1)
            /usr/lib/go/src/pkg/testing/testing.go:505 +0x75c
    testing.Main(0x821b2c4, 0x82a5630, 0x1, 0x1, 0x82bb1e0, 0x0, 0x0, 0x82bb1e0, 0x0, 0x0)
            /usr/lib/go/src/pkg/testing/testing.go:435 +0x6e
    main.main()
            explode/_test/_testmain.go:47 +0x86
    
    goroutine 19 [finalizer wait]:
    runtime.park(0x805a040, 0x82b926c, 0x82ab3e9)
            /usr/lib/go/src/pkg/runtime/proc.c:1369 +0x94
    runtime.parkunlock(0x82b926c, 0x82ab3e9)
            /usr/lib/go/src/pkg/runtime/proc.c:1385 +0x3f
    runfinq()
            /usr/lib/go/src/pkg/runtime/mgc0.c:2644 +0xc5
    runtime.goexit()
            /usr/lib/go/src/pkg/runtime/proc.c:1445
    
    goroutine 20 [chan receive]:
    launchpad.net/gocheck.(*suiteRunner).runFunc(0xd70ae140, 0xd70ae180, 0x0, 0x0, 0x821b2bc, 0x0)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:636 +0x71
    launchpad.net/gocheck.(*suiteRunner).runFixture(0xd70ae140, 0xd70ae180, 0x0, 0xd70b8170)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:684 +0x55
    launchpad.net/gocheck.(*suiteRunner).run(0xd70ae140, 0xd70b8170)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:581 +0x7f
    launchpad.net/gocheck.Run(0x81b8240, 0xd70b8170, 0xd70a46a0, 0x0)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/run.go:76 +0x4a
    launchpad.net/gocheck.RunAll(0xd70a46a0, 0xd70a46a0)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/run.go:68 +0xb8
    launchpad.net/gocheck.TestingT(0xd70841e0)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/run.go:56 +0x29c
    explode.TestClient(0xd70841e0)
            /home/pedronis/canonical/go-ws/src/explode/explode_test.go:88 +0x2e
    testing.tRunner(0xd70841e0, 0x82a5630)
            /usr/lib/go/src/pkg/testing/testing.go:422 +0x87
    created by testing.RunTests
            /usr/lib/go/src/pkg/testing/testing.go:504 +0x720
    
    goroutine 21 [select]:
    launchpad.net/gocheck.(*resultTracker)._loopRoutine(0xd70ae100)
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:436 +0x28f
    created by launchpad.net/gocheck.(*resultTracker).start
            /home/pedronis/canonical/go-ws/src/launchpad.net/gocheck/gocheck.go:416 +0x35
    
    goroutine 17 [syscall]:
    runtime.goexit()
            /usr/lib/go/src/pkg/runtime/proc.c:1445
    exit status 2
    FAIL    explode 0.030s
    

they all seem to point to a corrupted freelist.

This change seems to workaround the issue, pointing in the direction of C-heap pointers and finalizers instead of an issue with the code itself. AFAIU go gc should be able to ignore C-heap pointers so this wouldn't be needed, indeed the code works fine with no panics without this change with 1.2, 1.4.1 and 1.3.3 on 64 bits:

diff --git a/cclick.go b/cclick.go
index 976c00a..cc56c24 100644
--- a/cclick.go
+++ b/cclick.go
@@ -26,6 +26,7 @@ func (ccu *CClickUser) CInit(holder interface{}) error {
        }
        ccu.cref = cref
        runtime.SetFinalizer(holder, func(interface{}) {
+               ccu.cref = nil
                C.g_object_unref((C.gpointer)(cref))
        })
        return nil
@pedronis pedronis changed the title runtime: go 1.3.3 panics in gc on 32-bits only involving cgo, C heap pointers and finalizers (afaict) runtime: go 1.3.3 32-bits only panics in gc involving cgo, C heap pointers and finalizers (afaict) Feb 13, 2015
@pedronis pedronis changed the title runtime: go 1.3.3 32-bits only panics in gc involving cgo, C heap pointers and finalizers (afaict) runtime: go 1.3.3 (32-bits only) panics in gc involving cgo, C heap pointers and finalizers (afaict) Feb 13, 2015
@bradfitz
Copy link
Contributor

No more work is being done on Go 1.3. Please re-open if the problem happens with Go 1.4 or tip.

@chipaca
Copy link
Contributor

chipaca commented Feb 13, 2015

Less than six months support window. Noted.

@bradfitz
Copy link
Contributor

@chipaca, your sarcasm is unnecessary. The whole reason we have the Go 1 compatibility promise (https://golang.org/doc/go1compat) is so there's no reason for people to not upgrade from Go 1.n to Go 1.n+1. If something worked in Go 1.3 but doesn't work in Go 1.4, that's a bug and we'll fix it in Go 1.4 (in a point release) and at tip as well. That promise takes much more time than "six months". But what we're not going to do is go cherry-pick all fixes into all old releases of Go.

@chipaca
Copy link
Contributor

chipaca commented Feb 13, 2015

Sorry about the sarcasm; I am rather upset about this. Code that works with 1.2 has stopped working in 1.3, leading to a FTBFS in Ubuntu's Vivid (15.04) (which is the first one with 1.3 and not out the door yet) for code that worked up to 14.10. So we're left with deciding whether to ship unsupported golang packages, and work around the issue and hope it is a valid workaround and not merely decreasing the probability of it in testing, or moving Vivid to 1.4 (which is not a call I can make, and given that 1.4 isn't yet out of Debian experimental -- let alone unstable -- it is rather unlikely).

@bradfitz
Copy link
Contributor

/cc @dvyukov for any thoughts.

I agree that the release cycle mismatch between Go and Ubuntu/Debian is unfortunate, but it's hard to line up anything with Debian.

@ianlancetaylor
Copy link
Contributor

The Go team is not going to make fixes in old Go releases. There is no way that our small team can maintain past releases indefinitely. People using our releases are expected to upgrade.

However, you are not using our releases. You are using Debian's. There is nothing prevent Debian from patching an old release.

That said, I have not looked at your code. Liveness analysis is getting increasingly more precise. Errors involving finalizers being run unexpectedly are more common in later releases. In particular, the fact that some variable is live at the beginning of a function does not mean that it is live at the end. This means that a finalizer may be running when you do not expect it. Take a close look at your finalizers and make sure that when they call into C there is no way that they can be touching any C memory that is referenced by any other Go value.

@chipaca
Copy link
Contributor

chipaca commented Feb 13, 2015

Maybe I should've added that the issue is not present in 1.4?

@dvyukov
Copy link
Member

dvyukov commented Feb 13, 2015

The way finalizers work for C resource finalization is very subtle.

If you have code like:

holder.cref = C.create()
runtime.SetFinalizer(&holder, func(h *Holder) {
C.destroy(h.cref)
})
...
C.use(holder.cref)

By the time C.use starts executing, holder becomes dead and so can be finalized while C.use executes.

Looking at your patch, it seems that you can be affected by this subtlity.

@pedronis
Copy link
Author

my bad, the effect of the workaround is simply to stopping the collection
of holder completely :/

anyway the is no further use of ccu.cref in the code through ccu itself,
cref comes out of the C side and is never passed back to be used here,
except the value closed over to be for the unref but that doesn't go
through ccu or any other go object afaict,

On Fri, Feb 13, 2015 at 6:04 PM, Dmitry Vyukov notifications@github.com
wrote:

The way finalizers work for C resource finalization is very subtle.

If you have code like:

holder.cref = C.create()
runtime.SetFinalizer(&holder, func(h *Holder) {
C.destroy(h.cref)
})
...
C.use(holder.cref)

By the time C.use starts executing, holder becomes dead and so can be
finalized while C.use executes.

Looking at your patch, it seems that you can be affected by this subtlity.


Reply to this email directly or view it on GitHub
#9866 (comment).

@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants