reflect: Call through autogenerated method corrupts memory on ppc64le #10628

Closed
davecheney opened this Issue Apr 30, 2015 · 10 comments

Comments

Projects
None yet
4 participants
Contributor

davecheney commented Apr 30, 2015

I'm very sorry about the quality of this issue report, I've yet to be able to resolve it to a smaller reproduction.

The background is we have an API server which instruments types and exposes their methods over an RPC interface. This stack trace is from the sever side, attempting to dispatch to agent.(*AgentAPIV1).StateServingInfo

unexpected fault address 0xf8010260f8010260
fatal error: fault
[signal 0xb code=0x1 addr=0xf8010260f8010260 pc=0x1e56c]

goroutine 229 [running]:
runtime.throw(0xf45d00, 0x5)
        /home/ubuntu/go/src/runtime/panic.go:543 +0x8c fp=0xc2083ff688 sp=0xc2083ff670
runtime.sigpanic()
        /home/ubuntu/go/src/runtime/sigpanic_unix.go:27 +0x2f0 fp=0xc2083ff6d8 sp=0xc2083ff688
runtime.convI2E(0xf8010260f8010258, 0x7c030378e8410070, 0x0, 0x0)
        /home/ubuntu/go/src/runtime/iface.go:256 +0x74 fp=0xc2083ff708 sp=0xc2083ff6e0
github.com/juju/juju/apiserver/agent.(*AgentAPIV0).StateServingInfo(0x31f5c8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/apiserver/agent/agent_v0.go:93 +0xa0 fp=0xc2083ff830 sp=0xc2083ff708
github.com/juju/juju/apiserver/agent.(*AgentAPIV1).StateServingInfo(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        <autogenerated>:22 +0x40 fp=0xc2083ff838 sp=0xc2083ff830
runtime.call128(0xc208205bd0, 0xe85098, 0xc208398380)
        /home/ubuntu/go/src/runtime/asm_ppc64x.s:413 +0x88 fp=0xc2083ff8c0 sp=0xc2083ff838
reflect.Value.call(0xe84f40, 0xc20802c5f0, 0x913, 0xf30060, 0x4, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/go/src/reflect/value.go:432 +0xe30 fp=0xc2083ffc00 sp=0xc2083ff8c0
reflect.Value.Call(0xe84f40, 0xc20802c5f0, 0x913, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /home/ubuntu/go/src/reflect/value.go:300 +0x94 fp=0xc2083ffc60 sp=0xc2083ffc00
github.com/juju/juju/rpc/rpcreflect.newMethod.func6(0xe84f40, 0xc20802c5f0, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/rpc/rpcreflect/type.go:326 +0x1a0 fp=0xc2083ffd60 sp=0xc2083ffc60
github.com/juju/juju/apiserver.(*srvCaller).Call(0xc208581500, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/apiserver/root.go:131 +0xf0 fp=0xc2083ffde8 sp=0xc2083ffd60
github.com/juju/juju/rpc.(*Conn).runRequest(0xc208472be0, 0x3fff7fc0ee38, 0xc208581500, 0x1286728, 0x3, 0xc2080a8460, 0x5, 0x1, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/rpc/server.go:552 +0x114 fp=0xc2083fff28 sp=0xc2083ffde8
runtime.goexit()
        /home/ubuntu/go/src/runtime/asm_ppc64x.s:1132 +0x4 fp=0xc2083fff28 sp=0xc2083fff28
created by github.com/juju/juju/rpc.(*Conn).handleRequest
        /home/ubuntu/src/github.com/juju/juju/rpc/server.go:481 +0x5f4

however AgentAPIV1 is defined as

// AgentAPIV1 implements the version 1 of the API provided to an agent.
type AgentAPIV1 struct {
        *AgentAPIV0
}

So the compiler has autogenerated AgentAPIV1.StateServingInfo and forwarded it to AgentAPIV0.StateServingInfo

github.com/juju/juju/apiserver/agent.(*AgentAPIV0).StateServingInfo(0x31f5c8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/apiserver/agent/agent_v0.go:93 +0xa0 fp=0xc2083ff830 sp=0xc2083ff708
github.com/juju/juju/apiserver/agent.(*AgentAPIV1).StateServingInfo(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        <autogenerated>:22 +0x40 fp=0xc2083ff838 sp=0xc2083ff830

However, looking at the receiver value for agent.(*AgentAPIV1).StateServingInfo it is 0x00, but in the next frame the receiver is 0x31f5c8, this may be related to the problem.

The faulting line, /home/ubuntu/src/github.com/juju/juju/apiserver/agent/agent_v0.go:93, is

func (api *AgentAPIV0) StateServingInfo() (result state.StateServingInfo, err error) {
        auth := api.auth
        fmt.Printf("%#v\n", auth) // line 93

and at this point the value of auth is garbage, its interface type and value are random junk, runtime.convI2E(0xf8010260f8010258, 0x7c030378e8410070, 0x0, 0x0)

So far I have been able to determine that manually adding the forwarding method

func (api *AgentAPIV1) StateServingInfo() (result state.StateServingInfo, err error) {
        return api.AgentAPIV0.StateServingInfo()
}

Stops the panic, but I have not yet been able to construct a stand alone reproduction.

Sadly this only happens on our ppc64le system, which is not accessible to others.

/cc @randall77 @minux @aclements

@davecheney davecheney added the OS-Linux label Apr 30, 2015

@davecheney davecheney added this to the Go1.5 milestone Apr 30, 2015

Contributor

mwhudson commented Apr 30, 2015

Stupid question, but something about 0xf8010260f8010260 certainly catches the eye. Is 0xf8010260 a valid pointer? (It doesn't disassemble to anything very remarkable, ori r2,r0,504 or std r0,8288(r1) depending on endianess)

Contributor

davecheney commented Apr 30, 2015

Oh, I didn't spot that the top and bottom words are duplicated.

On 30 Apr 2015, at 17:44, Michael Hudson-Doyle notifications@github.com wrote:

Stupid question, but something about 0xf8010260f8010260 certainly catches the eye. Is 0xf8010260 a valid pointer? (It doesn't disassemble to anything very remarkable, ori r2,r0,504 or std r0,8288(r1) depending on endianess)


Reply to this email directly or view it on GitHub.

Contributor

rsc commented May 19, 2015

Same as #10844 ?

Contributor

davecheney commented May 19, 2015

@rsc I'd hope so, but I cannot confirm at this time.

I spent all yesterday bisecting trying to find a revision that this problem didn't exist in. The annoying part of that is that while I can replicate this problem easily at tip, as I go back in time the problem is still there, just less frequent so confirming a revision is good involves testing multiple times at each revision.

I hope to have the bisecting done today.

Contributor

rsc commented May 20, 2015

I don't believe this is the same as 10844 anymore.

Contributor

rsc commented Jun 8, 2015

There were some bugs in reflect.Call's type bitmaps that I fixed in May.
Is this still happening?
If so, Dave, is it possible for me to reproduce this somehow?

@rsc rsc changed the title from cmd/gc: reflect.Call through autogenerated method corrupts memory on ppc64le to reflect: Call through autogenerated method corrupts memory on ppc64le Jun 8, 2015

Contributor

davecheney commented Jun 8, 2015

I'll test again. I may be able to get you access to this system, but that
is bit guaranteed. Sadly this is another "needs magic MongoDB" repo.

On Mon, 8 Jun 2015 16:07 Russ Cox notifications@github.com wrote:

There were some bugs in reflect.Call's type bitmaps that I fixed in May.
Is this still happening?
If so, Dave, is it possible for me to reproduce this somehow?


Reply to this email directly or view it on GitHub
#10628 (comment).

Contributor

davecheney commented Jun 9, 2015

@rsc it looks like the problem is still there. It takes more load on the machine to trigger it, but it still happens.

unexpected fault address 0xf8010270f8010290
fatal error: fault
[signal 0xb code=0x1 addr=0xf8010270f8010290 pc=0x1d313c]

goroutine 206 [running]:
runtime.throw(0xfcf840, 0x5)
    /home/ubuntu/go/src/runtime/panic.go:527 +0x8c fp=0xc20882d6e0 sp=0xc20882d6c8
runtime.sigpanic()
    /home/ubuntu/go/src/runtime/sigpanic_unix.go:27 +0x2f0 fp=0xc20882d730 sp=0xc20882d6e0
github.com/juju/juju/apiserver/agent.(*AgentAPIV0).StateServingInfo(0x32c2f4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /home/ubuntu/src/github.com/juju/juju/apiserver/agent/agent_v0.go:87 +0x6c fp=0xc20882d828 sp=0xc20882d738
github.com/juju/juju/apiserver/agent.(*AgentAPIV1).StateServingInfo(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    <autogenerated>:22 +0x40 fp=0xc20882d830 sp=0xc20882d828
runtime.call128(0xc208835220, 0xee8338, 0xc2080a3100)
    /home/ubuntu/go/src/runtime/asm_ppc64x.s:431 +0x88 fp=0xc20882d8b8 sp=0xc20882d830
reflect.Value.call(0xee81e0, 0xc20802c320, 0x913, 0xfb9b00, 0x4, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /home/ubuntu/go/src/reflect/value.go:432 +0xe54 fp=0xc20882dc08 sp=0xc20882d8b8
reflect.Value.Call(0xee81e0, 0xc20802c320, 0x913, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /home/ubuntu/go/src/reflect/value.go:300 +0x94 fp=0xc20882dc68 sp=0xc20882dc08
github.com/juju/juju/rpc/rpcreflect.newMethod.func6(0xee81e0, 0xc20802c320, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /home/ubuntu/src/github.com/juju/juju/rpc/rpcreflect/type.go:323 +0xf8 fp=0xc20882dd28 sp=0xc20882dc68
github.com/juju/juju/apiserver.(*srvCaller).Call(0xc208864d40, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /home/ubuntu/src/github.com/juju/juju/apiserver/root.go:131 +0xf0 fp=0xc20882ddb0 sp=0xc20882dd28
github.com/juju/juju/rpc.(*Conn).runRequest(0xc20843d720, 0x3fff84c7d818, 0xc208864d40, 0x12e4800, 0x3, 0xc208a09010, 0x5, 0x1, 0x0, 0x0, ...)
    /home/ubuntu/src/github.com/juju/juju/rpc/server.go:552 +0xe8 fp=0xc20882dee8 sp=0xc20882ddb0
runtime.goexit()
    /home/ubuntu/go/src/runtime/asm_ppc64x.s:1166 +0x4 fp=0xc20882dee8 sp=0xc20882dee8
created by github.com/juju/juju/rpc.(*Conn).handleRequest
    /home/ubuntu/src/github.com/juju/juju/rpc/server.go:481 +0x5ec

Because of $BUREAUCRACY I probably cannot get you access to the machine. I'll try to come up with a reproducer.

Contributor

rsc commented Jul 21, 2015

We're going to push ppc64 work off to Go 1.6 at this point.

@rsc rsc modified the milestones: Go1.6, Go1.5 Jul 21, 2015

Contributor

davecheney commented Aug 10, 2015

I turned of gocheck's pretty printing of the panic backtrace and found something interesting

winton-01(~/src/github.com/juju/juju/featuretests) % env GOTRACEBACK=crash ./featuretests.test 
panic: runtime error: makeslice: cap out of range

goroutine 11 [running]:
runtime.gopanic(0x14761e0, 0xc82027e020)
        /home/ubuntu/go/src/runtime/panic.go:461 +0x39c fp=0xc820483e20 sp=0xc820483da0
runtime.makeslice(0x10a0280, 0x0, 0xf46182595c618084, 0x0, 0x0, 0x0)
        /home/ubuntu/go/src/runtime/slice.go:30 +0x1b8 fp=0xc820483e70 sp=0xc820483e20
strconv.quoteWith(0xf8410188e8410050, 0xf8410190e8410058, 0xc820020022, 0x0, 0x0)
        /home/ubuntu/go/src/strconv/quote.go:17 +0x80 fp=0xc820483f30 sp=0xc820483e70
strconv.Quote(0xf8410188e8410050, 0xf8410190e8410058, 0x0, 0x0)
        /home/ubuntu/go/src/strconv/quote.go:93 +0x50 fp=0xc820483f60 sp=0xc820483f30
fmt.(*fmt).fmt_q(0xc8201dc398, 0xf8410188e8410050, 0xf8410190e8410058)
        /home/ubuntu/go/src/fmt/format.go:378 +0x150 fp=0xc820483fe0 sp=0xc820483f60
fmt.(*pp).fmtString(0xc8201dc340, 0xf8410188e8410050, 0xf8410190e8410058, 0xc800000071)
        /home/ubuntu/go/src/fmt/print.go:527 +0xfc fp=0xc820484010 sp=0xc820483fe0
fmt.(*pp).printArg(0xc8201dc340, 0x10cc860, 0xc82027e000, 0x71, 0x0, 0x0)
        /home/ubuntu/go/src/fmt/print.go:797 +0xed8 fp=0xc8204841c8 sp=0xc820484010
fmt.(*pp).doPrintf(0xc8201dc340, 0xc820484610, 0x18, 0xc82027e010, 0x1, 0x1)
        /home/ubuntu/go/src/fmt/print.go:1219 +0x1a20 fp=0xc820484580 sp=0xc8204841c8
fmt.Sprintf(0xc820484610, 0x18, 0xc82027e010, 0x1, 0x1, 0x0, 0x0)
        /home/ubuntu/go/src/fmt/print.go:203 +0x68 fp=0xc8204845d0 sp=0xc820484580
github.com/juju/errors.wrap(0x0, 0x0, 0x171b7c0, 0xe, 0x16fd2b0, 0xa, 0xc82027e010, 0x1, 0x1, 0x0, ...)
        /home/ubuntu/src/github.com/juju/errors/errortypes.go:13 +0xa8 fp=0xc820484688 sp=0xc8204845d0
github.com/juju/errors.NotFoundf(0x171b7c0, 0xe, 0xc820484890, 0x1, 0x1, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/errors/errortypes.go:29 +0xfc fp=0xc8204847a8 sp=0xc820484688
github.com/juju/juju/environs/configstore.CacheFile.readInfo(0xc8202ac720, 0xc8202ac750, 0xc8202ac780, 0xf8410188e8410050, 0xf8410190e8410058, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/environs/configstore/cachefile.go:101 +0x770 fp=0xc820484a38 sp=0xc8204847a8
github.com/juju/juju/environs/configstore.(*diskStore).readCacheFile(0xc820269f70, 0xf8410188e8410050, 0xf8410190e8410058, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/environs/configstore/disk.go:217 +0x144 fp=0xc820484ae0 sp=0xc820484a38
github.com/juju/juju/environs/configstore.(*diskStore).ReadInfo(0xc820269f70, 0xf8410188e8410050, 0xf8410190e8410058, 0x0, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/environs/configstore/disk.go:195 +0x190 fp=0xc820484b90 sp=0xc820484ae0
github.com/juju/juju/cmd/envcmd.ConnectionInfoForName(0xf8410188e8410050, 0xf8410190e8410058, 0x0, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/cmd/envcmd/environmentcommand.go:198 +0x104 fp=0xc820484c20 sp=0xc820484b90
github.com/juju/juju/cmd/envcmd.(*SysCommandBase).ConnectionInfo(0x478420, 0x0, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/cmd/envcmd/systemcommand.go:130 +0xec fp=0xc820484c98 sp=0xc820484c20
github.com/juju/juju/cmd/envcmd.(*SysCommandBase).ConnectionEndpoint(0x478420, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/cmd/envcmd/systemcommand.go:116 +0x5c fp=0xc820484df8 sp=0xc820484c98
github.com/juju/juju/cmd/juju/user.(*AddCommand).ConnectionEndpoint(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        <autogenerated>:17 +0x30 fp=0xc820484e00 sp=0xc820484df8
github.com/juju/juju/cmd/juju/user.writeServerFile(0x3fff846cae58, 0xc82040ac80, 0xc82040b590, 0x16fb618, 0x4, 0xc8202b7900, 0x18, 0xc8202ac5a0, 0x2c, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/cmd/juju/user/common.go:29 +0x80 fp=0xc820485090 sp=0xc820484e00
github.com/juju/juju/cmd/juju/user.(*AddCommand).Run(0xc82040ac80, 0xc82040b590, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/cmd/juju/user/add.go:109 +0x790 fp=0xc8204852b8 sp=0xc820485090
github.com/juju/juju/cmd/envcmd.(*sysCommandWrapper).Run(0xc82023e580, 0xc82040b590, 0x0, 0x0)
        <autogenerated>:61 +0x9c fp=0xc820485300 sp=0xc8204852b8
github.com/juju/cmd.(*SuperCommand).Run(0xc8204de000, 0xc82040b590, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/cmd/supercommand.go:427 +0x720 fp=0xc820485530 sp=0xc820485300
github.com/juju/juju/testing.RunCommand(0xc8204de0f0, 0x3fff846ca8f8, 0xc8204de000, 0xc82023e560, 0x2, 0x2, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/testing/cmd.go:79 +0xe4 fp=0xc820485598 sp=0xc820485530
github.com/juju/juju/featuretests.(*UserSuite).RunUserCommand(0xc8204cd6c0, 0xc8204de0f0, 0xc82023e560, 0x2, 0x2, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/featuretests/cmd_juju_user_test.go:38 +0x7c fp=0xc820485610 sp=0xc820485598
github.com/juju/juju/featuretests.(*UserSuite).TestUserAdd(0xc8204cd6c0, 0xc8204de0f0)
        /home/ubuntu/src/github.com/juju/juju/featuretests/cmd_juju_user_test.go:42 +0xcc fp=0xc8204857a0 sp=0xc820485610
runtime.call32(0xc82040a0a0, 0x1674618, 0xc820478670, 0x1000000010, 0x12)
        /home/ubuntu/go/src/runtime/asm_ppc64x.s:429 +0x84 fp=0xc8204857d0 sp=0xc8204857a0
reflect.Value.call(0x1673fe0, 0xc8204cd6c0, 0x3d13, 0x16f0870, 0x4, 0xc820485ee8, 0x1, 0x1, 0x0, 0x0, ...)
        /home/ubuntu/go/src/reflect/value.go:432 +0xe54 fp=0xc820485b20 sp=0xc8204857d0
reflect.Value.Call(0x1673fe0, 0xc8204cd6c0, 0x3d13, 0xc820485ee8, 0x1, 0x1, 0x0, 0x0, 0x0)
        /home/ubuntu/go/src/reflect/value.go:300 +0x94 fp=0xc820485b80 sp=0xc820485b20
gopkg.in/check%2ev1.(*suiteRunner).forkTest.func1(0xc8204de0f0)
        /home/ubuntu/src/gopkg.in/check.v1/check.go:772 +0x5ac fp=0xc820485f48 sp=0xc820485b80
gopkg.in/check%2ev1.(*suiteRunner).forkCall.func1(0xc82008e000, 0xc8204de0f0, 0xc8204caf00)
        /home/ubuntu/src/gopkg.in/check.v1/check.go:665 +0x50 fp=0xc820485f60 sp=0xc820485f48
runtime.goexit()
        /home/ubuntu/go/src/runtime/asm_ppc64x.s:1157 +0x4 fp=0xc820485f60 sp=0xc820485f60

It looks like the autogenerated method is not directly responsible (although adding an explicit forwarding method rather than letting the kernel generate the wrapper always fixes the crash)

github.com/juju/juju/cmd/envcmd.ConnectionInfoForName(0xf8410188e8410050, 0xf8410190e8410058, 0x0, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/cmd/envcmd/environmentcommand.go:198 +0x104 fp=0xc820484c20 sp=0xc820484b90
github.com/juju/juju/cmd/envcmd.(*SysCommandBase).ConnectionInfo(0x478420, 0x0, 0x0, 0x0, 0x0)
        /home/ubuntu/src/github.com/juju/juju/cmd/envcmd/systemcommand.go:130 +0xec fp=0xc820484c98 sp=0xc820484c20
github.com/juju/juju/cmd/envcmd.(*SysCommandBase).ConnectionEndpoint(0x478420, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ubuntu/src/github.com/juju/juju/cmd/envcmd/systemcommand.go:116 +0x5c fp=0xc820484df8 sp=0xc820484c98
github.com/juju/juju/cmd/juju/user.(*AddCommand).ConnectionEndpoint(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        <autogenerated>:17 +0x30 fp=0xc820484e00 sp=0xc820484df8

github.com/juju/juju/cmd/envcmd.ConnectionInfoForName is passed a string, and returns two interface values. The string passed in here is garbage.

@davecheney davecheney closed this Aug 11, 2015

davecheney pushed a commit to davecheney/juju that referenced this issue Aug 12, 2015

jujubot added a commit to juju/juju that referenced this issue Aug 12, 2015

Merge pull request #2968 from davecheney/undo-issue-10628-workaround
Undo workaround for golang/go#10628

Undo the workaround for golang/go#10628 now that the fix is available upstream.

(Review request: http://reviews.vapour.ws/r/2348/)

@gopherbot gopherbot locked and limited conversation to collaborators Aug 10, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.