Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/parser: SIGSEGV in (*resolver).resolveList #52046

Open
bcmills opened this issue Mar 30, 2022 · 16 comments
Open

go/parser: SIGSEGV in (*resolver).resolveList #52046

bcmills opened this issue Mar 30, 2022 · 16 comments
Labels
NeedsInvestigation
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Mar 30, 2022

unexpected fault address 0x833a6e1c00000000
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x3 addr=0x833a6e1c00000000 pc=0x19818]

goroutine 203 [running]:
runtime.throw({0x2db0c9?, 0x62778?})
	/workdir/go/src/runtime/panic.go:992 +0x58 fp=0xc000be9070 sp=0xc000be9030 pc=0x45aa8
runtime.sigpanic()
	/workdir/go/src/runtime/signal_unix.go:825 +0x1b8 fp=0xc000be90b0 sp=0xc000be9070 pc=0x5dac8
runtime.convI2I(0xc000f00100?, 0x0?)
	/workdir/go/src/runtime/iface.go:415 +0x38 fp=0xc000be9108 sp=0xc000be90d0 pc=0x19818
go/parser.(*resolver).resolveList(0xc000088000?, 0x109e3c?)
	/workdir/go/src/go/parser/resolver.go:525 +0xb0 fp=0xc000be9170 sp=0xc000be9108 pc=0x120190
go/parser.(*resolver).Visit(0xc000269c80, {0x33fdc8?, 0xc000efa7e0?})
	/workdir/go/src/go/parser/resolver.go:493 +0x1a4 fp=0xc000be99f0 sp=0xc000be9170 pc=0x11d614
go/ast.Walk({0x33f018?, 0xc000269c80?}, {0x33fdc8?, 0xc000efa7e0?})
	/workdir/go/src/go/ast/walk.go:52 +0x74 fp=0xc000be9b08 sp=0xc000be99f0 pc=0x105934
go/parser.resolveFile(0xc000eb3100, 0xc0002296e0, 0x0)
	/workdir/go/src/go/parser/resolver.go:32 +0x404 fp=0xc000be9bb0 sp=0xc000be9b08 pc=0x11c3e4
go/parser.(*parser).parseFile(0xc0001cf400)
	/workdir/go/src/go/parser/parser.go:2897 +0x3a4 fp=0xc000be9cf0 sp=0xc000be9bb0 pc=0x11bd74
go/parser.ParseFile(0xc000116000, {0xc0001c4270, 0x21}, {0x28a080?, 0xc000af5140?}, 0x0?)
	/workdir/go/src/go/parser/interface.go:122 +0x154 fp=0xc000be9dc0 sp=0xc000be9cf0 pc=0x109454
golang.org/x/tools/go/packages.newLoader.func2(0xc0001c4270?, {0xc0001c4270, 0x21}, {0xc000096480?, 0x0?, 0x0?})
	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:582 +0x90 fp=0xc000be9e18 sp=0xc000be9dc0 pc=0x2514b0
golang.org/x/tools/go/packages.(*loader).parseFile(0xc000112000, {0xc0001c4270, 0x21})
	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1042 +0x3dc fp=0xc000be9f38 sp=0xc000be9e18 pc=0x2548dc
golang.org/x/tools/go/packages.(*loader).parseFiles.func1(0x69, {0xc0001c4270?, 0x0?})
	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1070 +0x64 fp=0xc000be9f88 sp=0xc000be9f38 pc=0x254f54
golang.org/x/tools/go/packages.(*loader).parseFiles.func2()
	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1072 +0x60 fp=0xc000be9fc0 sp=0xc000be9f88 pc=0x254ee0
runtime.goexit()
	/workdir/go/src/runtime/asm_ppc64x.s:905 +0x4 fp=0xc000be9fc0 sp=0xc000be9fc0 pc=0x78c44
created by golang.org/x/tools/go/packages.(*loader).parseFiles
	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1069 +0x4ec

greplogs --dashboard -md -l -e 'goroutine \d+ \[running\]:\n(.+\n\t.+\n)*go/parser\.\(\*resolver\)\.resolveList' --omit=-n2d

2022-03-21T13:26:21-86b02b3-7eaad60/linux-ppc64le-buildlet
2021-04-28T19:13:50-16b25d2-ad989c7/linux-arm-scaleway

(CC @griesemer)

@bcmills bcmills added the NeedsInvestigation label Mar 30, 2022
@bcmills bcmills added this to the Backlog milestone Mar 30, 2022
@griesemer
Copy link
Contributor

@griesemer griesemer commented Mar 30, 2022

cc @findleyr

@findleyr
Copy link
Contributor

@findleyr findleyr commented Mar 30, 2022

Is this likely related to go/parser, or the builders themselves? See also #51487.

I'd guess the common denominator is that the resolver does a lot of allocation via map writes? For example, the scaleway failure is actually the following:
fatal error: runtime: out of memory
https://build.golang.org/log/48bbb71a6cb675fefd008946e5bdce61e970b2b6

CC @golang/runtime

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 1, 2022

One more. Not clear to me whether this is related to go/parser or the builder itself.

greplogs --dashboard -md -l -e 'goroutine \d+ \[running\]:\n(.+\n\t.+\n)*go/parser\.\(\*resolver\)\.resolveList' --since=2022-03-30

2022-03-31T21:21:20-b9a4807-ff6b6c6/linux-ppc64le-buildlet

@findleyr
Copy link
Contributor

@findleyr findleyr commented Apr 1, 2022

Given that the scaleway failure was unambigiously an OOM, and the other two failures are on linux-ppc64le-buildlet, I am inclined to think that this may be a builder problem (or a strange manifestation of OOM on ppc). But I have no idea how this could occur.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 21, 2022

Indeed, this seems to somehow be specific to linux-ppc64le-buildlet. The OOM failure mode suggests a likely connection to #49957.

greplogs --dashboard -md -l -e 'goroutine \d+ \[running\]:\n(.+\n\t.+\n)*go/parser\.\(\*resolver\)\.resolveList' --since=2022-04-01

2022-04-20T20:19:07-5d7ca8a-24fcbb9/linux-ppc64-buildlet
2022-04-04T15:12:26-153e30b-f86f9a3/linux-ppc64le-buildlet

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 21, 2022

@pmur, what kind of overcommit policy does linux-ppc64-buildlet have? Typically I expect an OOM-on-overcommit to result in SIGKILL — do we have a theory as to why it would manifest as SIGSEGV here?

@pmur
Copy link
Contributor

@pmur pmur commented Apr 21, 2022

I can check (via osuol). The ppc64 builders don't enforce container ram limits like the ppc64le builders. I am not sure if the docker issues mentioned in the systemd unit file have been resolved (or whether they are tracked anywhere). I am guessing these use the default kernel overcommit options (mode 0 and 50% overcommit).

I requested osuol bump the memory allocated to these VM's from 20 to 30GB. I think that change became effective in the last week.

@pmur
Copy link
Contributor

@pmur pmur commented Apr 21, 2022

The segfault address on ppc64 looks suspicious. That kind of looks like an endian-swapped address based on how I've seen go grow it's heap on ppc64le.

@cherrymui
Copy link
Member

@cherrymui cherrymui commented Apr 21, 2022

What address is suspicious with the wrong endianness?

[signal SIGSEGV: segmentation violation code=0x1 addr=0x1 pc=0x19818]

This address looks reasonable.

@pmur
Copy link
Contributor

@pmur pmur commented Apr 21, 2022

The value in the first comment 0x833a6e1c00000000

@pmur
Copy link
Contributor

@pmur pmur commented Apr 21, 2022

2022-04-20T20:19:07-5d7ca8a-24fcbb9/linux-ppc64-buildlet is it possible a failed mmap wasn't handled somewhere?

@cherrymui
Copy link
Member

@cherrymui cherrymui commented Apr 21, 2022

0x833a6e1c00000000

That address is indeed weird. But it doesn't look like a endianness-swapped address, which would be 0x000000001c6e3a83. The heap address is typically 0x000000c0xxxxxxxx.

Perhaps it off by 32-bit, i.e. 833a6e1c should be the low 32-bit of something, not the high bits?

@pmur
Copy link
Contributor

@pmur pmur commented Apr 21, 2022

Looking at other logs, it looks more like a bad pointer (2022-03-31T21:21:20-b9a4807-ff6b6c6/linux-ppc64le-buildlet).

@laboger
Copy link
Contributor

@laboger laboger commented Apr 21, 2022

Pointers are usually printed as 8 byte values, but in this case it is printing it as 16 bytes so that is why it looks weird. I've seen addresses printed like this before.

I don't think these are OOMs on Power -- if they hit OOM it would SIGKILL and not execute the rest of the tests.
SIGSEGV happens when trying to execute an instruction using an invalid address.
The fact that these failures are not all on the same test and don't fail the same way would indicate there is something random happening, like branching or returning to a wrong instruction address where it expects the registers to contain something other than they do. So it is not really accessing a valid address.

Interesting that it doesn't happen on power9. But the fact that it only happens in tests with golang.org/x/tools would make it seem like it is related to that package. Unfortunately it is hard to debug if we can't reproduce. But maybe get an idea of when it started happening?

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 25, 2022

https://build.golang.org/log/0017fe5991c697b5529548a93953bc686e2ac01d looks like the same failure mode too, although because of the indentation I'm having trouble shoehorning it into a good greplogs regexp. 🤷‍♂️

    multichecker_test.go:80: [-findcall.name=nosuchfunc io]: out=<<unexpected fault address 0x63a0865300000000
        fatal error: fault
        [signal SIGSEGV: segmentation violation code=0x3 addr=0x63a0865300000000 pc=0x19cd8]
        
        goroutine 628 [running]:
        runtime.throw({0x33c9a7?, 0x62cb8?})
        	/workdir/go/src/runtime/panic.go:1000 +0x58 fp=0xc000ef3070 sp=0xc000ef3030 pc=0x45ff8
        runtime.sigpanic()
        	/workdir/go/src/runtime/signal_unix.go:830 +0x1b8 fp=0xc000ef30b0 sp=0xc000ef3070 pc=0x5dff8
        runtime.convI2I(0xc000b3be50?, 0xc000653860?)
        	/workdir/go/src/runtime/iface.go:415 +0x38 fp=0xc000ef3108 sp=0xc000ef30d0 pc=0x19cd8
        go/parser.(*resolver).resolveList(0xc000088000?, 0x10ba8c?)
        	/workdir/go/src/go/parser/resolver.go:524 +0xb0 fp=0xc000ef3170 sp=0xc000ef3108 pc=0x121de0
        go/parser.(*resolver).Visit(0xc000873aa0, {0x3af630?, 0xc0019305d0?})
        	/workdir/go/src/go/parser/resolver.go:492 +0x1a4 fp=0xc000ef39f0 sp=0xc000ef3170 pc=0x11f264
        go/ast.Walk({0x3ae678?, 0xc000873aa0?}, {0x3af630?, 0xc0019305d0?})
        	/workdir/go/src/go/ast/walk.go:51 +0x74 fp=0xc000ef3b08 sp=0xc000ef39f0 pc=0x107584
        go/parser.resolveFile(0xc0017eb200, 0xc00061a9c0, 0x0)
        	/workdir/go/src/go/parser/resolver.go:32 +0x404 fp=0xc000ef3bb0 sp=0xc000ef3b08 pc=0x11e034
        go/parser.(*parser).parseFile(0xc000ec5680)
        	/workdir/go/src/go/parser/parser.go:2890 +0x3a4 fp=0xc000ef3cf0 sp=0xc000ef3bb0 pc=0x11d9c4
        go/parser.ParseFile(0xc0000ec400, {0xc000231fb0, 0x21}, {0x2de6c0?, 0xc000e97458?}, 0x0?)
        	/workdir/go/src/go/parser/interface.go:119 +0x154 fp=0xc000ef3dc0 sp=0xc000ef3cf0 pc=0x10b0a4
        golang.org/x/tools/go/packages.newLoader.func2(0xc000231fb0?, {0xc000231fb0, 0x21}, {0xc000096480?, 0x0?, 0x0?})
        	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:611 +0x90 fp=0xc000ef3e18 sp=0xc000ef3dc0 pc=0x274600
        golang.org/x/tools/go/packages.(*loader).parseFile(0xc0000ba1c0, {0xc000231fb0, 0x21})
        	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1077 +0x3d4 fp=0xc000ef3f38 sp=0xc000ef3e18 pc=0x277ac4
        golang.org/x/tools/go/packages.(*loader).parseFiles.func1(0x6a, {0xc000231fb0?, 0x0?})
        	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1104 +0x64 fp=0xc000ef3f88 sp=0xc000ef3f38 pc=0x278134
        golang.org/x/tools/go/packages.(*loader).parseFiles.func2()
        	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1106 +0x60 fp=0xc000ef3fc0 sp=0xc000ef3f88 pc=0x2780c0
        runtime.goexit()
        	/workdir/go/src/runtime/asm_ppc64x.s:892 +0x4 fp=0xc000ef3fc0 sp=0xc000ef3fc0 pc=0x790c4
        created by golang.org/x/tools/go/packages.(*loader).parseFiles
        	/workdir/gopath/src/golang.org/x/tools/go/packages/packages.go:1103 +0x4ec

@laboger
Copy link
Contributor

@laboger laboger commented Jun 10, 2022

Paul tried to reproduce this problem but was unable to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation
Projects
None yet
Development

No branches or pull requests

6 participants