New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: darwin_amd64: running dsymutil failed: signal: segmentation fault #26237

Closed
pcman312 opened this Issue Jul 5, 2018 · 19 comments

Comments

Projects
None yet
4 participants
@pcman312

pcman312 commented Jul 5, 2018

What version of Go are you using (go version)?

$ go version
go version go1.10.3 darwin/amd64

$ sw_vers
ProductName:	Mac OS X
ProductVersion:	10.12.6
BuildVersion:	16G1114

$ clang --version
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Does this issue reproduce with the latest release?

Yes. I was originally on 10.2 but upgraded to 10.3 to see if a fix already existed and the issue persisted.

What operating system and processor architecture are you using (go env)?

$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE=[snip]
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH=[snip]
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/1n/x7xttf9n7ylbl8vn50830dsh6qmzqb/T/go-build029428003=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do? / What did you expect to see? / What did you see instead?

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

We have a REST service that accesses an elasticsearch database. In the main it imports a package in the service that accesses ES (.../db/elastic). That sub-package imports ES "gopkg.in/olivere/elastic.v3". I'm refactoring some things around and pulled that library into the main in addition to the sub-package.

We format our imports like this:

[stdlib]

[not-our code - this is where the olivere library is]

[our code - both this service as well as any internal libraries we are using]

When I pulled in the olivere library and compiled it, I got this error:

/usr/local/go/pkg/tool/darwin_amd64/link: /usr/local/go/pkg/tool/darwin_amd64/link: running dsymutil failed: signal: segmentation fault

I've been playing around with it to try to narrow down what's causing it and so far I've been able to narrow my suspicions down to the order of imports.

This import successfully builds:

[other external imports]

ES "gopkg.in/olivere/elastic.v3"
".../db/elastic"
[other service/internal imports]

This import segfaults as above:

[other external imports]
ES "gopkg.in/olivere/elastic.v3"

".../db/elastic"
[other service/internal imports]

If I put the olivere import in a separate block (newlines around it), it seems to have to be strictly after the service import .../db/elastic:

Succeeds:

[other imports]

".../db/elastic"

ES "gopkg.in/olivere/elastic.v3"

[other imports]

Also succeeds:

[other imports]

// The order doesn't matter if they don't have newlines between them
ES "gopkg.in/olivere/elastic.v3"
".../db/elastic"

[other imports]

Fails:

[other imports]

ES "gopkg.in/olivere/elastic.v3"

".../db/elastic"

[other imports]

I've been trying to build a standalone program that I can reproduce this in, but so far I have been unsuccessful. When I pull out the code in question to a sandbox program, I am unable to reproduce it. It's possible that some other import is playing some sort of role, but I haven't been able to narrow it down because it does not seem to matter where I put the two imports as long as they are in the same block (without newlines between them) or are in a specific order.

Also worth noting: when I do a cross compile to linux, this succeeds regardless of how I organize the imports.

The behavior seems very similar to #23374 but I don't know if it's related beyond using the same compiler.

@ianlancetaylor ianlancetaylor changed the title from Another cmd/link: darwin_amd64: running dsymutil failed: signal: segmentation fault to cmd/link: darwin_amd64: running dsymutil failed: signal: segmentation fault Jul 6, 2018

@ianlancetaylor ianlancetaylor added this to the Go1.11 milestone Jul 6, 2018

@ianlancetaylor

This comment has been minimized.

Show comment
Hide comment
@ianlancetaylor

ianlancetaylor Jul 6, 2018

Contributor

CC @thanm

Can you check whether this is fixed in the 1.11beta1 beta release? Thanks.

Contributor

ianlancetaylor commented Jul 6, 2018

CC @thanm

Can you check whether this is fixed in the 1.11beta1 beta release? Thanks.

@thanm thanm self-assigned this Jul 6, 2018

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 6, 2018

Member

Hi @pcman312 -- I'd like to take a closer look at this, but I'm not sure there is much I can do without a stand-alone reproducer. Perhaps you could share an example with the same imports even if it doesn't cause the crash?

Also worth mentioning: folks have seen DWARF problems with older versions of Xcode, for example in https://github.com/golang/go/issues/25392. For your system, could you pls share the output of:

$ xcodebuild -version

Thanks.

Member

thanm commented Jul 6, 2018

Hi @pcman312 -- I'd like to take a closer look at this, but I'm not sure there is much I can do without a stand-alone reproducer. Perhaps you could share an example with the same imports even if it doesn't cause the crash?

Also worth mentioning: folks have seen DWARF problems with older versions of Xcode, for example in https://github.com/golang/go/issues/25392. For your system, could you pls share the output of:

$ xcodebuild -version

Thanks.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

@thanm

$ xcodebuild
xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance

I'm not sure how much of this I can share since it's proprietary code dealing with our database unfortunately. I should be able to put something together that has the same dependency tree, just no functional contents if that works.

pcman312 commented Jul 6, 2018

@thanm

$ xcodebuild
xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance

I'm not sure how much of this I can share since it's proprietary code dealing with our database unfortunately. I should be able to put something together that has the same dependency tree, just no functional contents if that works.

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 6, 2018

Member

same dependency tree, just no functional contents

This is very unlikely to help, sorry. DWARF problems of this sort can be tricky to reproduce.

Member

thanm commented Jul 6, 2018

same dependency tree, just no functional contents

This is very unlikely to help, sorry. DWARF problems of this sort can be tricky to reproduce.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

No kidding. I was just looking through it to try to narrow it down to a smaller program that I could share and discovered a couple other dependencies that when removed makes the compile error disappear. Do you have any recommendations for things I can do to try to reproduce this for you?

pcman312 commented Jul 6, 2018

No kidding. I was just looking through it to try to narrow it down to a smaller program that I could share and discovered a couple other dependencies that when removed makes the compile error disappear. Do you have any recommendations for things I can do to try to reproduce this for you?

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

I have a version of this that I can share that reproduces the error. I haven't been able to get this to reduce any further:
https://github.com/pcman312/darwin_test

I confirmed this by having a coworker try to compile it and he got the same error.

When I mess around with the imports in main.go or with any of the handling after the Do() call (https://github.com/pcman312/darwin_test/blob/master/es/esclient.go#L21-L26) it successfully compiles.

pcman312 commented Jul 6, 2018

I have a version of this that I can share that reproduces the error. I haven't been able to get this to reduce any further:
https://github.com/pcman312/darwin_test

I confirmed this by having a coworker try to compile it and he got the same error.

When I mess around with the imports in main.go or with any of the handling after the Do() call (https://github.com/pcman312/darwin_test/blob/master/es/esclient.go#L21-L26) it successfully compiles.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

Interesting further investigation: I had another engineer who did not have Go set up on his computer try building that test and his succeeded. That suggests there is some difference between my local version of either of the two libraries in question. I will upload copies of my local versions to the darwin_test package. They will not be in a vendor directory, just in appropriately named folders.

pcman312 commented Jul 6, 2018

Interesting further investigation: I had another engineer who did not have Go set up on his computer try building that test and his succeeded. That suggests there is some difference between my local version of either of the two libraries in question. I will upload copies of my local versions to the darwin_test package. They will not be in a vendor directory, just in appropriately named folders.

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 6, 2018

Member

Thanks for creating the reproducer. So far I can't reproduce the dsymutil crash on my mac with either 1.10.3 or the 1.11 beta candidate... I'll keep poking around at it to see what I can figure out.

Member

thanm commented Jul 6, 2018

Thanks for creating the reproducer. So far I can't reproduce the dsymutil crash on my mac with either 1.10.3 or the 1.11 beta candidate... I'll keep poking around at it to see what I can figure out.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

I've uploaded copies of those two libraries to the darwin_test project. I'll see if that engineer I mentioned previously can reproduce it with those versions.

pcman312 commented Jul 6, 2018

I've uploaded copies of those two libraries to the darwin_test project. I'll see if that engineer I mentioned previously can reproduce it with those versions.

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 6, 2018

Member

SGTM.

Member

thanm commented Jul 6, 2018

SGTM.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

I uploaded my local copy of golang.org/x/sys/unix as well. We are still unable to reproduce the error on his system with those 3 libraries copied from my system to his. This suggests it may be some other versioning problem. I checked his system and he is running a different version of OSX:

$ go version
go version go1.10.3 darwin/amd64

$ sw_vers
ProductName:    Mac OS X
ProductVersion:    10.13.3
BuildVersion:    17D47

$ clang --version
Apple LLVM version 9.1.0 (clang-902.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

The first engineer (whom can reproduce this error) is on the same version of OSX and clang. Their versions of xcodebuild are different (can reproduce - Xcode 9.2 Build version 9C40b, cannot reproduce - Xcode 9.4 Build version 9F1027a, mine shows an error)

pcman312 commented Jul 6, 2018

I uploaded my local copy of golang.org/x/sys/unix as well. We are still unable to reproduce the error on his system with those 3 libraries copied from my system to his. This suggests it may be some other versioning problem. I checked his system and he is running a different version of OSX:

$ go version
go version go1.10.3 darwin/amd64

$ sw_vers
ProductName:    Mac OS X
ProductVersion:    10.13.3
BuildVersion:    17D47

$ clang --version
Apple LLVM version 9.1.0 (clang-902.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

The first engineer (whom can reproduce this error) is on the same version of OSX and clang. Their versions of xcodebuild are different (can reproduce - Xcode 9.2 Build version 9C40b, cannot reproduce - Xcode 9.4 Build version 9F1027a, mine shows an error)

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 6, 2018

Member

I'm running Xcode 9.4 Build version 9F1027a, so in theory my machine should have the same setup as yours. Hmm.

One other shot in the dark: generated DWARF is sensitive to the length of pathnames. If you could please send me the number of characters in your GOPATH and/or GOROOT settings, I can try to replicate them in my setup. E.g.

$ go env GOPATH
/foo/bar/mygopath

here length = 17 chars.

Member

thanm commented Jul 6, 2018

I'm running Xcode 9.4 Build version 9F1027a, so in theory my machine should have the same setup as yours. Hmm.

One other shot in the dark: generated DWARF is sensitive to the length of pathnames. If you could please send me the number of characters in your GOPATH and/or GOROOT settings, I can try to replicate them in my setup. E.g.

$ go env GOPATH
/foo/bar/mygopath

here length = 17 chars.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 6, 2018

My xcode is giving me an error when I ask for a version, but the engineer who can reproduce the error has version 9.2, not 9.4.

The length of my path to the darwin_test project is 62 characters:

$ pwd
/Users/[15 characters]/go/src/github.com/pcman312/darwin_test

$ pwd | wc -c
      62

The path to the top of the service in question is 58 characters. From there it goes to .../db/elastic where it's at 69 characters. The folder with the longest path to it in the project (excluding vendoring) is 83 characters. None of these numbers include the filename at the end, only the folders. Including the files, the longest is 106 (unsurprisingly in the same folder as the longest folder path).

Obligatory relevant XKCD: https://m.xkcd.com/688/

Edit: GOROOT is /usr/local/go

pcman312 commented Jul 6, 2018

My xcode is giving me an error when I ask for a version, but the engineer who can reproduce the error has version 9.2, not 9.4.

The length of my path to the darwin_test project is 62 characters:

$ pwd
/Users/[15 characters]/go/src/github.com/pcman312/darwin_test

$ pwd | wc -c
      62

The path to the top of the service in question is 58 characters. From there it goes to .../db/elastic where it's at 69 characters. The folder with the longest path to it in the project (excluding vendoring) is 83 characters. None of these numbers include the filename at the end, only the folders. Including the files, the longest is 106 (unsurprisingly in the same folder as the longest folder path).

Obligatory relevant XKCD: https://m.xkcd.com/688/

Edit: GOROOT is /usr/local/go

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 9, 2018

Member

I recreated my environment using GOPATH/GOROOT directories with string lengths that match your setup -- still can't reproduce the issue on my system (no dsymutil errors). I am mostly out of ideas at this point.

Member

thanm commented Jul 9, 2018

I recreated my environment using GOPATH/GOROOT directories with string lengths that match your setup -- still can't reproduce the issue on my system (no dsymutil errors). I am mostly out of ideas at this point.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 9, 2018

Is it possible for you to downgrade clang to 9.0.0? I'm going to try to find some other computers with 9.0.0 installed and see if it reproduces on those machines.

pcman312 commented Jul 9, 2018

Is it possible for you to downgrade clang to 9.0.0? I'm going to try to find some other computers with 9.0.0 installed and see if it reproduces on those machines.

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 9, 2018

Member

OK, now I think we're finally getting somewhere. I downgraded Xcode on my machine to 9.0.1, and I am finally able to trigger the dsymutil crash.

I ran the link with "go build -x -work -ldflags="-v -tmpdir=/tmp/xxx", then picked apart the intermediate objects. It looks like the DWARF in /tmp/xxx/go.o is bad... running dwarf-check on it I get:

loading DWARF for /tmp/xxx/go.o
examining DWARF for /tmp/xxx/go.o
unresolved abstract origin ref from DIE 51249 at offset 0x136ca5 to bad offset 0x136bb6

0x136ca5: FormalParameter
at=AbstractOrigin val=0x136bb6
at=Location val=0x91987f

Parent:
0x136c8b: InlinedSubroutine
at=AbstractOrigin val=0x136b7b
at=Lowpc val=0x2481e7
at=Highpc val=0x2481f0
at=CallFile val=0x1
at=CallLine val=0x18
leaving main

I'll have to look more closely, but at this point it certainly looks like a compiler bug, probably something similar to #23374. More to come later.

Member

thanm commented Jul 9, 2018

OK, now I think we're finally getting somewhere. I downgraded Xcode on my machine to 9.0.1, and I am finally able to trigger the dsymutil crash.

I ran the link with "go build -x -work -ldflags="-v -tmpdir=/tmp/xxx", then picked apart the intermediate objects. It looks like the DWARF in /tmp/xxx/go.o is bad... running dwarf-check on it I get:

loading DWARF for /tmp/xxx/go.o
examining DWARF for /tmp/xxx/go.o
unresolved abstract origin ref from DIE 51249 at offset 0x136ca5 to bad offset 0x136bb6

0x136ca5: FormalParameter
at=AbstractOrigin val=0x136bb6
at=Location val=0x91987f

Parent:
0x136c8b: InlinedSubroutine
at=AbstractOrigin val=0x136b7b
at=Lowpc val=0x2481e7
at=Highpc val=0x2481f0
at=CallFile val=0x1
at=CallLine val=0x18
leaving main

I'll have to look more closely, but at this point it certainly looks like a compiler bug, probably something similar to #23374. More to come later.

@pcman312

This comment has been minimized.

Show comment
Hide comment
@pcman312

pcman312 Jul 9, 2018

That's excellent news! I've been trying to gather as much data as I can on my side regarding versions and so far I've gotten 6 people on 9.1.0 and 3 people on 9.0.0. All of the 9.1.0's have built successfully while 9.0.0 has failed.

Edit: Also someone with 8.1.0 got the segfault as well

pcman312 commented Jul 9, 2018

That's excellent news! I've been trying to gather as much data as I can on my side regarding versions and so far I've gotten 6 people on 9.1.0 and 3 people on 9.0.0. All of the 9.1.0's have built successfully while 9.0.0 has failed.

Edit: Also someone with 8.1.0 got the segfault as well

@thanm

This comment has been minimized.

Show comment
Hide comment
@thanm

thanm Jul 10, 2018

Member

I have a fix for this that I will be sending shortly. The construct that is triggering this bug is a package import whose terminal directory contains a "." character. Within the compiler such package paths are canonicalized to mangled/hide the dot (this mangling is not supposed to be exposed to the user, it's simply something to make symbol handling easier); the bug was that part of the DWARF generation code wasn't canonicalizing an import path where it was supposed to, which later on triggered an inconsistency in the DWARF.

Member

thanm commented Jul 10, 2018

I have a fix for this that I will be sending shortly. The construct that is triggering this bug is a package import whose terminal directory contains a "." character. Within the compiler such package paths are canonicalized to mangled/hide the dot (this mangling is not supposed to be exposed to the user, it's simply something to make symbol handling easier); the bug was that part of the DWARF generation code wasn't canonicalizing an import path where it was supposed to, which later on triggered an inconsistency in the DWARF.

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Jul 10, 2018

Change https://golang.org/cl/123036 mentions this issue: cmd/compile: call objabi.PathToPrefix when emitting abstract fn

gopherbot commented Jul 10, 2018

Change https://golang.org/cl/123036 mentions this issue: cmd/compile: call objabi.PathToPrefix when emitting abstract fn

@gopherbot gopherbot closed this in ec88f78 Jul 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment