New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: macOS 10.10 builder is red with TSAN memory alloc failures #26475

Open
bradfitz opened this Issue Jul 19, 2018 · 18 comments

Comments

Projects
None yet
7 participants
@bradfitz
Member

bradfitz commented Jul 19, 2018

The release notes for Go 1.11 say that our minimum macOS supported version is "macOS 10.10 Yosemite or later", but our macOS 10.10 column is currently red.

The first failure I see is: (rev 7951d90, which I happened to approve perhaps mistakenly thinking it was trivially correct and only affected debug mode.. maybe wrong?)

https://build.golang.org/log/17de002c5f0eb3ea6136185d5f943d1f38eb5a63


##### Testing race detector
ok  	runtime/race	33.822s
ok  	flag	1.079s
ok  	net	1.241s
ok  	os	1.124s
ok  	os/exec	3.102s
ok  	encoding/gob	1.132s
==59101==ERROR: ThreadSanitizer failed to allocate 0x10000000 (268435456) bytes at address a30000000000 (errno: 12)
runtime: newstack sp=0xb0103e20 stack=[0x20c000032000, 0x20c000032800]
	morebuf={pc:0x41bcba9 sp:0xb0103e30 lr:0x0}
	sched={pc:0x4044789 sp:0xb0103e28 lr:0x0 ctxt:0x0}
runtime: gp=0x20c000076300, goid=18, gp->status=0x2
 runtime: split stack overflow: 0xb0103e20 < 0x20c000032000
fatal error: runtime: split stack overflow

runtime stack:
runtime.throw(0x427c0fd, 0x1d)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/panic.go:608 +0x6a fp=0xb0103cf0 sp=0xb0103cc0 pc=0x402d95a
runtime.newstack()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/stack.go:959 +0x89f fp=0xb0103e88 sp=0xb0103cf0 pc=0x404826f
runtime.morestack()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/asm_amd64.s:429 +0x8f fp=0xb0103e90 sp=0xb0103e88 pc=0x405ce5f

goroutine 18 [running]:
runtime: unexpected return pc for runtime.sigpanic called from 0x41bcba9
stack: frame={sp:0xb0103e28, fp:0xb0103e30} stack=[0x20c000032000,0x20c000032800)

runtime.sigpanic()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/signal_unix.go:371 +0x2f9 fp=0xb0103e30 sp=0xb0103e28 pc=0x4044789
created by runtime.gcenable
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mgc.go:216 +0x58

goroutine 1 [chan receive, locked to thread]:
runtime.gopark(0x4283908, 0x20c000082058, 0x20c00002170d, 0x3)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:302 +0xeb fp=0x20c000036678 sp=0x20c000036658 pc=0x402f6cb
runtime.goparkunlock(0x20c000082058, 0x170d, 0x3)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:308 +0x53 fp=0x20c0000366a8 sp=0x20c000036678 pc=0x402f773
runtime.chanrecv(0x20c000082000, 0x0, 0x20c000000301, 0x401a6b8)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/chan.go:520 +0x20b fp=0x20c000036738 sp=0x20c0000366a8 pc=0x40085db
runtime.chanrecv1(0x20c000082000, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/chan.go:402 +0x2b fp=0x20c000036768 sp=0x20c000036738 pc=0x40083bb
runtime.gcenable()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mgc.go:217 +0x6f fp=0x20c000036798 sp=0x20c000036768 pc=0x401a6cf
runtime.main()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:164 +0x119 fp=0x20c0000367e0 sp=0x20c000036798 pc=0x402f1c9
runtime.goexit()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0x20c0000367e8 sp=0x20c0000367e0 pc=0x405edd1

goroutine 2 [force gc (idle)]:
runtime.gopark(0x4283908, 0x441b290, 0x1410, 0x1)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:302 +0xeb fp=0x20c000036f80 sp=0x20c000036f60 pc=0x402f6cb
runtime.goparkunlock(0x441b290, 0x1410, 0x1)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:308 +0x53 fp=0x20c000036fb0 sp=0x20c000036f80 pc=0x402f773
runtime.forcegchelper()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:251 +0xb3 fp=0x20c000036fe0 sp=0x20c000036fb0 pc=0x402f543
runtime.goexit()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0x20c000036fe8 sp=0x20c000036fe0 pc=0x405edd1
created by runtime.init.4
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:240 +0x35
exit status 2
FAIL	_/private/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/misc/cgo/test	0.016s
2018/07/09 15:59:30 Failed: exit status 1
ok  	flag	1.080s
ok  	os/exec	3.121s

/cc @randall77 @aclements @dvyukov

@aclements

This comment has been minimized.

Member

aclements commented Jul 19, 2018

@randall77 debugged this a bit when I blamed the parent commit of 7951d90:

I'm not really sure. I can see where you are coming from, but I don't understand how this CL would have broken the race detector.
Gomoteing a 1.10 builder, all.bash passes on tip. But that doesn't really match the builder behavior, which looks like it deterministically fails (unless it nondeterministically fails at an earlier test).
Maybe there's something weird about the specific builder used by the dashboard?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jul 19, 2018

Since Dec 2017 (in https://go-review.googlesource.com/82355), the Mac builders are always fresh VMs booted per build. They should be identical between build.golang.org and gomote and trybots. The only difference is the build system doesn't use all.bash. It runs make.bash and then runs the go tool dist test --list units itself with go tool dist test $TEST_UNIT.

@aclements

This comment has been minimized.

Member

aclements commented Jul 19, 2018

I was able to reproduce with VM=$(gomote create darwin-amd64-10_10) && gomote push $VM && gomote run $VM go/src/make.bash && gomote run $VM go/bin/go tool dist test race

@aclements

This comment has been minimized.

Member

aclements commented Jul 19, 2018

I'm able to reproduce at 94076fe (@randall77's commit, 7951d90~), but not at 7951d90~~, so this does appear to be from commit 94076fe.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jul 19, 2018

@aclements

This comment has been minimized.

Member

aclements commented Jul 19, 2018

@aclements

This comment has been minimized.

Member

aclements commented Jul 19, 2018

Ah, interesting, my commit to use TSAN-compatible hints in race mode (2d4c1ff) changed the nature of the failure: https://build.golang.org/log/6d2ecc136e9649755ba2a8c0f38ce9e656a854eb

fatal error: too many address space collisions for -race mode

runtime stack:
runtime.throw(0x42805c5, 0x30)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/panic.go:608 +0x72 fp=0x7fff5fbff1a8 sp=0x7fff5fbff178 pc=0x402d742
runtime.(*mheap).sysAlloc(0x4466820, 0x4000000, 0x0, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:575 +0x5b3 fp=0x7fff5fbff228 sp=0x7fff5fbff1a8 pc=0x400dcd3
runtime.(*mheap).grow(0x4466820, 0x1, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:920 +0x42 fp=0x7fff5fbff280 sp=0x7fff5fbff228 pc=0x40253b2
runtime.(*mheap).allocSpanLocked(0x4466820, 0x1, 0x441f568, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:848 +0x309 fp=0x7fff5fbff2c0 sp=0x7fff5fbff280 pc=0x4025239
runtime.(*mheap).alloc_m(0x4466820, 0x1, 0x2a, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:692 +0x119 fp=0x7fff5fbff300 sp=0x7fff5fbff2c0 pc=0x4024a79
runtime.(*mheap).alloc.func1()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:759 +0x4c fp=0x7fff5fbff338 sp=0x7fff5fbff300 pc=0x405ae0c
runtime.(*mheap).alloc(0x4466820, 0x1, 0x1002a, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:758 +0x8a fp=0x7fff5fbff388 sp=0x7fff5fbff338 pc=0x4024d1a
runtime.(*mcentral).grow(0x44685d8, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mcentral.go:232 +0x94 fp=0x7fff5fbff3d0 sp=0x7fff5fbff388 pc=0x4018cc4
runtime.(*mcentral).cacheSpan(0x44685d8, 0x5805000)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mcentral.go:106 +0x2f8 fp=0x7fff5fbff418 sp=0x7fff5fbff3d0 pc=0x4018818
runtime.(*mcache).refill(0x5805000, 0xffffff2a)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mcache.go:122 +0x95 fp=0x7fff5fbff448 sp=0x7fff5fbff418 pc=0x40183d5
runtime.(*mcache).nextFree.func1()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:749 +0x32 fp=0x7fff5fbff468 sp=0x7fff5fbff448 pc=0x405a222
runtime.(*mcache).nextFree(0x5805000, 0x402a, 0x5805000, 0x7fff5fbff520, 0x400f0fb)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:748 +0xb6 fp=0x7fff5fbff4c0 sp=0x7fff5fbff468 pc=0x400dd96
runtime.mallocgc(0x180, 0x4275f60, 0x5809001, 0x405a2b0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:903 +0x7ae fp=0x7fff5fbff560 sp=0x7fff5fbff4c0 pc=0x400e6fe
runtime.newobject(0x4275f60, 0x7fff5fbff570)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:1032 +0x38 fp=0x7fff5fbff590 sp=0x7fff5fbff560 pc=0x400eae8
runtime.malg(0x8000, 0x5805000)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:3285 +0x31 fp=0x7fff5fbff5d0 sp=0x7fff5fbff590 pc=0x4036d11
runtime.mpreinit(0x44624e0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/os_darwin.go:225 +0x29 fp=0x7fff5fbff5f0 sp=0x7fff5fbff5d0 pc=0x402b979
runtime.mcommoninit(0x44624e0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:624 +0xc1 fp=0x7fff5fbff628 sp=0x7fff5fbff5f0 pc=0x40305c1
runtime.schedinit()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:546 +0xaa fp=0x7fff5fbff690 sp=0x7fff5fbff628 pc=0x403028a
runtime.rt0_go(0x7fff5fbff6d8, 0x4, 0x7fff5fbff6d8, 0x0, 0x7fff99efb5c9, 0x7fff99efb5c9, 0x0, 0x4, 0x7fff5fbff8f8, 0x7fff5fbff94a, ...)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/asm_amd64.s:195 +0x11c fp=0x7fff5fbff698 sp=0x7fff5fbff690 pc=0x405cc5c
exit status 2
FAIL	_/private/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/misc/cgo/test	0.013s

This is still really weird, though. Before we've even run user code, it's claiming that it couldn't get a single successful mmap in the entire TSAN heap range. Maybe this is some conflict with CoreFoundation global initializers, given that 94076fe started linking in CoreFoundation?

@aclements

This comment has been minimized.

Member

aclements commented Jul 20, 2018

It is in fact blowing through all of the arena hints. The mmap calls are succeeding, but returning a different address than requested:

reserve failed for 0xc000000000 n=0x4000000 got 0x5845000
reserve failed for 0xc100000000 n=0x4000000 got 0x5845000
...

I made it spin after blowing through all of the hints and used sudo vmmap PID to get its memory map:

Process:         test.test [5864]
Path:            /private/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/go-build444054229/b001/test.test
Load Address:    0x4000000
Identifier:      test.test
Version:         0
Code Type:       X86-64
Parent Process:  go [4939]

Date/Time:       2018-07-20 07:32:21.298 -0700
OS Version:      Mac OS X 10.10.5 (14F1912)
Report Version:  7
Analysis Tool:   /usr/bin/vmmap
----

Virtual Memory Map of process 5864 (test.test)
Output report format:  2.3  -- 64-bit process

==== Non-writable regions for process 5864
REGION TYPE                      START - END             [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
__TEXT                 0000000004000000-00000000043fc000 [ 4080K] r-x/rwx SM=COW  ...001/test.test
__LINKEDIT             0000000004e31000-0000000004ebe000 [  564K] r--/rwx SM=COW  ...001/test.test
MALLOC (admin)         0000000004ebf000-0000000004ec0000 [    4K] r--/rwx SM=ZER  
MALLOC (admin)         0000000004ec1000-0000000004ec2000 [    4K] ---/rwx SM=NUL  
MALLOC (admin)         0000000004ed7000-0000000004ed9000 [    8K] ---/rwx SM=NUL  
MALLOC (admin)         0000000004eee000-0000000004eef000 [    4K] ---/rwx SM=NUL  
MALLOC (admin)         0000000004eef000-0000000004ef0000 [    4K] r--/rwx SM=PRV  
STACK GUARD            00007fff5bc00000-00007fff5f400000 [ 56.0M] ---/rwx SM=NUL  ... for thread 0
__TEXT                 00007fff62ea2000-00007fff62ed9000 [  220K] r-x/rwx SM=COW  /usr/lib/dyld
__LINKEDIT             00007fff62f12000-00007fff62f26000 [   80K] r--/rwx SM=COW  /usr/lib/dyld
__TEXT                 00007fff8aec7000-00007fff8aef0000 [  164K] r-x/r-x SM=COW  .../libxpc.dylib
__TEXT                 00007fff8aef0000-00007fff8aef9000 [   36K] r-x/r-x SM=COW  ...m_dnssd.dylib
__TEXT                 00007fff8b647000-00007fff8b64a000 [   12K] r-x/r-x SM=COW  ...sandbox.dylib
__TEXT                 00007fff8b976000-00007fff8bd0f000 [ 3684K] r-x/r-x SM=COW  ...oreFoundation
__TEXT                 00007fff8be1d000-00007fff8be46000 [  164K] r-x/r-x SM=COW  ...em_info.dylib
__TEXT                 00007fff8bf5c000-00007fff8bf65000 [   36K] r-x/r-x SM=COW  ...latform.dylib
__TEXT                 00007fff8d83e000-00007fff8d855000 [   92K] r-x/r-x SM=COW  ...tem_asl.dylib
__TEXT                 00007fff8dc13000-00007fff8dc15000 [    8K] r-x/r-x SM=COW  ..._blocks.dylib
__TEXT                 00007fff8e1fd000-00007fff8e28a000 [  564K] r-x/r-x SM=COW  ...ystem_c.dylib
__TEXT                 00007fff8eebc000-00007fff8eec8000 [   48K] r-x/r-x SM=COW  ...nCrypto.dylib
__TEXT                 00007fff8eed2000-00007fff8eed4000 [    8K] r-x/r-x SM=COW  ...sClient.dylib
__TEXT                 00007fff8efaf000-00007fff8efcd000 [  120K] r-x/r-x SM=COW  ..._kernel.dylib
__TEXT                 00007fff8f3ed000-00007fff8f3ff000 [   72K] r-x/r-x SM=COW  ...coretls.dylib
__TEXT                 00007fff8f401000-00007fff8f406000 [   20K] r-x/r-x SM=COW  ...ibcache.dylib
__TEXT                 00007fff8f415000-00007fff8f488000 [  460K] r-x/r-x SM=COW  ...sions/A/IOKit
__TEXT                 00007fff8f488000-00007fff8f48f000 [   28K] r-x/r-x SM=COW  ...m_trace.dylib
__TEXT                 00007fff8f661000-00007fff8f662000 [    4K] r-x/r-x SM=COW  ...bkeymgr.dylib
__TEXT                 00007fff8f820000-00007fff8f822000 [    8K] r-x/r-x SM=COW  ...secinit.dylib
__TEXT                 00007fff8faf4000-00007fff8faf6000 [    8K] r-x/r-x SM=COW  ...ystem.B.dylib
__TEXT                 00007fff8fb86000-00007fff8fb87000 [    4K] r-x/r-x SM=COW  .../libunc.dylib
__TEXT                 00007fff90286000-00007fff902bf000 [  228K] r-x/r-x SM=COW  ...network.dylib
__TEXT                 00007fff90fa5000-00007fff90fac000 [   28K] r-x/r-x SM=COW  ...tension.dylib
__TEXT                 00007fff91d29000-00007fff91d2a000 [    4K] r-x/r-x SM=COW  ...blaunch.dylib
__TEXT                 00007fff923ed000-00007fff923f0000 [   12K] r-x/r-x SM=COW  ...rantine.dylib
__TEXT                 00007fff93aea000-00007fff93b1b000 [  196K] r-x/r-x SM=COW  ...ystem_m.dylib
__TEXT                 00007fff93ee4000-00007fff93eed000 [   36K] r-x/r-x SM=COW  ...opyfile.dylib
__TEXT                 00007fff947c4000-00007fff94a3b000 [ 2524K] r-x/r-x SM=COW  ...ns/A/Security
__TEXT                 00007fff95171000-00007fff95179000 [   32K] r-x/r-x SM=COW  ...iler_rt.dylib
__TEXT                 00007fff951a5000-00007fff951c2000 [  116K] r-x/r-x SM=COW  ..._malloc.dylib
__TEXT                 00007fff9533b000-00007fff9534c000 [   68K] r-x/r-x SM=COW  ...ibbsm.0.dylib
__TEXT                 00007fff95358000-00007fff9535b000 [   12K] r-x/r-x SM=COW  ...uration.dylib
__TEXT                 00007fff95aae000-00007fff95ab1000 [   12K] r-x/r-x SM=COW  ...ervices.dylib
__TEXT                 00007fff95bac000-00007fff95d92000 [ 1944K] r-x/r-x SM=COW  ...ucore.A.dylib
__TEXT                 00007fff95e16000-00007fff95e17000 [    4K] r-x/r-x SM=COW  ...ingUtil.dylib
__TEXT                 00007fff969ab000-00007fff96a9e000 [  972K] r-x/r-x SM=COW  ...bxml2.2.dylib
__TEXT                 00007fff9771d000-00007fff9771f000 [    8K] r-x/r-x SM=COW  ...ovefile.dylib
__TEXT                 00007fff97ef2000-00007fff97efc000 [   40K] r-x/r-x SM=COW  ...pthread.dylib
__TEXT                 00007fff98392000-00007fff983a0000 [   56K] r-x/r-x SM=COW  ...ibxar.1.dylib
__TEXT                 00007fff983a0000-00007fff983a6000 [   24K] r-x/r-x SM=COW  ...bunwind.dylib
__TEXT                 00007fff983a6000-00007fff983b0000 [   40K] r-x/r-x SM=COW  ..._notify.dylib
__TEXT                 00007fff983c2000-00007fff98417000 [  340K] r-x/r-x SM=COW  ...ibc++.1.dylib
__TEXT                 00007fff984ff000-00007fff986fa000 [ 2028K] r-x/r-x SM=COW  ...bobjc.A.dylib
__TEXT                 00007fff98906000-00007fff9890c000 [   24K] r-x/r-x SM=COW  ...ibmacho.dylib
__TEXT                 00007fff98a40000-00007fff98a6b000 [  172K] r-x/r-x SM=COW  ...ispatch.dylib
__TEXT                 00007fff98c8e000-00007fff98cba000 [  176K] r-x/r-x SM=COW  ...bc++abi.dylib
__TEXT                 00007fff98d28000-00007fff98d2e000 [   24K] r-x/r-x SM=COW  ...m_stats.dylib
__TEXT                 00007fff98ff5000-00007fff98ff9000 [   16K] r-x/r-x SM=COW  ...libdyld.dylib
__TEXT                 00007fff99800000-00007fff99805000 [   20K] r-x/r-x SM=COW  ...ibpam.2.dylib
__TEXT                 00007fff99805000-00007fff99813000 [   56K] r-x/r-x SM=COW  ...bz2.1.0.dylib
__TEXT                 00007fff9a2e0000-00007fff9a35a000 [  488K] r-x/r-x SM=COW  ...ecrypto.dylib
__TEXT                 00007fff9a3a3000-00007fff9a3af000 [   48K] r-x/r-x SM=COW  ...libkxld.dylib
__TEXT                 00007fff9a516000-00007fff9a528000 [   72K] r-x/r-x SM=COW  .../libz.1.dylib
__TEXT                 00007fff9a528000-00007fff9a66c000 [ 1296K] r-x/r-x SM=COW  ...sqlite3.dylib
__TEXT                 00007fff9a68d000-00007fff9a6d4000 [  284K] r-x/r-x SM=COW  ...libauto.dylib
__UNICODE              00007fff9ac57000-00007fff9ace1000 [  552K] r--/r-- SM=COW  ...oreFoundation
__LINKEDIT             00007fff9ad65000-00007fff9f440000 [ 70.9M] r--/r-- SM=COW  ...ed __LINKEDIT
VM_ALLOCATE            00007fffffe00000-00007fffffe01000 [    4K] r--/r-- SM=SHM  
shared memory          00007fffffeab000-00007fffffeac000 [    4K] r-x/r-x SM=SHM  

==== Writable regions for process 5864
REGION TYPE                      START - END             [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
__DATA                 00000000043fc000-000000000441d000 [  132K] rw-/rwx SM=COW  ...001/test.test
__DATA                 000000000441d000-0000000004e31000 [ 10.1M] rw-/rwx SM=PRV  ...001/test.test
Kernel Alloc Once      0000000004ebe000-0000000004ebf000 [    4K] rw-/rwx SM=PRV  
MALLOC                 0000000004ec0000-0000000004ec1000 [    4K] rw-/rwx SM=ZER  
MALLOC                 0000000004ec2000-0000000004ed7000 [   84K] rw-/rwx SM=PRV  
MALLOC                 0000000004ed9000-0000000004eee000 [   84K] rw-/rwx SM=PRV  
Performance tool data  0000000004ef0000-0000000004f00000 [   64K] rw-/rwx SM=PRV  
MALLOC_TINY            0000000004f00000-0000000005000000 [ 1024K] rw-/rwx SM=PRV  ...one_0x4ebf000
MALLOC_SMALL           0000000005000000-0000000005800000 [ 8192K] rw-/rwx SM=PRV  ...one_0x4ebf000
Performance tool data  0000000005800000-0000000005805000 [   20K] rw-/rwx SM=PRV  
VM_ALLOCATE            0000000005805000-0000000005846000 [  260K] rw-/rwx SM=PRV  
Performance tool data  0000200010ff1000-00002000111fa000 [ 2084K] rw-/rwx SM=NUL  ...(unallocated)
Performance tool data  00003000021f0000-0000300002240000 [  320K] rw-/rwx SM=NUL  ...(unallocated)
Performance tool data  0000600000000000-0000600000040000 [  256K] rw-/rwx SM=PRV  
Performance tool data  0000600001000000-0000600001001000 [    4K] rw-/rwx SM=PRV  
Stack                  00007fff5f400000-00007fff5fbff000 [ 8188K] rw-/rwx SM=ZER  thread 0
Stack                  00007fff5fbff000-00007fff5fc00000 [    4K] rw-/rwx SM=COW  thread 0
__DATA                 00007fff62ed9000-00007fff62edc000 [   12K] rw-/rwx SM=PRV  /usr/lib/dyld
__DATA                 00007fff62edc000-00007fff62f12000 [  216K] rw-/rwx SM=COW  /usr/lib/dyld
__DATA                 00007fff7ad39000-00007fff7ad3e000 [   20K] rw-/rwx SM=COW  .../libxpc.dylib
__DATA                 00007fff7ad3e000-00007fff7ad3f000 [    4K] rw-/rwx SM=COW  ...m_dnssd.dylib
__DATA                 00007fff7af9b000-00007fff7af9c000 [    4K] rw-/rw- SM=COW  ...sandbox.dylib
__DATA                 00007fff7affc000-00007fff7b000000 [   16K] rw-/rw- SM=COW  ...oreFoundation
__DATA                 00007fff7b000000-00007fff7b052000 [  328K] rw-/rwx SM=COW  ...oreFoundation
__DATA                 00007fff7b09d000-00007fff7b0a0000 [   12K] rw-/rwx SM=COW  ...em_info.dylib
__DATA                 00007fff7b0cb000-00007fff7b0cc000 [    4K] rw-/rwx SM=COW  ...latform.dylib
__DATA                 00007fff7b5c9000-00007fff7b5cb000 [    8K] rw-/rw- SM=COW  ...tem_asl.dylib
__DATA                 00007fff7b621000-00007fff7b622000 [    4K] rw-/rwx SM=COW  ..._blocks.dylib
__DATA                 00007fff7b783000-00007fff7b78d000 [   40K] rw-/rwx SM=COW  ...ystem_c.dylib
__DATA                 00007fff7b95f000-00007fff7b961000 [    8K] rw-/rwx SM=COW  ...nCrypto.dylib
__DATA                 00007fff7b964000-00007fff7b965000 [    4K] rw-/rwx SM=COW  ...sClient.dylib
__DATA                 00007fff7b97c000-00007fff7b97f000 [   12K] rw-/rwx SM=COW  ..._kernel.dylib
__DATA                 00007fff7ba5d000-00007fff7ba5e000 [    4K] rw-/rw- SM=COW  ...coretls.dylib
__DATA                 00007fff7ba5f000-00007fff7ba60000 [    4K] rw-/rw- SM=COW  ...ibcache.dylib
__DATA                 00007fff7ba68000-00007fff7ba71000 [   36K] rw-/rw- SM=COW  ...sions/A/IOKit
__DATA                 00007fff7ba71000-00007fff7ba72000 [    4K] rw-/rw- SM=COW  ...m_trace.dylib
__DATA                 00007fff7baf4000-00007fff7baf5000 [    4K] rw-/rw- SM=COW  ...bkeymgr.dylib
__DATA                 00007fff7bb6c000-00007fff7bb6d000 [    4K] rw-/rw- SM=COW  ...secinit.dylib
__DATA                 00007fff7bc58000-00007fff7bc59000 [    4K] rw-/rw- SM=COW  ...ystem.B.dylib
__DATA                 00007fff7bd31000-00007fff7bd37000 [   24K] rw-/rw- SM=COW  ...network.dylib
__DATA                 00007fff7bf30000-00007fff7bf31000 [    4K] rw-/rw- SM=COW  ...tension.dylib
__DATA                 00007fff7c208000-00007fff7c209000 [    4K] rw-/rw- SM=COW  ...rantine.dylib
__DATA                 00007fff7c521000-00007fff7c522000 [    4K] rw-/rw- SM=COW  ...ystem_m.dylib
__DATA                 00007fff7c61c000-00007fff7c61d000 [    4K] rw-/rw- SM=COW  ...opyfile.dylib
__DATA                 00007fff7c727000-00007fff7c759000 [  200K] rw-/rw- SM=COW  ...ns/A/Security
__DATA                 00007fff7c81a000-00007fff7c81c000 [    8K] rw-/rwx SM=COW  ...iler_rt.dylib
__DATA                 00007fff7c81d000-00007fff7c81e000 [    4K] rw-/rwx SM=COW  ..._malloc.dylib
__DATA                 00007fff7c874000-00007fff7c876000 [    8K] rw-/rwx SM=COW  ...ibbsm.0.dylib
__DATA                 00007fff7c87a000-00007fff7c87b000 [    4K] rw-/rwx SM=COW  ...uration.dylib
__DATA                 00007fff7c8ea000-00007fff7c8eb000 [    4K] rw-/rwx SM=COW  ...ervices.dylib
__DATA                 00007fff7c937000-00007fff7c958000 [  132K] rw-/rwx SM=COW  ...ucore.A.dylib
__DATA                 00007fff7c983000-00007fff7c984000 [    4K] rw-/rwx SM=COW  ...ingUtil.dylib
__DATA                 00007fff7ca41000-00007fff7ca4c000 [   44K] rw-/rw- SM=COW  ...bxml2.2.dylib
__DATA                 00007fff7cbc1000-00007fff7cbc2000 [    4K] rw-/rw- SM=COW  ...ovefile.dylib
__DATA                 00007fff7cd2f000-00007fff7cd33000 [   16K] rw-/rwx SM=COW  ...pthread.dylib
__DATA                 00007fff7ce96000-00007fff7ce97000 [    4K] rw-/rwx SM=COW  ...ibxar.1.dylib
__DATA                 00007fff7ce97000-00007fff7ce98000 [    4K] rw-/rwx SM=COW  ...bunwind.dylib
__DATA                 00007fff7ce98000-00007fff7ce99000 [    4K] rw-/rwx SM=COW  ..._notify.dylib
__DATA                 00007fff7ce9b000-00007fff7cea3000 [   32K] rw-/rwx SM=COW  ...ibc++.1.dylib
__DATA                 00007fff7cee7000-00007fff7cef6000 [   60K] rw-/rwx SM=COW  ...bobjc.A.dylib
__DATA                 00007fff7cf34000-00007fff7cf35000 [    4K] rw-/rwx SM=COW  ...ibmacho.dylib
__DATA                 00007fff7cf7b000-00007fff7cf8c000 [   68K] rw-/rwx SM=COW  ...ispatch.dylib
__DATA                 00007fff7cfed000-00007fff7cfef000 [    8K] rw-/rwx SM=COW  ...bc++abi.dylib
__DATA                 00007fff7d001000-00007fff7d002000 [    4K] rw-/rwx SM=COW  ...m_stats.dylib
__DATA                 00007fff7d01a000-00007fff7d01b000 [    4K] rw-/rwx SM=COW  ...libdyld.dylib
__DATA                 00007fff7d1ab000-00007fff7d1ac000 [    4K] rw-/rwx SM=COW  ...ibpam.2.dylib
__DATA                 00007fff7d1ac000-00007fff7d1ad000 [    4K] rw-/rwx SM=COW  ...bz2.1.0.dylib
__DATA                 00007fff7d26e000-00007fff7d276000 [   32K] rw-/rw- SM=COW  ...ecrypto.dylib
__DATA                 00007fff7d27e000-00007fff7d27f000 [    4K] rw-/rw- SM=COW  ...libkxld.dylib
__DATA                 00007fff7d2a0000-00007fff7d2a1000 [    4K] rw-/rw- SM=COW  .../libz.1.dylib
__DATA                 00007fff7d2a1000-00007fff7d2a6000 [   20K] rw-/rw- SM=COW  ...sqlite3.dylib
__DATA                 00007fff7d2a8000-00007fff7d2aa000 [    8K] rw-/rw- SM=COW  ...libauto.dylib

==== Legend
SM=sharing mode:  
	COW=copy_on_write PRV=private NUL=empty ALI=aliased 
	SHM=shared ZER=zero_filled S/A=shared_alias

==== Summary for process 5864
ReadOnly portion of Libraries: Total=92.2M resident=43.3M(47%) swapped_out_or_unallocated=48.9M(53%)
Writable regions: Total=30.4M written=1812K(6%) resident=1872K(6%) swapped_out=0K(0%) unallocated=28.6M(94%)

REGION TYPE                        VIRTUAL
===========                        =======
Kernel Alloc Once                       4K
MALLOC                               9388K        see MALLOC ZONE table below
MALLOC (admin)                         24K
Performance tool data                 344K        not counted in TOTAL below
Performance tool data (reserved)     2404K        reserved VM address space (unallocated)
STACK GUARD                          56.0M
Stack                                8192K
VM_ALLOCATE                           264K
__DATA                               11.7M
__LINKEDIT                           71.5M
__TEXT                               20.7M
__UNICODE                             552K
shared memory                           4K
===========                        =======
TOTAL                               177.9M
TOTAL, minus reserved VM space      175.5M


                               VIRTUAL ALLOCATION      BYTES
MALLOC ZONE                       SIZE      COUNT  ALLOCATED  % FULL
===========                    =======  =========  =========  ======
DefaultMallocZone_0x4ebf000      9216K        622       118K      1%

Notably, there's nothing in [0xc000000000, 0xe000000000) at all.

Maybe CoreFoundation in macOS 10.10 is intercepting mmap calls and ignoring the hint address?

@aclements

This comment has been minimized.

Member

aclements commented Jul 20, 2018

Now I'm extra confused. I tried the following C program:

#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>

int main() 
{
        void *x = mmap((void*)0xc000000000ull, 0x4000000, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1, 0);
        printf("%p %d\n", x, errno);
        return 0;
}

Even when compiled with just gcc x.c (no -framework CoreFoundation), on macOS 1.10, mmap is ignoring the hint address. But based on debug prints I added, it clearly does honor the hint address for Go programs, except in misc/cgo/test since the test added in 94076fe.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Jul 20, 2018

MAP_FIXED?

@aclements

This comment has been minimized.

Member

aclements commented Jul 20, 2018

It does honor the address with MAP_FIXED, but that's playing with fire. I'd much rather just disable this test under TSAN on Darwin.

@aclements

This comment has been minimized.

Member

aclements commented Jul 20, 2018

This keeps getting weirder.

misc/cgo/test already linked against CoreFoundation prior to 94076fe because of issue21897.go. Going back to 94076fe and removing the tests it added (but keeping the cgo tool changes) is enough to get it working again, suggesting that the cgo tool changes aren't involved, and that the problem isn't simply linking against CoreFoundation, but something more specific to the tests that were added. However, as of commit 5419e7a, it's no longer enough to remove the tests from 94076fe. As of 5419e7a, removing either test26213.go (which was added in this commit) or issue21897.go (which already existed and didn't change in this commit) is sufficient to get the test to pass.

@gopherbot

This comment has been minimized.

gopherbot commented Jul 20, 2018

Change https://golang.org/cl/125304 mentions this issue: misc/cgo/test: disable some Darwin tests in cgo mode

gopherbot pushed a commit that referenced this issue Jul 23, 2018

misc/cgo/test: disable some Darwin tests in cgo mode
For unknown reasons, linking against CoreFoundation on macOS 10.10
sometimes causes mmap to ignore the hint address, which makes the Go
allocator incompatible with TSAN. Currently, the effect of this is to
run the allocator out of arena hints on the very first allocation,
causing a "too many address space collisions for -race mode" panic.

This CL skips the cgo tests that link against CoreFoundation in race
mode.

Updates #26475.
Updates #26513.

Change-Id: I52ec638c99acf5d4966e68ff0054f7679680dac6
Reviewed-on: https://go-review.googlesource.com/125304
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@aclements

This comment has been minimized.

Member

aclements commented Jul 23, 2018

macOS 10.10 is passing now. I'm going to close this even though we don't completely understand this bug. #26513 tracks reverting this when we drop macOS 10.10 support.

@aclements aclements closed this Jul 23, 2018

@katiehockman katiehockman reopened this Oct 4, 2018

@katiehockman

This comment has been minimized.

Contributor

katiehockman commented Oct 4, 2018

This appears to be popping up again. It has failed all of the builds for macOS 10.10 over the last day.

The latest failure: https://build.golang.org/log/cccb90e1ae132b4037e0ffbdcc58fea77aa72b39

fatal error: too many address space collisions for -race mode

runtime stack:
runtime.throw(0x427e6c9, 0x30)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/panic.go:608 +0x72 fp=0x7fff5fbff248 sp=0x7fff5fbff218 pc=0x402d732
runtime.(*mheap).sysAlloc(0x44272c0, 0x4000000, 0x0, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:573 +0x5a6 fp=0x7fff5fbff2c8 sp=0x7fff5fbff248 pc=0x400dcd6
runtime.(*mheap).grow(0x44272c0, 0x1, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:922 +0x42 fp=0x7fff5fbff320 sp=0x7fff5fbff2c8 pc=0x4025582
runtime.(*mheap).allocSpanLocked(0x44272c0, 0x1, 0x4df41c8, 0xffffffff)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:850 +0x2f5 fp=0x7fff5fbff360 sp=0x7fff5fbff320 pc=0x4025415
runtime.(*mheap).alloc_m(0x44272c0, 0x1, 0x442002a, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:694 +0x105 fp=0x7fff5fbff3a0 sp=0x7fff5fbff360 pc=0x4024c75
runtime.(*mheap).alloc.func1()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:761 +0x4c fp=0x7fff5fbff3d8 sp=0x7fff5fbff3a0 pc=0x405a55c
runtime.(*mheap).alloc(0x44272c0, 0x1, 0x7fff5f01002a, 0xff80000005811000)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mheap.go:760 +0x8a fp=0x7fff5fbff428 sp=0x7fff5fbff3d8 pc=0x4024f0a
runtime.(*mcentral).grow(0x4429078, 0x0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mcentral.go:257 +0x93 fp=0x7fff5fbff470 sp=0x7fff5fbff428 pc=0x4018fe3
runtime.(*mcentral).cacheSpan(0x4429078, 0x401914b)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mcentral.go:106 +0x2cf fp=0x7fff5fbff4b8 sp=0x7fff5fbff470 pc=0x4018aaf
runtime.(*mcache).refill(0x5811000, 0x2a)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/mcache.go:135 +0x86 fp=0x7fff5fbff4d8 sp=0x7fff5fbff4b8 pc=0x4018576
runtime.(*mcache).nextFree(0x5811000, 0x402a, 0x5811000, 0x7fff5fbff570, 0x400f0a9)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:749 +0x88 fp=0x7fff5fbff510 sp=0x7fff5fbff4d8 pc=0x400dd78
runtime.mallocgc(0x180, 0x4273f20, 0x5815001, 0x4059a40)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:902 +0x77a fp=0x7fff5fbff5b0 sp=0x7fff5fbff510 pc=0x400e6aa
runtime.newobject(0x4273f20, 0x7fff5fbff5c0)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/malloc.go:1031 +0x38 fp=0x7fff5fbff5e0 sp=0x7fff5fbff5b0 pc=0x400ea98
runtime.malg(0x8000, 0x5811000)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:3223 +0x31 fp=0x7fff5fbff620 sp=0x7fff5fbff5e0 pc=0x4036861
runtime.mpreinit(0x4422f80)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/os_darwin.go:234 +0x29 fp=0x7fff5fbff640 sp=0x7fff5fbff620 pc=0x402b9a9
runtime.mcommoninit(0x4422f80)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:618 +0xc1 fp=0x7fff5fbff678 sp=0x7fff5fbff640 pc=0x4030591
runtime.schedinit()
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/proc.go:540 +0xaa fp=0x7fff5fbff6e0 sp=0x7fff5fbff678 pc=0x403025a
runtime.rt0_go(0x7fff5fbff720, 0x4, 0x7fff5fbff720, 0x0, 0x7fff95e365c9, 0x0, 0x4, 0x7fff5fbff930, 0x7fff5fbff982, 0x7fff5fbff993, ...)
	/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/src/runtime/asm_amd64.s:195 +0x11c fp=0x7fff5fbff6e8 sp=0x7fff5fbff6e0 pc=0x405c36c
exit status 2
FAIL	_/private/var/folders/6g/8k13h11n2zb8hxx99wk6f6y40000gn/T/workdir/go/misc/cgo/test	0.010s
2018/10/04 09:43:23 Failed: exit status 1

@agnivade agnivade modified the milestones: Go1.11, Go1.12 Oct 4, 2018

@katiehockman

This comment has been minimized.

Contributor

katiehockman commented Oct 30, 2018

This is still failing for all darwin-amd64-10_10 builds.

/cc owners: @aclements @rsc @RLH @randall77

@mknyszek

This comment has been minimized.

Contributor

mknyszek commented Dec 11, 2018

The fact that it completely ignores the hint is very weird, but the kernel is well within its power to give back whatever address it wants.

So given that, and also given that the race detector expects the heap within a certain address range, maybe it actually does make sense to map new memory as MAP_FIXED, if it helps the problem. This would only be true in race mode, of course.

We already fail if we can't maintain the invariant that the heap only lives in a certain region, so if we can't map it with MAP_FIXED and mmap fails, then race mode wouldn't work anyway. Based on @aclements experiments above (and my efforts to reproduce them) MAP_FIXED does seem to help. The race detector also assumes that the heap is bounded within a certain range, so if we cannot map memory in that range we can't even do race detection properly. The race detector also loosely expects the heap to be contiguous (see #24133) which makes MAP_FIXED seem like a better idea. Note though that if #24133 gets fixed, then MAP_FIXED is maybe no longer a great idea, but in this particular case we're burning through all our hints.

Anyway, I'll give MAP_FIXED a try for race mode and see if that helps with builder stability. We should probably remove it when we stop supporting darwin-10_10.

@gopherbot

This comment has been minimized.

gopherbot commented Dec 12, 2018

Change https://golang.org/cl/153897 mentions this issue: runtime: MAP_FIXED on BSDs in race mode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment