Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: address space conflict at startup using buildmode=c-shared #16936

Closed
sh4m1l65 opened this issue Aug 31, 2016 · 4 comments

Comments

Projects
None yet
5 participants
@sh4m1l65
Copy link

commented Aug 31, 2016

The failure occurred in a shared library (go build -buildmode=c-shared) that is loaded as a ulogd plugin. So, obviously, go is not in complete control of its runtime situation. Still, I believe this stack trace represents an issue with the embedded go runtime and not the host application.

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.6 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/lib/go-1.6"
GOTOOLDIR="/usr/lib/go-1.6/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"

What did you do?

If possible, provide a recipe for reproducing the error.

I may be able to obtain clearance from my employer to share binary code and/or source for the application in which this panic appeared. I don't yet have such clearance.

At a high level, this is a shared library that is compiled against Linux ulogd sources (http://www.netfilter.org/projects/ulogd/) to produce a plugin for the ulogd host app. The plugin receives callbacks from the host application and composes log messages sent through dropsonde (https://github.com/cloudfoundry/dropsonde) to be collected via Cloud Foundry's Loggregator services.

Because the panic stack trace does not refer to any code outside of the Go runtime, it is difficult for me to see which allocation provoked the allocator failure.

A complete runnable program is good.
A link on play.golang.org is best.

What did you expect to see?

normally (in thousands of instances of restarting the host application that is loading the go shared library) the app runs fine. The stack trace comes from exactly once-ever-witnessed failure of this kind.

What did you see instead?

stack trace follows:

------------ STARTING connection-logger_ctl at Tue Aug 30 15:38:29 UTC 2016 --------------
[... elided irrelevant log prefix ...]
runtime: address space conflict: map(0xc820000000) = 0x7f4db163c000
fatal error: runtime: address space conflict
2016/08/30 15:38:31 init() for diego

runtime stack:
runtime.throw(0x7f4daf389da0, 0x1f)
        /usr/lib/go-1.6/src/runtime/panic.go:530 +0x92 fp=0x7f4dae86d9e0 sp=0x7f4dae86d9c8
runtime.sysMap(0xc820000000, 0x100000, 0x6600, 0x7f4daf57f778)
        /usr/lib/go-1.6/src/runtime/mem_linux.go:210 +0x13a fp=0x7f4dae86da20 sp=0x7f4dae86d9e0
runtime.(*mheap).sysAlloc(0x7f4daf565a60, 0x100000, 0x0)
        /usr/lib/go-1.6/src/runtime/malloc.go:429 +0x193 fp=0x7f4dae86daa8 sp=0x7f4dae86da20
runtime.(*mheap).grow(0x7f4daf565a60, 0x8, 0x0)
        /usr/lib/go-1.6/src/runtime/mheap.go:651 +0x65 fp=0x7f4dae86db00 sp=0x7f4dae86daa8
runtime.(*mheap).allocSpanLocked(0x7f4daf565a60, 0x1, 0x0)
        /usr/lib/go-1.6/src/runtime/mheap.go:553 +0x4f8 fp=0x7f4dae86db58 sp=0x7f4dae86db00
runtime.(*mheap).alloc_m(0x7f4daf565a60, 0x1, 0x15, 0x7f4db174c000)
        /usr/lib/go-1.6/src/runtime/mheap.go:437 +0x11d fp=0x7f4dae86db88 sp=0x7f4dae86db58
runtime.(*mheap).alloc.func1()
        /usr/lib/go-1.6/src/runtime/mheap.go:502 +0x43 fp=0x7f4dae86dbb8 sp=0x7f4dae86db88
runtime.systemstack(0x7f4dae86dbd8)
        /usr/lib/go-1.6/src/runtime/asm_amd64.s:307 +0xa1 fp=0x7f4dae86dbc0 sp=0x7f4dae86dbb8
runtime.(*mheap).alloc(0x7f4daf565a60, 0x1, 0x10000000015, 0x7f4daebdcc38)
        /usr/lib/go-1.6/src/runtime/mheap.go:503 +0x65 fp=0x7f4dae86dc08 sp=0x7f4dae86dbc0
runtime.(*mcentral).grow(0x7f4daf567660, 0x0)
        /usr/lib/go-1.6/src/runtime/mcentral.go:209 +0x95 fp=0x7f4dae86dc70 sp=0x7f4dae86dc08
runtime.(*mcentral).cacheSpan(0x7f4daf567660, 0x7f4daf5604c8)
        /usr/lib/go-1.6/src/runtime/mcentral.go:89 +0x47f fp=0x7f4dae86dcb0 sp=0x7f4dae86dc70
runtime.(*mcache).refill(0x7f4db174c000, 0x7f4d00000015, 0x7f4dae86dd18)
        /usr/lib/go-1.6/src/runtime/mcache.go:119 +0xd0 fp=0x7f4dae86dce8 sp=0x7f4dae86dcb0
runtime.mallocgc.func2()
        /usr/lib/go-1.6/src/runtime/malloc.go:642 +0x2d fp=0x7f4dae86dd08 sp=0x7f4dae86dce8
runtime.systemstack(0x7f4dae86dda8)
        /usr/lib/go-1.6/src/runtime/asm_amd64.s:307 +0xa1 fp=0x7f4dae86dd10 sp=0x7f4dae86dd08
runtime.mallocgc(0x180, 0x7f4daf310da0, 0x0, 0x800000000)
        /usr/lib/go-1.6/src/runtime/malloc.go:643 +0x87c fp=0x7f4dae86dde8 sp=0x7f4dae86dd10
runtime.newobject(0x7f4daf310da0, 0x7f4daf560990)
        /usr/lib/go-1.6/src/runtime/malloc.go:781 +0x44 fp=0x7f4dae86de10 sp=0x7f4dae86dde8
runtime.malg(0x7f4d00008000, 0x7f4daf560d40)
        /usr/lib/go-1.6/src/runtime/proc.go:2634 +0x29 fp=0x7f4dae86de48 sp=0x7f4dae86de10
runtime.mpreinit(0x7f4daf5612e0)
        /usr/lib/go-1.6/src/runtime/os1_linux.go:205 +0x21 fp=0x7f4dae86de60 sp=0x7f4dae86de48
runtime.mcommoninit(0x7f4daf5612e0)
        /usr/lib/go-1.6/src/runtime/proc.go:494 +0x109 fp=0x7f4dae86dea8 sp=0x7f4dae86de60
runtime.schedinit()
        /usr/lib/go-1.6/src/runtime/proc.go:434 +0x7d fp=0x7f4dae86def0 sp=0x7f4dae86dea8
runtime.rt0_go(0x7fff1f836a58, 0x7, 0x7fff1f836a58, 0x7f4dae86e700, 0x7f4db13a1184, 0x0, 0x7f4dae86e700, 0x7f4dae86e700, 0xa48c0a2111c02ce0, 0x0, ...)
        /usr/lib/go-1.6/src/runtime/asm_amd64.s:138 +0x134 fp=0x7f4dae86def8 sp=0x7f4dae86def0

@josharian josharian changed the title panic in Go 1.6 runtime (allocator) runtime: panic in Go 1.6 runtime (allocator) Aug 31, 2016

@quentinmit quentinmit added this to the Go1.8Maybe milestone Sep 6, 2016

@rsc

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2016

@sh4m1l65, if you are interested in debugging this, the thing to do would be to try to make this failure dump /proc/self/maps as it dies.

You'd do that by adding code something like this to runtime/mem_linux.go and rebuilding the runtime and your binary:

diff --git a/src/runtime/mem_linux.go b/src/runtime/mem_linux.go
index 094658d..2805050 100644
--- a/src/runtime/mem_linux.go
+++ b/src/runtime/mem_linux.go
@@ -206,6 +206,8 @@ func sysReserve(v unsafe.Pointer, n uintptr, reserved *bool) unsafe.Pointer {
    return p
 }

+var mapbuf [10 * 1024]byte
+
 func sysMap(v unsafe.Pointer, n uintptr, reserved bool, sysStat *uint64) {
    mSysStatInc(sysStat, n)

@@ -217,6 +219,11 @@ func sysMap(v unsafe.Pointer, n uintptr, reserved bool, sysStat *uint64) {
        }
        if p != v {
            print("runtime: address space conflict: map(", v, ") = ", p, "\n")
+           print("/proc/self/maps:\n")
+           fd := open("/proc/self/maps", 0, 0)
+           n := read(fd, unsafe.Pointer(&mapbuf[0]), int32(len(mapbuf)))
+           closefd(fd)
+           write(2, unsafe.Pointer(&mapbuf[0]), int32(n))
            throw("runtime: address space conflict")
        }
        return

Thanks.

@rsc rsc changed the title runtime: panic in Go 1.6 runtime (allocator) runtime: address space conflict at startup using buildmode=c-shared Oct 21, 2016

@rsc rsc added the WaitingForInfo label Nov 2, 2016

@rsc rsc modified the milestones: Go1.9, Go1.8Maybe Nov 11, 2016

@lipixun

This comment has been minimized.

Copy link

commented Nov 17, 2016

I met exactly the same issue.

What version of Go are you using (go version)?

go version go1.7.3 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/lipixun"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build198174358=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

What did you do?

I'm writing a python module by cgo. The code is quite simple, since the stack backtrace doesn't point to any method in my codes (neither c codes or go codes), I believe it's an issue of go runtime.

One of the crashed stack backtrace:

runtime: address space conflict: map(0xc000000000) = 0x7fa5d2aa3000
fatal error: runtime: address space conflict

runtime stack:
runtime.throw(0x7fa5a78abfa0, 0x1f)
    /usr/local/go/src/runtime/panic.go:547 +0x92 fp=0x7fa5a74549b0 sp=0x7fa5a7454998
runtime.sysMap(0xc000000000, 0x1000, 0xc820000000, 0x7fa5a7941080)
    /usr/local/go/src/runtime/mem_linux.go:210 +0x13a fp=0x7fa5a74549f0 sp=0x7fa5a74549b0
runtime.(*mheap).mapSpans(0x7fa5a79287e0, 0xc820100000)
    /usr/local/go/src/runtime/mheap.go:330 +0xb9 fp=0x7fa5a7454a20 sp=0x7fa5a74549f0
runtime.(*mheap).sysAlloc(0x7fa5a79287e0, 0x100000, 0x0)
    /usr/local/go/src/runtime/malloc.go:431 +0x1df fp=0x7fa5a7454aa8 sp=0x7fa5a7454a20
runtime.(*mheap).grow(0x7fa5a79287e0, 0x8, 0x0)
    /usr/local/go/src/runtime/mheap.go:651 +0x65 fp=0x7fa5a7454b00 sp=0x7fa5a7454aa8
runtime.(*mheap).allocSpanLocked(0x7fa5a79287e0, 0x1, 0x0)
    /usr/local/go/src/runtime/mheap.go:553 +0x4f8 fp=0x7fa5a7454b58 sp=0x7fa5a7454b00
runtime.(*mheap).alloc_m(0x7fa5a79287e0, 0x1, 0x7f0000000015, 0x7fa5a7454ba8)
    /usr/local/go/src/runtime/mheap.go:437 +0x11d fp=0x7fa5a7454b88 sp=0x7fa5a7454b58
runtime.(*mheap).alloc.func1()
    /usr/local/go/src/runtime/mheap.go:502 +0x43 fp=0x7fa5a7454bb8 sp=0x7fa5a7454b88
runtime.systemstack(0x7fa5a7454bd8)
    /usr/local/go/src/runtime/asm_amd64.s:307 +0xa1 fp=0x7fa5a7454bc0 sp=0x7fa5a7454bb8
runtime.(*mheap).alloc(0x7fa5a79287e0, 0x1, 0x10000000015, 0x7fa5a754ed98)
    /usr/local/go/src/runtime/mheap.go:503 +0x65 fp=0x7fa5a7454c08 sp=0x7fa5a7454bc0
runtime.(*mcentral).grow(0x7fa5a792a3e0, 0x0)
    /usr/local/go/src/runtime/mcentral.go:209 +0x95 fp=0x7fa5a7454c70 sp=0x7fa5a7454c08
runtime.(*mcentral).cacheSpan(0x7fa5a792a3e0, 0x7fa5a7925428)
    /usr/local/go/src/runtime/mcentral.go:89 +0x47f fp=0x7fa5a7454cb0 sp=0x7fa5a7454c70
runtime.(*mcache).refill(0x7fa5a5fea000, 0x7fa500000015, 0x7fa5a7454d18)
    /usr/local/go/src/runtime/mcache.go:119 +0xd0 fp=0x7fa5a7454ce8 sp=0x7fa5a7454cb0
runtime.mallocgc.func2()
    /usr/local/go/src/runtime/malloc.go:642 +0x2d fp=0x7fa5a7454d08 sp=0x7fa5a7454ce8
runtime.systemstack(0x7fa5a7454da8)
    /usr/local/go/src/runtime/asm_amd64.s:307 +0xa1 fp=0x7fa5a7454d10 sp=0x7fa5a7454d08
runtime.mallocgc(0x180, 0x7fa5a7884ee0, 0x0, 0x800000000)
    /usr/local/go/src/runtime/malloc.go:643 +0x87c fp=0x7fa5a7454de8 sp=0x7fa5a7454d10
runtime.newobject(0x7fa5a7884ee0, 0x7fa5a79256b0)
    /usr/local/go/src/runtime/malloc.go:781 +0x44 fp=0x7fa5a7454e10 sp=0x7fa5a7454de8
runtime.malg(0x7fa500008000, 0x7fa5a7925920)
    /usr/local/go/src/runtime/proc.go:2637 +0x29 fp=0x7fa5a7454e48 sp=0x7fa5a7454e10
runtime.mpreinit(0x7fa5a7925c60)
    /usr/local/go/src/runtime/os1_linux.go:205 +0x21 fp=0x7fa5a7454e60 sp=0x7fa5a7454e48
runtime.mcommoninit(0x7fa5a7925c60)
    /usr/local/go/src/runtime/proc.go:497 +0x109 fp=0x7fa5a7454ea8 sp=0x7fa5a7454e60
runtime.schedinit()
    /usr/local/go/src/runtime/proc.go:434 +0x7d fp=0x7fa5a7454ef0 sp=0x7fa5a7454ea8
runtime.rt0_go(0x7ffd456f3e38, 0x7, 0x7ffd456f3e38, 0x7fa5a7455700, 0x7fa5d1d4d184, 0x0, 0x7fa5a7455700, 0x7fa5a7455700, 0x73831ff1887067f2, 0x0, ...)
    /usr/local/go/src/runtime/asm_amd64.s:138 +0x134 fp=0x7fa5a7454ef8 sp=0x7fa5a7454ef0

I also have a couple of global variables declared

gSegmentersLock   sync.Locker                = &sync.Mutex{}
gSegmenterCounter int                        = 0
gSegmenters       map[int]*segment.Segmenter = make(map[int]*segment.Segmenter)
// The strings used by c
CStringSegmenterObjectReleasedErrorMessage = C.CString("Segmenter object has already released")

The python class is defined as usual, the __init__ method points to a c function, and an exported go method will be invoked in that function.

This error doesn't happen every time. In one of my tests, I started a hadoop job with 6000 map tasks (in 9 nodes), each task will import the python module and create objects (namely, the go method will be called at least once for each task), and there're at most 19 map tasks running in one node at the same time. And finally I got 136 tasks failed because of this error. And It seems that the error happens at the end of the task which means the error possibly always happens when finalizing the process.

Each node has 1 * 12 * 2core cpu and 128G memory and no memory limitation is configured. The hadoop dashboard shows that the resources are absoluately sufficient. OS is Linux s1 3.19.0-37-generic #42~14.04.1-Ubuntu SMP

The build command is go build -buildmode=c-shared -o xxx.so package

I can stably reproduce this error.

@gopherbot

This comment has been minimized.

Copy link

commented Mar 21, 2017

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please reopen if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this Mar 21, 2017

@gopherbot

This comment has been minimized.

Copy link

commented Jan 2, 2018

Change https://golang.org/cl/85887 mentions this issue: runtime: use sparse mappings for the heap

gopherbot pushed a commit that referenced this issue Feb 15, 2018

runtime: use sparse mappings for the heap
This replaces the contiguous heap arena mapping with a potentially
sparse mapping that can support heap mappings anywhere in the address
space.

This has several advantages over the current approach:

* There is no longer any limit on the size of the Go heap. (Currently
  it's limited to 512GB.) Hence, this fixes #10460.

* It eliminates many failures modes of heap initialization and
  growing. In particular it eliminates any possibility of panicking
  with an address space conflict. This can happen for many reasons and
  even causes a low but steady rate of TSAN test failures because of
  conflicts with the TSAN runtime. See #16936 and #11993.

* It eliminates the notion of "non-reserved" heap, which was added
  because creating huge address space reservations (particularly on
  64-bit) led to huge process VSIZE. This was at best confusing and at
  worst conflicted badly with ulimit -v. However, the non-reserved
  heap logic is complicated, can race with other mappings in non-pure
  Go binaries (e.g., #18976), and requires that the entire heap be
  either reserved or non-reserved. We currently maintain the latter
  property, but it's quite difficult to convince yourself of that, and
  hence difficult to keep correct. This logic is still present, but
  will be removed in the next CL.

* It fixes problems on 32-bit where skipping over parts of the address
  space leads to mapping huge (and never-to-be-used) metadata
  structures. See #19831.

This also completely rewrites and significantly simplifies
mheap.sysAlloc, which has been a source of many bugs. E.g., #21044,
 #20259, #18651, and #13143 (and maybe #23222).

This change also makes it possible to allocate individual objects
larger than 512GB. As a result, a few tests that expected huge
allocations to fail needed to be changed to make even larger
allocations. However, at the moment attempting to allocate a humongous
object may cause the program to freeze for several minutes on Linux as
we fall back to probing every page with addrspace_free. That logic
(and this failure mode) will be removed in the next CL.

Fixes #10460.
Fixes #22204 (since it rewrites the code involved).

This slightly slows down compilebench and the x/benchmarks garbage
benchmark.

name       old time/op     new time/op     delta
Template       184ms ± 1%      185ms ± 1%    ~     (p=0.065 n=10+9)
Unicode       86.9ms ± 3%     86.3ms ± 1%    ~     (p=0.631 n=10+10)
GoTypes        599ms ± 0%      602ms ± 0%  +0.56%  (p=0.000 n=10+9)
Compiler       2.87s ± 1%      2.89s ± 1%  +0.51%  (p=0.002 n=9+10)
SSA            7.29s ± 1%      7.25s ± 1%    ~     (p=0.182 n=10+9)
Flate          118ms ± 2%      118ms ± 1%    ~     (p=0.113 n=9+9)
GoParser       147ms ± 1%      148ms ± 1%  +1.07%  (p=0.003 n=9+10)
Reflect        401ms ± 1%      404ms ± 1%  +0.71%  (p=0.003 n=10+9)
Tar            175ms ± 1%      175ms ± 1%    ~     (p=0.604 n=9+10)
XML            209ms ± 1%      210ms ± 1%    ~     (p=0.052 n=10+10)

(https://perf.golang.org/search?q=upload:20171231.4)

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.23ms ± 1%  2.25ms ± 1%  +0.84%  (p=0.000 n=19+19)

(https://perf.golang.org/search?q=upload:20171231.3)

Relative to the start of the sparse heap changes (starting at and
including "runtime: fix various contiguous bitmap assumptions"),
overall slowdown is roughly 1% on GC-intensive benchmarks:

name        old time/op     new time/op     delta
Template        183ms ± 1%      185ms ± 1%  +1.32%  (p=0.000 n=9+9)
Unicode        84.9ms ± 2%     86.3ms ± 1%  +1.65%  (p=0.000 n=9+10)
GoTypes         595ms ± 1%      602ms ± 0%  +1.19%  (p=0.000 n=9+9)
Compiler        2.86s ± 0%      2.89s ± 1%  +0.91%  (p=0.000 n=9+10)
SSA             7.19s ± 0%      7.25s ± 1%  +0.75%  (p=0.000 n=8+9)
Flate           117ms ± 1%      118ms ± 1%  +1.10%  (p=0.000 n=10+9)
GoParser        146ms ± 2%      148ms ± 1%  +1.48%  (p=0.002 n=10+10)
Reflect         398ms ± 1%      404ms ± 1%  +1.51%  (p=0.000 n=10+9)
Tar             173ms ± 1%      175ms ± 1%  +1.17%  (p=0.000 n=10+10)
XML             208ms ± 1%      210ms ± 1%  +0.62%  (p=0.011 n=10+10)
[Geo mean]      369ms           373ms       +1.17%

(https://perf.golang.org/search?q=upload:20180101.2)

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.22ms ± 1%  2.25ms ± 1%  +1.51%  (p=0.000 n=20+19)

(https://perf.golang.org/search?q=upload:20180101.3)

Change-Id: I5daf4cfec24b252e5a57001f0a6c03f22479d0f0
Reviewed-on: https://go-review.googlesource.com/85887
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>

@golang golang locked and limited conversation to collaborators Jan 2, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.