New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: out of memory compiling cmd/compile/internal/ssa with 1GB RAM #27739

Open
philhofer opened this Issue Sep 18, 2018 · 20 comments

Comments

Projects
None yet
10 participants
@philhofer
Contributor

philhofer commented Sep 18, 2018

Building tip on Ubuntu 18.04 on a Digital Ocean VM with one 1GB of RAM at commit 83dfc3b

phil@spare0:~/go/src$ uname -a
Linux spare0 4.15.0-30-generic #32-Ubuntu SMP Thu Jul 26 17:42:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

I can reproduce this out-of-memory condition 100% of the time (in the prove pass in SSA):

Building Go cmd/dist using /home/phil/go-bootstrap.
Building Go toolchain1 using /home/phil/go-bootstrap.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
# cmd/compile/internal/ssa
fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x9f727f, 0x16)
        /home/phil/go/src/runtime/panic.go:608 +0x72
runtime.sysMap(0xc02c000000, 0x4000000, 0xe7d258)
        /home/phil/go/src/runtime/mem_linux.go:156 +0xc7
runtime.(*mheap).sysAlloc(0xe5a580, 0x4000000, 0xe5a598, 0x7f0d87065308)
        /home/phil/go/src/runtime/malloc.go:619 +0x1c7
runtime.(*mheap).grow(0xe5a580, 0x3, 0x0)
        /home/phil/go/src/runtime/mheap.go:920 +0x42
runtime.(*mheap).allocSpanLocked(0xe5a580, 0x3, 0xe7d268, 0x400)
        /home/phil/go/src/runtime/mheap.go:848 +0x337
runtime.(*mheap).alloc_m(0xe5a580, 0x3, 0x50, 0x7f0d87351fff)
        /home/phil/go/src/runtime/mheap.go:692 +0x119
runtime.(*mheap).alloc.func1()
        /home/phil/go/src/runtime/mheap.go:759 +0x4c
runtime.(*mheap).alloc(0xe5a580, 0x3, 0x7f0d87010050, 0x7f0d87065270)
        /home/phil/go/src/runtime/mheap.go:758 +0x8a
runtime.(*mcentral).grow(0xe5ccb8, 0x0)
        /home/phil/go/src/runtime/mcentral.go:232 +0x94
runtime.(*mcentral).cacheSpan(0xe5ccb8, 0x1fe)
        /home/phil/go/src/runtime/mcentral.go:106 +0x2f8
runtime.(*mcache).refill(0x7f0d8b82a000, 0xc000020050)
        /home/phil/go/src/runtime/mcache.go:122 +0x95
runtime.(*mcache).nextFree.func1()
        /home/phil/go/src/runtime/malloc.go:749 +0x32
runtime.systemstack(0x455ad9)
        /home/phil/go/src/runtime/asm_amd64.s:351 +0x66
runtime.mstart()
        /home/phil/go/src/runtime/proc.go:1229

1GB of memory has been more than enough to build the toolchain in the past.

Barring any clever ideas about how to debug this, I'll try to bisect and hope that there was only one commit that reliably introduced this regression.

@agnivade

This comment has been minimized.

Member

agnivade commented Sep 18, 2018

Likely a duplicate of #26523. Can you try this patch as suggested #26523 (comment) ?

Sorry, the CL is already merged. I didn't see that.

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

Yes, I'll do that.

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

That patch is part of the go1.11 release, and building go1.11 still fails.

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

I'm going to try to bisect 74b5602...e0faedb and see if that produces interesting results.

@cherrymui

This comment has been minimized.

Contributor

cherrymui commented Sep 18, 2018

What version of Go is the bootstrap compiler? If I understand correctly, it is toolchain1 OOM'd, and toolchain1 is built with the bootstrap compiler with the bootstrap runtime.

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

Ah, fair point. I was using go1.11 as my bootstrap toolchain. (Interestingly, though, building go1.10.4 with go1.11 as the bootstrap toolchain works fine...)

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

My first round of bisecting blames commit e913729, but I'm going to bisect again with 1.10.4 as my bootstrap toolchain and see if that produces different results. (That commit doesn't make much sense as the source of the regression.)

@cherrymui

This comment has been minimized.

Contributor

cherrymui commented Sep 18, 2018

Go 1.11 compiler does more work, which of course has more code. So it is a double factor here if using Go 1.11 compiler to compile Go 1.11 compiler: it (may) use more memory (even compiling the same code), and it compiles more code.

Maybe a workaround is to use an older (or newer) bootstrap compiler?

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

make.bash on go1.11 using go1.10.4 as a bootstrap still fails. I'm bisecting the same commit range.

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

When bootstrapping with 1.10.4, bisect blames cc09212.

# bad: [e0faedbb5344eb6f8f704005fe88961cdc6cf5f8] cmd/go: add missing newlines in printf formats
# good: [74b56022a1f834b3edce5c3eca0570323ac90cd7] doc: note that x509 cert parsing rejects some more certs now
git bisect start 'e0faedbb' '74b56022a'
# good: [62adf6fc2d70d9270b4213218e622c15504966be] cmd/internal/obj: convert unicode C to ASCII C
git bisect good 62adf6fc2d70d9270b4213218e622c15504966be
# bad: [4eb1c84752b8d3171be930abf4281080d639f634] cmd/link: fix name section of WebAssembly binary
git bisect bad 4eb1c84752b8d3171be930abf4281080d639f634
# bad: [31ef3846a792012b0588d92251f3976596c0b1b1] cmd/compile: add rulegen diagnostic
git bisect bad 31ef3846a792012b0588d92251f3976596c0b1b1
# good: [cc0aaff40e02192356ccb65d8acf571d12f74a95] cmd/compile: fix Wasm rule file name
git bisect good cc0aaff40e02192356ccb65d8acf571d12f74a95
# good: [3080b7d0af65858400b13134c1c471e2cb35e647] runtime: unify fetching of locals and arguments maps
git bisect good 3080b7d0af65858400b13134c1c471e2cb35e647
# good: [8a16c71067ca2cfd09281a82ee150a408095f0bc] cmd/vet: -composites only checks imported types
git bisect good 8a16c71067ca2cfd09281a82ee150a408095f0bc
# bad: [7d61ad25f8b10c0a656ef709fb30c08f5974594b] crypto/x509: check EKUs like 1.9.
git bisect bad 7d61ad25f8b10c0a656ef709fb30c08f5974594b
# bad: [f2cde55cd60993e948dada9187d25211ec150a5e] runtime: use Go function signatures for memclr and memmove comments
git bisect bad f2cde55cd60993e948dada9187d25211ec150a5e
# good: [e9137299bf74e1bcac358b569f86aef73c7c2ea6] debug/pe: parse the import directory correctly
git bisect good e9137299bf74e1bcac358b569f86aef73c7c2ea6
# bad: [cc09212f59ee215cae5345dc1ffcd1ed81664e1b] runtime: use libc for nanotime on Darwin
git bisect bad cc09212f59ee215cae5345dc1ffcd1ed81664e1b
# good: [e86c26789dbc11c50c4c49bee55ea015847a97b7] runtime: fix darwin 386/amd64 stack switches
git bisect good e86c26789dbc11c50c4c49bee55ea015847a97b7
# first bad commit: [cc09212f59ee215cae5345dc1ffcd1ed81664e1b] runtime: use libc for nanotime on Darwin

That commit doesn't make much sense as the culprit either, but both bisects point to a regression introduced somewhere in or around May of this year.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Sep 18, 2018

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

It's the cheapest Digital Ocean VM. 1 vCPU, 1GB memory, no swap:

phil@spare0:~$ cat /proc/swaps
Filename                                Type            Size    Used    Priority

It's less important to me how many resources one needs to build the toolchain, and more important that things are moving in the wrong direction. Shouldn't peak resource consumption by the compiler be determined by either the largest package (in the compiler front-end) or the largest function (in the back-end)? The out-of-memory condition doesn't occur in the linker, where I would expect resource consumption to grow in concert with the repo growing ~10% more code.

@philhofer

This comment has been minimized.

Contributor

philhofer commented Sep 18, 2018

Additional anecdotal evidence of a memory use regression, though only for vmsize, not rss, which would make sense.

@philhofer philhofer changed the title from make.bash: out of memory (tip) to make.bash: out of memory Sep 18, 2018

@mvdan

This comment has been minimized.

Member

mvdan commented Sep 19, 2018

I think we should have a builder that has a relatively small amount of memory. Somewhat related, in #26867 I reported how go test net OOM'd with a few gigabytes of available memory.

We can assume that Linux is fairly common on machines with less memory (small computers, routers, VMs, etc), and that a 64-bit architecture like amd64 should stress test the memory more than a 32-bit architecture would.

We already have special builders like linux-amd64-noopt, so I propose adding a linux-amd64-small. It could start at a limit of 2GB of memory, but we could lower that to 1GB or even lower once it's in place. I presume that we could also add extra limits to it, such as:

  • lowering the maximum number of open file descriptors
  • lowering the size of /tmp
  • lowering the maximum number of processes created by the user
  • giving the machine CPU power comparable to embedded devices (e.g. dual-core 1GHz)
  • lowering the size of the disk

/cc @dmitshur

@ctriple

This comment has been minimized.

Contributor

ctriple commented Sep 19, 2018

i suggest add memory swap on/off @mvdan

@bradfitz

This comment has been minimized.

Member

bradfitz commented Oct 24, 2018

@mvdan, let's not combine two bugs into one. It's hard to label & track that way.

Could you file a separate builder bug about a small config? (but perhaps we could just make an existing builder (cgo, noopt?) be the small one... you could float that in the bug, or I could reply there later)

I'm going to remove the "Builders" label from this bug.

@mvdan

This comment has been minimized.

Member

mvdan commented Oct 25, 2018

@bradfitz you're right - see the issue above.

@ianlancetaylor ianlancetaylor changed the title from make.bash: out of memory to cmd/compile: out of memory compiling cmd/compile/internal/ssa with 1GB RAM Dec 13, 2018

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Dec 13, 2018

I just tried building cmd/compile/internal/ssa with tip and it took just over 1G.

@josharian

This comment has been minimized.

Contributor

josharian commented Dec 13, 2018

#20104 is one big fix here

@philhofer

This comment has been minimized.

Contributor

philhofer commented Dec 14, 2018

@josharian I agree that #20104 would reduce memory pressure when compiling cmd/compile/internal/ssa. However, you'll notice the compiler fails in the second bootstrap phase rather than the first, which means compiling the current code with the old compiler succeeds, but compiling the same code with the current compiler fails. In other words, the regression is in the compiler's memory use, rather than the size of the code to be compiled. The regression in the compiler's performance seems higher-priority to me, since it impacts more than just folks working on the Go compiler itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment