runtime: garbage collection ineffective on 32-bit #909
rsc
commented
Apr 29, 2011
peterGo
commented
Jun 7, 2011
Should issue #1925, rather than issue #1920, have been merged into this issue?
MaVo159
commented
Jul 7, 2011
package main
import "fmt"
func main() {
for {
vec := make([]float64, 1000000)
fmt.Println(vec[0])
}
}
This crashes on 386 linux with 8g from last weekly on a mashine with 1.6GB free memory:
runtime: memory allocated by OS not in usable range
runtime: out of memory: cannot allocate 8060928-byte block (533069824 in use)
throw: out of memory
Is this a dublicate of 909 or something else? Looks like 909 to me, but testcase is a
bit different, I'm not sure. If it is, can we expect it to get fixed anytime soon?,
since it renders go on 386 unusable for lots of very common usecases (at least for me).
MaVo159
commented
Jul 8, 2011
@adg: You make it sound like a cornercase.. ...it's not. Allocating a vector with around 1.000.000 floats is not that uncommon if you process scientific data. One vector in this example should be about 8Mb, the program crashed at 530Mb, thats only 70 vectors. I am most certain "much" is a bit more than 70 vectors in some very common cases. I really don't want to complain, I like go and the compilers, they are great on 64bit, but this is a serious bug. I'm a bit surprised that you think that it isn't. I'm still not convinced that it is the same bug though, since my problem is not only the GC, but that the memory allocation fails when >1Gb free memory is available.
rsc
commented
Feb 19, 2012
Nice test program from issue #1925: package main import ( "fmt" "runtime" "net/http" _ "net/http/pprof" ) var st runtime.MemStats func main() { runtime.MemProfileRate = 1 for i := 0; i < 10; i++ { a := make([]byte, 5000000) if a == nil { } a = nil runtime.GC() runtime.ReadMemStats(&st) fmt.Println(i, st.Alloc, st.Sys, st.HeapObjects) } fmt.Println() for i := 0; i < 10; i++ { a := make([]byte, 20000000) if a == nil { } a = nil runtime.GC() runtime.ReadMemStats(&st) fmt.Println(i, st.Alloc, st.Sys, st.HeapObjects) } http.ListenAndServe(":8000", nil) } I have a few changes to ameliorate the effect of large static tables for Go 1, but the general problem remains.
rsc
commented
Feb 19, 2012
This CL should fix the 'static tables cause confusion' problem, but once you remove the static tables you quickly find that dynamic data can cause confusion too, sadly. changeset: 12574:24411588e821 user: Russ Cox <rsc@golang.org> date: Sun Feb 19 03:19:52 2012 -0500 summary: gc, ld: tag data as no-pointers and allocate in separate section
gopherbot
commented
Feb 29, 2012
I stumbled upon this bug because a Go program of mine runs out of memory on my 386 machine :-(. I tried the latest version of Go (go version says "go version weekly.2012-02-22 +ca9790d6a51a"), but the problem is still present. The output of the test program from comment #19 is: ./test909 0 5482240 9018492 684 1 10521560 14695548 705 2 15559640 20372604 707 3 20597720 26049660 709 4 25635800 31726716 711 5 30673880 37403772 713 6 35711960 43080828 715 7 40750040 48757884 717 8 45788120 54434940 719 9 50826200 60111996 721 0 65862696 82672764 723 1 85900592 105233532 731 2 105938224 127794300 733 3 125975856 150355068 735 4 146013488 172915836 737 5 166051120 195476604 739 6 186088752 218037372 741 7 206126384 240598140 743 8 226164016 263158908 745 9 246201648 285719676 747 Am I misunderstanding your message / the test program or does your fix not actually work? Thanks.
gopherbot
commented
Feb 29, 2012
I hate to ask this, but do you have a rough time frame for when this bug will be adressed? Order of weeks, months, two-digit months? I’m just asking because I really need a solution for this. I could live without it for a few more weeks, but if it’s going to take longer than that, I’ll have to reimplement my program in a different language (not meant as pressure, just a fact).
gopherbot
commented
Feb 29, 2012
We're having the same problem as in comment #10...We've had to develop several workarounds (including restarting a long-running app when approaching the 512MB allocation limit, which is far from ideal). I'd love it if this was fixed as moving all our servers to 64 bit isn't an affordable option at the moment.
gopherbot
commented
Mar 3, 2012
FYI: I was able to solve this problem (in my case) by using "coffer": I put up a fixed version at http://github.com/mstap/coffer, forked from http://github.com/mcgoo/coffer. I just swapped all my buffers which used buffer := bytes.NewBuffer(make([]byte, 0, readBufferBytes)) before with buffer, _ := coffer.NewMemCoffer(readBufferBytes+1). My old buffer.Reset() becomes buffer.Seek(0, 0), but apart from that, coffer was a drop-in replacement in my case. Note that buffer.Close() free()s the memory and *needs* to be called. This is not a beautiful solution, but it gets the job done until this problem is properly fixed.
gopherbot
commented
Mar 12, 2012
I spent some time digging into this issue and have found the following: 1. GC is Ineffectual because some memory blocks in the GO's heap can be mistakenly marked as referenced, but in fact they are not. 2. The "fake" references are from the static variables defined in various go packages. These variables are not pointers. However, as GC scans the data section for potential references to the heap, they are treated as "pointers" and therefore the entire heap blocks which these "pointers" happen to "reference" can never be reclaimed even when they should be. 3. The attached test programs, mem.go can easily illustrate how these fake pointers prevent GC from freeing used memory on both the tip (12661:426b1101b166) and r60.3 on 32-bit linux. To run the test programs, please unzip the attachment. To run it on Go 1, go to "go0" and run "go run mem.go". Make sure you are tip hash is 12661:426b1101b166, the most recent as of now. To run it on Go release, go to "go0" and run "make" and "./mem" In the unicode package, there are many static variables which end up being put in the DATA section. As GC scans the data section at runtime/mgc0.c:648, it treats the variables are pointers and some happen to "point" to the memory blocks in the heap. If we comment out the unicode package, GC works and the program runs fine. 4. The issue are more likely to crash applications which allocate memory in large chunks because one "fake" pointer can hold a large piece of memory and it does not take a lot of fake pointers to make the app run out of RAM. If allocated in small chunks, the problem still persist, though much less severe, and often gets away unnoticed. 5. I am suspecting, the issue potentially exists in 64 bit as well.
Attachments:
- issue_909.zip (1645 bytes)
gopherbot
commented
Mar 12, 2012
To make it more convient, I just paste the code here:
package main
import (
"runtime"
//comment the following line and the program runs fine.
"unicode"
)
func fs() []byte {
//allocate 64 MB chunks
r := make([]byte, 64*1024*1024)
return r
}
func main() {
//comment the following line and the program runs fine.
println("addr:", &unicode.Scripts)
var s []byte
for i := 0; i < 100; i++ {
println("")
println(i, "---------------")
s = fs()
runtime.GC()
var m runtime.MemStats
runtime.ReadMemStats(&m)
println(i, "MemStats.Alloc:", m.Alloc)
}
_ = s
}
gopherbot
commented
Mar 12, 2012
Comment 31 by matthewrsiegel@comcast.net:
excellent investigation! if there were a hack to prevent gc from bothering with static data...
gopherbot
commented
Mar 13, 2012
I was able to prove that the bug exists on 64 bit linux as well with r60.3. The two fake
pointers prevents GC from reclaiming the two pieces of memory blocks allocated in the
first two iterations of the loop. However, it may not work for you because setting two
fake pointers is tricky.
My argument is, this is essentially a bug that can cause huge memory leaks to both 64
bit and 32 bit servers.
Moderators, should I create a separate issue?
package main
import (
"runtime"
)
var fake_pointer1 uint64 = 0xf84402d000
var fake_pointer2 uint64 = 0xf84000d000
func fs() []byte {
//allocate 64 MB chunks
r := make([]byte, 64*1024*1024)
return r
}
func main() {
println("addr:", &fake_pointer1, &fake_pointer2)
var s []byte
for i := 0; i < 10; i++ {
println("")
println(i, "---------------")
s = fs()
runtime.GC()
m := runtime.MemStats
println(i, "MemStats.Alloc:", m.Alloc)
}
_ = s
}
gopherbot
commented
Apr 6, 2012
Comment 34 by ezyang@mit.edu:
Here's what I don't understand: Go appears to keep track of some simple type information, namely, whether or not a region may have pointers, or doesn't have pointers. Why can't this be added to static data?
rsc
commented
Apr 6, 2012
We do track that for static data too, and pure-data (no pointer) blocks do not get scanned by the garbage collector. That's what I did in the CL described by comment #20. But that's not enough. All it takes is one unfortunate collision to pin a pointer in memory that should not be, and that might be the root of some large tree of allocations. The chance of collision increases the more memory you allocate, since more and more 32-bit patterns point into allocated data. The collector is structured so that it can keep per-word information about what is a pointer and what is not, but we have not hooked that up to the C compiler and the Go compiler yet, so we can't take advantage of that. It is a known, unfortunate issue. The best workaround I can suggest for the short term is to use a 64-bit machine. Sorry. Russ
gopherbot
commented
Apr 7, 2012
Is it actually feasible for Go to have a non conservative garbage collector? I.e. are there any language design choices which make a practical realisation impossible? I really like the Go language, and I understand the reasons for the current choice of GC, but realistically, due to design choices and restrictions (Cgo etc..), are we going to be stuck with this situation for the foreseeable future? I really hope not, but currently this issue is barring me from promoting the use of Go in my workplace... Regards,
gopherbot
commented
Dec 15, 2012
I can't use Go until this issue is fixed, and there are lots of others like me. It will significantly boost for the adoption of the language to get this done. I get the impression by "Owner: ----", "Priority-Later", and "Size-XL" that this isn't slated to get fixed anytime soon... How can we change that? Who is in a position to get this done?
Work is in progress on this (see golang-dev emails from 0xe2.0x9a.0x9b, for example), and it remains a high priority for 32-bit systems (notably ARM devices). No number of stars or comments on this bug will increase its priority, though, as the number of people qualified to work on this is relatively low and they're mostly already working on it.
gopherbot
commented
May 8, 2013
I don't see it mentioned here; "[This issue] should be mostly addressed by improvements to the garbage collector in 1.1. "If you've experienced the issues, please try Go 1.1 and see if the problem persists, and update the issue with your results." -- Andrew Gerrand, https://groups.google.com/d/msg/golang-nuts/PoQGGq-V2l8/Hzw-OC6xY8IJ
gopherbot
commented
Jun 13, 2013
Comment 58 by trent@wireover.com:
Is this problem fixed? It's unclear - someone said "This issue] should be mostly addressed by improvements to the garbage collector in 1.1", but the issue is still open and scheduled for "Go1.2Maybe". This single issue is blocking us from using Go.
ianlancetaylor
commented
Jun 13, 2013
For a problem of this generality, "fixed" is a difficult word. The core problem is that the garbage collector is not fully precise. This can cause a value that does not contain a pointer to appear to be a pointer, which prevents the garbage collector from collecting the memory block to which the non-pointer appears to point. In practice this happens most often with floating point values; integers rarely appear to be pointers, but floating point value sometimes do. So much for the background. In Go 1.1 the GC is far more precise than the GC in Go 1.0. That means that these sorts of problems are vastly less likely to occur when using Go 1.1 than they are with Go 1.0. In particular in Go 1.1 the heap is almost entirely precisely typed, and floating point values in the heap will never be mistaken for pointers. However, the problem is not entirely fixed because even in Go 1.1 the GC is not entirely precise. In particular in Go 1.1 stack frames are not precisely typed. So if, for example, you have a bunch of floating point local variables, those variables can appear to be pointers while the function is live and cause the GC to fail to collect various memory blocks. This is not likely to be a problem for most programs, but it is certainly possible. In Go 1.2 the GC will be even more precise than it is in Go 1.1. On tip the GC is already precise for function arguments and results. Go 1.2 may be precise for local variables on the stack. It may even be fully precise. But it's too early to tell whether this work, which is ongoing, will be complete for Go 1.2. So the current status is that most programs that did not work with Go 1.0 should now work fine with Go 1.1. However, it is possible to construct test cases that will not work well with Go 1.1. And there is a very small chance that your real program will accidentally happen to be one of those test cases. But it's a very small chance. Also, if you are so unfortunate as to encounter that chance, you do have options in Go 1.1 that you did not have in Go 1.0: you can force your non-pointers that appear to be pointers off the stack into the heap. That will give you a bit more heap allocation, but in return the GC will collect more data. Hope this helps.
gopherbot
commented
Jun 19, 2013
Comment 60 by trent@wireover.com:
Thanks for that thorough explanation. I understand this is a rather complex issue, and I appreciate that you're working to fix it. To paraphrase, what I'm hearing is -- 1. "You can write syntactically correct Go code, and it will mostly likely work, but there's a small chance it will leak memory." 2. "If your code is going to leak memory, you won't know ahead of time; you'll have to monitor the code while it runs. It will be a major pain to detect and figure out where it's leaking, and the only fix is an awkward work-around." Are those fair? I need Go to run unattended for a long time in a high-reliability production deployment. This issue has turned me off to using Go. I'm very excited to give Go a go when the GC is completely precise.
alberts
commented
Jun 19, 2013
@60: For what it's worth, I think you might be denying yourself the benefits you will reap from using Go because of a minor issue. If the software needs to run in a high-reliability production deployment, you will need to build a proper system/stress/load test for it. That test will tell you if you have this issue within a couple of seconds to minutes. You need this test despite this issue. You might write memory leaking code all on your own in another language, despite the minor shortcomings of the Go runtime. You should also having something (systemd, upstart, whatever) supervising your Go process. Depending on what you choose, you could limit the maximum memory usage (or if you're running on 32-bit, it won't go over 2 GB anyway). Again, you should have this for any service written in any language anyway so that rare crashes don't bring down your high-reliability production system. Also, you can further guard against this issue by running 64-bit code instead of 32-bit code.
gopherbot
commented
Jun 22, 2013
Comment 63 by trent@wireover.com:
These are good points. I had thought that the nature of the memory leak was that my process would gradually use more and more memory, and eventually blow up because leaked memory would never be reclaimed by the GC. But one of you posted recently saying that a leaky app would just use a bit more memory than it should. If this is true then I'm a lot less concerned. Would you explain how this can be the case? I don't have a good understanding of how GCs or memory management work. If my process has a lot of data-structure churn, won't the leaked memory gradually accumulate? If memory is leaked, how would it ever get reclaimed by the GC? What prevents the accumulation of leaked memory?
ianlancetaylor
commented
Jun 22, 2013
The current garbage collector takes a snapshot of memory at a particular moment. Anything accessible from global variables and goroutine stacks is live. Everything else is freed. A memory leak is when some memory block that is inaccessible is not freed. To put it another way, a memory leak occurs when some value that is not a pointer appears to be a pointer and appears to point to a memory block that would otherwise be freed. Go 1.1 is precise on the heap, which means that there are no such invalid pointers on the heap. The only source of invalid pointers is the goroutine stacks. Since the stack is not precise, at any given snapshot of time, it is possible that the stack will contain values that are not pointers but look like pointers, and that may cause some blocks to not be freed. However, goroutine stacks are finite and (unless there is some bug in your program) the average size of the goroutine stacks as your program runs will remain fixed. So on average you can't get an ever increasing number of invalid pointers on the goroutine stacks. Instead you have a fixed set of invalid numbers, and depending on your program that fixed set will either change or not. If the fixed set of invalid pointers doesn't change, then you have some fixed amount of extra memory that is not freed. Data structure churn doesn't affect that. The invalid pointers will point to blocks that should be freed, but there are (by definition) no real pointers to point to those blocks, so they won't change, and the amount of memory they tie up won't change. Alternatively, if the fixed set of invalid pointers does change over time, then over time some blocks will have no invalid pointers pointing to them and they will be freed, while other, different, blocks will be retained incorrectly. The amount of memory incorrectly retained will fluctuate over time, but on average there is no reason to expect it to either increase or decrease. The only way that the incorrectly retained memory would increase over time would be if the overall size of your live data is increasing over time--in which case you have a problem anyhow. So either way you have a leak in the sense that your program is using more memory than it should, but the amount of extra memory should be, on average, fixed based on the behaviour of your program, and it should not increase over time. It may be possible to write a Go 1.1 program that does have a memory leak in the worst sense, in that it steadily uses more memory over time even though the live memory does not actually increase. But I'm having a hard time thinking of a way to do it. And I'm confident that any such program would be highly specialized for this purpose; the problem could not arise by accident.
gopherbot
commented
Jun 22, 2013
Comment 65 by trent@wireover.com:
That allays my concerns about this issue. Thank you for the detailed explanation. I recall another issue about Go memory management - something about Go requiring a large contiguous chunk of RAM when a program launches, which on 32-bit systems can fail in certain circumstances. I can't seem to find updated info on this issue - is it still a problem with Go 1.1? Can you characterize this issue? I'd like to determine if Go can be stable WRT memory running as a 32-bit Windows desktop app (understanding that Go is intended as a 64-bit server language). Are there still any memory-management concerns with this?
ianlancetaylor
commented
Jun 22, 2013
In both Go 1 and Go 1.1, in 32-bit mode, the memory allocator starts with a chunk of 768MB and then asks for additional chunks of 256MB as they are needed. The Go 1.1 allocator is slightly cleverer but not really noticeably so. Unfortunately I know very little about Go running in the Windows environment; I don't know if there are any memory issues that arise on Windows but not elsewhere.
alexbrainman
commented
Jun 24, 2013
trent@wireover.com, > I recall another issue about Go memory management - something about Go requiring a large contiguous chunk of RAM when a program launches, which on 32-bit systems can fail in certain circumstances. I can't seem to find updated info on this issue - is it still a problem with Go 1.1? Can you characterize this issue? I think you are talking about https://golang.org/issue/2323. Looking at the issue again, Go used to reserve one large block of about 800MB memory. Checking a recently built Go program (with vmmap), I can see it reserves 2 large blocks of about 250MB and 500MB instead now. So, that should be an improvement here. > ... Are there still any memory-management concerns with this? There is also issue of Go not returning committed memory back to Windows https://golang.org/issue/4960 and https://golang.org/issue/5584. It is a big issue, as far as I am concerned, if your program requires "a lot" of memory occasionally, because once it has been allocated, it won't get released back to OS until your process will exit. So your whole system will suffer. If your process is short lived, or you don't require excessive amounts of memory you should be OK. Alex
gopherbot
commented
Jun 25, 2013
Comment 68 by trent@wireover.com:
That's very useful info guys, thank you. Keep up the good work - Go has a bright future.
gopherbot
commented
Jul 12, 2013
The make function also holds a static pointer to the last thing it alloced giving weird
result in memory statistics:
package main
import (
"runtime"
)
func fs(sz int) []byte {
//allocate sz MB chunks
r := make([]byte, sz*1024*1024)
return r
}
func printMemStats(i int) {
var m runtime.MemStats
runtime.GC()
runtime.ReadMemStats(&m)
println(i, "MemStats.Alloc:", m.Alloc/(1024*1024))
}
func clearMake() {
r := make([]int, 1)
_ = r
}
func main() {
var s []byte
var t []byte
s = fs(64)
printMemStats(1)
s = nil
printMemStats(2)
clearMake()
printMemStats(3)
t = fs(32)
printMemStats(4)
s = fs(64)
printMemStats(5)
_ = s
_ = t
}
Output:
1 MemStats.Alloc: 64
2 MemStats.Alloc: 64
3 MemStats.Alloc: 0
4 MemStats.Alloc: 32
5 MemStats.Alloc: 96
Expected:
1 MemStats.Alloc: 64
2 MemStats.Alloc: 0
3 MemStats.Alloc: 0
4 MemStats.Alloc: 32
5 MemStats.Alloc: 96
davecheney
commented
Jul 13, 2013
@Cevian, could you please open a new issue (issue #909 is too long) with your code sample.
randall77
commented
Mar 5, 2014
Status update. We've done a lot to improve the preciseness of the garbage collector for 1.3. The major change is that scanning of Go stack frames is now completely precise. There are lots of minor changes, including scanning interfaces precisely, correct context pointer scanning, and modifying reflect.Value to do its magic in a precise way. We're not completely precise yet, but we're 99% there. The major remaining piece is the scanning of C stack frames. A few minor pieces also remain, including scanning of some runtime internal data structures.
rsc
commented
Sep 18, 2014
We've done a lot over the last few months. C stack frames are gone, and the internal data structures are described correctly now. The only piece I am aware of that is left are a few C-declared data structures that the linker instructs the garbage collector to scan conservatively. I think we can eliminate those for 1.4 and finally close this bug. There is one other piece that I am not counting: if you use SWIG to allocate Go memory from C++, that Go memory is scanned conservatively. That's a different problem (issue 6461) and not a concern for most Go programmers (since most don't use SWIG).
Labels changed: added release-go1.4, removed release-none.
This was referenced Dec 8, 2014
Closed
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
GOARCH=386 Garbage Collection Is Ineffectual: mmap: errno=0xc package main import ( "fmt" "runtime" ) func fs() []float64 { r := make([]float64, 923521) return r } func main() { s := fs() for i := 0; i < 1000; i++ { s = fs() m := runtime.MemStats fmt.Printf("i %d; Alloc %d; TotalAlloc %d\n", i, m.Alloc, m.TotalAlloc) } m := runtime.MemStats fmt.Printf("end; Alloc %d; TotalAlloc %d\n", m.Alloc, m.TotalAlloc) _ = s } Expected results obtained with 6g, GOARCH=amd64, and GOOS=linux, for hg id 5af6f6656531 tip, using 6.0GB real memory: i 0; Alloc 15086544; TotalAlloc 15141352 i 1; Alloc 15156656; TotalAlloc 23020488 i 2; Alloc 22545888; TotalAlloc 30798840 i 3; Alloc 30004752; TotalAlloc 38646824 i 4; Alloc 15295920; TotalAlloc 46520408 i 5; Alloc 22685152; TotalAlloc 54298760 i 6; Alloc 30074384; TotalAlloc 62077112 . . i 996; Alloc 30074384; TotalAlloc 7771093592 i 997; Alloc 15295920; TotalAlloc 7778897544 i 998; Alloc 22685152; TotalAlloc 7786675896 i 999; Alloc 30074384; TotalAlloc 7794454248 end; Alloc 30074432; TotalAlloc 7794838296 Actual results obtained with 8g, GOARCH=386, and GOOS=linux, for hg id 5af6f6656531 tip, using 0.5GB real memory: i 0; Alloc 14975824; TotalAlloc 16730200 i 1; Alloc 22402248; TotalAlloc 24448472 i 2; Alloc 29828344; TotalAlloc 32166408 i 3; Alloc 37254440; TotalAlloc 39884344 i 4; Alloc 44680536; TotalAlloc 47602280 i 5; Alloc 44717256; TotalAlloc 55320216 i 6; Alloc 52106488; TotalAlloc 63001288 . . . i 427; Alloc 2814607576; TotalAlloc 3310449848 i 428; Alloc 2822033672; TotalAlloc 3318167784 i 429; Alloc 2829459768; TotalAlloc 3325885720 i 430; Alloc 2836885864; TotalAlloc 3333603656 mmap: errno=0xc For a complete description of the problem q.v. conjugate gradient method out of memory http://groups.google.com/group/golang-nuts/browse_thread/thread/6fb3e3b7ae04d42a