runtime: pageAlloc.searchAddr may point to unmapped memory in discontiguous heaps, violating its invariant #40191
It's been brought to my attention by the Fuchsia team that in cases where the Go heap is very discontiguous, the documented invariant for
Consider the following case.
In a contiguous heap we're unlikely to run into this case because the first page of
Most operating systems, even when not following the address hints provided to
This is a real problem, even for existing systems. Note that the above scenario need not happen for summary level
Fixing this problem is not straight-forward without some kind of performance consequences (though it's unclear to me how big those consequences will be).
Fundamentally, this issue arises because we try to update
Another option is to actually find the first piece of mapped memory in that region, which would be possible by iterating over
A third option is to remove the invariant on
My vote is for option 2. It's not terribly complicated, and the change is local to just the
This issue should not be considered a blocking issue for Go 1.15 because the bug was technically introduced in Go 1.14, but it should be fixed and probably backported too, since it can cause failures with no workaround.
The text was updated successfully, but these errors were encountered:
Why then is there a free page indicated in unmapped memory? How is page i not mapped?
@randall77 The higher-level summaries describe large portions of the address space, some of which may not be mapped. IIRC each level-0 summary on linux/amd64 represents 128 GiB of address space. If your heap is small (like one arena in size) then one level-0 summary will indicate that pages are free in that 128 GiB region are free, but not all of that 128 GiB need be mapped. You're right that unmapped pages necessarily have
Also, d'oh, I forgot to CC you. Sorry about that.
…geAlloc.find Currently pageAlloc.find attempts to find a better estimate for the first free page in the heap, even if the space its looking for isn't necessarily going to be the first free page in the heap (e.g. if npages >= 2). However, in doing so it has the potential to return a searchAddr candidate that doesn't actually correspond to mapped memory, but this candidate might still be adopted. As a result, pageAlloc.alloc's fast path may look at unmapped summary memory and segfault. This case is rare on most operating systems since the heap is kept fairly contiguous, so the chance that the candidate searchAddr discovered is unmapped is fairly low. Even so, this is totally possible and outside the user's control when it happens (in fact, it's likely to happen consistently for a given user on a given system). Fix this problem by ensuring that our candidate always points to mapped memory. We do this by looking at mheap's arenas structure first. If it turns out our candidate doesn't correspond to mapped memory, then we look at inUse to round up the searchAddr to the next mapped address. While we're here, clean up some documentation related to searchAddr. For #40191. Fixes #40192. Change-Id: I759efec78987e4a8fde466ae45aabbaa3d9d4214 Reviewed-on: https://go-review.googlesource.com/c/go/+/242680 Run-TryBot: Michael Knyszek <firstname.lastname@example.org> Reviewed-by: Austin Clements <email@example.com> Reviewed-by: Michael Pratt <firstname.lastname@example.org> TryBot-Result: Gobot Gobot <email@example.com> (cherry picked from commit b56791c) Reviewed-on: https://go-review.googlesource.com/c/go/+/246197 Run-TryBot: Dmitri Shuralyov <firstname.lastname@example.org>
Currently the AddrRange used for testing is defined separately from addrRange in the runtime, making it difficult to test it as well as addrRanges. Redefine AddrRange in terms of addrRange instead. For #40191. Change-Id: I3aa5b8df3e4c9a3c494b46ab802dd574b2488141 Reviewed-on: https://go-review.googlesource.com/c/go/+/242677 Trust: Michael Knyszek <email@example.com> Run-TryBot: Michael Knyszek <firstname.lastname@example.org> TryBot-Result: Go Bot <email@example.com> Reviewed-by: Michael Pratt <firstname.lastname@example.org> Reviewed-by: Austin Clements <email@example.com>
This change adds a test suite for addrRanges.findSucc so we can change the implementation more safely. For #40191. Change-Id: I14a834b6d54836cbc676eb0edb292ba6176705cc Reviewed-on: https://go-review.googlesource.com/c/go/+/242678 Trust: Michael Knyszek <firstname.lastname@example.org> Run-TryBot: Michael Knyszek <email@example.com> TryBot-Result: Go Bot <firstname.lastname@example.org> Reviewed-by: Austin Clements <email@example.com> Reviewed-by: Michael Pratt <firstname.lastname@example.org>
This change modifies addrRanges.findSucc to more efficiently find the successor range in an addrRanges by using a binary search to narrow down large addrRanges and iterate over no more than 8 addrRanges. This change makes the runtime more robust against systems that may aggressively randomize the address space mappings it gives the runtime (e.g. Fuchsia). For #40191. Change-Id: If529df2abd2edb1b1496d8690ddd284ecd7138c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/242679 Trust: Michael Knyszek <email@example.com> Reviewed-by: Austin Clements <firstname.lastname@example.org> Reviewed-by: Michael Pratt <email@example.com>