Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: mallocs cause "base outside usable address space" panic when running on iOS 14 #46860

Open
rayvbr opened this issue Jun 21, 2021 · 32 comments

Comments

@rayvbr
Copy link

@rayvbr rayvbr commented Jun 21, 2021

What version of Go are you using (go version)?

$ go version
go version go1.16.5 darwin/amd64 (also reproducible on 1.15 and 1.14)

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

ios/arm64

What did you do?

Create a simple Go package, doing a few medium sized allocs (creating a few byte slices of several tens of MB will do, as long as it forces the Go runtime to ask the OS for more memory), compile it with gomobile to an iOS .framework file, use the resulting framework in an iOS project and compile and run for iOS 14.

If the iOS project doesn't do a lot of memory allocs itself, everything works fine.
If the iOS project does use reserve a lot of memory (either for CPU or GPU) before initialising the Go framework, the Go runtime crashes with the following panic.

runtime: memory allocated by OS [0x2ac000000, 0x2b0000000) not in usable address space: base outside usable address space
fatal error: memory reservation exceeds address space limit

Note that the exact amount of memory needed seems to be device dependent, although in all cases the panic happens way before the typical OOM point for the device in question. Issues was reproduced using iPhone 12 (allocating 1.1GB of RAM in the iOS project is sufficient to make it crash), iPhone 12 Pro and iPhone SE 2020. Note that everything works fine when compiling with pre-iOS 14 versions of XCode.

Although I'm way out of my league here, the addresses indicated in the panic seem to indicate that iOS no longer has a 33-bit memory address limit, as assumed by the Go runtime. Perhaps related to iPhone 12 Pro having 6GB of RAM, which would mean the 4GB limit assumed by the Go runtime would no longer be sufficient?

@rayvbr rayvbr changed the title runtime: mallocs fail with base outside usable address space when running on iOS 14 runtime: mallocs cause "base outside usable address space" panic when running on iOS 14 Jun 21, 2021
@toothrot toothrot added this to the Backlog milestone Jun 21, 2021
@toothrot
Copy link
Contributor

@toothrot toothrot commented Jun 21, 2021

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jun 22, 2021

Although I'm way out of my league here, the addresses indicated in the panic seem to indicate that iOS no longer has a 33-bit memory address limit, as assumed by the Go runtime. Perhaps related to iPhone 12 Pro having 6GB of RAM, which would mean the 4GB limit assumed by the Go runtime would no longer be sufficient?

This might be the case. To be honest, I couldn't find any documentation about this. This was the best I could infer from experiments and the experiments of others I found. Basically, the assumption I was operating under is that each process has 4 GiB of address space, starting from the 4 GiB offset (the bottom 4 GiB are reserved).

The amount of RAM on the phone theoretically shouldn't matter, because I think (but cannot say for sure) iOS keeps each process's reserved address space somewhat limited, IIUC. The aforementioned experiments revealed you could barely make a 2 GiB mapping (PROT_NONE!) without iOS saying we were out of memory. But, I think you're right that something changed in iOS 14.

One solution here is to just run new experiments and increase the size of the address space that we assume. I'll try to think of something better, but until there's actually any documentation on this, I think the address space structure we have is maybe not a great fit for a platform that doesn't document the size of its available address space. Sigh.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 22, 2021

Thanks @mknyszek.

i did some searching on changes in iOS 14, and it seems Apple introduced something called Extended Virtual Address Space. Details are scarce, but it sounds very related to our problem above.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 22, 2021

Update: I can reproduce with an empty iOS project, and having the Go library allocate several GB of data on its own. Instead of the alloc succeeding, or throwing an OOM error, it throws:

runtime: memory allocated by OS [0x2ac000000, 0x2b0000000) not in usable address space: base outside usable address space
fatal error: memory reservation exceeds address space limit

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 25, 2021

@mknyszek Given that Go 1.13 doesn't suffer from the problem, I was able to trace the issue back to this single-line commit 198f045

I've confirmed that when I change the 33 into 39, everything works fine.

That said, I don't understand this code well enough to determine what side-effects that might have. What are the potential risks of me doing so? Is the base outside usable address space error just an extra safeguard against something that should never happen in practice because the OS would never assign a memory address outside of addressable space? Or is me changing it to 39 a serious risk that could result in runtime panics?

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jun 25, 2021

Increasing that number is exactly what I was talking about earlier. The risk with increasing that number naively, though, is that it

  1. causes a larger virtual mapping to be created for a data structure in the runtime, and
  2. slows down access to that data structure slightly (not enough to matter, I think).

This may or may not cause errors in older iOS versions, but I think the mapping for 39 bits should still be relatively small. I should calculate that. The reason why you don't see an issue with Go 1.13 is that I added that in Go 1.14.

My question, however, is: is 39 the right number? What are the actual virtual address space limits when you have the Extended Virtual Addressing entitlement? I can't actually find any documentation on this.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 25, 2021

I double checked the extended virtual addressing entitlement, but we actually have it turned off. So my theory is that something changed in the way iOS does memory addressing to allow for that feature to be possible, but those changes are also active when the entitlement is not used.

I'll do some more searching to find the 'right' number. The highest memory address I've seen so far is 0x2b8000000.

But I'm afraid changing it to 39 has serious issues. I've had the following crash on Swift side at least once, since testing with 39:

2021-06-25 16:14:36.961698+0200 flat[43222:11907111] Uncaught exception: NSInvalidArgumentException: *** NSAllocateMemoryPages(18446744071865635812) failed
(
    0   CoreFoundation                      0x000000019bbfa5c8 58500388-BF36-397C-84CF-17315A3445B6 + 1217992
    1   libobjc.A.dylib                     0x00000001b06797a8 objc_exception_throw + 60
    2   Foundation                          0x000000019cf81224 NSZoneMalloc + 0
    3   Foundation                          0x000000019ce53f14 63D26DEE-A1FB-34B0-9ADA-CC52E7F1D60C + 57108
    4   Foundation                          0x000000019ce569b8 63D26DEE-A1FB-34B0-9ADA-CC52E7F1D60C + 68024

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jun 25, 2021

That failure suggests to me (but I'm not 100% sure) that Swift's malloc is having trouble getting virtual address space. If that's the case, I think 39 is too high.

Thanks for looking into this! Does 35 work for you? I noticed your original reply (which got sent to my email) had 35 instead of 39.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 25, 2021

Yes, indeed. I set it to 35 before, but then I noticed it was not high enough; the issue still appeared, albeit at higher memory levels. I'll try a few more values until I find the 'right' one. The main issue is that it is difficult to say for sure when a value is correct. The Swift error above for example only seems to appear when the circumstances are just right and is not easy to reproduce.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 25, 2021

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 25, 2021

Documenting results of various experiments here:

33-bit addresses: Go panic: runtime: memory allocated by OS [0x2ac000000, 0x2b0000000) not in usable address space: base outside usable address space. fatal error: memory reservation exceeds address space limit
35-bit addresses: Go panic: runtime: memory allocated by OS [0x2b4000000, 0x2b8000000) not in usable address space: base outside usable address space. fatal error: memory reservation exceeds address space limit
37-bit addresses: Go panic: fatal error: out of memory allocating heap arena metadata
38-bit addresses: Go panic: runtime: memory allocated by OS [0x2bc000000, 0x2c0000000) not in usable address space: base outside usable address space. fatal error: memory reservation exceeds address space limit
39-bit addresses: Swift exception

Uncaught exception: NSInvalidArgumentException: *** NSAllocateMemoryPages(18446744071865635812) failed
(
    0   CoreFoundation                      0x000000019bbfa5c8 58500388-BF36-397C-84CF-17315A3445B6 + 1217992
    1   libobjc.A.dylib                     0x00000001b06797a8 objc_exception_throw + 60
    2   Foundation                          0x000000019cf81224 NSZoneMalloc + 0
    3   Foundation                          0x000000019ce53f14 63D26DEE-A1FB-34B0-9ADA-CC52E7F1D60C + 57108
    4   Foundation                          0x000000019ce569b8 63D26DEE-A1FB-34B0-9ADA-CC52E7F1D60C + 68024

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 27, 2021

@mknyszek see above. I'm afraid there is no 'good' value. Setting it to 38 instead of 33 makes things better, in the sense that the crash comes later (i.e. higher memory address) than with 33, but we're still not at the real OS memory limit (still several hundred MB away from reaching the OS OOM limit). At 39 though, we get the Swift exception. Not sure how to continue...

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jun 28, 2021

@rayvbr It occurs to me that you might want to switch to treating iOS as "64-bit" instead of "32-bit." Just changing the constant results in the aforementioned runtime data structure getting mapped in entirely as read/write. In the "64-bit" case, we make a reservation, and only map what we need. This may be needed to make 39+ work. At that point, I think we should just eliminate all the iOS workarounds in treating it as "32-bit" and assume a 48-bit address space. 2^39 is already so large that I expect address space to be plentiful.

We could do iOS version detection and do things the old way for iOS <14, and remove the workarounds for >=14 (and then later remove the workarounds altogether when iOS <14 is no longer supported), but what's tricky here is the fact that the data structure is configured by compile-time constants. I'd rather not add a compile-time flag for this.

Another alternative is to switch iOS over to the "64-bit" structure, then detect the version. If the version is <14, the runtime can limit the amount of memory actually mapped, but the length of the slices over that mapped memory can be shortened, so that if we ever try to access beyond the corresponding address space what iOS <14 actually supports (say, due to a bug or something), we won't access random memory and will instead panic (which is preferable).

It's a bit of a hack, but this might be a way forward. It also gives us a clear way out for iOS 15 and beyond.

I really wish we were just allowed to know what the actual address space limits are. That would make this sort of planning a whole lot simpler.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jun 28, 2021

Thanks for the explanation @mkevac, we'll continue our investigation. Two short follow up questions:

  • You mention "you might want to switch to treating iOS as "64-bit" instead of "32-bit." How do I enable 64-bit mode for iOS if I want to test that?
  • What surprises me in the results above is that when we increased the address size from 33-bit to 38-bit, the memory address signalled in the panic has increased from 0x2ac000000 to 0x2bc000000, which seemingly indicates that we only increased the allowed maximum memory address from ~11.4GB to ~11.77GB (which corresponds to what we see in XCode Profiler). Shouldn't an increase of 5 additional bits normally result in way more additional addressable memory than that? Doesn't this indicate that something else is going on? Perhaps Apple changing the internal structure of the memory addresses? Or them using a different base offset?

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jun 29, 2021

@rayvbr RE: your second point... that's a very good point. I should've looked more closely at your earlier message. I think that might mean there's something else that needs to change. Let me double check and see if I can prepare a patch for you if there's something else that has to change.

RE: your first point, I'll just explain how this works in more detail.

The Go runtime manages a data structure that allocates one bit per 8 KiB of address space (this happens lazily, so is generally not a problem), and 8 bytes per 4 MiB of address space (this bit is mapped up-front). On systems with small address spaces (like 32-bit systems), this data structure is relatively small, just a few KiB. So, we map the whole thing into memory (see mpagealloc_32bit.go). On systems with large address spaces (like amd64, which has a 48-bit address space in practice), this data structure is much larger, using ~6 GiB of address space if memory serves. Of course, we don't want every Go process to consume 6 GiB of memory by default, so we only make a reservation, and then map in new pieces of the data structure as read/write whenever the heap grows (see mpagealloc_64bit.go).

Currently iOS uses mpagealloc_32bit.go's implementation, because the 6 GiB mapping on iOS was causing all Go processes to error out immediately (not enough address space!). Prior to that exception, it was treated like every other arm64 system.

My suggestion, then, was to make iOS use mpagealloc_64bit.go. The way to do this is to just change the go:build lines in those files and remove the exception for iOS in malloc.go (though I think for other reasons we may still want to keep 4 MiB arenas, so this is going to be a little subtle, but still very little code).

However, using mpagealloc_64bit.go on iOS means that older versions of iOS are going to break. So, I was proposing that we use mpagealloc_64bit.go but add some logic to make the mappings smaller only on iOS. It's not ideal, but it's something, and it should be easy to remove that logic once iOS <14 is no longer supported.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Jul 9, 2021

@mknyszek, it seems the mpagealloc_64bit.go trick works.

I removed the ios-specific build tags from mpagealloc_64bit.go and mpagealloc_32bit.go, removed the 33-bit exception in malloc.go, and the result is that I can finally use the full available memory on an iPhone 12. When I do cross the app memory limit, I now get a proper fatal error: out of memory error, instead of the memory reservation exceeds address space limit panic. On the Swift side everything looks fine as well, and I don't have any runtime exceptions like when I changed 33 to 39.

Note that I did not touch the following two lines:

case GOARCH == "arm64" && GOOS == "ios":
     p = uintptr(i)<<40 | uintptrMask&(0x0013<<28)

Unfortunately, I don't have any iOS 13 (or earlier) device I can test on. And downgrading seems to be far from trivial. Do you happen to have one?

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Aug 13, 2021

Hey @mknyszek, any suggestions on best way forward? While we've had the above running in production for some time now, I'd love to at some point go back to using an official Go release, instead of relying on homemade patches, especially where it concerns low-level code like this...

@requilence
Copy link

@requilence requilence commented Aug 23, 2021

@mknyszek We are facing the same issue on iOS 14. Do you have plans to patch this in the 1.16.x/1.17.x?
The idea of having a DIY-patched Go binary in the CI feels uncomfortable.
Anything we can help with?

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Aug 23, 2021

I think what this has to look like is switching to the 64-bit implementation on iOS (for all versions) but adding a case to limit the address space used for iOS < 14. I can't think of anything better, and this workaround is pretty unfortunate. The workaround would then disappear once we no longer supported iOS < 14.

I'll ask around this week and see if anyone else has any ideas. I'll also see if it's possible to make a small, safe change so this is fixable in a minor release. For a major release, I'm going to start considering eliminating the separate 32-bit implementation and just have every platform do the dynamic mapping. That should make this fix cleaner going forward.

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Aug 23, 2021

Actually, I just thought of something better: a combination of what @rayvbr tried before. Something like: make iOS use the 64-bit implementation, but limit it to a 40-bit address space. This should prevent that Swift exception seen earlier, but hopefully is large enough to accommodate the addresses that we saw iOS trying to return earlier. This will likely use 2 MiB or so more address space in iOS <14, but luckily that's not that much and shouldn't impact most applications. This is a small change that I believe could be backported. @rayvbr @requilence Would y'all be willing to try this out?

Eventually, once old iOS versions are pronounced obsolete, we could promote iOS to a full 48-bit address space like every other arm64-based platform.

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Aug 23, 2021

Err... I guess one snag is we don't actually know where that Swift exception at 39-bit addresses came from. The amount of additional address space used should only be O(MiB), so I don't fully understand why Swift would have trouble allocating pages. Given that it's the Go runtime is running out of memory sometimes, perhaps that's independent and just chance that Swift does instead?

Anyway, let's try this patch and see how it goes.

@gopherbot
Copy link

@gopherbot gopherbot commented Aug 23, 2021

Change https://golang.org/cl/344401 mentions this issue: runtime: set iOS addr space to 40 bits with incremental pagealloc

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Aug 23, 2021

Thanks @mknyszek, that sounds great!

@rayvbr @requilence Would y'all be willing to try this out?

Of course!

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Aug 31, 2021

Hey @mknyszek, we did a thorough set of tests on various devices, and I can report the following:

  • We can no longer reproduce the not in usable address space: base outside usable address space fatal error: memory reservation exceeds address space limit error when using https://golang.org/cl/344401. We can allocate up until the device limit without issues, either in Go or in Swift.
  • When having Go allocate memory that crosses the physical limit (e.g. 2.1GB on a non-Pro iPhone 12), by repeatedly allocating chunks of 10MB until it crashes, the panic thrown is out of memory allocating heap arena metadata.

I'm not sure if the panic thrown is the expected one or whether you would expect a regular out of memory panic? Either way, it definitely fixes the issues we've been having, and that is great news!

Unfortunately, we were not able to obtain an iOS 13 or earlier device to test on. Perhaps @requilence has one?

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Aug 31, 2021

Huh. That's not quite the out of memory error I was expecting, but I guess since the arena size is 4 MiB that's the luck of the draw. Or, maybe it's actually more likely than I think, because I'm pretty sure that structure gets zeroed whereas each 10 MiB chunk isn't paged in yet (and we don't need to zero it because we assume the OS does for us).

It might be hard to hit the "real" out-of-memory error on any Darwin-based (or any UNIX-y) system, since the actual out of memory condition depends on what's paged in. In any case, I think this is a good sign.

RE: iOS 13, I'm pretty sure our builders are iOS 13 or below. There's currently a failure with my patch, but it's purely test related (a test I forgot to update). As far as I can tell, it works.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Sep 1, 2021

Sounds great. I'm not sure what the criteria are for adding something to a minor release vs. waiting for the next major release. Would you say this qualifies for a subsequent 1.17 minor release?

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Sep 1, 2021

@gopherbot Please open backport issues for 1.16 and 1.17.

I think so. This failure doesn't seem to have a reasonable workaround (use less memory?) and otherwise means iOS >=14 Go apps can just break.

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 1, 2021

Backport issue(s) opened: #48115 (for 1.16), #48116 (for 1.17).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@requilence
Copy link

@requilence requilence commented Sep 6, 2021

@mknyszek, this patch works for us. BTW, I have only iOS 14 devices right now

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Oct 4, 2021

Sorry for the delay here. There's some problem with the patch on our builders, and I'm having trouble accessing the builders to identify the issue. Filed #48772.

@rayvbr
Copy link
Author

@rayvbr rayvbr commented Oct 15, 2021

Thanks @mknyszek. What are the chances of this making it into the next minor release? Or do you expect it will have to wait for 1.18?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants