-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/bytealg: valgrind reports invalid reads by C.GoString #27610
Comments
Thanks for filing an issue. cc @TocarIP |
There is code in The place where the out-of-bounds read is generated is src/runtime/string.go:findnull. We do this:
We're essentially looking for a null one page at a time. We're passing a range to IndexByte that can range outside a small allocation. It's safe with respect to the virtual memory because we don't stray off the page, but I can certainly see how it would confuse Valgrind. |
Indeed; it's safe inasmuch as it won't run off a page, but it is reading uninitialised memory (in that that part of the page has not been 'allocated' to userspace by a That said, the aforementioned pthread "possibly lost" allocations are also Valgrind noise. Maybe we could document suppressions/include a sample suppressions file? I suspect there are many more I've yet to hit, though. |
Yes, it would be nice if we had a clean Valgrind run, or had a way to induce one with a config file or something. We do this over-reading in a few places in the Go runtime. Normally the over-reading happens in the Go heap, which Valgrind presumably doesn't care about. I think How does C handle this? I would think C library functions like |
Yeah, that makes a lot of sense. I haven't looked into it yet, but I do assume the C library does this as well, and that Valgrind has its own |
I think all of the cases in the Go stdlib would be handled by the Valgrind option |
I was on 3.12.0 (Oct 2016); have just tried on the latest stable, 3.13.0, and the result is the same: $ valgrind --partial-loads-ok=yes ./sscce
==15735== Memcheck, a memory error detector
==15735== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==15735== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==15735== Command: ./sscce
==15735==
==15735== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==15735== the SIGRT32 signal is used internally by Valgrind
==15735== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==15735== the SIGRT32 signal is used internally by Valgrind
==15735== Warning: client switching stacks? SP change: 0x1fff0001f0 --> 0xc0000347d8
==15735== to suppress, use: --max-stackframe=687211759080 or greater
==15735== Warning: client switching stacks? SP change: 0xc000034790 --> 0x1fff0002a0
==15735== to suppress, use: --max-stackframe=687211758832 or greater
==15735== Warning: client switching stacks? SP change: 0x1fff0002a0 --> 0xc000034790
==15735== to suppress, use: --max-stackframe=687211758832 or greater
==15735== further instances of this message will not be shown.
==15735== Conditional jump or move depends on uninitialised value(s)
==15735== at 0x40265B: indexbytebody (/usr/local/go/src/internal/bytealg/indexbyte_amd64.s:151)
==15735==
==15735==
==15735== HEAP SUMMARY:
==15735== in use at exit: 1,176 bytes in 5 blocks
==15735== total heap usage: 10 allocs, 5 frees, 1,316 bytes allocated
==15735==
==15735== LEAK SUMMARY:
==15735== definitely lost: 0 bytes in 0 blocks
==15735== indirectly lost: 0 bytes in 0 blocks
==15735== possibly lost: 1,152 bytes in 4 blocks
==15735== still reachable: 24 bytes in 1 blocks
==15735== suppressed: 0 bytes in 0 blocks
==15735== Rerun with --leak-check=full to see details of leaked memory
==15735==
==15735== For counts of detected and suppressed errors, rerun with: -v
==15735== Use --track-origins=yes to see where uninitialised values come from
==15735== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) |
It doesn't seem like we should change how the Go library behaves only to make valgrind happy. We need some way to tell valgrind that this particular read of uninitialized memory is OK. Is there any way to do that? |
This impact of this issue isn't limited to Valgrind warnings. It is currently causing production app crashes for Android users with MTE enabled. mullvad/mullvadvpn-app#6349 and mullvad/mullvadvpn-app#6727 is one example of this. I don't think the Go runtime should be making these kinds of reads by default. The runtime is making the assumption that the target's allocator will permit reading beyond the allocation boundary, which is undefined behavior even if the read stays within the page. While some allocators permit this behavior, others do not, so that assumption is incorrect in both theory and practice. And as noted above, it's causing real-world issues. |
Reading outside the bounds of objects from other languages is much more than an uninitialized read. It's a memory safety violation which will trigger detection by memory tagging. Memory protections are not only enforced with page size granularity. Memory tagging provides 16 byte granularity memory protection and is not a break in compatibility. It was always undefined and unsafe to read outside the bounds of C objects. This breaks Go code using these methods when memory tagging is being used. GrapheneOS uses heap memory tagging by default for the base OS and will be adopting stack allocation memory tagging too. It's currently opt-in for user installed apps due to compatibility issues in apps with memory safety violations occurring during regular use exactly like this one. Some of the memory safety violations we've found with it have turned out to be high severity remotely exploitable bugs including a couple exposed via Android's Bluetooth implementation while others like this one may seem benign but still break with finer grained memory safety violation detection. Android will likely deploy memory tagging by default in asynchronous mode for a future target API level, which we're strongly advocating for doing. The main barrier is not performance but rather compatibility due to apps with memory safety violations. It's a shame that it hasn't happened already but it should happen. Go fixing this will take a very long time to propagate to apps since they're bundling executables compiled with Go that are often not compiled with the latest Go release but it should still happen and should be prioritized. Android is moving ahead with advancing memory tagging in Android 15 and GrapheneOS will be enabling it by default for user installed apps. WireGuard is usually used via a userspace implementation written on Go for Android including as part of the official WireGuard app and other apps like Mullvad. This is likely not the only memory safety issue uncovered by MTE with the Go WireGuard implementation but it's the first one Mullvad narrowed down and they've worked around it. |
I don't think we ever got a definitive answer to these questions:
So, how does valgrind handle C's strlen and friends? Does valgrind have a special case for them? Do they replace the implementations entirely? Or something else? What parts of these solutions could Go use? |
There's no exception from hardware memory tagging for the C standard library and it can't do this either. Reading outside of 16 byte aligned blocks on 64-bit ARM is incompatible with any platform using MTE including Android on Pixels where MTE is available even with the stock OS as an opt-in developer feature. Valgrind does special case the C standard library and since that's part of the platform it can do things that are incorrect elsewhere but it can't do what Go is doing here or it'd be broken in the same way with MTE. Reading in 16 byte aligned blocks wouldn't be broken but reading past any 16 byte alignment boundary before or after the allocation would be incompatible with MTE. It doesn't particularly matter what Valgrind catches or doesn't catch, but it definitely does matter that this breaks with memory tagging because memory tagging is enforcement of the OS memory safety model and it's always correct with no false positives. Go is wrong to violate MTE boundaries on Android. The platform sets the rules and this isn't allowed regardless of what you consider to be an exception from undefined behavior rules. If Go was part of the Android platform, this would have been fixed already by Android because MTE is tested and officially supported just not enforced by default right now or available outside developer options outside GrapheneOS. |
Really, it's always technically wrong for anything side from the platform's own compiler and libc to do stuff like this. People can just get away with it until they don't. Go stopped getting away with it with memory tagging on ARMv8.5+ / ARMv9 where there's 16 byte granularity memory protection enforcement via tagging. That's the risk of relying on undefined behavior, and Go isn't part of the platform on Android so it's not up to Go. Android has official memory tagging support upstream that's usable on the stock Pixel OS on 8th and 9th generation Pixels. It's clearly intended to eventually be available enabled by default in production in the future so apps can at least opt-in to it.. Reading a whole page is odd. Reading a typical 64 byte cacheline size would be less odd, but normally the code would only read past the end in the vector size it's using which would be the word size for normal Linux kernel code, even the core string functions, and at most 16 bytes for typical ARM NEON code. SVE lacks mainstream usage. It'd still break reading 64 aligned bytes since MTE is 16 byte chunks, not 64 bytes like SPARC ADI. MTE is not at all Android exclusive, Pixels are just the first platform deploying it in a usable form and GrapheneOS on Pixels is the first platform using it in production. It's a standard extension on ARMv8.5+ and ARMv9. Typical usage is tagging all heap allocations on the standard heap with random tags and certain exclusions for deterministic protection. Clang supports it for tagging stack allocations too. That means 16 byte granularity memory protection needs to be expected on 64-bit ARM so regardless of whether you care about undefined behavior, it will break to read objects from other languages outside their 16 byte naturally chunks on platforms using MTE. It can also be used within objects to whatever extent is desired but that's not typical usage for C and would break some correct code unlike the normal usage with no false positives. |
@thestinger please re-read @randall77’s comment. He’s not arguing with you. He’s looking for a solution. Asking about strlen is a natural place to start, because this function is strlen. |
Bionic has special versions of strlen (and others) for MTE: https://android.googlesource.com/platform/bionic.git/+/900d07d6a1f3e1eca8cdbb3b1db1ceeec0acc9e2 |
It's convoluted, but at least at the time of that initial commit, it looks like bionic just did strlen one byte at a time when MTE was enabled.
|
The tip code in bionic is very difficult to follow, but I think it ultimately uses https://github.com/ARM-software/optimized-routines/blob/master/string/aarch64/strlen-mte.S, which is unsurprisingly written to read in 16 byte chunks. |
What version of Go are you using (
go version
)?go version go1.11 linux/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?What did you do?
(Ignore the "possibly lost" blocks; they're pthreads started by the Go runtime.)
What did you expect to see?
No conditional jump/move depending on uninitialised values.
What did you see instead?
A conditional jump/move depending on uninitialised values.
The nature of the issue becomes more apparent if you run Valgrind with
--partial-loads-ok=no
:I understand some work has been done to ensure
IndexByte
doesn't run over a page boundary (#24206), and so that this is unlikely to have a negative effect. Should I just add a suppression and call it a day?The text was updated successfully, but these errors were encountered: