Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/bytealg: valgrind reports invalid reads by C.GoString #27610

Open
kivikakk opened this issue Sep 11, 2018 · 17 comments
Open

internal/bytealg: valgrind reports invalid reads by C.GoString #27610

kivikakk opened this issue Sep 11, 2018 · 17 comments
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@kivikakk
Copy link

What version of Go are you using (go version)?

go version go1.11 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/kivikakk/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/kivikakk/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build515123667=/tmp/go-build -gno-record-gcc-switches"

What did you do?

package main

/*
#include <string.h>
#include <stdlib.h>
char* s() {
        return strdup("hello");
}
*/
import "C"
import "unsafe"

func main() {
        s := C.s()
        C.GoString(s)
        C.free(unsafe.Pointer(s))
}
$ go build
$ valgrind ./sscce
==11241== Memcheck, a memory error detector
==11241== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==11241== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==11241== Command: ./sscce
==11241==
==11241== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==11241==          the SIGRT32 signal is used internally by Valgrind
==11241== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==11241==          the SIGRT32 signal is used internally by Valgrind
==11241== Warning: client switching stacks?  SP change: 0xfff0001b0 --> 0xc0000367d8
==11241==          to suppress, use: --max-stackframe=755931244072 or greater
==11241== Warning: client switching stacks?  SP change: 0xc000036790 --> 0xfff000260
==11241==          to suppress, use: --max-stackframe=755931243824 or greater
==11241== Warning: client switching stacks?  SP change: 0xfff000260 --> 0xc000036790
==11241==          to suppress, use: --max-stackframe=755931243824 or greater
==11241==          further instances of this message will not be shown.
==11241== Conditional jump or move depends on uninitialised value(s)
==11241==    at 0x40265B: indexbytebody (/usr/local/go/src/internal/bytealg/indexbyte_amd64.s:151)
==11241==
==11241==
==11241== HEAP SUMMARY:
==11241==     in use at exit: 1,200 bytes in 6 blocks
==11241==   total heap usage: 10 allocs, 4 frees, 1,310 bytes allocated
==11241==
==11241== LEAK SUMMARY:
==11241==    definitely lost: 0 bytes in 0 blocks
==11241==    indirectly lost: 0 bytes in 0 blocks
==11241==      possibly lost: 1,152 bytes in 4 blocks
==11241==    still reachable: 48 bytes in 2 blocks
==11241==         suppressed: 0 bytes in 0 blocks
==11241== Rerun with --leak-check=full to see details of leaked memory
==11241==
==11241== For counts of detected and suppressed errors, rerun with: -v
==11241== Use --track-origins=yes to see where uninitialised values come from
==11241== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

(Ignore the "possibly lost" blocks; they're pthreads started by the Go runtime.)

What did you expect to see?

No conditional jump/move depending on uninitialised values.

What did you see instead?

A conditional jump/move depending on uninitialised values.


The nature of the issue becomes more apparent if you run Valgrind with --partial-loads-ok=no:

$ valgrind --partial-loads-ok=no ./sscce
==11376== Memcheck, a memory error detector
==11376== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==11376== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==11376== Command: ./sscce
==11376==
==11376== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==11376==          the SIGRT32 signal is used internally by Valgrind
==11376== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==11376==          the SIGRT32 signal is used internally by Valgrind
==11376== Warning: client switching stacks?  SP change: 0xfff0001b0 --> 0xc0000367d8
==11376==          to suppress, use: --max-stackframe=755931244072 or greater
==11376== Warning: client switching stacks?  SP change: 0xc000036790 --> 0xfff000260
==11376==          to suppress, use: --max-stackframe=755931243824 or greater
==11376== Warning: client switching stacks?  SP change: 0xfff000260 --> 0xc000036790
==11376==          to suppress, use: --max-stackframe=755931243824 or greater
==11376==          further instances of this message will not be shown.
==11376== Invalid read of size 32
==11376==    at 0x40264E: indexbytebody (/usr/local/go/src/internal/bytealg/indexbyte_amd64.s:148)
==11376==  Address 0x53f47c0 is 0 bytes inside a block of size 12 alloc'd
==11376==    at 0x4C2BBAF: malloc (vg_replace_malloc.c:299)
==11376==    by 0x45165D: s (main.go:7)
==11376==    by 0x4516A5: _cgo_a004886745c9_Cfunc_s (cgo-gcc-prolog:54)
==11376==    by 0x44A0DF: runtime.asmcgocall (/usr/local/go/src/runtime/asm_amd64.s:637)
==11376==    by 0x7: ???
==11376==    by 0x6C287F: ??? (in /home/kivikakk/sscce/sscce)
==11376==    by 0xFFF00024F: ???
==11376==    by 0x4462B1: runtime.(*mcache).nextFree.func1 (/usr/local/go/src/runtime/malloc.go:749)
==11376==    by 0x448905: runtime.systemstack (/usr/local/go/src/runtime/asm_amd64.s:351)
==11376==    by 0x4283BF: ??? (/usr/local/go/src/runtime/proc.go:1146)
==11376==    by 0x448798: runtime.rt0_go (/usr/local/go/src/runtime/asm_amd64.s:201)
==11376==    by 0x451DEF: ??? (in /home/kivikakk/sscce/sscce)
==11376==
==11376==
==11376== HEAP SUMMARY:
==11376==     in use at exit: 1,200 bytes in 6 blocks
==11376==   total heap usage: 10 allocs, 4 frees, 1,316 bytes allocated
==11376==
==11376== LEAK SUMMARY:
==11376==    definitely lost: 0 bytes in 0 blocks
==11376==    indirectly lost: 0 bytes in 0 blocks
==11376==      possibly lost: 1,152 bytes in 4 blocks
==11376==    still reachable: 48 bytes in 2 blocks
==11376==         suppressed: 0 bytes in 0 blocks
==11376== Rerun with --leak-check=full to see details of leaked memory
==11376==
==11376== For counts of detected and suppressed errors, rerun with: -v
==11376== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

I understand some work has been done to ensure IndexByte doesn't run over a page boundary (#24206), and so that this is unlikely to have a negative effect. Should I just add a suppression and call it a day?

{
   indexbytebody_loves_to_read
   Memcheck:Addr32
   fun:indexbytebody
}
@josharian
Copy link
Contributor

Thanks for filing an issue.

cc @TocarIP

@josharian josharian changed the title C.GoString performs invalid reads internal/bytealg: C.GoString performs invalid reads Sep 11, 2018
@josharian josharian added this to the Go1.12 milestone Sep 11, 2018
@josharian josharian added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 11, 2018
@randall77
Copy link
Contributor

There is code in IndexByte to avoid loading across a page boundary, and it might trip up runtime detectors like valgrind. But the error you're seeing is not one of the assembly lines at which that could happen. It's occurring in the 32+ byte section, where we never read outside the passed in range.

The place where the out-of-bounds read is generated is src/runtime/string.go:findnull. We do this:

	// pageSize is the unit we scan at a time looking for NULL.
	// It must be the minimum page size for any architecture Go
	// runs on. It's okay (just a minor performance loss) if the
	// actual system page size is larger than this value.
	const pageSize = 4096

	offset := 0
	ptr := unsafe.Pointer(s)
	// IndexByteString uses wide reads, so we need to be careful
	// with page boundaries. Call IndexByteString on
	// [ptr, endOfPage) interval.
	safeLen := int(pageSize - uintptr(ptr)%pageSize)

	for {
		t := *(*string)(unsafe.Pointer(&stringStruct{ptr, safeLen}))
		// Check one page at a time.
		if i := bytealg.IndexByteString(t, 0); i != -1 {
			return offset + i
		}
		// Move to next page
		ptr = unsafe.Pointer(uintptr(ptr) + uintptr(safeLen))
		offset += safeLen
		safeLen = pageSize
	}

We're essentially looking for a null one page at a time. We're passing a range to IndexByte that can range outside a small allocation. It's safe with respect to the virtual memory because we don't stray off the page, but I can certainly see how it would confuse Valgrind.

@kivikakk
Copy link
Author

Indeed; it's safe inasmuch as it won't run off a page, but it is reading uninitialised memory (in that that part of the page has not been 'allocated' to userspace by a malloc or similar), which Valgrind is catching. I understand that it's safe and that we don't have a problem here as such; it just seems a pity(?) that this kind of thing will crop up and dirty Valgrind results.

That said, the aforementioned pthread "possibly lost" allocations are also Valgrind noise. Maybe we could document suppressions/include a sample suppressions file? I suspect there are many more I've yet to hit, though.

@randall77
Copy link
Contributor

Yes, it would be nice if we had a clean Valgrind run, or had a way to induce one with a config file or something. We do this over-reading in a few places in the Go runtime. Normally the over-reading happens in the Go heap, which Valgrind presumably doesn't care about. I think findnull is probably the most common operation that would happen on C-allocated memory. But if someone makes a string or a slice pointing to C-allocated memory, it can happen on operations involving those strings/slices as well.

How does C handle this? I would think C library functions like strlen do the same thing we are doing.

@kivikakk
Copy link
Author

Yeah, that makes a lot of sense. I haven't looked into it yet, but I do assume the C library does this as well, and that Valgrind has its own strlen which doesn't trip itself up like this. (This looks like a report from before Valgrind knew about these kinds of optimisations: https://stackoverflow.com/a/3246124/499609)

@randall77
Copy link
Contributor

I think all of the cases in the Go stdlib would be handled by the Valgrind option --partial-loads-ok=yes, and I think that's been the default since 2015.
Could you check and see if that makes a difference? Your Valgrind looks like it is 2015 vintage.

@kivikakk
Copy link
Author

I was on 3.12.0 (Oct 2016); have just tried on the latest stable, 3.13.0, and the result is the same:

$ valgrind --partial-loads-ok=yes ./sscce
==15735== Memcheck, a memory error detector
==15735== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==15735== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==15735== Command: ./sscce
==15735==
==15735== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==15735==          the SIGRT32 signal is used internally by Valgrind
==15735== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==15735==          the SIGRT32 signal is used internally by Valgrind
==15735== Warning: client switching stacks?  SP change: 0x1fff0001f0 --> 0xc0000347d8
==15735==          to suppress, use: --max-stackframe=687211759080 or greater
==15735== Warning: client switching stacks?  SP change: 0xc000034790 --> 0x1fff0002a0
==15735==          to suppress, use: --max-stackframe=687211758832 or greater
==15735== Warning: client switching stacks?  SP change: 0x1fff0002a0 --> 0xc000034790
==15735==          to suppress, use: --max-stackframe=687211758832 or greater
==15735==          further instances of this message will not be shown.
==15735== Conditional jump or move depends on uninitialised value(s)
==15735==    at 0x40265B: indexbytebody (/usr/local/go/src/internal/bytealg/indexbyte_amd64.s:151)
==15735==
==15735==
==15735== HEAP SUMMARY:
==15735==     in use at exit: 1,176 bytes in 5 blocks
==15735==   total heap usage: 10 allocs, 5 frees, 1,316 bytes allocated
==15735==
==15735== LEAK SUMMARY:
==15735==    definitely lost: 0 bytes in 0 blocks
==15735==    indirectly lost: 0 bytes in 0 blocks
==15735==      possibly lost: 1,152 bytes in 4 blocks
==15735==    still reachable: 24 bytes in 1 blocks
==15735==         suppressed: 0 bytes in 0 blocks
==15735== Rerun with --leak-check=full to see details of leaked memory
==15735==
==15735== For counts of detected and suppressed errors, rerun with: -v
==15735== Use --track-origins=yes to see where uninitialised values come from
==15735== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

@ianlancetaylor ianlancetaylor changed the title internal/bytealg: C.GoString performs invalid reads internal/bytealg: valgrind reports invalid reads by C.GoString Dec 13, 2018
@ianlancetaylor
Copy link
Member

It doesn't seem like we should change how the Go library behaves only to make valgrind happy. We need some way to tell valgrind that this particular read of uninitialized memory is OK. Is there any way to do that?

@lberrymage
Copy link

This impact of this issue isn't limited to Valgrind warnings. It is currently causing production app crashes for Android users with MTE enabled. mullvad/mullvadvpn-app#6349 and mullvad/mullvadvpn-app#6727 is one example of this.

I don't think the Go runtime should be making these kinds of reads by default. The runtime is making the assumption that the target's allocator will permit reading beyond the allocation boundary, which is undefined behavior even if the read stays within the page. While some allocators permit this behavior, others do not, so that assumption is incorrect in both theory and practice. And as noted above, it's causing real-world issues.

@thestinger
Copy link

Reading outside the bounds of objects from other languages is much more than an uninitialized read. It's a memory safety violation which will trigger detection by memory tagging. Memory protections are not only enforced with page size granularity. Memory tagging provides 16 byte granularity memory protection and is not a break in compatibility. It was always undefined and unsafe to read outside the bounds of C objects.

This breaks Go code using these methods when memory tagging is being used. GrapheneOS uses heap memory tagging by default for the base OS and will be adopting stack allocation memory tagging too. It's currently opt-in for user installed apps due to compatibility issues in apps with memory safety violations occurring during regular use exactly like this one. Some of the memory safety violations we've found with it have turned out to be high severity remotely exploitable bugs including a couple exposed via Android's Bluetooth implementation while others like this one may seem benign but still break with finer grained memory safety violation detection.

Android will likely deploy memory tagging by default in asynchronous mode for a future target API level, which we're strongly advocating for doing. The main barrier is not performance but rather compatibility due to apps with memory safety violations. It's a shame that it hasn't happened already but it should happen.

Go fixing this will take a very long time to propagate to apps since they're bundling executables compiled with Go that are often not compiled with the latest Go release but it should still happen and should be prioritized. Android is moving ahead with advancing memory tagging in Android 15 and GrapheneOS will be enabling it by default for user installed apps. WireGuard is usually used via a userspace implementation written on Go for Android including as part of the official WireGuard app and other apps like Mullvad. This is likely not the only memory safety issue uncovered by MTE with the Go WireGuard implementation but it's the first one Mullvad narrowed down and they've worked around it.

@randall77
Copy link
Contributor

I don't think we ever got a definitive answer to these questions:

How does C handle this? I would think C library functions like strlen do the same thing we are doing.

I do assume the C library does this as well, and that Valgrind has its own strlen which doesn't trip itself up like this.

So, how does valgrind handle C's strlen and friends? Does valgrind have a special case for them? Do they replace the implementations entirely? Or something else? What parts of these solutions could Go use?

@thestinger
Copy link

thestinger commented Sep 30, 2024

There's no exception from hardware memory tagging for the C standard library and it can't do this either. Reading outside of 16 byte aligned blocks on 64-bit ARM is incompatible with any platform using MTE including Android on Pixels where MTE is available even with the stock OS as an opt-in developer feature.

Valgrind does special case the C standard library and since that's part of the platform it can do things that are incorrect elsewhere but it can't do what Go is doing here or it'd be broken in the same way with MTE. Reading in 16 byte aligned blocks wouldn't be broken but reading past any 16 byte alignment boundary before or after the allocation would be incompatible with MTE.

It doesn't particularly matter what Valgrind catches or doesn't catch, but it definitely does matter that this breaks with memory tagging because memory tagging is enforcement of the OS memory safety model and it's always correct with no false positives. Go is wrong to violate MTE boundaries on Android. The platform sets the rules and this isn't allowed regardless of what you consider to be an exception from undefined behavior rules. If Go was part of the Android platform, this would have been fixed already by Android because MTE is tested and officially supported just not enforced by default right now or available outside developer options outside GrapheneOS.

@thestinger
Copy link

Really, it's always technically wrong for anything side from the platform's own compiler and libc to do stuff like this. People can just get away with it until they don't. Go stopped getting away with it with memory tagging on ARMv8.5+ / ARMv9 where there's 16 byte granularity memory protection enforcement via tagging. That's the risk of relying on undefined behavior, and Go isn't part of the platform on Android so it's not up to Go. Android has official memory tagging support upstream that's usable on the stock Pixel OS on 8th and 9th generation Pixels. It's clearly intended to eventually be available enabled by default in production in the future so apps can at least opt-in to it..

Reading a whole page is odd. Reading a typical 64 byte cacheline size would be less odd, but normally the code would only read past the end in the vector size it's using which would be the word size for normal Linux kernel code, even the core string functions, and at most 16 bytes for typical ARM NEON code. SVE lacks mainstream usage. It'd still break reading 64 aligned bytes since MTE is 16 byte chunks, not 64 bytes like SPARC ADI.

MTE is not at all Android exclusive, Pixels are just the first platform deploying it in a usable form and GrapheneOS on Pixels is the first platform using it in production. It's a standard extension on ARMv8.5+ and ARMv9. Typical usage is tagging all heap allocations on the standard heap with random tags and certain exclusions for deterministic protection. Clang supports it for tagging stack allocations too. That means 16 byte granularity memory protection needs to be expected on 64-bit ARM so regardless of whether you care about undefined behavior, it will break to read objects from other languages outside their 16 byte naturally chunks on platforms using MTE. It can also be used within objects to whatever extent is desired but that's not typical usage for C and would break some correct code unlike the normal usage with no false positives.

@josharian
Copy link
Contributor

@thestinger please re-read @randall77’s comment. He’s not arguing with you. He’s looking for a solution. Asking about strlen is a natural place to start, because this function is strlen.

@prattmic
Copy link
Member

As far as we know, fixes to core Bionic libc and Linux kernel code weren't required for MTE since neither did something like that in the first place.

Bionic has special versions of strlen (and others) for MTE:

https://android.googlesource.com/platform/bionic.git/+/refs/heads/main/libc/arch-arm64/dynamic_function_dispatch.cpp#144

https://android.googlesource.com/platform/bionic.git/+/900d07d6a1f3e1eca8cdbb3b1db1ceeec0acc9e2

@prattmic
Copy link
Member

It's convoluted, but at least at the time of that initial commit, it looks like bionic just did strlen one byte at a time when MTE was enabled.

#define strlen strlen_mte
#include <upstream-openbsd/lib/libc/string/strlen.c>

https://android.googlesource.com/platform/bionic.git/+/900d07d6a1f3e1eca8cdbb3b1db1ceeec0acc9e2/libc/upstream-openbsd/lib/libc/string/strlen.c

@prattmic
Copy link
Member

prattmic commented Sep 30, 2024

The tip code in bionic is very difficult to follow, but I think it ultimately uses https://github.com/ARM-software/optimized-routines/blob/master/string/aarch64/strlen-mte.S, which is unsurprisingly written to read in 16 byte chunks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants