New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: msan cannot handle structs with padding #26167

Open
benesch opened this Issue Jul 1, 2018 · 17 comments

Comments

Projects
None yet
4 participants
@benesch
Contributor

benesch commented Jul 1, 2018

What version of Go are you using (go version)?

go version devel +0dc814c Sat Jun 30 01:04:30 2018 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

linux/amd64

What did you do?

Attempted to dereference a C struct with padding from Go with the memory sanitizer enabled:

// msan.go

package main

/*
#include <stdlib.h>

struct s {
 	int i;
	char c;
};

struct s* mks(void) {
	struct s* s = malloc(sizeof(struct s));
	s->i = 0xdeadbeef;
	s->c = 'n';
	return s;
}
*/
import "C"
import "fmt"

func main() {
	s := *C.mks()
	fmt.Println(s.c)
}

I compiled with:

CC=clang-6.0 CXX=clang++-6.0 go build -msan -o msan msan.go

Upon execution, msan crashes spuriously:

~/go1.10/misc/cgo/testsanitizers/src/foo$ ./msan
Uninitialized bytes in __msan_check_mem_is_initialized at offset 5 inside [0x701000000000, 8)
==21637==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e8b08  (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08)

SUMMARY: MemorySanitizer: use-of-uninitialized-value (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08) 
Exiting

The problem appears to be that the Go instrumentation is coarser than the C instrumentation. Only bytes 0-5 are marked as initialized by C (bytes 6-8 are padding), but Go asks msan to verify that all 8 bytes are initialized when it stores s.

In this particular example, there are two easy ways to soothe msan. The first is to remove the padding from the struct (e.g., struct s { int i; int c }). The second is to access fields within the struct without storing it into a temporary (e.g., fmt.Println(C.mks().c)). Neither of these "workarounds" are viable for real programs.

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 1, 2018

Well, this patch makes the test pass but causes a slew of problems on real programs:

--- a/src/cmd/compile/internal/gc/ssa.go
+++ b/src/cmd/compile/internal/gc/ssa.go
@@ -600,7 +600,13 @@ func (s *state) instrument(t *types.Type, addr *ssa.Value, wr bool) {
        var fn *obj.LSym
        needWidth := false
 
-       if flag_msan {
+       if flag_msan && t.Etype == TSTRUCT {
+               for _, f := range t.FieldSlice() {
+                       addr = s.newValue1I(ssa.OpOffPtr, types.NewPtr(t), f.Offset, addr)
+                       s.instrument(f.Type, addr, wr)
+               }
+               return
+       } else if flag_msan {
                fn = msanread
                if wr {
                        fn = msanwrite

Hmm.

@ALTree ALTree added this to the Go1.12 milestone Jul 1, 2018

@ALTree

This comment has been minimized.

Member

ALTree commented Jul 1, 2018

I assume this also happens with go1.10 (so it's not a recent regression)?

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 1, 2018

Yep. Just verified that it happens with go1.10.3. Given that this case isn't tested by any of the sample programs in misc/cgo/testsanitizers I wouldn't be surprised if it's been broken since msan support was introduced.

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 1, 2018

Oh boy. This doesn't seem particularly easy to fix. LLVM's msan transformation (https://github.com/llvm-mirror/llvm/blob/6fed3d8070003bd65e0af3b70aaa04f7ca12260e/lib/Transforms/Instrumentation/MemorySanitizer.cpp#L1431) is more nuanced than gc's. Specifically, LLVM doesn't consider a load or a store of uninitialized memory to be problematic. It's what you do afterwards that counts. Your program won't blow up until e.g. you perform arithmetic using uninitialized memory or index into an array using uninitialized memory.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jul 1, 2018

Perhaps we could modify gc to only call msanread/msanwrite for non-padding bytes when copying struct types.

@ianlancetaylor ianlancetaylor changed the title from runtime/msan: cannot handle structs with padding to cmd/compile: msan cannot handle structs with padding Jul 1, 2018

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 1, 2018

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jul 1, 2018

Sorry, I didn't even know the compiler added padding fields at all. In widstruct in cmd/compile/internal/gc/align.go it just sets offsets based on alignment, implicitly adding padding.

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 1, 2018

Oh! Perhaps it's cmd/cgo that's to blame. Thanks. I'll keep digging.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jul 1, 2018

Ah, yes, cmd/cgo does add padding fields to the Go version of C structs in (*typeconv).pad in cmd/cgo/gcc.go.

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 1, 2018

Ok, great. Excluding the blank identifiers that cmd/cgo writes from msan instrumentation lets Cockroach get even further in boot.

In other news, I've stumbled across an edge case that I'm not sure what to make of. Consider a program that uses an ostensibly-initialized C struct as a Go map key:

// msan.go

package main

/*
#include <stdlib.h>

struct s {
 	int i;
	char c;
};

struct s* mks(void) {
	struct s* s = malloc(sizeof(struct s));
	s->i = 0xdeadbeef;
	s->c = 'n';
	return s;
}
*/
import "C"

func main() {
	m := map[C.struct_s]struct{}{}
	m[*C.mks()] = struct{}{}
}

Go will read the uninitialized padding bytes to hash the struct for the map key. Yikes! Though perhaps a valid answer here is "don't do that."

@josharian

This comment has been minimized.

Contributor

josharian commented Jul 1, 2018

I’m AFK but I believe Syms have an IsBlank method.

@josharian

This comment has been minimized.

Contributor

josharian commented Jul 1, 2018

Go will read the uninitialized padding bytes to hash the struct for the map key.

The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 2, 2018

The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.

Ah. Then perhaps mapassign and friends are simply reporting too coarsely to msan. Maybe here:

msanread(key, t.key.size)

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 2, 2018

I’m AFK but I believe Syms have an IsBlank method.

Yep, thanks. I've included the patch I've been experimenting with below. (I'm sure there's somewhere better to put this code.)

diff --git a/src/cmd/compile/internal/gc/ssa.go b/src/cmd/compile/internal/gc/ssa.go
index ff2b93d..234954d 100644
--- a/src/cmd/compile/internal/gc/ssa.go
+++ b/src/cmd/compile/internal/gc/ssa.go
@@ -600,6 +600,16 @@ func (s *state) instrument(t *types.Type, addr *ssa.Value, wr bool) {
        var fn *obj.LSym
        needWidth := false
 
+       if flag_msan && t.IsStruct() {
+               for _, f := range t.FieldSlice() {
+                       if !f.Sym.IsBlank() {
+                               faddr := s.newValue1I(ssa.OpOffPtr, f.Type.PtrTo(), f.Offset, addr)
+                               s.instrument(f.Type, faddr, wr)
+                       }
+               }
+               return
+       }
+
        if flag_msan {
                fn = msanread
                if wr {
@josharian

This comment has been minimized.

Contributor

josharian commented Jul 2, 2018

Then perhaps mapassign and friends are simply reporting too coarsely to msan

Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.

Also, not that it matters for this conversation, but it appears to me (on my phone) that the msanread call there could be moved inside the “if map is nil” return path. Might improve perf.

The ideal way to share a patch is gerrit. Or a PR (which gets imported to gerrit).

@benesch

This comment has been minimized.

Contributor

benesch commented Jul 2, 2018

Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.

Yep, definitely an option. Or the calls to msanread could be moved closer to the actual memory reads in the hash functions.

@josharian

This comment has been minimized.

Contributor

josharian commented Jul 2, 2018

I believe the msanread call is there only to handle the case in which we don’t call the hash function, which should have its own instrumentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment