Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Invalid garbage collection of structure pointers #36569

Closed
abarisani opened this issue Jan 15, 2020 · 26 comments
Closed

runtime: Invalid garbage collection of structure pointers #36569

abarisani opened this issue Jan 15, 2020 · 26 comments

Comments

@abarisani
Copy link

@abarisani abarisani commented Jan 15, 2020

What version of Go are you using (go version)?

$ go version 1.13.6

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lcars/.cache/go-build"
GOENV="/home/lcars/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/lcars/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build760597893=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Run the example at https://play.golang.org/p/RIMIZDWEcZT

What did you expect to see?

I expect the code not to panic.

What did you see instead?

The code panic because GC scrapes struct instances which should be preserved.

Description

The code at https://play.golang.org/p/RIMIZDWEcZT uses an AlignmentBuffer structure to allow selection of an aligned offset over a []byte array for casting a structure. (This is used for bare metal code operation within TamaGo, however the identified issue is not specific to bare metal and/or TamaGo code as it is reproducible with standard Go).

The dTD struct holds two pointers to AlignmentBuffer instances, what is witnessed in the main function is that allocating an array of []*dTD results, without losing a reference to it, for the *AlignmentBuffer pointer contents to be scraped out by GC.

This is unexpected behaviour.

In order to workaround this we had to keep the two *AlignmentBuffer pointers outside the dTD struct, as this doesn't trigger the issue.

@reusee

This comment was marked as outdated.

Copy link

@reusee reusee commented Jan 15, 2020

uintptr is not considered as pointer so converting uintptr field to unsafe.Pointer is not allowed in vet tools.
Objects allocated on the stack may be re-allocated in stack expansion, and uintptrs will not be updated to new addresses.
Check unsafe.Pointer's doc for more details: https://golang.org/pkg/unsafe/#Pointer

@cuonglm

This comment was marked as outdated.

Copy link
Contributor

@cuonglm cuonglm commented Jan 15, 2020

With tip and upcoming go1.14, you can try:

go build -gcflags=all=-d=checkptr main.go

Then run:

./main
building
fatal error: checkptr: unsafe pointer arithmetic

goroutine 1 [running]:
runtime.throw(0x10e9146, 0x23)
	/Users/cuonglm/sources/go/src/runtime/panic.go:1112 +0x72 fp=0xc00009a9d0 sp=0xc00009a9a0 pc=0x102e922
runtime.checkptrArithmetic(0xc0000b8000, 0x0, 0x0, 0x0)
	/Users/cuonglm/sources/go/src/runtime/checkptr.go:41 +0xb5 fp=0xc00009aa00 sp=0xc00009a9d0 pc=0x1005ae5
main.buildDTD(0x0, 0x1, 0x10e4600, 0xc00009acd8, 0x200, 0x200, 0x0, 0x9, 0x0)
	/Users/cuonglm/main.go:89 +0x74 fp=0xc00009aa40 sp=0xc00009aa00 pc=0x10ac464
main.main()
	/Users/cuonglm/main.go:115 +0xdc fp=0xc00009af88 sp=0xc00009aa40 pc=0x10ac6cc
runtime.main()
	/Users/cuonglm/sources/go/src/runtime/proc.go:203 +0x212 fp=0xc00009afe0 sp=0xc00009af88 pc=0x1030f72
runtime.goexit()
	/Users/cuonglm/sources/go/src/runtime/asm_amd64.s:1375 +0x1 fp=0xc00009afe8 sp=0xc00009afe0 pc=0x105b651

to see invalid unsafe.Pointer usage in your program.

@ALTree

This comment was marked as outdated.

Copy link
Member

@ALTree ALTree commented Jan 15, 2020

As pointed out, the crash is caused by incorrect usage of unsafe. Closing here.

@ALTree ALTree closed this Jan 15, 2020
@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

One thing has nothing to do with the other.

The panic which I trigger in the test code is to show that the GC is collecting something that it shouldn't (imho), the "unsafe pointer arithmetic" fatal error in go1.14, triggered by "checkptr" merely prevents the test case for an entirely different reason.

I don't think the unsafe pointer use is, or should be related, with the GC behaviour. Or in other words why does the unsafe.Pointer use flags the entire NewAlignmentBuffer for clearing by the GC, while it is kept around? Is this expected at all?

Thanks

@empijei

This comment has been minimized.

Copy link
Contributor

@empijei empijei commented Jan 15, 2020

If a uintptr cannot be converted to an unsafe.Pointer, why does the documentation specifically states that it is allowed? Is this a doc error?

image

Also I don't think this is a stack growing problem here.

@abarisani

This comment was marked as outdated.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

uintptr is not considered as pointer so converting uintptr field to unsafe.Pointer is not allowed in vet tools.
Objects allocated on the stack may be re-allocated in stack expansion, and uintptrs will not be updated to new addresses.
Check unsafe.Pointer's doc for more details: https://golang.org/pkg/unsafe/#Pointer

But how come this affects the contents of the allocated Buf []byte within NewAlignmentBuffer ? I cast over that byte array specifically to do so on an object which is not a uintptr, not unsafe and that it's explicitly kept around. This is what confuses me.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Jan 15, 2020

@abarisani In general, a program that violates the unsafe.Pointer rules described here is not guaranteed to behave correctly, or work, at all; there is also no guarantee that the program will crash at a specific point.

If you can provide a crasher that does not violates the unsafe.Pointer rules, then it'll be investigated.

Re-opening in waiting for info in the meantime.

@ALTree ALTree reopened this Jan 15, 2020
@abarisani

This comment was marked as outdated.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

@abarisani In general, a program that violates the unsafe.Pointer rules described here is not guaranteed to behave correctly, or work, at all; there is also no guarantee that the program will crash at a specific point.

If you can provide a crasher that does not violates the unsafe.Pointer rules, then it'll be investigated.

Re-opening in waiting for info in the meantime.

Which rule specifically is violated?

@cuonglm

This comment was marked as outdated.

Copy link
Contributor

@cuonglm cuonglm commented Jan 15, 2020

uintptr is not considered as pointer so converting uintptr field to unsafe.Pointer is not allowed in vet tools.
Objects allocated on the stack may be re-allocated in stack expansion, and uintptrs will not be updated to new addresses.
Check unsafe.Pointer's doc for more details: https://golang.org/pkg/unsafe/#Pointer

But how come this affects the contents of the allocated Buf []byte within NewAlignmentBuffer ? I cast over that byte array specifically to do so on an object which is not a uintptr, not unsafe and that it's explicitly kept around. This is what confuses me.

Because you did invalid unsafe.Pointer arithmetic, the result can lead to invalid memory the pointer points to.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Jan 15, 2020

@empijei

The doc says:

There are four special operations available for type Pointer

that means: "these 4 operation are theoretically allowed", in the sense that the program will compile. Then it goes on with an important caveat:

The following patterns involving Pointer are valid. Code not using these patterns is likely to be invalid today or to become invalid in the future.

This means that even if in general a uintptr -> Pointer conversion will compile, if it doesn't follow one of the allowed patterns, then it's not valid.

@abarisani

At line 89:

dtd = (*dTD)(unsafe.Pointer(dtdBuf.Addr))

The rule listing the allowed patterns says (pattern n. 2):

(2) Conversion of a Pointer to a uintptr (but not back to Pointer).
[...]
Conversion of a uintptr back to Pointer is not valid in general.

You are doing this on line 89, since dtdBuf.Addr is an uintptr created from an unsafe.Pointer in NewAlignmentBuffer.

It goes on:

Conversion of a uintptr back to Pointer is not valid in general.
[...]
The remaining patterns enumerate the only valid conversions from uintptr to Pointer.

I don't think your usage is aligned to any of the listed patterns:

  • (3) Conversion of a Pointer to a uintptr and back, with arithmetic. (Note that both conversions must appear in the same expression)
  • (4) Conversion of a Pointer to a uintptr when calling syscall.Syscall.
  • (5) Conversion of the result of reflect.Value.Pointer or reflect.Value.UnsafeAddr from uintptr to Pointer.
  • (6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data field to or from Pointer.
@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

By the way, if I change the code so that the two *AlignmentBuffer pointer are returned by buildDTD, rather than being embedded in the dTD structure, and I keep them around in main then the GC does not clean them up.

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

This example address the warning but exhibits the same behaviour: https://play.golang.org/p/QrNrPHkk6Un

@ALTree ALTree changed the title Invalid garbage collection of structure pointers. runtime: Invalid garbage collection of structure pointers Jan 15, 2020
@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Jan 15, 2020

Thanks for fixing the reproducer.

I can reproduce the crash both on 1.13 and tip. I also note that when ran with GOGC=off, the panic at the end of the program does not fire.

cc @aclements @mknyszek

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

Thanks! Just for reference you can see here how we worked around the issue for now:

https://github.com/f-secure-foundry/tamago/blob/master/imx6/usb/endpoint.go#L183

If we remove the indirection and keep those pointers around without embedding them in the structure (i.e. return structure + 2 points rather than structure with those 2 pointers embedded within) then everything works just fine and the GC doesn't give us troubles.

Let me know if there is anything more that I can do to help.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Jan 15, 2020

Rule (1) says

(1) Conversion of a *T1 to Pointer to *T2.

Provided that T2 is no larger than T1 and that the two share an equivalent memory layout, this conversion allows reinterpreting data of one type as data of another type.

This means that on the

dtd = (*dTD)(unsafe.Pointer(&dtdBuf.Buf[dtdBuf.Offset]))

line, dTD and &dtdBuf.Buf[dtdBuf.Offset] need to "share an equivalent memory layout" for the conversion to be valid. Is this the case? The two types are:

type dTD struct {
	next   *dTD
	token  uint32
	buffer [5]uintptr

	buf   *AlignmentBuffer
	pages *AlignmentBuffer
}
type AlignmentBuffer.Buf []byte

In particular, dTD has pointer fields, while Buf is just a []byte.

Maybe cc @ianlancetaylor too.

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

It depends on what "equivalent memory layout" means. the size of Buf is initialized so that it can always hold dTD, is that sufficient to satisfy this condition?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 15, 2020

I agree with @ALTree. In the example from #36569 (comment), this line is invalid:

	dtd = (*dTD)(unsafe.Pointer(&dtdBuf.Buf[dtdBuf.Offset]))

dTD and [...]byte do not share an equivalent memory layout.

The code mentioned in #36569 (comment) also appears to be invalid at first glance.

Go, unlike C, is a memory managed language. You can't in general treat memory allocated as one type as being memory of a different type. In particular, if you allocate memory with a pointer type, then you must never store a non-pointer there, and if you allocate memory with a non-pointer type, then you must never store a pointer there.

I'm going to close this again, but please comment if you disagree.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Jan 15, 2020

For the record, mdempsky also proposed to better clarify what "equivalent memory layout" means, in issue #16807, but the proposal was rejected. Discussion there and in the linked issue/golang-dev thread may also be of interest.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 15, 2020

Sorry, I missed the last couple of comments. "Equivalent memory layout" essentially means that the shared prefix of memory is exactly the same type. If you are converting from one struct type to another, the shared fields need to have the same type.

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

So I guess this line:

dst := (*Dirent)(unsafe.Pointer(&b[0]))

from https://golang.org/src/syscall/fs_nacl.go which casts a Dirent (https://golang.org/pkg/syscall/#Dirent) over a byte array, is also incorrect?

Or is this an issue only when we place pointers within the byte array?

In other word is casting a structure over a byte array fine as long as there are no Go pointers/references inside of it? Because I see casting structs over byte arrays in several occasions (I can find other examples if you like).

I'd also like to point out that our code works just fine with the workaround I mentioned which doesn't trigger the GC, this has been tested with real hardware and we are actually implementing drivers in Go by using similar techniques and it all works fine.

I understand that Go does not guarantee this to continue to work in the future, but I just want to ensure that this GC issue we are witnessing is not a symptom of another issue which is being hidden by instead pointing the finger on how we cast structs over byte arrays.

It is important for our project to understand if GC has issues in general and, if not, what are the rules that we must follow when casting structures over byte arrays (which we must do), as from this issue we stumbled upon it's not quite clear (at least to me).

Thanks for your patience and help!

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

Another example for reference:

rsa := (*RawSockaddr)(unsafe.Pointer(&b[0]))

@beoran

This comment has been minimized.

Copy link

@beoran beoran commented Jan 15, 2020

@ianlancetaylor

Currently, unsafe pointers can also be used to cast different structs which simply have the same size, this is used in the go standard library as well (e.g., math). See an example here:

https://play.golang.org/p/Vn6PlzmMgHj
EDIT:
https://play.golang.org/p/1FWFs0cx08S is better.

I think the problem only arises when you have pointers inside the struct, the GC doesn't know about those and may not be able to see that you are keeping references by using them.

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

The GC should know about them as we are keeping a reference to the cast structure around. Or at least that would be my expectation. The cast structure is kept in scope if you check the playground example.

The only difference between the non working and working example is that in the 1st case only the dTD structure is kept in scope, and that's the one that holds the two references to the other 2 structures. In the 2nd case dTD and the other two pointers are kept at the same level.

If the code would be invalid I'd expect none of the examples to work, and not just the second one.

I'd also expect GC to disregard anything which is dereferenced and "hidden" in a byte array that was once upon a time cast to a (now lost) structure. However this is not the case here, all byte arrays which we are casting on as well as all the casted structures are kept referenced.

Hence my confusion and willingness to understand whether this is a true GC bug, or my incorrect interpretation on what should and shouldn't work. In any case it would be nice to have a clear rule to follow so that I can adjust my API accordingly.

@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Jan 15, 2020

So I guess this line:

dst := (*Dirent)(unsafe.Pointer(&b[0]))
from https://golang.org/src/syscall/fs_nacl.go which casts a Dirent (https://golang.org/pkg/syscall/#Dirent) over a byte array, is also incorrect?

Yes, that code violates the rule. I wouldn't call it a major violation, as both Dirent and the byte slice backing store have no pointers in them. But doing casts like this can have Go version and platform dependent behavior (especially in this case, layout and alignment issues). Fortunately in package syscall in the stdlib, we can control for both platform and Go version.

In other word is casting a structure over a byte array fine as long as there are no Go pointers/references inside of it? Because I see casting structs over byte arrays in several occasions (I can find other examples if you like).

It's not fine according to the rules. You might run into issues doing this. But you're right that the damage should be limited. I do not expect you would run into GC issues because no pointers are involved. If you're prepared to accept possible Go version and platform-specific behavior on operations involving just that structure, go ahead.

Really though, you should write encoders and decoders for your data structure to and from []byte. That will ensure you avoid this issue altogether. For instance, in fs_nacl.go we should really do:

import "encoding/binary"
binary.LittleEndian.PutUint64(b[0:8], uint64(src.inode.Ino))
binary.LittleEndian.PutUint64(b[8:16], offset)

and so on. It will even generate the exact same code on architectures that support unaligned writes.

I understand that Go does not guarantee this to continue to work in the future, but I just want to ensure that this GC issue we are witnessing is not a symptom of another issue which is being hidden by instead pointing the finger on how we cast structs over byte arrays.

If you are casting pointer-containing structures onto a byte array, then I can definitely see that causing GC issues. If you have a program that just casts between pointer-free types and still has garbage collection problems, I'd like to see it.

@abarisani

This comment has been minimized.

Copy link
Author

@abarisani abarisani commented Jan 15, 2020

Understood, thanks for now!

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 15, 2020

You must not allocate memory as a non-pointer type and then store a pointer value there.

You must not allocate memory as a pointer type and then store a non-pointer value there.

Other fiddling with memory is likely to work with the current implementation but is not guaranteed for the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.