Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: runtime.Pin() to protect against hypothetical moving GC in the future #40431

Open
seebs opened this issue Jul 27, 2020 · 6 comments
Open
Labels
Projects
Milestone

Comments

@seebs
Copy link
Contributor

@seebs seebs commented Jul 27, 2020

What version of Go are you using (go version)?

N/A, but mostly 1.14

Does this issue reproduce with the latest release?

No. It is only expected to reproduce some day in the future. This is sort of a future-proofing thing, where there's been a lot of thought and effort going into writing correct code against a hypothetical future moving implementation of GC, but since we don't have one, we don't know whether we're succeeding...

So, in general, right now, the go GC does not move things, and we know it won't move things. However, it could, so we try to write code that's robust against this. I was previously thinking about this in a now-declined proposal, #32115, which I think was approaching it in the wrong way.

On further study of some of the use cases for this, I've concluded that there is an actual general need for a way to express "this memory has to not-move because something somewhere will be relying on this address in a way the runtime can't do anything about". One example is pointers in parameters passed to ioctl calls; some ioctl calls take data structures which may contain pointers, and the Go runtime can't really be expected to know about all such potential things, but worse, by the time the ioctl call starts and go knows it should be pinning things, the things indirectly referenced by some such data structure could have already been moved.

But there's another potential case, which is alignment. So far as I can tell, while Go presumably preserves its own alignment requirements on data, that doesn't mean that it preserves anything else's alignment requirements for data. Consider a []byte which is being used to store a buffer of raw data which can be type-punned and interpreted as some other data structure. Go has no way of detecting that the alignment requirements are significant.

pkg/unsafe says:

It is valid both to add and to subtract offsets from a pointer in this way. It is also valid to use &^ to round pointers, usually for alignment. In all cases, the result must continue to point into the original allocated object.

So imagine that I'm doing something which interacts, possibly via cgo, with hardware. I might actually care about alignment larger than anything Go cares about. For instance, say I want a 16KB block that is 16KB-aligned. I can do that at all in Go:

b := make([]byte, 32768)
p := unsafe.Pointer(&b[16383])
p = unsafe.Pointer(uintptr(p)&^16383)

So far as I know, this gives me a valid unsafe.Pointer which definitely refers to 16KB of 16KB-aligned memory... Except that a hypothetical moving implementation might move b. And I assume it's smart enough that it can detect that p is a pointer into that same object, and fix it, but it's less obvious to me that the new location will definitely be aligned the same way.

There's only a handful of cases where this applies, and in the absence of an actual moving implementation, it's obviously trivial to "implement" the desired functionality that things don't move while pinned. On the other hand, some of these things are so far as I can tell impossible to write safely without something like pin. (You can get some of the desired behavior by using mmap to request anonymous regions that runtime doesn't manage, but then you have to be sure you're never storing anything in those
regions that contains pointers into runtime-managed memory that could be the only pointers to those things...)

An approximate sketch of an API:

unpin := runtime.Pin(addr)
defer unpin()
// code that uses addr and can assume it won't move

Alternatively:
runtime.Pin(ctx, addr)

where ctx is an arbitrary context, and the pin lasts until the context is cancelled.

It should probably be permissible for a pin to never get cancelled, with the caveat that if there's a lot of uncancelled pins, it could be bad.

@gopherbot gopherbot added this to the Proposal milestone Jul 27, 2020
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals Jul 27, 2020
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 27, 2020

Pinning memory indefinitely is a footgun with a moving garbage collector. Is there some way we can use temporary pins of objects that do not contain pointers, as is done for cgo?

(For that matter these ioctls can be safely called using cgo.)

@hodgesds
Copy link

@hodgesds hodgesds commented Jul 27, 2020

Pinning memory indefinitely is a footgun with a moving garbage collector. Is there some way we can use temporary pins of objects that do not contain pointers, as is done for cgo?

That would be super helpful for some of what I'm trying to do with io_uring. I need a way to ensure that addresses aren't moved because the kernel is using the unsafe addresses. After the request is complete they could freely be moved, but something like this would be super helpful.

@seebs
Copy link
Contributor Author

@seebs seebs commented Jul 28, 2020

One of the boundary cases is probably "objects that do contain pointers", in the case where something passed to ioctl itself contains pointers to other data structures.

I would agree than indefinite pinning seems pretty insane, but I can't think of a way to admit pinning multiple things when necessary (say, a data structure and its children) that doesn't have some kind of theoretical potential for being indefinite.

I suppose another alternative could be an annotation similar to //go:notinheap, but my intuition is that most things are only transiently pinned. And you could in theory make the pinning always have a clearly defined end, something like runtime.WithPin(addr, func ()), but then someone's going to do

go runtime.WithPin(addr, func() {
    var ch chan struct{}
    <-ch
})

Although at least then they have to be more clear about the intent, I guess?

My understanding is that one of the original examples that brought this up was a specific example of an ioctl that could not be safely called using cgo, because while the data structure passed to it was effectively pinned, the data structures that data structure referred to weren't, and there's no way to express that. My first thoughts were "obviously this should recurse", but really it very much shouldn't; if you need to pin multiple things, you should be explicitly pinning those multiple things.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 28, 2020

You aren't allowed to pass the address of a structure containing Go pointers to cgo: https://golang.org/cmd/cgo/#hdr-Passing_pointers. To build such a data structure using cgo you have to use C.malloc.

With pinned pointers I think we would want to implement a similar restriction: a pinned pointer can point to a data structure that may not contain any pointers other than pinned pointers.

With respect to lifetime of pinned pointers, I agree that intent matters. It's also possible to pass a pointer to cgo and have the cgo function never return.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 28, 2020

@hodgesds Note that the current garbage collector is not a moving garbage collector. In the current implementation the only time Go pointers move is when they point to a value on the stack and the stack is moved.

@GSPP
Copy link

@GSPP GSPP commented Jan 12, 2021

Maybe it would be helpful to study the pinning design that .NET is using. There are now 20 years of experience with it and, in my opinion, it has stood the test of time.

.NET has a precise, moving GC. Interop calls are frequent and quite often, pointers are left in unmanaged space after the call is complete. For example, this happens with asynchronous IO and callbacks from native to managed (in .NET, the callback is an object).

Pinning is achieved in two ways:

  1. Local variables can be marked as fixed. C# compiler, JIT and GC then treat any pointer stored in them as pinned.
  2. The runtime exposes a concept called a GC handle (https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.gchandle?view=net-5.0). You can use it to pin or to create a weak reference that is cleared when the object dies (https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.gchandletype?view=net-5.0). There's another type of GC handle called a ConditionalWeakTable. You can use it to associate a secondary object with a primary one giving the secondary exactly the lifetime as the first. This behaves exactly as if you had added another field. So you can effectively extend arbitrary objects with arbitrary data.
  3. There is now a pinned object heap. Objects allocated there are never moved. Otherwise, they are normal objects. In particular, they are eligible to be collected. It is not manual memory management.
  4. Data on the stack is inherently pinned.

The GC has various strategies for dealing with pinned objects. A major problem is that they fragment the GC segments by creating immovable holes. I recommend talks and presentations from Maoni Stephens who is a major GC architect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Proposals
Incoming
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants