Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: provide Pinner API for object pinning #46787

Open
ansiwen opened this issue Jun 16, 2021 · 100 comments
Open

runtime: provide Pinner API for object pinning #46787

ansiwen opened this issue Jun 16, 2021 · 100 comments

Comments

@ansiwen
Copy link

@ansiwen ansiwen commented Jun 16, 2021

Update, 2021-10-20: the latest proposal is the API in #46787 (comment).


Problem Statement

The pointer passing rules state:

Go code may pass a Go pointer to C provided the Go memory to which it points does not contain any Go pointers.

and

Go code may not store a Go pointer in C memory.

There are C APIs, most notably the iovec based ones for vectored I/O which expect an array of structs that describe buffers to read to or write from. The naive approach would be to allocate both the array and the buffers with C.malloc() and then either work on the C buffers directly or copy the content to Go buffers. In the case of Go bindings for a C API, which is assumably the most common use case for Cgo, the users of the bindings shouldn't have to deal with C types, which means that all data has to be copied into Go allocated buffers. This of course impairs the performance, especially for larger buffers. Therefore it would be desirable to have a safe possibility to let the C API write directly into the Go buffers. This, however, is not possible because

  • either the buffer array is allocated in C memory, but then the pointers of the Go buffers can't be stored in it. (Storing Go pointers in C memory is forbidden.)
  • or the buffer array is allocated in Go memory and the Go buffer pointers are stored in it. But then the pointer to that buffer array can't be passed to a C function. (Passing a Go pointer that points to memory containing other Go pointers to a C function is forbidden.)

Obviously, what is missing is a safe way to pin an arbitrary number of Go pointers in order to store them in C memory or in passed-to-C Go memory for the duration of a C call.

Workarounds

Break the rules and store the Go pointer in C memory

(click)

with something like

IovecCPtr.iov_base = unsafe.Pointer(myGoPtr)

but GODEBUG=cgocheck=2 would catch that.

However, you can circumvent cgocheck=2 with this casting trick:

*(*uintptr)(unsafe.Pointer(&IovecCPtr.iov_base)) = uintptr(myGoPtr)

This might work, as long as the GC is not moving the pointers, which might be a fact as of now, but is not guaranteed.

Break the rules and hide the Go pointer in Go memory

(click)

with something like

type iovecT struct {
  iov_base uintptr
  iov_len  C.size_t
}
iovec := make([]iovecT, numberOfBuffers)
for i := range iovec {
  bufferPtr := unsafe.Pointer(&bufferArray[i][0])
  iovec[i].iov_base = uintptr(bufferPtr)
  iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n := C.my_iovec_read((*C.struct_iovec)(unsafe.Pointer(&iovec[0])), C.int(numberOfBuffers))

Again: This might work, as long as the GC is not moving the pointers. GODEBUG=cgocheck=2 wouldn't complain about this.

Break the rules and temporarily disable cgocheck

(click)

If hiding the Go pointer as a uintptr like in the last workaround is not possible, passing Go memory that contains Go pointers usually bails out because of the default cgocheck=1 setting. It is possible to disable temporarily cgocheck during a C call, which can especially useful, when the pointer have been "pinned" with one of the later workarounds. For example the _cgoCheckPtr() function, that is used in the generated Cgo code, can be shadowed in the local scope, which disables the check for the following C calls in the scope:

func ... {
  _cgoCheckPointer := func(interface{}, interface{}) {}
  C.my_c_function(x, y)
}

Maybe slightly more robust, is to export the runtime.dbgvars list:

type dbgVar struct {
	name  string
	value *int32
}

//go:linkname dbgvars runtime.dbgvars
var dbgvars []dbgVar

var cgocheck = func() *int32 {
	for i := range dbgvars {
		if dbgvars[i].name == "cgocheck" {
			return dbgvars[i].value
		}
	}
	panic("Couln't find cgocheck debug variable")
}()

func ... {
	before := *cgocheck
	*cgocheck = 0
	C.my_c_function(x, y)
	*cgocheck = before
}

Use a C function to store the Go pointer in C memory

(click)

The rules allow that a C function stores a Go pointer in C memory for the duration of the call. So, for each Go pointer a C function can be called in a Go routine, that stores the Go pointer in C memory and then calls a Go function callback that waits for a release signal. After the release signal is received, the Go callback returns to the C function, the C function clears the C memory from the Go pointer, and returns as well, finishing the Go routine.

This approach fully complies with the rules, but is quite expensive, because each Go routine that calls a C function creates a new thread, that means one thread per stored Go pointer.

Use the //go:uintptrescapes compiler directive

(click)

//go:uintptrescapes is a compiler directive that

specifies that the function's uintptr arguments may be pointer values that have been converted to uintptr and must be treated as such by the garbage collector.

So, similar to the workaround before, a Go function with this directive can be called in a Go routine, which simply waits for a release signal. When the signal is received, the function returns and sets the pointer free.

This seems already almost like a proper solution, so that I implemented a package with this approach, that allows to Pin() a Go pointer and Poke() it into C memory: PtrGuard

But there are still caveats. The compiler and the runtime (cgocheck=2) don't seem to know about which pointers are protected by the directive, because they still don't allow to pass Go memory containing these Go pointers to a C function, or to store the pointers in C memory. Therefore the two first workarounds are additionally necessary. Also there is the small overhead for the Go routine and the release signalling.

Proposal

It would make Cgo a lot more usable for C APIs with more complex pointer handling like iovec, if there would be a programmatic way to provide what //go:uintptrescapes provides already through the backdoor. There should be a possibility to pin an arbitrary amount of Go pointers in the current scope, so that they are allowed to be stored in C memory or be contained in Go memory that is passed to a C function within this scope, for example with a runtime.PtrEscapes() function. It's cumbersome, that it's required to abuse Go routines, channels and casting tricks in order provide bindings to such C APIs. As long as the Go GC is not moving pointers, it could be a trivial implementation, but it would encapsulate this knowledge and would give users a guarantee.

I know from the other issues and discussions around this topic that it's seen as dangerous if it is possible to pin an arbitrary amount of pointers. But

  1. it is possible to call an arbitrary amount of C or //go:uintptrescapes functions, therefore it is also possible to pin arbitrary amount of Go pointers already.
  2. it is necessary for some C APIs

Related issues: #32115, #40431

/cc @ianlancetaylor @rsc @seebs

edit: the first workaround had an incorrect statement.
edit 2: add workarounds for disabling cgocheck

@gopherbot gopherbot added this to the Proposal milestone Jun 16, 2021
@DeedleFake
Copy link

@DeedleFake DeedleFake commented Jun 16, 2021

From what I can tell from the documentation for the new cgo.Handle, it's intended only for a situation where a pointer needs to be passed from Go to C and then back to Go without the C code doing anything with what it points to. As it passes a handle ID, not a real pointer, the C code can't actually get access to the actual data. Maybe a function could be provided on the C side that takes a handle ID and returns the original pointer, thus allowing the C code to access the data? Would that solve this issue?

Edit: Wait, that doesn't make sense. Could you just use Handle to make sure that it's held onto? Could the definition of Handle be extended to mean that the pointer itself is valid for the duration of the Handle's existence? In other words, this would be defined to be valid:

// void doSomethingWithAPointer(int *a);
import "C"

func main() {
  v := C.int(3)
  h := cgo.NewHandle(&v)
  doSomethingWithAPointer(&v) // Safe because the handle exists for that pointer.
  h.Delete()
}

Alternatively, if that's not feasible, what about a method on Handle that returns a valid pointer for the given value?

// Pointer returns a C pointer that points to the underlying value of the handle
// and is valid for the life of the handle.
func (h Handle) Pointer() C.uintptr_t

Disclaimer: I'm not familiar enough with the internals of either the Go garbage collector or Cgo to know if either of these even make sense.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Jun 16, 2021

@DeedleFake As you pointed out yourself, the cgo.Handle has a very different purpose. It's just a registry for a map from a C compatible arbitrary ID (uintptr) to an arbitrary Go value. It's purpose is to refer to a Go value in the C world, not to access it from there. It doesn't affect the behavior of the garbage collector, which could still freely move around the values in the Handle map, and would never delete them, since they are referenced by the map.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals Jun 16, 2021
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 16, 2021

An big advantage of the current cgo mechanisms, including go:uintptrescapes, is that the pointers are automatically unpinned when the cgo function returns. As far as I can see you didn't propose any particular mechanism for pinning pointers, but it would be very desirable to somehow ensure that the pointers are unpinned. Otherwise code could easily get into scenarios in which pointers remain pinned forever, which if Go ever implements a full moving garbage collector will cause the garbage collector to silently behave quite poorly. In other words, some APIs that could solve this problem will be be footguns: code that can easily cause a program to silently behave badly in ways that will be very hard to detect.

It's hard to say more without a specific API to discuss. If you suggested one, my apologies for missing it.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Jun 17, 2021

@ianlancetaylor thanks for taking the time to answer.

An big advantage of the current cgo mechanisms, including go:uintptrescapes, is that the pointers are automatically unpinned when the cgo function returns.

I agree, that is an advantage. However, with go routines it's trivial to fire-and-forget thousands of such function calls, that never return.

As far as I can see you didn't propose any particular mechanism for pinning pointers, but it would be very desirable to somehow ensure that the pointers are unpinned. Otherwise code could easily get into scenarios in which pointers remain pinned forever, which if Go ever implements a full moving garbage collector will cause the garbage collector to silently behave quite poorly. In other words, some APIs that could solve this problem will be be footguns: code that can easily cause a program to silently behave badly in ways that will be very hard to detect.

I didn't describe a specific API, that's true. I hoped that this could be developed here together once we agreed on the requirements. One of the requirements that I mentioned was, that the pinning happens only for the current scope. That implies automatic unpinning when the scope is left. Sorry that I didn't make that clear enough. So, to rephrase more compactly, the requirements would be:

  • possibility to pin pointers in the current scope (exactly as if they would be the argument of a C function call)
  • automatic unpinning when the current scope is left (the current function returns)
  • cgocheck knows about the pinning and does not complain

It's hard to say more without a specific API to discuss. If you suggested one, my apologies for missing it.

As stated above, I didn't want to suggest a specific API, but characteristics of it. In the end it could be a function like runtime.PtrEscapes(unsafe.Pointer). The usage could look like this:

func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
  numberOfBuffers := len(bufferArray)

  iovec := make([]C.struct_iovec, numberOfBuffers)

  for i := range iovec {
    bufferPtr := unsafe.Pointer(&bufferArray[i][0])
    runtime.PtrEscapes(bufferPtr) // <- pins the pointer and makes it known to escape to C
    iovec[i].iov_base = bufferPtr
    iovec[i].iov_len = C.size_t(len(bufferArray[i]))
  }

  n := C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
  // ^^^ cgocheck doesn't complain, because Go pointers in iovec are pinned
  return int(n) // <- all pinned pointers in iovec are unpinned
}

As long as the GC is not moving, runtime.PtrEscapes() is almost a no-op, it would basically only tell cgocheck not to bail out for these pointers. But users would have a guarantee, that if the GC becomes moving later, this function will take care of it.

Regarding footguns I'm pretty sure, that the workarounds, that have to be used at the moment to solve these problems, will cause more "programs to silently behave badly" than the potential abuse of a proper pinning method.

@bcmills
Copy link
Member

@bcmills bcmills commented Jun 17, 2021

it would be very desirable to somehow ensure that the pointers are unpinned

Drawing from runtime.KeepAlive, one possibility might be something like:

package runtime

// Pin prevents the object to which p points from being relocated until
// the returned PointerPin either is unpinned or becomes unreachable.
func Pin[T any](p *T) PointerPin

type PointerPin struct {…}
func (p PointerPin) Unpin() {}

Then the example might look like:

func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
	numberOfBuffers := len(bufferArray)

	iovec := make([]C.struct_iovec, numberOfBuffers)

	for i := range iovec {
		bufferPtr := unsafe.Pointer(&bufferArray[i][0])
		defer runtime.Pin(bufferPtr).Unpin()
		iovec[i].iov_base = bufferPtr
		iovec[i].iov_len = C.size_t(len(bufferArray[i]))
	}

	n := C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
	return int(n)
}

A vet warning could verify that the result of runtime.Pin is used, to ensure that it is not accidentally released too early (see also #20803).

@phlogistonjohn
Copy link

@phlogistonjohn phlogistonjohn commented Jun 17, 2021

@ansiwen when you write "automatic unpinning when the current scope is left (the current function returns)" the current scope you refer to is the scope of the Go function correct? In your example that would be ReadFileIntoBufferArray.
I'm trying to double check what the behavior would be regarding if we needed to make multiple calls into C using the same pointer.

@bcmills version also looks very natural flowing to me, and in that version it's clear that the pointer would be pinned until the defer at the end of ReadFileIntoBufferArray.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Jun 17, 2021

@ansiwen when you write "automatic unpinning when the current scope is left (the current function returns)" the current scope you refer to is the scope of the Go function correct? In your example that would be ReadFileIntoBufferArray.

@phlogistonjohn Yes, exactly.

@bcmills version also looks very natural flowing to me, and in that version it's clear that the pointer would be pinned until the defer at the end of ReadFileIntoBufferArray.

Yes, I also would prefer @bcmills version from a user's perspective, because it's more explicit and it's basically the same API that we use with PtrGuard.

I just don't know enough about the implications on the implementation side and effects on the Go internals, so I don't know what API would be more feasible. My proposal is about providing an official way to solve the described problem. I really don't care so much about the "form", that is how exactly the API looks like. Whatever works best with the current Go and Cgo implementation. 😊

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Jun 18, 2021

@bcmills I guess, an argument @ianlancetaylor might bring up against your API proposal is, that it would allow to store the PointerPin value in a variable and keep them pinned for an unlimited time, so it would not "ensure that the pointers are unpinned". If the unpinning is implicit, it is more comparable to //go:uintptrescapes.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Jun 24, 2021

@ianlancetaylor

it would be very desirable to somehow ensure that the pointers are unpinned.

So, if you want to enforce the unpinning, the only strict RAII pattern in Go that I could come up with is using a scoped constructor like this API:

package runtime

// Pinner is the context for pinning pointers with Pin()
// can't be copied or constructed outside a Pinner scope
type Pinner struct {…}

// Pin prevents the object to which p points from being relocated until
// Pinner becomes invalid.
func (Pinner) Pin(p unsafe.Pointer) {...}

func WithPinner(func(Pinner)) {...}

which would be used like this:

func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
    numberOfBuffers := len(bufferArray)
    
    iovec := make([]C.struct_iovec, numberOfBuffers)

    var n C.ssize_t
    runtime.WithPinner(func (pinner runtime.Pinner) {
        for i := range iovec {
            bufferPtr := unsafe.Pointer(&bufferArray[i][0])
            pinner.Pin(bufferPtr)
            iovec[i].iov_base = bufferPtr
            iovec[i].iov_len = C.size_t(len(bufferArray[i]))
        }
        
        n = C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
    }) // <- All pinned pointers are released here and pinner is invalidated (in case it's copied out of scope).
    return int(n)
}

I personally would prefer a thinner API, where either it must be explicitly unpinned, like in the proposal of @bcmills, or - even better - the pinning implicitly creates a defer for the scope in which the pinning function has been called from. Given, that this will be implemented in the runtime package, I guess there are tricks and magic that can be used there.

@Merovius
Copy link

@Merovius Merovius commented Jun 30, 2021

@ansiwen Even with the func API you suggest, a user might store the argument in a closed-over variable, to have it survive the function. In general, as long as the pin is represented by some value, we can't prevent that value from being kept around. So I don't think your version has significant safety-benefits as to compared to @bcmills, while being less wieldy and also potentially heavier in runtime cost (the closure might make it easier for things to escape).

Personally, as long as the PointerPin has to be intentionally kept around, I think that's fine. I think the suggestion to unpin when the PointerPin becomes unreachable already makes it sufficiently hard to shoot yourself in the foot to tolerate the risk. And we might be able to use go vet for additional safety (like warning if the result of Pin is assigned to a global var or something).

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Jun 30, 2021

@Merovius

@ansiwen Even with the func API you suggest, a user might store the argument in a closed-over variable, to have it survive the function. In general, as long as the pin is represented by some value, we can't prevent that value from being kept around. So I don't think your version has significant safety-benefits as to compared to @bcmills, while being less wieldy and also potentially heavier in runtime cost (the closure might make it easier for things to escape).

The "keeping-around" can easily be prevented by one pointer indirection that get's invalidated when the scope is left. You can have a look at my implementation of PtrGuard that even has test case for exactly the case of a scope escaping variable.

Personally, as long as the PointerPin has to be intentionally kept around, I think that's fine. I think the suggestion to unpin when the PointerPin becomes unreachable already makes it sufficiently hard to shoot yourself in the foot to tolerate the risk. And we might be able to use go vet for additional safety (like warning if the result of Pin is assigned to a global var or something).

Yeah, I agree, as I wrote before, I'm totally fine with both. It's just something I came up with to address @ianlancetaylor's concerns. I also think that the risks are "manageable", there are all kinds of other risks when dealing with runtime and/or unsafe packages after all.

@rsc rsc moved this from Incoming to Active in Proposals Aug 4, 2021
@beoran
Copy link

@beoran beoran commented Aug 4, 2021

I think that the API proposed by @bcmills is the most useful one. Although there is a risk of forgetting to unpin a pointer, once Go gets a moving garby collector, for certain low level uses, certain blocks of memory will have to stay pinned for the duration of the program. Certainly for system calls in Linux, such as for the frame buffers. In other words, Pin and Unpin are also useful without cgo.

@hnes
Copy link

@hnes hnes commented Aug 17, 2021

Hi @rsc, any updates on this issue recently? I noticed it has been several days after the 2021-08-04's review meeting minutes.

@rsc
Copy link
Contributor

@rsc rsc commented Aug 18, 2021

The compiler/runtime team has been talking a bit about this but don't have any clear suggestions yet.

The big problem with pinning is that if we ever want a moving garbage collector in the future, pins will make it much more complex. That's why we've avoided it so far.

/cc @aclements

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Aug 18, 2021

The big problem with pinning is that if we ever want a moving garbage collector in the future, pins will make it much more complex. That's why we've avoided it so far.

@rsc But my point in the description was, that we have pinning already when C functions are called with Go pointers or when the //go:uintptrescapes directive is used. So the situation is complex already, isn't it?

@beoran
Copy link

@beoran beoran commented Aug 18, 2021

@rsc I would say the converse is also true. If you are going to implement a moving garbage collector without support for pinning, that will make it much more complex to use Go for certain direct operating calls without cgo, e.g. on Linux.
In other words, as @ansiwen says, there's really no way to avoid that complexity. And therefore I think it would be better if Go supported it explicitly than through workarounds.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 19, 2021

Unbounded pinning has the potential to be significantly worse than bounded pinning. If people accidentally or intentionally leave many pointers pinned, that can fragment the spaces that the GC uses, and make it very hard for a moving GC to make any progress at all. This can in principle happen with cgo today, but it is unlikely that many programs pass a bunch of pointers to a cgo function that never returns. When programmers control the pinning themselves, bugs are more likely. If the bug is in some imported third party library, the effect will be strange garbage collection behavior for the overall program. This will be hard to understand and hard to diagnose, and it will be hard to find the root cause. (One likely effect will be a set of tools similar to the memory profiler that track pinned pointers.)

It's also worth noting that we don't have a moving garbage collector today, so any problems that pinned pointers may introduce for a moving garbage collector will not be seen today. So if we ever do introduce a moving garbage collector, we will have a flag day of hard-to-diagnose garbage collection problems. This will make it that much harder to ever change the garbage collector in practice.

So I do not think the current situation is nearly as complex as the situation would be if we add unbounded pinning. This doesn't mean that we shouldn't add unbounded pinning. But I think that it does mean that the argument for it has to be something other than "we can already pin pointers today."

@beoran
Copy link

@beoran beoran commented Aug 19, 2021

@ianlancetaylor That is fair enough. But then it seems to me the best way ahead is to put this issue on hold until we can implement a prototype moving garbage collector.

There is always a workaround if there is no pinning available and that is to manually allocate memory directly from the OS so the GC doesn't know about it. It is not ideal but it can work.

@egonelbre
Copy link
Contributor

@egonelbre egonelbre commented Aug 19, 2021

Yeah, one workaround that is missing from the discussion is hiding the C api allocation concerns, e.g. iovec could be implemented like:

package iovec

type Buffers struct {
	Data [][]byte

	data *C.uint8_t
	list *C.iovecT
}

func NewBuffers(sizes []int) *Buffers {
	...
	// C.malloc everything
	// cast from *C.uint8_t to []byte
}

func (buffers *Buffers) ReadFrom(f *os.File) error { ...

Or in other words, from the problem statement, it's unclear why it's required to use bufferArray [][]byte as the argument.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Aug 20, 2021

@ianlancetaylor

So I do not think the current situation is nearly as complex as the situation would be if we add unbounded pinning. This doesn't mean that we shouldn't add unbounded pinning. But I think that it does mean that the argument for it has to be something other than "we can already pin pointers today."

Let's separate the two questions "pinning yes/no" and "pinning bounded/unbounded".

pinning yes/no

I also proposed

  1. an API that allows bounded pinning (runtime.WithPinner()).
  2. the potential possibility of a runtime.Pin() with no return value and an implicit defer that automatically gets unpinned when the current function returns.

Both provide a similar behaviour as the //go:uintptrescapes directive, if that is what you mean with "bounded". What do you think of these options?

pinning bounded/unbounded

  1. when we will have a moving GC, there will always be also a possibility to pin pointer or pause the moving, so this needs to be implemented in any case. Is this correct?
  2. when people leave pointers pinned, the GC will behave like a non-moving GC, so there is no regression beyond our current status-quo, right? So, what exactly do you mean with "hard-to-diagnose garbage collection problems"?
  3. would the risk of many unpinned pointers not be similar to that of memory leaks, like with global dynamic data structures, that are possible now? I know, memory fragmentation is potentially worse than just allocating memory, but the effect would be similar: OOM errors.

For me personally the first question is more important. Bounded or unbounded, I think the existing and required ways of pinning should be made less hacky in their usage.

@egonelbre

Or in other words, from the problem statement, it's unclear why it's required to use bufferArray [][]byte as the argument.

The bufferArray [][]byte is just a placeholder for an arbitrary "native Go data structure". As the problem statement mentions, the goal is to avoid copying of the data. Especially vectored I/O is used for big amounts of data, so depending on the use case, you can't choose the target data structure by yourself, but it is provided by another library that you intend to use (let's say video processing for example). That would mean, that in all these cases you have to copy the data from your own C allocated data structure to the Go-allocated target data structure of your library, for no good reason.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 20, 2021

when we will have a moving GC, there will always be also a possibility to pin pointer or pause the moving, so this needs to be implemented in any case. Is this correct?

In some manner, yes.

when people leave pointers pinned, the GC will behave like a non-moving GC, so there is no regression beyond our current status-quo, right? So, what exactly do you mean with "hard-to-diagnose garbage collection problems"?

A GC that is based on moving pointers is not the same as a GC that does not move pointers. A GC based on moving pointers may be completely blocked by a pinned pointer, whereas for a non-moving GC a pinned pointer is just the same as a live pointer.

would the risk of many unpinned pointers not be similar to that of memory leaks, like with global dynamic data structures, that are possible now? I know, memory fragmentation is potentially worse than just allocating memory, but the effect would be similar: OOM errors.

Same answer.

Again, all I am saying is that arguments based on "we already support pinned pointers, so it's OK to add more" are not good arguments. We need different arguments.

@hnes
Copy link

@hnes hnes commented Aug 21, 2021

How would we deal with the iovec struct during vectored I/O syscall if we have a GC that is based on moving pointers? Maybe the same solution could also be applied to the pointer pinning we are discussing?

A GC based on moving pointers may be completely blocked by a pinned pointer.

I'm afraid that would badly impact the GC latency or something else if it is true. Please consider the disk i/o syscall that may block a very long time.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Aug 21, 2021

@ianlancetaylor

when we will have a moving GC, there will always be also a possibility to pin pointer or pause the moving, so this needs to be implemented in any case. Is this correct?

In some manner, yes.

when people leave pointers pinned, the GC will behave like a non-moving GC, so there is no regression beyond our current status-quo, right? So, what exactly do you mean with "hard-to-diagnose garbage collection problems"?

A GC that is based on moving pointers is not the same as a GC that does not move pointers. A GC based on moving pointers may be completely blocked by a pinned pointer, whereas for a non-moving GC a pinned pointer is just the same as a live pointer.

Since you agreed that the pinning is required in the answer before, I don't understand how such an implementation could be used in Go.

Again, all I am saying is that arguments based on "we already support pinned pointers, so it's OK to add more" are not good arguments. We need different arguments.

I don't think "add more" is the right wording. It's more about exposing the pinning in a better way. And these are not arguments for doing it, but arguments against the supposed risks of doing it.

The argument for doing it should be clear by now: give people a zero-copy way to use APIs like iovec with Go data structures in a future proof way. At the moment, that's not possible.

In your answers you skipped the first part about the bounded pinning. If you have the time to comment on these too, I would be very interested. 😊

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 21, 2021

Since you agreed that the pinning is required in the answer before, I don't understand how such an implementation could be used in Go.

The current system for pinning pointers doesn't permit pointers to be pinned indefinitely, if we discount the unusual case of a C function that does not return.

I agree that other systems that somehow ensure that pointers can't be pinned indefinitely are better. (I don't think that an implicit defer is a good approach for Go, though.)

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 27, 2021

@dot-asm

With both a read and a write barrier, there is no need for a mutex that blocks GC. A read barrier is certainly a performance cost. But it's not a mutex and it doesn't prevent parallel execution by the program and the GC.

I'm not sure I follow, but maybe we are talking past each other. Would you agree that when an object is being moved, the code referencing it can't actually execute? And once it resumes the execution all local copies of the relevant pointers would have to be externally adjusted by the GC? And for this last part to happen GC would have to either synchronize with target code one way or another, or be able to fix up registers in suspended thread's processor context?

This should probably be discussed somewhere else, such as golang-nuts, not on this issue. It's not related to this proposal.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Oct 31, 2021

package runtime

type Pinner struct { ... }
func (p *Pinner) Pin(object interface{})
func (p *Pinner) Unpin()

@rsc, @aclements, while I was implementing this interface as a PoC, a further question came up: You are suggesting an interface {} as argument. I get that this is more comfortable to use, because it does't require any type punning. However, isn't that more expensive, because all the dynamic type structures need to be created for the call? I would naively have used unsafe.Pointer as argument, since I assume that in most cases, where one would use pinning, the unsafe package is imported anyway. Probably it doesn't make a real difference in the end, and it doesn't matter, but I thought we should at least touch that point shortly before wrapping up.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 2, 2021

I'm not sure I agree that people using a Pinner will routinely import "unsafe". The additional type information is constructed statically by the compiler, so the extra cost of interface{} should be minimal. Of course we'll want to make sure that the interface value does not escape, but that seems feasible.

@rsc rsc moved this from Likely Accept to Accepted in Proposals Nov 3, 2021
@rsc
Copy link
Contributor

@rsc rsc commented Nov 3, 2021

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: runtime: provide Pinner API for object pinning runtime: provide Pinner API for object pinning Nov 3, 2021
@rsc rsc removed this from the Proposal milestone Nov 3, 2021
@rsc rsc added this to the Backlog milestone Nov 3, 2021
@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 4, 2021

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

Awesome, thanks! 🥳

Is it on me, who filed the issue, to implement it? I'm happy to do that, I just might need some guidance how to disable cgocheck for pinned pointers.

Or would it be better that someone from the runtime/GC team implements this?

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 4, 2021

It is not on you to implement this. If you want to implement it, though, that would be great. But this is not a trivial change. We'll need to efficiently track pinned pointers in some way.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 5, 2021

It is not on you to implement this. If you want to implement it, though, that would be great. But this is not a trivial change. We'll need to efficiently track pinned pointers in some way.

Cool. As I said, I'm happy to try it, if I can get some direction what "some way" might be. Like some similar functionality in existing code I can look into, or general ideas and concepts. Maybe it would be good to line out the pinning process in pseudo code, so I don't forget an important step?

Or should I just push a stub implementation, and we iterate over it in the code review?

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 5, 2021

I don't know of any similar functionality. It's complicated. Pushing a stub implementation won't be helpful.

I think the key step here is that cgoCheckPointer and the functions that it calls must not complain about pinned pointers.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 6, 2021

I also looked at the code yesterday for a bit. My current approach would be to change cgoIsGoPointer to cgoIsUnpinnedGoPointer, because it's exclusively called from cgocheck code. About how to mark a pointer as pinned I saw two options so far:

  • like cgo.Handle we could use a global sync.Map for registering all pinned pointers and keeping them alive as suggested before (or uintptr, if we don't want to keep them alive). I guess as long as the map is emtpy the performance impact for the default cgocheck=1 would be neglectible. However, I have no idea how expensive it is, if it's not empty, and if it would be acceptable.
  • similar to runtime.mspan.special we could add a list of pinned objects in the span. That might be anyway the appropriate place, when the GC needs to know about it later. I guess getting the span of a pointer and iterating through a list of few pinned objects is cheaper than a sync.Map lookup, but I'm not sure, especially if - as I assume - we need to serialze the list access.

Thoughts?

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 7, 2021

I don't think the sync.Map approach will work well, because 1) I think that if a runtime.Pinner is garbage collected, we should explicitly unpin the pointers; 2) I don't think the performance hit of changing cgocheck to look up pointers in a sync.Map will be acceptable.

I don't know about the mspan.special approach, that might work.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 7, 2021

ad 1) why is a sync.Map registry and unpinning in a finalizer a contradiction? We could even let it panic, if a pinned Pinner is collected, as @bcmills suggested here, in order to educate people not to forget the unpinning. Note that the map would keep the pinned pointer alive, not the pinner itself. And actually that's not even necessary, because the references in the pinner itself would keep it alive, so the map could be a uintptr -> nil map.
ad 2) yeah, that's what I thought too. and I guess a normal map with a sync.RWMutex wouldn't perform any better? However, you also don't seem completely definite. Maybe someone here has a stronger feeling about that. Otherwise maybe some benchmarking could help giving us a better understanding?

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 8, 2021

I want to share some benchmarks, that I wrote in order go get a rough idea about the performance impact that we are talking about:

% /Users/svanders/sdk/go1.17.3/bin/go test -run=^$ -bench . pinnerbenchmark -benchtime 1s -c
goos: darwin
goarch: amd64
pkg: pinnerbenchmark
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkCCall0NoCgoCheck       19683577    55.13  ns/op
BenchmarkCCall0                 21328222    55.03  ns/op
BenchmarkCCall1NoCgoCheck       18208592    60.48  ns/op |
BenchmarkCCall1                 16470044    74.33  ns/op |
BenchmarkCCall4NoCgoCheck       16455018    63.66  ns/op |--> ~14.3ns / ptr
BenchmarkCCall4                  9597632   121.3   ns/op |
BenchmarkSyncMap                92286345    13.34  ns/op     +93% / ptr
BenchmarkMap                   220817404     5.263 ns/op     +36% / ptr
BenchmarkMutexMap               80391705    14.63  ns/op    +102% / ptr
BenchmarkRWMutexMap             81198786    14.75  ns/op    +103% / ptr
BenchmarkSpecials               51773179    23.25  ns/op    +163% / ptr
BenchmarkSpecialsWithoutLocks  275029039     4.281 ns/op     +30% / ptr

The "CCall" benchmarks are measuring the base costs of the call of a very simple C function and the costs of cgocheck=1 depending on number of pointer arguments. They show that cgocheck=1 adds about 14.3ns per pointer argument to the call costs on my computer.

The other benchmarks measure the costs of different lookups of non-existing keys/flags. The maps are not empty (10 entries, but that's irrelevant). BenchmarkSpecial basically calls removespecial from runtime/mheap.c for a non-existing special on an empty special list, in which case it's basically the same as a lookup. This function synchronizes with the GC and uses a lock, therefore it's quite slow. If we can avoid a lock for reading a list of pinned pointers, and only lock on write access and change the list with atomic operations, then it might be as cheap as BenchmarkSpecialsWithoutLocks, which basically is just the cost of getting the span for the pointer with spanOfHeap() and checking that the list pointer is nil.

The percentages are the increase of cost per pointer relative to the base cost per pointer. In general the values are all roughly in the same ballpark, but of course cgocheck=1 would get more expensive. However, with a real "payload" in the C call that cost might get negligible quite fast? With a simple snprintf() the C call already takes 140ns, making the 13ns from a sync.Map lookup not look very dramatic. So, what additional cost would be acceptable?

The linked list in the span structure would obviously be the cheapest (if we can avoid the locks), but it would increase the size of the runtime.mspan struct. Not sure if that is acceptable either.

Is it maybe also an option to move the checks to cgocheck=2, like the "store Go pointer in C memory" test?

@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 8, 2021

If we think that the common case is that the checked pointers are not pinned, and there are no other special objects in its span, then the fast path could be as simple as finding the span and checking that the specials list is empty with an atomic.Loadp. Only if there are >0 specials in the span do we need to do more work (grab a lock, ...).
There's even a pageSpecials bitmap we could use.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 8, 2021

@randall77

If we think that the common case is that the checked pointers are not pinned, and there are no other special objects in its span, then the fast path could be as simple as finding the span and checking that the specials list is empty with an atomic.Loadp. Only if there are >0 specials in the span do we need to do more work (grab a lock, ...). There's even a pageSpecials bitmap we could use.

Unfortunately, after thinking about it, I believe it's exactly the other way around: We would only check pinning status for nested Go pointers, and if they are not pinned, it panics as it does now already. So, in working code the pinning must always be set, when we check for it. But the good news is, only if there are nested Go pointers present we will have the extra cost. So you could say that's the extra cost, that comes with the possibility to pass nested Go pointers, and if you don't like it, set cgocheck=0. But we need to optimize for the pinned case. (At least for the cgocheck code.) Or did I forget something?

@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 8, 2021

I guess you're right, if we don't need to check top-level pointers then the thing we're checking already being pinned is the common case.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 8, 2021

The problem with reusing the specials list is, that the GC seems to modify it without lock while sweeping, so you have to synchronize with the GC and acquire the specials lock. If we would use a separate list for pinned pointers, I guess we could realize lock-free iteration.

Speaking of bitmaps: a pinnedBits *gcBits would certainly be the most performant, if we can afford the memory.

@ansiwen
Copy link
Author

@ansiwen ansiwen commented Nov 10, 2021

I wrote an implementation that uses a pinnedBits *gcBits pinnedBits *pBits bitmap in the mspan struct for marking pinned objects. The bitmap is only allocated, when the first object of a span is pinned. (I hope it's legal to allocate bitmaps after the span init, but at least it works.)

Costs if pinning is not used:

  • Memory cost is one pointer per span.
  • Runtime cost is zero, because the pinning is only checked in cases, where it would otherwise panic anyway.

Costs if pinning is used:

  • The first pin in a span allocates a bitmap with (nelems / 8) bytes.
  • Each nested pointer in a C call argument additionally costs about 3-4 ns on my computer, that's about a third of the cost of a top level pointer argument.
  • The only lock that is used is when the bitmap is allocated during the first pin in a span. I shared that lock with the specialslock, because this happens only once per span with at least on pinned object. So I guess it's fine to serialize specialslock and bitmap creation and therefore saving the memory for an additional lock per span.

While implementing, a couple of behavioural decisions came up, that I want to raise here:

  • Global objects and zero size objects: attempt to pin should panic or be irgnored? Are nested pointers to these allowed in C call arguments? I guess both are implicitly pinned anyway, correct?
  • Double pin: What should happen, when the object is already pinned? Panic or ignore? This might also happen when several pointers to the same object are pinned, like different fields of the same struct or different elements of the same array.
  • Pinner leaks: What should happen, when the GC collects an unreachable Pinner that still holds pinned pointers? Silently unpin them or panic? (At the moment my implementation panics.)

In general I would default to panics, because they give feedback to the author about possible issues. But with the double pin I'm not so sure if it might become annoying.

As always: feedback is highly appreciated. Thanks! 😊

@gopherbot
Copy link

@gopherbot gopherbot commented Nov 28, 2021

Change https://golang.org/cl/367296 mentions this issue: runtime: implement Pinner API for object pinning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Proposals
Accepted
Development

No branches or pull requests