New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: provide Pinner API for object pinning #46787
Comments
From what I can tell from the documentation for the new Edit: Wait, that doesn't make sense. Could you just use // void doSomethingWithAPointer(int *a);
import "C"
func main() {
v := C.int(3)
h := cgo.NewHandle(&v)
doSomethingWithAPointer(&v) // Safe because the handle exists for that pointer.
h.Delete()
} Alternatively, if that's not feasible, what about a method on // Pointer returns a C pointer that points to the underlying value of the handle
// and is valid for the life of the handle.
func (h Handle) Pointer() C.uintptr_t Disclaimer: I'm not familiar enough with the internals of either the Go garbage collector or Cgo to know if either of these even make sense. |
@DeedleFake As you pointed out yourself, the |
An big advantage of the current cgo mechanisms, including It's hard to say more without a specific API to discuss. If you suggested one, my apologies for missing it. |
@ianlancetaylor thanks for taking the time to answer.
I agree, that is an advantage. However, with go routines it's trivial to fire-and-forget thousands of such function calls, that never return.
I didn't describe a specific API, that's true. I hoped that this could be developed here together once we agreed on the requirements. One of the requirements that I mentioned was, that the pinning happens only for the current scope. That implies automatic unpinning when the scope is left. Sorry that I didn't make that clear enough. So, to rephrase more compactly, the requirements would be:
As stated above, I didn't want to suggest a specific API, but characteristics of it. In the end it could be a function like func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
numberOfBuffers := len(bufferArray)
iovec := make([]C.struct_iovec, numberOfBuffers)
for i := range iovec {
bufferPtr := unsafe.Pointer(&bufferArray[i][0])
runtime.PtrEscapes(bufferPtr) // <- pins the pointer and makes it known to escape to C
iovec[i].iov_base = bufferPtr
iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n := C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
// ^^^ cgocheck doesn't complain, because Go pointers in iovec are pinned
return int(n) // <- all pinned pointers in iovec are unpinned
} As long as the GC is not moving, Regarding footguns I'm pretty sure, that the workarounds, that have to be used at the moment to solve these problems, will cause more "programs to silently behave badly" than the potential abuse of a proper pinning method. |
Drawing from package runtime
// Pin prevents the object to which p points from being relocated until
// the returned PointerPin either is unpinned or becomes unreachable.
func Pin[T any](p *T) PointerPin
type PointerPin struct {…}
func (p PointerPin) Unpin() {} Then the example might look like: func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
numberOfBuffers := len(bufferArray)
iovec := make([]C.struct_iovec, numberOfBuffers)
for i := range iovec {
bufferPtr := unsafe.Pointer(&bufferArray[i][0])
defer runtime.Pin(bufferPtr).Unpin()
iovec[i].iov_base = bufferPtr
iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n := C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
return int(n)
} A |
@ansiwen when you write "automatic unpinning when the current scope is left (the current function returns)" the current scope you refer to is the scope of the Go function correct? In your example that would be @bcmills version also looks very natural flowing to me, and in that version it's clear that the pointer would be pinned until the defer at the end of |
@phlogistonjohn Yes, exactly.
Yes, I also would prefer @bcmills version from a user's perspective, because it's more explicit and it's basically the same API that we use with PtrGuard. I just don't know enough about the implications on the implementation side and effects on the Go internals, so I don't know what API would be more feasible. My proposal is about providing an official way to solve the described problem. I really don't care so much about the "form", that is how exactly the API looks like. Whatever works best with the current Go and Cgo implementation. |
@bcmills I guess, an argument @ianlancetaylor might bring up against your API proposal is, that it would allow to store the |
So, if you want to enforce the unpinning, the only strict RAII pattern in Go that I could come up with is using a scoped constructor like this API: package runtime
// Pinner is the context for pinning pointers with Pin()
// can't be copied or constructed outside a Pinner scope
type Pinner struct {…}
// Pin prevents the object to which p points from being relocated until
// Pinner becomes invalid.
func (Pinner) Pin(p unsafe.Pointer) {...}
func WithPinner(func(Pinner)) {...} which would be used like this: func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
numberOfBuffers := len(bufferArray)
iovec := make([]C.struct_iovec, numberOfBuffers)
var n C.ssize_t
runtime.WithPinner(func (pinner runtime.Pinner) {
for i := range iovec {
bufferPtr := unsafe.Pointer(&bufferArray[i][0])
pinner.Pin(bufferPtr)
iovec[i].iov_base = bufferPtr
iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n = C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
}) // <- All pinned pointers are released here and pinner is invalidated (in case it's copied out of scope).
return int(n)
} I personally would prefer a thinner API, where either it must be explicitly unpinned, like in the proposal of @bcmills, or - even better - the pinning implicitly creates a defer for the scope in which the pinning function has been called from. Given, that this will be implemented in the runtime package, I guess there are tricks and magic that can be used there. |
@ansiwen Even with the Personally, as long as the |
The "keeping-around" can easily be prevented by one pointer indirection that get's invalidated when the scope is left. You can have a look at my implementation of PtrGuard that even has test case for exactly the case of a scope escaping variable.
Yeah, I agree, as I wrote before, I'm totally fine with both. It's just something I came up with to address @ianlancetaylor's concerns. I also think that the risks are "manageable", there are all kinds of other risks when dealing with runtime and/or unsafe packages after all. |
I think that the API proposed by @bcmills is the most useful one. Although there is a risk of forgetting to unpin a pointer, once Go gets a moving garby collector, for certain low level uses, certain blocks of memory will have to stay pinned for the duration of the program. Certainly for system calls in Linux, such as for the frame buffers. In other words, Pin and Unpin are also useful without cgo. |
Hi @rsc, any updates on this issue recently? I noticed it has been several days after the 2021-08-04's review meeting minutes. |
The compiler/runtime team has been talking a bit about this but don't have any clear suggestions yet. The big problem with pinning is that if we ever want a moving garbage collector in the future, pins will make it much more complex. That's why we've avoided it so far. /cc @aclements |
@rsc But my point in the description was, that we have pinning already when C functions are called with Go pointers or when the |
@rsc I would say the converse is also true. If you are going to implement a moving garbage collector without support for pinning, that will make it much more complex to use Go for certain direct operating calls without cgo, e.g. on Linux. |
Unbounded pinning has the potential to be significantly worse than bounded pinning. If people accidentally or intentionally leave many pointers pinned, that can fragment the spaces that the GC uses, and make it very hard for a moving GC to make any progress at all. This can in principle happen with cgo today, but it is unlikely that many programs pass a bunch of pointers to a cgo function that never returns. When programmers control the pinning themselves, bugs are more likely. If the bug is in some imported third party library, the effect will be strange garbage collection behavior for the overall program. This will be hard to understand and hard to diagnose, and it will be hard to find the root cause. (One likely effect will be a set of tools similar to the memory profiler that track pinned pointers.) It's also worth noting that we don't have a moving garbage collector today, so any problems that pinned pointers may introduce for a moving garbage collector will not be seen today. So if we ever do introduce a moving garbage collector, we will have a flag day of hard-to-diagnose garbage collection problems. This will make it that much harder to ever change the garbage collector in practice. So I do not think the current situation is nearly as complex as the situation would be if we add unbounded pinning. This doesn't mean that we shouldn't add unbounded pinning. But I think that it does mean that the argument for it has to be something other than "we can already pin pointers today." |
@ianlancetaylor That is fair enough. But then it seems to me the best way ahead is to put this issue on hold until we can implement a prototype moving garbage collector. There is always a workaround if there is no pinning available and that is to manually allocate memory directly from the OS so the GC doesn't know about it. It is not ideal but it can work. |
Yeah, one workaround that is missing from the discussion is hiding the C api allocation concerns, e.g. iovec could be implemented like:
Or in other words, from the problem statement, it's unclear why it's required to use |
Let's separate the two questions "pinning yes/no" and "pinning bounded/unbounded". pinning yes/noI also proposed
Both provide a similar behaviour as the pinning bounded/unbounded
For me personally the first question is more important. Bounded or unbounded, I think the existing and required ways of pinning should be made less hacky in their usage.
The |
In some manner, yes.
A GC that is based on moving pointers is not the same as a GC that does not move pointers. A GC based on moving pointers may be completely blocked by a pinned pointer, whereas for a non-moving GC a pinned pointer is just the same as a live pointer.
Same answer. Again, all I am saying is that arguments based on "we already support pinned pointers, so it's OK to add more" are not good arguments. We need different arguments. |
How would we deal with the
I'm afraid that would badly impact the GC latency or something else if it is true. Please consider the disk i/o syscall that may block a very long time. |
Since you agreed that the pinning is required in the answer before, I don't understand how such an implementation could be used in Go.
I don't think "add more" is the right wording. It's more about exposing the pinning in a better way. And these are not arguments for doing it, but arguments against the supposed risks of doing it. The argument for doing it should be clear by now: give people a zero-copy way to use APIs like In your answers you skipped the first part about the bounded pinning. If you have the time to comment on these too, I would be very interested. |
The current system for pinning pointers doesn't permit pointers to be pinned indefinitely, if we discount the unusual case of a C function that does not return. I agree that other systems that somehow ensure that pointers can't be pinned indefinitely are better. (I don't think that an implicit |
I guess you're right, if we don't need to check top-level pointers then the thing we're checking already being pinned is the common case. |
The problem with reusing the specials list is, that the GC seems to modify it without lock while sweeping, so you have to synchronize with the GC and acquire the specials lock. If we would use a separate list for pinned pointers, I guess we could realize lock-free iteration. Speaking of bitmaps: a |
I wrote an implementation that uses a Costs if pinning is not used:
Costs if pinning is used:
While implementing, a couple of behavioural decisions came up, that I want to raise here:
In general I would default to panics, because they give feedback to the author about possible issues. But with the double pin I'm not so sure if it might become annoying. As always: feedback is highly appreciated. Thanks! |
Change https://golang.org/cl/367296 mentions this issue: |
Sorry for the delay. As agreed with @ianlancetaylor in the CL I will outline possible approaches here, so we can agree on them, and then proceed to a detailed reviewing of the implementation. TL;DR: we need decisions for
DetailsThere are two code paths where it is necessary to check if Go memory is pinned or not:
These checks should be very fast and must be @aclements requested in the CL, that the API should support multi-pinning, so that the same address could be pinned several times, and needs to be unpinned the same number of times for the memory to become movable again. This requires counters and a dynamic data structure (or otherwise it would require a lot of allocated memory). One suggestion was to use the "specials" list in the span struct. However, that seems to add quite some complexity, which would also be less efficient and might not be necessary, because the two checks above only need to know, if an address is pinned or not, and not how many times. Therefore I want to suggest this approach: All the logic of pinning/unpinning and counting is implemented with normal dynamic Go data structures, and could even live outside of the runtime package. This pinner logic updates accordingly a bool in the span struct that reflects if the address block that the pointer is pointing to is pinned or not. The "address block" can have the granularity that we decide is required or useful:
I assume we would go for the finest granularity (per span object), but I wanted to mention it anyway, because it should probably also match to whatever would be a "movable block" later, if the GC will be moving memory around. The next question would be, which data structure we use for the pinning logic. There are several options, and I even have some implementations for these options from my earlier experiments:
Because this data structure is only used when pinning or unpinning an address, but not for any of the hot-path checks from above, maybe the first and most simple approach is already sufficient? |
Note that cgocheck=2 is changing to a build-time instead of run-time switch, see https://go-review.googlesource.com/c/go/+/447778 . Also, speed for cgocheck=2 is ~irrelevant, slow is not a deal breaker. (This would be for 1.21.) |
Thanks @randall77 for the heads up! Any comments/opinions to my questions from #46787 (comment)? |
I think we do want to track pinning on every object somehow. Just having a bit per span is not sufficient, as we want to accurately report missing pins. (I'm presuming here by "bit per span", you intend that if a bit is set on a span, all objects in that span are considered pinned? That would lead to false negatives from cgocheck.) I don't have a strong opinion on what data structure to use. Something sync.Pool-y maybe? It would be nice to have a small buffer of pinned things per P that would be easy to put something in, take something out, and check presence. Only if that buffer overflows would we need to fall back to something more general. That said, it adds complication. |
I agree with @randall77 that the bit should definitely be per object, but thanks for considering that dimension.
This might just be because I'm used to working with span specials, but your other proposed solutions seem more complicated to me, or at least significantly less efficient. Note that in my proposed approach the pin checks are still a simple bitmap check. Only Pin/Unpin need to consult the specials. Here's a simpler version of my proposal that wouldn't be quite as efficient, but sticks to a single-bit bitmap:
I do like how simple this is, but I worry about the performance of a single global lock on the map. Perhaps we start with this and then optimize in a follow-up?
I have implemented these. They are extremely hard to get right and don't usually have the performance properties you want them to have, especially if the tree is usually small. |
Change https://go.dev/cl/465676 mentions this issue: |
@aclements thanks for your answer. Maybe there is a misunderstanding? My suggestion would have identical efficiency, IIUC, because both our proposals would use the bitmap for the "isPinned()" question. The difference would only be, if the pin counters are stored in the specials list, or in another data structure like a map, which is only accessed by Pin/Unpin functions. So efficiency regarding the gocheck code would be the same for bitmap+specials and bitmap+map. Right?
My proposal would be exactly the same, only that it stores the counter in a map, not in the specials list. If there is a good reason, I can do the specials-list approach, but I though a map-based solution would be less error prone.
That sounds good.
Right.
Yeah, I agree. So the last open question is then: bitmap+(counter in mutex+map) or bitmap+(counter in specials-list). As I said before, I think mutex+map is less error prone and the performance difference for Pin/Unpin should not matter. But I'm happy to be convinced otherwise and go for the specials-list. Thanks |
So, I went ahead and implemented multi-pinning with both approaches, a version with a global mutexed map, and then a version based on specials. Both seem to work fine, so I pushed the specials-based version, because that apeared to be @aclements preference. I still have several questions about some code which I added as comments in the latest patch. PTAL: https://go-review.googlesource.com/c/go/+/367296 |
You mentioned being unable to use |
No, a |
Here are some benchmarks. Certainly they are not representative for real-world-code, but there are some differences. As expected the specials implementation has a higher baseline, but becomes more efficient, when there are multi-pinnings, especially parallel ones. Overall the difference is not big, and for the "common" use cases the map solution is even a bit better. However, the specials solution could be further tuned for the common cases. specials:
map:
|
Updated version beats the maps version in almost all scenarios: specials (improved):
|
Update, 2021-10-20: the latest proposal is the API in #46787 (comment).
Problem Statement
The pointer passing rules state:
and
There are C APIs, most notably the
iovec
based ones for vectored I/O which expect an array of structs that describe buffers to read to or write from. The naive approach would be to allocate both the array and the buffers withC.malloc()
and then either work on the C buffers directly or copy the content to Go buffers. In the case of Go bindings for a C API, which is assumably the most common use case for Cgo, the users of the bindings shouldn't have to deal with C types, which means that all data has to be copied into Go allocated buffers. This of course impairs the performance, especially for larger buffers. Therefore it would be desirable to have a safe possibility to let the C API write directly into the Go buffers. This, however, is not possible becauseObviously, what is missing is a safe way to pin an arbitrary number of Go pointers in order to store them in C memory or in passed-to-C Go memory for the duration of a C call.
Workarounds
Break the rules and store the Go pointer in C memory
(click)
with something like
but
GODEBUG=cgocheck=2
would catch that.However, you can circumvent cgocheck=2 with this casting trick:
This might work, as long as the GC is not moving the pointers, which might be a fact as of now, but is not guaranteed.
Break the rules and hide the Go pointer in Go memory
(click)
with something like
Again: This might work, as long as the GC is not moving the pointers.
GODEBUG=cgocheck=2
wouldn't complain about this.Break the rules and temporarily disable
cgocheck
(click)
If hiding the Go pointer as a uintptr like in the last workaround is not possible, passing Go memory that contains Go pointers usually bails out because of the default
cgocheck=1
setting. It is possible to disable temporarilycgocheck
during a C call, which can especially useful, when the pointer have been "pinned" with one of the later workarounds. For example the_cgoCheckPtr()
function, that is used in the generated Cgo code, can be shadowed in the local scope, which disables the check for the following C calls in the scope:Maybe slightly more robust, is to export the runtime.dbgvars list:
Use a C function to store the Go pointer in C memory
(click)
The rules allow that a C function stores a Go pointer in C memory for the duration of the call. So, for each Go pointer a C function can be called in a Go routine, that stores the Go pointer in C memory and then calls a Go function callback that waits for a release signal. After the release signal is received, the Go callback returns to the C function, the C function clears the C memory from the Go pointer, and returns as well, finishing the Go routine.
This approach fully complies with the rules, but is quite expensive, because each Go routine that calls a C function creates a new thread, that means one thread per stored Go pointer.
Use the
//go:uintptrescapes
compiler directive(click)
//go:uintptrescapes
is a compiler directive thatSo, similar to the workaround before, a Go function with this directive can be called in a Go routine, which simply waits for a release signal. When the signal is received, the function returns and sets the pointer free.
This seems already almost like a proper solution, so that I implemented a package with this approach, that allows to
Pin()
a Go pointer andPoke()
it into C memory: PtrGuardBut there are still caveats. The compiler and the runtime (cgocheck=2) don't seem to know about which pointers are protected by the directive, because they still don't allow to pass Go memory containing these Go pointers to a C function, or to store the pointers in C memory. Therefore the two first workarounds are additionally necessary. Also there is the small overhead for the Go routine and the release signalling.
Proposal
It would make Cgo a lot more usable for C APIs with more complex pointer handling like
iovec
, if there would be a programmatic way to provide what//go:uintptrescapes
provides already through the backdoor. There should be a possibility to pin an arbitrary amount of Go pointers in the current scope, so that they are allowed to be stored in C memory or be contained in Go memory that is passed to a C function within this scope, for example with aruntime.PtrEscapes()
function. It's cumbersome, that it's required to abuse Go routines, channels and casting tricks in order provide bindings to such C APIs. As long as the Go GC is not moving pointers, it could be a trivial implementation, but it would encapsulate this knowledge and would give users a guarantee.I know from the other issues and discussions around this topic that it's seen as dangerous if it is possible to pin an arbitrary amount of pointers. But
//go:uintptrescapes
functions, therefore it is also possible to pin arbitrary amount of Go pointers already.Related issues: #32115, #40431
/cc @ianlancetaylor @rsc @seebs
edit: the first workaround had an incorrect statement.
edit 2: add workarounds for disabling cgocheck
The text was updated successfully, but these errors were encountered: