Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: unsafe: clarify unsafe.Pointer rules for package syscall #34684

Open
mdempsky opened this issue Oct 3, 2019 · 23 comments

Comments

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Oct 3, 2019

unsafe.Pointer's rule 4 says:

Conversion of a Pointer to a uintptr when calling syscall.Syscall.

The Syscall functions in package syscall pass their uintptr arguments directly to the operating system, which then may, depending on the details of the call, reinterpret some of them as pointers. That is, the system call implementation is implicitly converting certain arguments back from uintptr to pointer.

It talks about "the Syscall functions", but on Windows, there's also Proc.Call and LazyProc.Call.

Q1: Are these two functions "Syscall functions"? Strictly speaking, I would interpret "Syscall functions" to mean Syscall{,6,9,12,15,18}. Perhaps the docs should be clarified.

Q2: Do functions have to be called directly, or are indirect calls allowed? The rule explicitly warns that conversions have to be performed directly in the argument list, but it doesn't say anything about indirect calls.

Q3: For Proc.Call and LazyProc.Call, are direct calls via method expressions allowed? E.g., if x.Call(uintptr(p), 0, 0) is valid, then is (*Proc).Call(x, uintptr(p), 0, 0) also valid?

Clarifying this is relevant to determining how far we need to go in fixing #34474.

Incidentally, Q2 and Q3 also apply to rule 5. The status quo there is that cmd/compile safely handles unsafe.Pointer(f(...)) for all f(...), so we do allow indirect and method expression calls to reflect.Value.Pointer and reflect.Value.UnsafeAddr.

/cc @rsc @ianlancetaylor

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 3, 2019

The part you quoted is the rationale. The real rule is in the part you didn't quote:

The compiler handles a Pointer converted to a uintptr in the argument list of a call to a function implemented in assembly by arranging that the referenced allocated object, if any, is retained and not moved until the call completes, even though from the types alone it would appear that the object is no longer needed during the call.

For the compiler to recognize this pattern, the conversion must appear in the argument list:

My answers, such as they are, to your questions.

Q1: Are these two functions "Syscall functions"? Strictly speaking, I would interpret "Syscall functions" to mean Syscall{,6,9,12,15,18}. Perhaps the docs should be clarified.

Per above, the rule applies to any function implemented in assembly. syscall.Proc.Call and syscall.LazyProc.Call are not implemented in assembly, and, as such, this rule does not apply to them.

Q2: Do functions have to be called directly, or are indirect calls allowed? The rule explicitly warns that conversions have to be performed directly in the argument list, but it doesn't say anything about indirect calls.

I think the functions have to be called directly, and we should update the unsafe package docs to say that.

Q3: For Proc.Call and LazyProc.Call, are direct calls via method expressions allowed? E.g., if x.Call(uintptr(p), 0, 0) is valid, then is (*Proc).Call(x, uintptr(p), 0, 0) also valid?

I suppose that first we have to come up with a rule for Proc.Call and LazyProc.Call.

When I introduced go:uintptrescapes for #16035, my thinking was basically that there was no general rule that could make that code work, but since it already existed and people were using it we should at least try to keep those programs working. It wasn't intended to introduce a new general unsafe.Pointer exception, just one for those two methods.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 3, 2019

The part you quoted is the rationale. The real rule is in the part you didn't quote:

Hm, I've always interpreted the first line as the rule, and everything below as the rationale, restrictions, and details. For example, that rule 4 is you can convert unsafe.Pointer-to-uintptr when calling syscall.Syscall*, and the part you describe as the real rule is just an explanation of the implementation details about how the compilers implement this today (i.e., that syscall.Syscall* are implemented as assembly functions, and the compilers generically handle calls to assembly functions this way).

If rule 4 is in fact that all assembly functions should be handled that way (with syscall.Syscall as just a special case thereof), I think we should reword the first sentence to better emphasize that.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 3, 2019

Yes, my interpretation is that this applies to all functions written in assembly, since the rule clearly applies to more than the single function syscall.Syscall. But perhaps I am mistaken.

@beoran

This comment has been minimized.

Copy link

@beoran beoran commented Oct 4, 2019

I think making //go:uintptrescapes work correctly for these functions is probably the best way to go, since there really isn't a general rule, it depends on semantics of the OS call.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 4, 2019

@beoran The issue here is we've never defined exactly what it means for //go:uintptrescapes to "work correctly". As far as I can tell, it's not documented anywhere (e.g., neither cmd/compile's godocs, nor src/runtime/HACKING.md).

I've generally been under the assumption that (1) rule 4 is meant to apply specifically to package syscall's functions like Syscall, RawSyscall, Proc.Call, and LazyProc.Call, and (2) //go:uintptrescapes and cmd/compile's semantics for calling assembly functions are implementation details about how we guarantee rule 4 today.

For example, I'm under the impression that while today users can write their own assembly functions that pass pointers as uintptr or //go:uintptrescapes-annotated Go functions that do the same, we do not guarantee this to work in the future and reserve the right at any point to break this (as long as package syscall's Syscall functions continue to work correctly).

However, it seems at least @ianlancetaylor had a different interpretation.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 9, 2019

My thoughts, not definitive:

Q1: It seems like the answer must be yes or else those functions are impossible to use safely at all. That's the same reason we answer yes to syscall.Syscall.

Q2: It would be nice if indirect calls worked, but it seems like they probably can't be made to work without significant overhead, and it also seems like they don't work today. If both those are true, then probably we should say that indirect calls don't apply.

Q3: In general there's almost zero difference between x.M() and T.M(x), so I don't see why we'd introduce a difference here.

So I guess I'm suggesting yes/no/yes.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 9, 2019

Thanks for sharing your thoughts, @rsc.

Q2: It would be nice if indirect calls worked, but it seems like they probably can't be made to work without significant overhead, and it also seems like they don't work today. If both those are true, then probably we should say that indirect calls don't apply.

It should only introduce overhead for indirect calls involving unsafe.Pointer->uintptr conversions as arguments, and I don't think those are very common. So I don't expect the overhead would be significant, but I'll measure this.

Q3: In general there's almost zero difference between x.M() and T.M(x), so I don't see why we'd introduce a difference here.

That's my tentative position as well. The two counter arguments here though are (1) historically it hasn't worked; and (2) package unsafe imposes its own set of rules, and maybe we think T.M(x) is uncommon enough to not allow it.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 9, 2019

Across the standard library and Kubernetes, the only indirect call that involves a unsafe.Pointer->uintptr conversion is here:

// Unmap the memory and update m.
if errno := m.munmap(uintptr(unsafe.Pointer(&b[0])), uintptr(len(b))); errno != nil {
return errno
}

(And the same code appears in x/sys/unix.)

For this particular call, the overhead would amount to an extra stack slot and an instruction to populate it.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 9, 2019

Change https://golang.org/cl/200137 mentions this issue: cmd/compile: warn about indirect calls with unsafe.Pointer->uintptr conversions

@beoran

This comment has been minimized.

Copy link

@beoran beoran commented Oct 9, 2019

I think //go:uintptrescapes would be nice to have to make indirect calls as in Q2 possible. It makes it easier to keep Go code that interests with the operating system well factored and easier to read.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 9, 2019

How do you plan to fix the indirect calls?

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 9, 2019

How do you plan to fix the indirect calls?

Conservatively assume any indirect function calls might be to an assembly function or a function annotated with //go:uintptrescapes, and handle unsafe.Pointer->uintptr conversions passed to their uintptr-typed arguments appropriately. See CL 200137 for details.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 9, 2019

I think //go:uintptrescapes would be nice to have to make indirect calls as in Q2 possible. It makes it easier to keep Go code that interests with the operating system well factored and easier to read.

Sorry, I don't understand what you're suggesting here.

In general, users shouldn't have to worry about //go:uintptrescapes. It's an implementation detail about how we make sure package syscall works.

Go doesn't support users writing their own functions that accept or return pointers using uintptr-typed parameters.

@beoran

This comment has been minimized.

Copy link

@beoran beoran commented Oct 9, 2019

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 9, 2019

Notice the KeepAlive? It would be great if it wasn't needed and I could instruct the compiler that everything escapes, maybe with the mentioned pragma.

I agree it would be nice if users didn't have to manually insert runtime.KeepAlive calls when using os.File.Fd. Or if there was at least vet/lint tooling to suggest when they probably need it.

However, file descriptors are not pointers, so the unsafe.Pointer safety rules do not apply to os.File.Fd. You're suggesting entirely new functionality, whereas this issue is about clarifying corner cases of existing functionality.

I recommend filing a new feature request / proposal issue if you'd like to pursue that idea.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 30, 2019

@mdempsky, it sounds like you are suggesting that the
"Conversion of a Pointer to a uintptr when calling syscall.Syscall."
section effectively be retitled to
"Conversion of a Pointer to a uintptr when calling a function."
and that after such a function call (any call at all) there would be a keepalive of the pointer inserted immediately after the return from the call. (And whatever pinning is needed during the call.)

Do I have that right?

(You didn't specifically say "any call", but it seems like if you are going to do a fixed set of specific calls as well as all indirect calls, you might as well just complete the set and do all calls. Otherwise direct calls are somehow disadvantaged compared to indirect calls.)

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 30, 2019

@rsc I think that's close, yes.

I would say my suggestion was specifically that the conversion is safe when "calling (directly or indirectly) to a fixed set of specific functions".

The particular implementation detail today would be that we handle all indirect calls as though they might be to one of those functions (which has very few false positives), but compiler optimizations might later let us rule some of those out. E.g., calls to non-exported interface methods (outside of package syscall) can never be one of those specific functions; moving escape analysis to SSA might allow us to see that the set of possible called functions at a call-site is disjoint from the set of specific functions; or a Go JIT that does dynamic monomorphic call optimization might know the target function isn't one of those specific functions.

I think further simplifying to "when calling a function" is reasonable and has merits, but that's not specifically my suggestion. My only concern would be this might introduce more unnecessary KeepAlive calls in current code; I'll measure this.

@beoran

This comment has been minimized.

Copy link

@beoran beoran commented Oct 30, 2019

I am definitely in favour of simplifying this to "when calling a function", as this makes low level programming that much easier. As for unnecessary KeepAlive calls, in this case, I thought KeepAlive is always needed, only now it sometimes accidentally seems to work without the call.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 30, 2019

I am definitely in favour of simplifying this to "when calling a function", as this makes low level programming that much easier.

Note that even if we generalize to "when calling a function", to allow users to write their own functions that accept pointers as uintptr-typed parameters we'll still need to specify how those uintptr parameters can actually be used by the function.

Currently we side step that because only the standard library contains this code, and we can ensure it's kept in sync with the compiler's own conventions (e.g., using undocumented compiler directives).

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 1, 2019

Change https://golang.org/cl/198043 mentions this issue: cmd/compile: fix //go:uintptrescapes for basic method calls

gopherbot pushed a commit that referenced this issue Nov 5, 2019
The logic for keeping arguments alive for calls to //go:uintptrescapes
functions was only applying to direct function calls. This CL changes
it to also apply to direct method calls, which should address most
uses of Proc.Call and LazyProc.Call.

It's still an open question (#34684) whether other call forms (e.g.,
method expressions, or indirect calls via function values, method
values, or interfaces).

Fixes #34474.

Change-Id: I874f97145972b0e237a4c9e8926156298f4d6ce0
Reviewed-on: https://go-review.googlesource.com/c/go/+/198043
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 5, 2019

Change https://golang.org/cl/205244 mentions this issue: [release-branch.go1.13] cmd/compile: fix //go:uintptrescapes for basic method calls

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 6, 2019

It sounds like the "any function" rules may be fine but we are waiting on @mdempsky to confirm that there's not unexpected overhead. I'm going to tag this Proposal since it is a real (if small) change to the effective language.

@rsc rsc added the Proposal label Nov 6, 2019
@rsc rsc changed the title unsafe: clarify unsafe.Pointer rules for package syscall proposal: unsafe: clarify unsafe.Pointer rules for package syscall Nov 6, 2019
@gopherbot gopherbot removed the NeedsDecision label Nov 6, 2019
@rsc rsc modified the milestones: Backlog, Proposal Nov 6, 2019
@rsc rsc added this to Active in Proposals Nov 27, 2019
@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 27, 2019

Putting this proposal on hold until @mdempsky has had a chance to check the overheads.
Matthew, please feel free to remove the label when you are ready. No hurry. Thanks.

@rsc rsc added the Proposal-Hold label Nov 27, 2019
@rsc rsc moved this from Active to Hold in Proposals Dec 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
5 participants
You can’t perform that action at this time.