Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsafe: add Add function #40481

Open
rsc opened this issue Jul 29, 2020 · 27 comments
Open

unsafe: add Add function #40481

rsc opened this issue Jul 29, 2020 · 27 comments

Comments

@rsc
Copy link
Contributor

@rsc rsc commented Jul 29, 2020

@mdempsky has a proposal from 2014 to add unsafe.Add, analogous to the existing (unexported) runtime function

func add(p unsafe.Pointer, x uintptr) unsafe.Pointer {
       return unsafe.Pointer(uintptr(p) + x)
}

#19367 is about adding unsafe.Slice, and #395 is about slice to array conversion. Doing unsafe.Add at the same time would be a good bundle so that people who want to clean up uses of unsafe can do it all at one time.

This issue is to propose unsafe.Add, probably as a function, but maybe as a method on Pointer? Matthew's doc discusses the tradeoffs there.

@rsc rsc added the Proposal label Jul 29, 2020
@rsc rsc added this to Active in Proposals Jul 29, 2020
@rsc rsc added this to the Proposal milestone Jul 29, 2020
@seebs
Copy link
Contributor

@seebs seebs commented Jul 29, 2020

func (base Pointer) Add(offset uintptr) (derived Pointer) { ... }
?

... oh you actually already suggested that. I guess consider this a vote for a method, analogous to time.Add(). And this implies, perhaps, func (derived Pointer) Sub(base Pointer) (offset uintptr) { ... }

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jul 29, 2020

@seebs The proposal mentions Sub at the bottom too.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jul 29, 2020

In addition to unsafe.Add (or unsafe.Pointer.Add), I think it would also be worth considering adding some methods to package reflect like:

func (h *SliceHeader) DataPointer() unsafe.Pointer { return unsafe.Pointer(h.Data) }
func (h *SliceHeader) SetDataPointer(p unsafe.Pointer) { h.Data = uintptr(p) }

func (h *StringHeader) DataPointer() unsafe.Pointer { return unsafe.Pointer(h.Data) }
func (h *StringHeader) SetDataPointer(p unsafe.Pointer) { h.Data = uintptr(p) }

func (v Value) UnsafePointer() unsafe.Pointer { return unsafe.Pointer(v.Pointer()) }

If these methods were in place, we could issue vet warnings against using accessing Data directly or calling Value.Pointer or Value.Addr. unsafe.Pointer(v.UnsafeAddr()) can be rewritten to v.Addr().UnsafePointer(), but we could also add variant of UnsafeAddr that returns unsafe.Pointer directly too.

Together with unsafe.Add, I think this would eliminate any need for uintptr -> unsafe.Pointer conversions. For backwards compatibility, we'd still need to support them; but we'd at least be able to guide users towards less-error-prone solutions.

Historically, package reflect couldn't return unsafe.Pointer values directly, because of safe mode, which restricted use of unsafe.Pointer. But we removed safe mode in 2018, as we have better ways to sandbox untrusted Go programs nowadays and restricting use of unsafe.Pointer is intrusive upon the Go ecosystem.

@seebs
Copy link
Contributor

@seebs seebs commented Jul 29, 2020

Hmm. I sort of like that, but a concern would be that it's potentially wrong to set the data pointer without first setting len/cap appropriately, so if you have a SetDataPointer, but not the other parts, I feel like this will be a temptation to do unsafe things.

(And just a wild loose-end idea: Distinguish between runtime.Pointer and unsafe.Pointer, with runtime.Pointer only permitting/admitting known-safe operations, while unsafe.Pointer admits risky ones? Probably not useful, though.)

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jul 29, 2020

it's potentially wrong to set the data pointer without first setting len/cap appropriately

The order that you set data/len/cap don't matter as long as you set all of them before using the underlying string/slice again.

However, it's a fair point that maybe:

func (h *SliceHeader) Get() (data unsafe.Pointer, len, cap int)
func (h *SliceHeader) Set(data unsafe.Pointer, len, cap int)

func (h *StringHeader) Get() (data unsafe.Pointer, len int)
func (h *StringHeader) Set(data unsafe.Pointer, len int)

would be a useful abstraction instead or as well. In fact, I think abstracting away the fields into methods like this would address the long-term compatibility issues that "It cannot be used safely or portably and its representation may change in a later release." are meant to caveat. No matter the underlying representation, these methods could still work.

Though this also starts trending towards overlap with the earlier suggestions of #19367.

And just a wild loose-end idea: Distinguish between runtime.Pointer and unsafe.Pointer, with runtime.Pointer only permitting/admitting known-safe operations, while unsafe.Pointer admits risky ones?

Do you have any known-safe operations in mind that aren't already supported by interface{} or reflect.Value?

@seebs
Copy link
Contributor

@seebs seebs commented Jul 29, 2020

I saw a thing, somewhere, which said in passing that it was important to set len/cap to either 0 or the smaller of the values before setting data, or else you'd have an inconsistent state, but on consideration I can't see why it would matter unless something else was accessing the slice concurrently, which it presumably shouldn't be. I'm now trying to remember where I saw this, and what the rationale was.

As to known-safe: My vague idea was roughly that a runtime.Pointer might be a richer datatype, which also had information about known bounds, so it could reject or fail an Add() or a Sub() with invalid parameters, while unsafe.Pointer would assume you know what you're doing. This is probably a large and separate idea, and probably not a great idea, but I am a bit stream-of-consciousness some days. So, runtime.Pointer wouldn't imply genuine and complete knowledge of a pointer's real bounds, it would just be working from provenance; if you have a var a [7]int, and you get a runtime.Pointer to a[0], it's the address plus the knowledge that the bounds are [0,7*sizeof(int)], so it can detect/diagnose modifications.

I'm admittedly also coming up light on circumstances where you have enough information that this would be useful, but you also have any reason to use pointers and pointer arithmetic.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jul 29, 2020

it was important to set len/cap to either 0 or the smaller of the values before setting data, or else you'd have an inconsistent state

Yeah, this at least isn't an issue under any existing Go implementation (to my knowledge). The Go runtime's GC only pays attention to the Data field; it doesn't care about the Len or Cap fields. And when it comes to concurrent access, you'll need to synchronize access to the entire slice/string value anyway, so the order you update individual fields doesn't matter. (Unless you're doing sequenced atomic stores of each field, but then you need matched atomic loads at access sites; at which point you're talking about an application-specific synchronization protocol, not a general Go coding pattern.)

My vague idea was roughly that a runtime.Pointer might be a richer datatype, which also had information about known bounds, so it could reject or fail an Add() or a Sub() with invalid parameters

I think you're describing reflect.Value. :)

@beoran
Copy link

@beoran beoran commented Aug 5, 2020

I like the original proposal, so I'd also like to get the func unsafe.Sub(p, q Pointer) uintptr, func unsafe.Less(p, q Pointer) uintptr, and func unsafe.IsSameVariable(p, q Pointer) bool as well, please :) . With all of them in place, pointer arithmetic will become a lot easier to do right in Go.

@rsc
Copy link
Contributor Author

@rsc rsc commented Aug 5, 2020

@beoran, while you might use unsafe.Add with the offsets you find in reflect information (particularly in StructField), I can't think of a corresponding time when you should ever use unsafe.Sub. Pointer arithmetic like in C is not safe, because you may construct an invalid pointer (C doesn't care, but Go does), nor is it a goal. Am I missing some typical use of subtracting pointers in Go? I've written plenty of unsafe code and can't remember ever doing that.

@mdempsky, we should probably start a separate discussion for the SliceHeader/StringHeader changes. It's unclear to me how much is left after unsafe.Slice is added. It's also unclear to me whether we should do new methods or just define better new types (with an unsafe.Pointer-typed field).

@rsc
Copy link
Contributor Author

@rsc rsc commented Aug 5, 2020

Other topics aside, the reception to adding unsafe.Add seems overwhelmingly positive on the reactions, with no objections in the text here.

This seems like a likely accept.
(Not trying to cut off the discussion of other changes - we can certainly discuss them separately.)

@rsc rsc moved this from Active to Likely Accept in Proposals Aug 5, 2020
@seebs
Copy link
Contributor

@seebs seebs commented Aug 5, 2020

I can think of a case where I would want a thing that I would call subtraction. I don't think it's quite applicable, but I'll post it in case it reminds someone of a thing that would be relevant:

The one case I've seen for pointer subtraction in C is not actually probably defined behavior there, but "everyone knows it has to work", and that is that if you know that an object is in fact a field of a struct, you can take a pointer to the object and go back to a pointer to the containing structure, using its offset. (And in C, the pointer to the inner object is allowed to "know" that it is a pointer to that specific object, rather than a pointer to the containing object, and thus have narrower bounds. I think. If you got three or four committee members together and gave them drinks they could probably give you at least two credible answers.)

The use case for this is, for instance, to have one or more linked-list data structures threaded through structures, and you can iterate over the lists using code that only knows about the list structure, and then go back to the containing parent structure, so you can have generic list-management code working on lists that are actually part of larger objects, and this is sometimes convenient.

But... while that's "subtracting" in arithmetic terms, it's not subtracting in type terms, it's adding a negative offset.

I can't actually think of a use case for pointer.Sub(pointer) -> offset that I can't do better some other way. I thought of it for the same reason that Time.Add(Duration) -> Time sort of implies Time.Sub(Time) -> Duration, but while subtracting times is common, I can't actually think of much of a use for subtracting pointers, in general. I guess as a way to compute OffsetOf, except that I think any time I'd want that, I would be able to do it with Reflect or with OffsetOf or ... basically, yeah, I can't think of anything.

@rsc rsc changed the title proposal: add unsafe.Add unsafe: add Add Aug 5, 2020
@beoran
Copy link

@beoran beoran commented Aug 5, 2020

Considering it more, with what @seebs is saying, I think that it's true that substracting pointers as such is useless. However, a func unsafe.Containing(p Pointer, offset uintptr) Pointer for use in conjunction with unsafe.Offsetof to go from a pointer to a struct member to the containing struct would be quite useful, for C interoperability, and low level internal linked lists. Shall we split this off in a separate issue or keep it here?

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Aug 5, 2020

@beoran You can use unsafe.Add(p, -n) to subtract an offset from a pointer.

Caveat: If n is a positive constant, you'll hit issues with not allowing negative values of type uintptr. This can be circumvented either by first assigning it to a local variable, or using unsafe.Add(p, ^n + 1).

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 5, 2020

@beoran I think unsafe.Contains is more controversial than unsafe.Add, so, yes, I think that that should be a separate proposal. (I'll note that I don't understand how it could be implemented, so I must misunderstand what it means.)

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Aug 5, 2020

I think @beoran's proposed unsafe.Containing(p, x) is essentially just unsafe.Add(p, -x). I understand the name as "return a pointer to the containing structure where p is at offset x" (e.g., unsafe.Containing(&p.x, unsafe.Offsetof(p.x)) == &p). But I'll defer to them if I've misunderstood.

@beoran
Copy link

@beoran beoran commented Aug 5, 2020

Yes, it is as @mdempsky says. If the implementation of unsafe.Add will allow unsafe.Add(p, -x) with a negative offset, then that is sufficient for my purposes, and no unsafe.Containing will be needed.

@rsc
Copy link
Contributor Author

@rsc rsc commented Aug 12, 2020

Still a likely accept. Waiting on #395.

@kortschak
Copy link
Contributor

@kortschak kortschak commented Aug 14, 2020

Arguably in the same family of operations is the ability to obtain a memory offset for indexed items in a slice or array (#12445). Could this be considered in package of additions (or nearby)?

@rsc
Copy link
Contributor Author

@rsc rsc commented Aug 26, 2020

@kortschak I created #41049 for that suggestion. Thanks.

@rsc
Copy link
Contributor Author

@rsc rsc commented Sep 2, 2020

We started looking at this because of the unsafe conversion issues, but this is not a conversion and is mostly unrelated to those. The outcome for those seems unclear still, but this one can probably be separated from them.

This was already a likely accept, and there's been no change in consensus. Accepted.

Update: To clarify, what is accepted is adding func Add(p Pointer, n uintptr) Pointer to package unsafe. (Not a method on Pointer.)

@rsc rsc moved this from Likely Accept to Accepted in Proposals Sep 2, 2020
@rsc rsc modified the milestones: Proposal, Backlog Sep 2, 2020
@muhlemmer
Copy link
Contributor

@muhlemmer muhlemmer commented Sep 10, 2020

@rsc; May I volunteer to send a CL? I would like to gain some more experience in committing to the Go project in my spare time.

To check if I understand correctly:

  1. This issue (and subsequent CL) is only about the Add method on the unsafe.Pointer type. Will other changes that where discussed here be done separately?
  2. The method will return a new intsance of unsafe.Pointer, the instance of pointer on which the method Add is called remains unchanged.
@rsc rsc changed the title unsafe: add Add unsafe: add Add function Sep 10, 2020
@rsc
Copy link
Contributor Author

@rsc rsc commented Sep 10, 2020

@muhlemmer Package unsafe is not a normal Go package written in Go source files - it is a special concept implemented entirely in the compiler. While contributions are welcome, if you don't have any experience in the compiler, it is likely to take a while to get up to speed, and you might not want to spend that time unless you expect to work more in the compiler in the future.

I've also clarified above that the accepted proposal is for a top-level function unsafe.Add, not a method on Pointer.

@muhlemmer
Copy link
Contributor

@muhlemmer muhlemmer commented Sep 16, 2020

Thanks for the warning. I've been looking around, but I'm afraid this is yet out my reach. Or at least, it might take quite some time until I get my head around this. It is probably best if someone with experience does this.

@komuw
Copy link
Contributor

@komuw komuw commented Oct 3, 2020

@muhlemmer
Mathew worked on this recently and livestreamed as he did so.
https://www.twitch.tv/videos/758956772?t=0h41m26s (Starts at around the 41st minute mark).

apologies for spamming the thread; but I thought you might be interested having expressed interest in working on this. I found the session quite helpful for someone who has never contributed to the compiler.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Oct 21, 2020

@mdempsky, I didn't watch that whole video, but are you targeting Go 1.16 for this?

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Oct 21, 2020

@bradfitz I had 1.16 in mind, but Russ milestoned #19367 for Go 1.17 last week, so I assumed this was too and haven't prioritized working on either of them further for 1.16. I think they're both still feasible to get in for 1.16 though.

@rsc
Copy link
Contributor Author

@rsc rsc commented Oct 21, 2020

I hadn't remembered this but yeah they probably should all go out together as part of a coherent update to unsafe. Thanks for implementing, and let's save for Go 1.17.

@rsc rsc modified the milestones: Backlog, Go1.17 Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Proposals
Accepted
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
You can’t perform that action at this time.