Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: unsafe: add Slice and String types, conversions to replace reflect.{Slice,String}Header #19367

Open
mdempsky opened this issue Mar 2, 2017 · 42 comments

Comments

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Mar 2, 2017

reflect.SliceHeader and reflect.StringHeader are clumsy to use because their Data fields have type uintptr instead of unsafe.Pointer.

This proposal is to add types unsafe.Slice and unsafe.String as replacements. They would be declared just like their package reflect analogs, except with unsafe.Pointer-typed Data fields:

type Slice struct {
    Data Pointer
    Len int
    Cap int
}

type String struct {
    Data Pointer
    Len int
}

Additionally, I suggest that for the purposes of type conversion, we treat that string and unsafe.String have the same underlying type, and also []T and unsafe.Slice. For example, these would be valid:

func makestring(p *byte, n int) string {
    // Direct conversion of unsafe.String to string.
    return string(unsafe.String{unsafe.Pointer(p), n})
}

func memslice(p *byte, n int) (res []byte) {
    // Direct conversion of *[]byte to *unsafe.Slice, without using unsafe.Pointer.
    s := (*unsafe.Slice)(&res)
    s.Data = unsafe.Pointer(p)
    s.Len = n
    s.Cap = n
    return
}

While the same results can be achieved using unsafe.Pointer conversions, by using direct conversions the compiler can provide a little extra type safety.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 2, 2017

If we do this, we should figure out a way to exempt these new types from the Go 1 compatibility guarantee, so that we can change the representation of strings and slices in the future. I'm not sure how best to do that.

@cespare

This comment has been minimized.

Copy link
Contributor

@cespare cespare commented Mar 2, 2017

@ianlancetaylor reflect.SliceHeader and reflect.StringHeader already try:

It cannot be used safely or portably and its representation may change in a later release.

but the compat doc itself gives a strong exemption for all of unsafe:

Packages that import unsafe may depend on internal properties of the Go implementation. We reserve the right to make changes to the implementation that may break such programs.

ISTM that unsafe.{Slice,String} would already be exempted sufficiently.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 6, 2017

Go 2 seems like the time to think about this (and reflect.SliceHeader etc).

-rsc for @golang/proposal-review

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 23, 2017

This proposal seems a bit redundant with #13656.

How much of the use-case is "create a string or slice aliasing C memory" vs. "manipulate existing strings and slices by tweaking header fields unsafely"?

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

I'd like to suggest renewing consideration of this proposal for Go 1.14. I think it will be useful for users trying to address issues flagged by -d=checkptr.

I'll also offer a counter-proposal that I think better addresses most end user needs in a more ergonomic manner:

package unsafe

func Slice(ptr *ArbitraryType, len, cap int) []ArbitraryType
func String(ptr *byte, len int) string

[Edit: As discussed below, I'm now in favor of combining Slice's len/cap parameter into a single parameter.]

This is a little less versatile than exposing the Header types, but I think it will minimize typing for most users, while also providing better type safety.

We could also do both this proposal and my original one, if we want to still offer the full flexibility of the Header types. In that case, I would suggest renaming the types to SliceHeader and StringHeader, and reserve the shorter Slice and String identifiers for the constructor functions.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 18, 2019

I like that counter proposal API.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

A few additional thoughts to add to my counter proposal:

  1. We should decide what happens when len < 0 or cap < len. I'm leaning towards panic, but maybe we should just leave it unspecified/undefined.

    Edit: ptr == nil && len > 0 is another case to consider.

    Edit 2: Also, len > MAXWIDTH / unsafe.Sizeof(*ptr).

  2. The functions would be builtins; in particular, users can't write f := unsafe.String; f(...).

  3. The cap argument to unsafe.Slice can be optional; if it's omitted, the len argument is used. (Just like make([]T, n) is shorthand for make([]T, n, n).)

  4. Perhaps the int parameters should actually follow the same goofy semantics that make([]T, n, m) follows. (I.e., make([]T, uint64(10), int8(20)) is valid, even though uint64 and int8 aren't normally assignable to int.)

  5. Since unsafe.String would be a builtin, it could evaluate to an untyped string.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 18, 2019

That API is closer to what I had suggested in https://golang.org/issue/13656#issuecomment-303216308, and we've been using that variant within Google for a couple of years now without complaints.

If the type desired for the slice does not match the pointer that the user has (for example, if one is a cgo-generated type and the other is a native Go type), I'm assuming that the caller could do something like:

	var s = unsafe.Slice((*someGoType)(unsafe.Pointer(cPtr)), len, cap)

to set the element type?

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 18, 2019

We should decide what happens when len < 0 or cap < len. I'm leaning towards panic, but maybe we should just leave it unspecified/undefined.

I would leave it unspecified, but panic is a fine implementation of “unspecified”.

Perhaps the int parameters should actually follow the same goofy semantics that make([]T, n, m) follows.

That would certainly smooth out the call site in the (overwhelmingly common) case that len and/or cap is a C.size_t.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

If the type desired for the slice does not match the pointer that the user has (for example, if one is a cgo-generated type and the other is a native Go type), I'm assuming that the caller could do something like:

	var s = unsafe.Slice((*someGoType)(unsafe.Pointer(cPtr)), len, cap)

to set the element type?

Yeah, that's my thought. If a user wants to convert *T into []U, then I think it's reasonable to require an explicit conversion there.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

I would leave it unspecified, but panic is a fine implementation of “unspecified”.

Ack, though my concern is if we panic by default, then users might come to rely on it panicking and not write their own checking.

It would be easy to put the panic behind -d=checkptr though.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 18, 2019

func Slice(ptr *ArbitraryType, len, cap int) []ArbitraryType

Can we instead do:

func Slice(ptr *ArbitraryType, len int[, cap int]) []ArbitraryType

... with an optional cap. Where omitting cap means cap == len?

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

@bradfitz Yeah, that's my additional thought #3 above. :)

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 18, 2019

if we panic by default, then users might come to rely on it panicking and not write their own checking.

Hmm, good point. We could make it a throw! 😉

Or we could make it a panic in ordinary code but a throw under -race or -d=checkptr. (The important thing, I think, is to vary it just enough that it causes tests to fail in some reasonably-common configuration.)

@jimmyfrasche

This comment has been minimized.

Copy link
Member

@jimmyfrasche jimmyfrasche commented Oct 18, 2019

Do you need to be able to specify len < cap? After it's constructed, it's a slice.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 18, 2019

@jimmyfrasche, good point. Requiring only cap would simplify the API and remove a number of behavior questions.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

Yeah, I like simplifying it to just:

func Slice(ptr *ArbitraryType, cap int) []ArbitraryType

It's not much more typing to write unsafe.Slice(ptr, cap)[:len] instead of unsafe.Slice(ptr, len, cap), and like @bradfitz points out it's a little simpler to specify and implement.

@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Oct 18, 2019

It's not much more typing to write unsafe.Slice(ptr, cap)[:len] instead of unsafe.Slice(ptr, len, cap), and like @bradfitz points out it's a little simpler to specify and implement.

This was my argument against the 3-arg slice op: a[x:y:z] = a[x::z][:y-x], so we would only need the 2-arg [x:y] and [x::y] operators. I lost that argument :(

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

https://go-review.googlesource.com/c/go/+/202080 contains a prototype implementation of the counter-proposal for cmd/compile.

Edit: And CL 202082 demonstrates some usage of it within the Go runtime.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 18, 2019

Change https://golang.org/cl/202080 mentions this issue: cmd/compile: implement unsafe.Slice and unsafe.String

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 18, 2019

Change https://golang.org/cl/202082 mentions this issue: runtime: make use of unsafe.Slice

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 21, 2019

Change https://golang.org/cl/201839 mentions this issue: cmd/compile: recognize (*[Big]T)(ptr)[:n:m] pattern for -d=checkptr

gopherbot pushed a commit that referenced this issue Oct 21, 2019
A common idiom for turning an unsafe.Pointer into a slice is to write:

    s := (*[Big]T)(ptr)[:n:m]

This technically violates Go's unsafe pointer rules (rule #1 says T2
can't be bigger than T1), but it's fairly common and not too difficult
to recognize, so might as well allow it for now so we can make
progress on #34972.

This should be revisited if #19367 is accepted.

Updates #22218.
Updates #34972.

Change-Id: Id824e2461904e770910b6e728b4234041d2cc8bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/201839
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 11, 2019

@rsc suggests adding a new, optional, parameter to make of a slice type:

    make([]byte, p, l, c)

This example would make a new slice of type []byte with the underlying array set to p, the length set to l, and the capacity set to c. This could be used with any slice type. The new p parameter would be required to have type unsafe.Pointer (which would permit distinguishing this new case from existing ones, in which the second parameter must have integer type or be an untyped constant). The capacity parameter c would be optional as it is today.

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Dec 11, 2019

I don't see how further overloading make buys that much in terms of parsimony. And it makes it less obvious at call sites what is going on; the types of the parameters can be effectively invisible to a human reader of code, but unsafe.Slice never will be.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

I don't think make is actually a great fit. For all other types, make allocates some kind of backing store (on either the stack or the heap) and returns header that refers to that backing store.

In contrast, punning an unsafe.Pointer to a slice does not allocate any kind of backing store — it only returns a header that refers to the existing data. That feels like a different operation to me — more like a conversion than a make.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 11, 2019

My initial reaction is that having make construct a []T from unsafe.Pointer is less type-safe than unsafe.Slice constructing []T from *T, but after re-reviewing the likely use sites, it seems less of a problem than I initially suspected. In most of the cases in CL 202082, the *T was explicitly converted from unsafe.Pointer anyway.

I think like @josharian says, it might be tricky reading the code to understand what's going on, but I think technically it works and addresses the issue.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 11, 2019

@bcmills I think there's some analogy to be made to C++'s placement new operator. Normally new allocates new memory, but placement new uses existing memory.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Dec 11, 2019

So we going to call it "placement make"?

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

The ergonomics of the “placement make” form seem worse compared to unsafe.Slice. To draw an example from https://golang.org/cl/202082:

Under @mdempsky's proposal, we have:

	scases := unsafe.Slice(cas0, ncases)

with the element type propagated from element type of the existing cas0 variable.

In contrast, under @rsc's alternative we would have:

	scases := make([]scase, unsafe.Pointer(cas0), ncases)
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

Riffing on the theme of “feels more like a conversion”, maybe we could allow arrays of non-constant type in conversions, provided that they are immediately sliced:

	scases := (*[ncase]scase)(unsafe.Pointer(cas0))[:]
@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Dec 11, 2019

If we wait for generics unsafe.Slice is no longer magical:

package unsafe
func Slice<T>(p *T, cap int) []T

We could do unsafe.String now, no generics required.

The reverse operation for Slice is easy, just &s[0] with a special condition for len==0. Maybe we also want the reverse operation for strings?

// Caller must not modify the memory pointed to by the return value
func StringPtr(s string) *byte // or unsafe.Pointer?

or maybe

// Caller must not modify the memory pointed to by the return value
func StringAsSlice(s string) []byte
@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 11, 2019

@bcmills I think that's what I was referring to about unsafe.Pointer vs *T. While that function is a bit less ergonomic under @rsc's proposal, other ones like:

xmhdr := unsafe.Slice((*method)(add(unsafe.Pointer(x), uintptr(x.moff))), nt)
methods := unsafe.Slice((*unsafe.Pointer)(unsafe.Pointer(&m.fun[0])), ni)

would become:

xmhdr := make([]method, add(unsafe.Pointer(x), uintptr(x.moff)), nt)
methods := make([]unsafe.Pointer, unsafe.Pointer(&m.fun[0]), ni)

which are shorter.

I think select.go isn't a great example either. That function is called by code generated by the Go compiler, which has its own separate type signatures for the function anyway, so the type safety there is superficial. We could just change the cas0 parameter's type to unsafe.Pointer.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

@randall77, note that I proposed essentially the same thing using reflect in #13656 (comment).

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 11, 2019

Riffing on the theme of “feels more like a conversion”, maybe we could allow arrays of non-constant type in conversions, provided that they are immediately sliced:

I think this would work, but we'd probably need to decide how go/types should handle it. In particular, what go/types.Type would one of these non-constant array type literals have?

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Dec 11, 2019

You could spell this:

scases := (*[ncase]scase)(unsafe.Pointer(cas0))[:]

like this instead:

scases := (*[...]scase)(unsafe.Pointer(cas0))[:ncase]

and then, following the discussion in #13656, resolve ... in this case to be the largest possible for the type.

(Although I can't say I really like either way of spelling it.)

@bradfitz bradfitz added the dotdotdot label Dec 11, 2019
@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Dec 11, 2019

We could do unsafe.String now, no generics required.

I don't think we want to do unsafe.String at all. It's too unsafe. (It's too easy to create a string that then survives forever with a broken promise and changing memory underneath).

If people want a string, they can string(byteSlice) and get a safe copy.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 12, 2019

I can see the appeal of "placement make" from a Go language perspective (i.e., it's making a []T value, and exposing more user control over the component element values). And I can understand why CL 202082 would make it seems like that's a better solution than unsafe.Slice.

But I think that's because CL 202082 is focused solely on the Go runtime, where we end up having to do a lot of manual pointer arithmetic anyway.

I expect the more common case where users will have a (ptr, len) pair and want to construct a slice will be users working with C APIs. In that case, cgo provides a type-safe *T pointer value for them, and it seems unfortunate to require them to write make([]T, unsafe.Pointer(ptr), len) instead of unsafe.Slice(ptr, len).

So I think I'm still leaning towards unsafe.Slice(ptr, len) as the least error-prone solution for users.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 12, 2019

@mdempsky

In particular, what go/types.Type would one of these non-constant array type literals have?

Fortunately, the documentation for go/types already gives us a good answer. Per https://golang.org/pkg/go/types/#NewArray:

A negative length indicates an unknown length.

The go/types.Type for the result of the conversion would be *Pointer. Call that type p. Then we have:

  • p.Elem() has concrete type *types.Array.
  • p.Elem().Elem() is the element type indicated in the conversion expression.
  • p.Elem().Len() is negative (presumably -1), since the value of the length expression is only known an run-time.

If such an expression is bound to a variable, I would decay the type of that variable to *[0]T, analogous to how an untyped integer constant decays to int.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 12, 2019

@josharian
The problem with spelling it

	scases := (*[...]scase)(unsafe.Pointer(cas0))[:ncase]

is that the slice expression normally only sets the length. It seems inconsistent to have that also set the capacity in this case.

In contrast, I want the resulting slice to always have both a well-defined length and capacity.

We could instead require the three-argument slice form for arrays of indeterminate length:

	scases := (*[...]scase)(unsafe.Pointer(cas0))[:ncase:ncase]

but that seems like unnecessary duplication for the vast majority of real use-cases.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 12, 2019

@bcmills

Fortunately, the documentation for go/types already gives us a good answer.

Good point. That does seem like a good solution.

If such an expression is bound to a variable, I would decay the type of that variable to *[0]T, analogous to how an untyped integer constant decays to int.

I'd suggest leaving *[nonConst]T an error outside of a slice expression, in which case it's fine to leave the variable type as *[-1]T instead of decaying it to something else.

(That said, I think (*[n]T)(p)[:] has the same issue as make([]T, p, n) that it requires T to be explicitly typed by the user, and thus loses some type safety for common uses.)

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 12, 2019

@randall77

If we wait for generics unsafe.Slice is no longer magical:

Package unsafe is inherently magical, so I think waiting for generics to make it non-magical only makes sense if we're wanting to put unsafe.Slice in a different package. But since it's an operation that the compiler can't guarantee is safe, package unsafe seems like the right place to me to make it stand out during code review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants
You can’t perform that action at this time.