Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: unsafe: add Slice and String types, conversions to replace reflect.{Slice,String}Header #19367

Open
mdempsky opened this issue Mar 2, 2017 · 52 comments

Comments

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Mar 2, 2017

reflect.SliceHeader and reflect.StringHeader are clumsy to use because their Data fields have type uintptr instead of unsafe.Pointer.

This proposal is to add types unsafe.Slice and unsafe.String as replacements. They would be declared just like their package reflect analogs, except with unsafe.Pointer-typed Data fields:

type Slice struct {
    Data Pointer
    Len int
    Cap int
}

type String struct {
    Data Pointer
    Len int
}

Additionally, I suggest that for the purposes of type conversion, we treat that string and unsafe.String have the same underlying type, and also []T and unsafe.Slice. For example, these would be valid:

func makestring(p *byte, n int) string {
    // Direct conversion of unsafe.String to string.
    return string(unsafe.String{unsafe.Pointer(p), n})
}

func memslice(p *byte, n int) (res []byte) {
    // Direct conversion of *[]byte to *unsafe.Slice, without using unsafe.Pointer.
    s := (*unsafe.Slice)(&res)
    s.Data = unsafe.Pointer(p)
    s.Len = n
    s.Cap = n
    return
}

While the same results can be achieved using unsafe.Pointer conversions, by using direct conversions the compiler can provide a little extra type safety.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 2, 2017

If we do this, we should figure out a way to exempt these new types from the Go 1 compatibility guarantee, so that we can change the representation of strings and slices in the future. I'm not sure how best to do that.

@cespare

This comment has been minimized.

Copy link
Contributor

@cespare cespare commented Mar 2, 2017

@ianlancetaylor reflect.SliceHeader and reflect.StringHeader already try:

It cannot be used safely or portably and its representation may change in a later release.

but the compat doc itself gives a strong exemption for all of unsafe:

Packages that import unsafe may depend on internal properties of the Go implementation. We reserve the right to make changes to the implementation that may break such programs.

ISTM that unsafe.{Slice,String} would already be exempted sufficiently.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 6, 2017

Go 2 seems like the time to think about this (and reflect.SliceHeader etc).

-rsc for @golang/proposal-review

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 23, 2017

This proposal seems a bit redundant with #13656.

How much of the use-case is "create a string or slice aliasing C memory" vs. "manipulate existing strings and slices by tweaking header fields unsafely"?

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

I'd like to suggest renewing consideration of this proposal for Go 1.14. I think it will be useful for users trying to address issues flagged by -d=checkptr.

I'll also offer a counter-proposal that I think better addresses most end user needs in a more ergonomic manner:

package unsafe

func Slice(ptr *ArbitraryType, len, cap int) []ArbitraryType
func String(ptr *byte, len int) string

[Edit: As discussed below, I'm now in favor of combining Slice's len/cap parameter into a single parameter.]

This is a little less versatile than exposing the Header types, but I think it will minimize typing for most users, while also providing better type safety.

We could also do both this proposal and my original one, if we want to still offer the full flexibility of the Header types. In that case, I would suggest renaming the types to SliceHeader and StringHeader, and reserve the shorter Slice and String identifiers for the constructor functions.

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Oct 18, 2019

I like that counter proposal API.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

A few additional thoughts to add to my counter proposal:

  1. We should decide what happens when len < 0 or cap < len. I'm leaning towards panic, but maybe we should just leave it unspecified/undefined.

    Edit: ptr == nil && len > 0 is another case to consider.

    Edit 2: Also, len > MAXWIDTH / unsafe.Sizeof(*ptr).

  2. The functions would be builtins; in particular, users can't write f := unsafe.String; f(...).

  3. The cap argument to unsafe.Slice can be optional; if it's omitted, the len argument is used. (Just like make([]T, n) is shorthand for make([]T, n, n).)

  4. Perhaps the int parameters should actually follow the same goofy semantics that make([]T, n, m) follows. (I.e., make([]T, uint64(10), int8(20)) is valid, even though uint64 and int8 aren't normally assignable to int.)

  5. Since unsafe.String would be a builtin, it could evaluate to an untyped string.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 18, 2019

That API is closer to what I had suggested in https://golang.org/issue/13656#issuecomment-303216308, and we've been using that variant within Google for a couple of years now without complaints.

If the type desired for the slice does not match the pointer that the user has (for example, if one is a cgo-generated type and the other is a native Go type), I'm assuming that the caller could do something like:

	var s = unsafe.Slice((*someGoType)(unsafe.Pointer(cPtr)), len, cap)

to set the element type?

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 18, 2019

We should decide what happens when len < 0 or cap < len. I'm leaning towards panic, but maybe we should just leave it unspecified/undefined.

I would leave it unspecified, but panic is a fine implementation of “unspecified”.

Perhaps the int parameters should actually follow the same goofy semantics that make([]T, n, m) follows.

That would certainly smooth out the call site in the (overwhelmingly common) case that len and/or cap is a C.size_t.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

If the type desired for the slice does not match the pointer that the user has (for example, if one is a cgo-generated type and the other is a native Go type), I'm assuming that the caller could do something like:

	var s = unsafe.Slice((*someGoType)(unsafe.Pointer(cPtr)), len, cap)

to set the element type?

Yeah, that's my thought. If a user wants to convert *T into []U, then I think it's reasonable to require an explicit conversion there.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

I would leave it unspecified, but panic is a fine implementation of “unspecified”.

Ack, though my concern is if we panic by default, then users might come to rely on it panicking and not write their own checking.

It would be easy to put the panic behind -d=checkptr though.

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Oct 18, 2019

func Slice(ptr *ArbitraryType, len, cap int) []ArbitraryType

Can we instead do:

func Slice(ptr *ArbitraryType, len int[, cap int]) []ArbitraryType

... with an optional cap. Where omitting cap means cap == len?

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Oct 18, 2019

@bradfitz Yeah, that's my additional thought #3 above. :)

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 18, 2019

if we panic by default, then users might come to rely on it panicking and not write their own checking.

Hmm, good point. We could make it a throw! 😉

Or we could make it a panic in ordinary code but a throw under -race or -d=checkptr. (The important thing, I think, is to vary it just enough that it causes tests to fail in some reasonably-common configuration.)

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Dec 11, 2019

So we going to call it "placement make"?

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

The ergonomics of the “placement make” form seem worse compared to unsafe.Slice. To draw an example from https://golang.org/cl/202082:

Under @mdempsky's proposal, we have:

	scases := unsafe.Slice(cas0, ncases)

with the element type propagated from element type of the existing cas0 variable.

In contrast, under @rsc's alternative we would have:

	scases := make([]scase, unsafe.Pointer(cas0), ncases)
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

Riffing on the theme of “feels more like a conversion”, maybe we could allow arrays of non-constant type in conversions, provided that they are immediately sliced:

	scases := (*[ncase]scase)(unsafe.Pointer(cas0))[:]
@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Dec 11, 2019

If we wait for generics unsafe.Slice is no longer magical:

package unsafe
func Slice<T>(p *T, cap int) []T

We could do unsafe.String now, no generics required.

The reverse operation for Slice is easy, just &s[0] with a special condition for len==0. Maybe we also want the reverse operation for strings?

// Caller must not modify the memory pointed to by the return value
func StringPtr(s string) *byte // or unsafe.Pointer?

or maybe

// Caller must not modify the memory pointed to by the return value
func StringAsSlice(s string) []byte
@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 11, 2019

@bcmills I think that's what I was referring to about unsafe.Pointer vs *T. While that function is a bit less ergonomic under @rsc's proposal, other ones like:

xmhdr := unsafe.Slice((*method)(add(unsafe.Pointer(x), uintptr(x.moff))), nt)
methods := unsafe.Slice((*unsafe.Pointer)(unsafe.Pointer(&m.fun[0])), ni)

would become:

xmhdr := make([]method, add(unsafe.Pointer(x), uintptr(x.moff)), nt)
methods := make([]unsafe.Pointer, unsafe.Pointer(&m.fun[0]), ni)

which are shorter.

I think select.go isn't a great example either. That function is called by code generated by the Go compiler, which has its own separate type signatures for the function anyway, so the type safety there is superficial. We could just change the cas0 parameter's type to unsafe.Pointer.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 11, 2019

@randall77, note that I proposed essentially the same thing using reflect in #13656 (comment).

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 11, 2019

Riffing on the theme of “feels more like a conversion”, maybe we could allow arrays of non-constant type in conversions, provided that they are immediately sliced:

I think this would work, but we'd probably need to decide how go/types should handle it. In particular, what go/types.Type would one of these non-constant array type literals have?

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Dec 11, 2019

You could spell this:

scases := (*[ncase]scase)(unsafe.Pointer(cas0))[:]

like this instead:

scases := (*[...]scase)(unsafe.Pointer(cas0))[:ncase]

and then, following the discussion in #13656, resolve ... in this case to be the largest possible for the type.

(Although I can't say I really like either way of spelling it.)

@bradfitz bradfitz added the dotdotdot label Dec 11, 2019
@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Dec 11, 2019

We could do unsafe.String now, no generics required.

I don't think we want to do unsafe.String at all. It's too unsafe. (It's too easy to create a string that then survives forever with a broken promise and changing memory underneath).

If people want a string, they can string(byteSlice) and get a safe copy.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 12, 2019

I can see the appeal of "placement make" from a Go language perspective (i.e., it's making a []T value, and exposing more user control over the component element values). And I can understand why CL 202082 would make it seems like that's a better solution than unsafe.Slice.

But I think that's because CL 202082 is focused solely on the Go runtime, where we end up having to do a lot of manual pointer arithmetic anyway.

I expect the more common case where users will have a (ptr, len) pair and want to construct a slice will be users working with C APIs. In that case, cgo provides a type-safe *T pointer value for them, and it seems unfortunate to require them to write make([]T, unsafe.Pointer(ptr), len) instead of unsafe.Slice(ptr, len).

So I think I'm still leaning towards unsafe.Slice(ptr, len) as the least error-prone solution for users.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 12, 2019

@mdempsky

In particular, what go/types.Type would one of these non-constant array type literals have?

Fortunately, the documentation for go/types already gives us a good answer. Per https://golang.org/pkg/go/types/#NewArray:

A negative length indicates an unknown length.

The go/types.Type for the result of the conversion would be *Pointer. Call that type p. Then we have:

  • p.Elem() has concrete type *types.Array.
  • p.Elem().Elem() is the element type indicated in the conversion expression.
  • p.Elem().Len() is negative (presumably -1), since the value of the length expression is only known an run-time.

If such an expression is bound to a variable, I would decay the type of that variable to *[0]T, analogous to how an untyped integer constant decays to int.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Dec 12, 2019

@josharian
The problem with spelling it

	scases := (*[...]scase)(unsafe.Pointer(cas0))[:ncase]

is that the slice expression normally only sets the length. It seems inconsistent to have that also set the capacity in this case.

In contrast, I want the resulting slice to always have both a well-defined length and capacity.

We could instead require the three-argument slice form for arrays of indeterminate length:

	scases := (*[...]scase)(unsafe.Pointer(cas0))[:ncase:ncase]

but that seems like unnecessary duplication for the vast majority of real use-cases.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 12, 2019

@bcmills

Fortunately, the documentation for go/types already gives us a good answer.

Good point. That does seem like a good solution.

If such an expression is bound to a variable, I would decay the type of that variable to *[0]T, analogous to how an untyped integer constant decays to int.

I'd suggest leaving *[nonConst]T an error outside of a slice expression, in which case it's fine to leave the variable type as *[-1]T instead of decaying it to something else.

(That said, I think (*[n]T)(p)[:] has the same issue as make([]T, p, n) that it requires T to be explicitly typed by the user, and thus loses some type safety for common uses.)

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Dec 12, 2019

@randall77

If we wait for generics unsafe.Slice is no longer magical:

Package unsafe is inherently magical, so I think waiting for generics to make it non-magical only makes sense if we're wanting to put unsafe.Slice in a different package. But since it's an operation that the compiler can't guarantee is safe, package unsafe seems like the right place to me to make it stand out during code review.

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Dec 13, 2019

Just to throw another idea out there: If the main use for this is cgo, we could push some magic into cgo. C.SliceOfT(ptr *T, cap int) could generate code that does the conversion for type T.

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Dec 13, 2019

It's not just cgo, though. syscall packages are full of this sort of stuff.

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Jan 7, 2020

Regarding the objection of adding magic functions to the unsafe package, the unsafe package is already pretty magic:

https://golang.org/pkg/unsafe/#ArbitraryType

(And all its uses)

Adding a polymorphic unsafe function (as opposed to extending the polymorphic built-in make) wouldn't make the unsafe package much more weird. There's already precedent for how to godoc it.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 21, 2020

A case that this would help in the standard library: #37350.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

@jimmyfrasche jimmyfrasche commented Mar 1, 2020

Of all the proposals, in this thread and the many in the past, unsafe.Slice(p, n) looks like it would be the easiest to write and it would be the easiest to read by human and machine alike.

There are very nice properties to have when doing something unsafe and this is a fundamentally unsafe thing to do.

I've personally only needed something like this when messing with cgo, but I don't recall any code I've written with cgo that wouldn't be simpler and more correct with this. Every case I can think of could be rewritten as just unsafe.Slice(v, int(n)).

@smasher164

This comment has been minimized.

Copy link
Member

@smasher164 smasher164 commented Mar 11, 2020

A place where the slice header type would have been useful for me: https://github.com/smasher164/mem/blob/11c40568d91b031ff1d4049628ff83b46bbdc4f2/map_unix.go#L22.

Retrieving the underlying address for a segment gotten from unix.Mmap requires a conversion to a slice header. While the above munmap implementation could make use of unsafe.Slice(p, n), the mmap implementation cannot. In my view, the flexibility that a guaranteed struct layout provides is far outweighed by the ergonomics of using a polymorphic function.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 11, 2020

@smasher164, assuming that the size is nonzero, you don't need a header type to obtain the address of an mmap'd slice. &(b[0]) should suffice.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Mar 11, 2020

A place where the slice header type would have been useful for me: https://github.com/smasher164/mem/blob/11c40568d91b031ff1d4049628ff83b46bbdc4f2/map_unix.go#L22.

Note that the safe coding pattern here is:

sl := (*reflect.SliceHeader)(unsafe.Pointer(&b))
return unsafe.Pointer(sl.Data), nil

It's not guaranteed that you can convert a pointer to a slice to a pointer to any other struct type except reflect.SliceHeader.

@smasher164

This comment has been minimized.

Copy link
Member

@smasher164 smasher164 commented Mar 11, 2020

@bcmills Thank you for pointing that out.

It's not guaranteed that you can convert a pointer to a slice to a pointer to any other struct type except reflect.SliceHeader.

@mdempsky I see. I assumed that the conversion would be correct based on the usage in x/sys/unix/syscall_unix.go#L117.

@mdempsky

This comment has been minimized.

Copy link
Member Author

@mdempsky mdempsky commented Mar 11, 2020

I assumed that the conversion would be correct based on the usage in x/sys/unix/syscall_unix.go#L117.

Thanks. We should fix that. That file's not following best practice regarding unsafe.Pointer. (This is a separate issue though.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.