Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsafe: add StringData, String, SliceData #53003

Closed
twmb opened this issue May 19, 2022 · 40 comments
Closed

unsafe: add StringData, String, SliceData #53003

twmb opened this issue May 19, 2022 · 40 comments

Comments

@twmb
Copy link
Contributor

twmb commented May 19, 2022

Now that reflect.StringHeader and reflect.SliceHeader are officially deprecated, I think it's time to revisit adding function that satisfy the reason people have used these types. AFAICT, the reason for deprecation is that reflect.SliceHeader and reflect.StringHeader are commonly misused. As well, the types have always been documented as unstable and not to be relied upon.

We can see in Github code search that usage of these types is ubiquitous. The most common use cases I've seen are:

  • converting []byte to string
  • converting string to []byte
  • grabbing the Data pointer field for ffi or some other niche use
  • converting a slice of one type to a slice of another type

The first use case can also commonly be seen as *(*string)(unsafe.Pointer(&mySlice)), which is never actually officially documented anywhere as something that can be relied upon. Under the hood, the shape of a string is less than a slice, so this seems valid per unsafe rule (1), but this is all relying on undocumented behavior. The second use case is commonly seen as *(*[]byte)(unsafe.Pointer(&string)), which is by-default broken because the Cap field can be past the end of a page boundary (example here, in widely used code) -- this violates unsafe rule (1).

Regardless of the thought that people should never rely upon these types, people do, and they do so all over. People also rely on invalid conversions because Go has never made this easy. Part of the discussion on #19367 was about all the ways that people misuse these types today. These conversions are small tricks that can alleviate memory pressure and improve latencies and CPU usage in real programs. The use cases are real, and Go provides just enough unsafe and buggy ways of working around these problems such that now there is a large ecosystem of technically invalid code that just so happens to work.

Rather than consistently saying "don't use this", go veting somewhat, and then ducking all responsibility for buggy programs, Go should provide actual safe(ish) APIs that people can rely on in perpetuity. New functions can live in unsafe and have well documented rules around their use cases, and then Go can finally document what to do when people want this common escape hatch.

Concrete proposal

The following APIs in the unsafe package:

// StringToBytes returns s as a byte slice by performing a non-copying type conversion.
// Slices returned from this function cannot be modified.
func StringToBytes(s string) []byte

// BytesToString returns b as a string by performing a non-copying type conversion.
// The input bytes to this function cannot be modified while any string returned from
// this function is alive.
func BytesToString(b []byte) string

func DataPointer[T ~string|~[]E, E any](t T) unsafe.Pointer eliminating, because realistically a person can just do &slice[0] (although a corresponding analogue does not exist for strings)

I think unsafe.Slice covers the use case for converting between slices of different types, although I'm not 100% sure what the use case of unsafe.Slice is.

@twmb twmb added the Proposal label May 19, 2022
@gopherbot gopherbot added this to the Proposal milestone May 19, 2022
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented May 19, 2022

Thanks, but I don't see an actual proposal here. If you want to discuss ideas, please use golang-nuts. The proposal process should be for a proposal. That is, suggest some specific functions that we should introduce. Thanks.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) May 19, 2022
@twmb
Copy link
Contributor Author

twmb commented May 19, 2022

Sure, good point. I've edited my comment with an actual proposal of threetwo new functions. I think unsafe.Slice covers my the fourth common use case I mentioned above, but I'm not 100% sure.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented May 20, 2022

One of the main use cases of unsafe.Slice is to create a slice whose backing array is a memory buffer returned from C code or from a call such as syscall.MMap. I agree that it can be used to (unsafely) convert from a slice of one type to a slice of a different type.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented May 20, 2022

CC @mdempsky

@paulstuart

This comment was marked as off-topic.

@ianlancetaylor

This comment was marked as off-topic.

@rsc
Copy link
Contributor

rsc commented May 25, 2022

Filed #53079. Perhaps we should undeprecate them for Go 1.19, since we don't have a replacement for all valid use cases yet. Thanks.

@rsc
Copy link
Contributor

rsc commented May 25, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) May 25, 2022
@bcmills
Copy link
Member

bcmills commented May 25, 2022

FWIW, the proposed functions closely parallel the OfString and AsString functions in my unsafeslice package.

That package also provides best-effort mutation detection, since Go string variables are supposed to be immutable and Go programs may generally assume that variables passed as type string are never mutated until they are garbage-collected.

@twmb
Copy link
Contributor Author

twmb commented May 25, 2022

I've seen that package, and the one thing I'd not do is the mutation detection (which you gate behind the unsafe build tag). I'd expect mutation detection to be automatically done when testing / when built with -race, but not to automatically be applied in releases. I'm using the unsafe package, after all :).

I was thinking to link the package in my original writeup, since it is further evidence of the use case.

@bcmills
Copy link
Member

bcmills commented May 25, 2022

At the very least, I think it's important for the documentation for the proposed functions to explicitly call out the lifetime issues, especially for BytesToString — a string used as a map key can cause arbitrary memory corruption if it is mutated while the map is still live.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented May 28, 2022

To fully replace reflect.StringHeader and reflect.SliceHeader, let's consider what they permit us to do. They can be used to read and/or to modify the contents of a string or a slice.

For reading, we can already extract all the elements of a slice, via &s[0], len(s), and cap(s). We can get the length of a string via len(s), but we can't get the pointer to the data.

For writing, we can create a new slice setting all the elements, via unsafe.Slice followed by a slice expression. But we can't create a new string.

So to me that suggests, in package unsafe,

// StringData returns a pointer to the bytes of a string.
// The bytes must not be modified; doing so can cause
// the program to crash or behave unpredictably.
func StringData(string) *byte

// String constructs a string value from a pointer and a length.
// The bytes passed to String must not be modified;
// doing so can cause the program to crash or behave unpredictably.
func String(*byte, int) string

These functions should remain meaningful even if we somehow change the representation of slices or strings in the future.

The restrictions on changing the bytes are unfortunate, but the fact is that people are doing these kinds of transformations today. Omitting these functions from the unsafe package doesn't mean that Go programs won't do them, it just means that they will do in ways that are sometimes even less safe.

It should be feasible to add a dynamic detector for any modifications to these bytes. This could perhaps be enabled when using the race detector. This would not be perfect but would detect egregious misuses.

The functions suggested above would then be written as

func StringToBytes(s string) []byte {
    return unsafe.Slice(unsafe.StringData(s), len(s))
}

func BytesToString(b []byte) string {
    return unsafe.String(&b[0], len(b))
}

(To be clear, the reverse is also possible: we can write String and StringData in terms of StringToBytes and BytesToString.)

@rsc
Copy link
Contributor

rsc commented Jun 1, 2022

It does seem like unsafe.String and unsafe.StringData match unsafe.Slice a bit better and are more fundamental operations
than providing StringToBytes and BytesToString as the primitives.
I wonder if we should add unsafe.SliceData as well (code often has to work around len 0 using &s[0]).

@mdempsky
Copy link
Member

mdempsky commented Jun 2, 2022

I'd like to suggest we reconsider the original proposal from #19367: to add new unsafe.StringHeader and unsafe.SliceHeader types, to handle the remaining advanced use cases that are supported by reflect.{String,Slice}Header but aren't covered by unsafe.Slice.

Concretely, I propose adding:

package unsafe

type StringHeader struct {
    Data *byte
    Len int
}

type SliceHeader[Elem any] struct {
    Data *Elem
    Len, Cap int
}

and allowing conversions between string and unsafe.StringHeader, and also between []Elem and unsafe.SliceHeader[Elem].

Converting an invalid unsafe.{String,Slice}Header (e.g., Len > Cap, or Data==nil and Len>0) into a normal string or slice type should fail, at least in -d=checkptr mode. I'm leaning towards making it a run-time panic (like unsafe.Slice) because the failure conditions are easy to specify/detect, but simply leaving it undefined (like unsafe.Add) seems not unreasonable too.

N.B., my original #19367 proposal also allowed conversions between *string and *unsafe.StringHeader, etc. I think we could still allow that (e.g., particularly to help users transition away from reflect.{String,Slice}Header), but I think it's less error-prone (and marginally better for escape analysis) if we encourage users to construct an unsafe.StringHeader, value convert it to string, and then store that into memory; rather than creating a *unsafe.StringHeader and individually assigning fields in memory.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jun 3, 2022

My concern about StringHeader and SliceHeader is that it locks all possible implementations into either using those exact headers or doing strange contortions in the compiler.

What can we do with StringHeader and SliceHeader that we can't do with Slice, SliceData, String, and StringData?

@mdempsky
Copy link
Member

mdempsky commented Jun 3, 2022

After thinking about it more, I'm inclined to agree that {Slice,String}{,Data} builtin functions are the way to go. As you say, they support the same functionality. I think that will be easier on tools authors than extending conversion semantics too.

@rsc
Copy link
Contributor

rsc commented Jun 8, 2022

It sounds like we've converged on considering unsafe.StringData, unsafe.String, and unsafe.SliceData.
Does anyone object to those?

@rsc rsc changed the title proposal: unsafe: add functions to replace reflect.StringHeader and reflect.SliceHeader proposal: unsafe: add StringData, String, SliceData Jun 8, 2022
@rsc
Copy link
Contributor

rsc commented Jun 15, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Accept in Proposals (old) Jun 15, 2022
@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) Jun 22, 2022
@rsc
Copy link
Contributor

rsc commented Jun 22, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: unsafe: add StringData, String, SliceData unsafe: add StringData, String, SliceData Jun 22, 2022
cuiweixie added a commit to cuiweixie/go that referenced this issue Aug 18, 2022
For golang#53003

Change-Id: I13a761daca8b433b271a1feb711c103d9820772d
cuiweixie added a commit to cuiweixie/go that referenced this issue Aug 18, 2022
For  golang#53003

Change-Id: I1f1b3ce3ede48d1bfc9981bceea4317c6b66b62d
cuiweixie added a commit to cuiweixie/go that referenced this issue Aug 18, 2022
…StringData,SliceData}

For  golang#53003
Change-Id: Id3125268523fed855ffac20cde6128010e3513f0
cuiweixie added a commit to cuiweixie/go that referenced this issue Aug 18, 2022
For golang#53003

Change-Id: I13a761daca8b433b271a1feb711c103d9820772d
gopherbot pushed a commit that referenced this issue Aug 24, 2022
For #53003
Change-Id: Id3125268523fed855ffac20cde6128010e3513f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/423754
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link

gopherbot commented Aug 25, 2022

Change https://go.dev/cl/425454 mentions this issue: go/types, types2: add more tests for unsafe.Slice/SliceData/String/StringData

gopherbot pushed a commit that referenced this issue Aug 25, 2022
…ringData

Also:
- fine-tune the implementation for some of the new builtin functions
- make sure the go/types code is an exact as possible copy of the
  types2 code
- fix the description and examples for errorcodes.go

Follow-up on CL 423754.

For #53003.

Change-Id: I5c70b74e90c724cf6c842cedc6f8ace26fde372b
Reviewed-on: https://go-review.googlesource.com/c/go/+/425454
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
gopherbot pushed a commit that referenced this issue Aug 31, 2022
For #53003

Change-Id: I13a761daca8b433b271a1feb711c103d9820772d
Reviewed-on: https://go-review.googlesource.com/c/go/+/423774
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: hopehook <hopehook@golangcn.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link

gopherbot commented Aug 31, 2022

Change https://go.dev/cl/427095 mentions this issue: unsafe: add docs for SliceData, String, and StringData

rajbarik pushed a commit to rajbarik/go that referenced this issue Sep 1, 2022
For golang#53003
Change-Id: Id3125268523fed855ffac20cde6128010e3513f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/423754
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
rajbarik pushed a commit to rajbarik/go that referenced this issue Sep 1, 2022
…ringData

Also:
- fine-tune the implementation for some of the new builtin functions
- make sure the go/types code is an exact as possible copy of the
  types2 code
- fix the description and examples for errorcodes.go

Follow-up on CL 423754.

For golang#53003.

Change-Id: I5c70b74e90c724cf6c842cedc6f8ace26fde372b
Reviewed-on: https://go-review.googlesource.com/c/go/+/425454
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
@gopherbot
Copy link

gopherbot commented Sep 7, 2022

Change https://go.dev/cl/428757 mentions this issue: reflect: deprecate (Slice|String)Header

@gopherbot
Copy link

gopherbot commented Sep 9, 2022

Change https://go.dev/cl/429755 mentions this issue: all: transfer reflect.{SliceHeader, StringHeader} to unsafeheader.{Slice, String}

gopherbot pushed a commit that referenced this issue Sep 9, 2022
As discussed in CL 401434 there are substantial misuses of these in the
wild, and they are a potential source of unsafety even for code that
does not use them directly.

Since proposal #53003 has already been implemented, now is the right
time to deprecate reflect.{SliceHeader, StringHeader}.

For #53003.

Change-Id: I724cf46d4b22d2ed3cbf2b948e6aac5ee4bf0f6e
Reviewed-on: https://go-review.googlesource.com/c/go/+/428757
Run-TryBot: hopehook <hopehook@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
gopherbot pushed a commit that referenced this issue Sep 9, 2022
…ice, String}

After we deprecated reflect.{SliceHeader, StringHeader}, it is recommended
to use unsafe.{Slice, String} to replace its work. However, the compiler
and linker cannot be migrated for the time being.

As a temporary strategy, using the "internal/unsafeheader" package like
other code is the most suitable choice at present.

For #53003.

Change-Id: I69d0ef72e2d95caabd0706bbb247a719d225c758
Reviewed-on: https://go-review.googlesource.com/c/go/+/429755
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: hopehook <hopehook@golangcn.org>
@cespare
Copy link
Contributor

cespare commented Nov 6, 2022

@mdempsky is this going to go in for 1.20? I've tried to follow the large number of associated PRs but I can't really tell what the current state is.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Nov 6, 2022

@cespare As far as I know, they are in. What's missing is the docs. I just pinged that CL.

If there is anything else missing, please let me know. Thanks.

@dmitshur dmitshur modified the milestones: Backlog, Go1.20 Nov 7, 2022
gopherbot pushed a commit that referenced this issue Nov 9, 2022
Updates #53003.

Change-Id: I076d1eb4bd0580002ad8008f3ca213c5edc951ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/427095
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
@gopherbot
Copy link

gopherbot commented Nov 10, 2022

Change https://go.dev/cl/449537 mentions this issue: spec: document the new unsafe functions SliceData, String, and StringData

gopherbot pushed a commit that referenced this issue Nov 14, 2022
…Data

For #53003.

Change-Id: If5d76c7b8dfcbcab919cad9c333c0225fc155859
Reviewed-on: https://go-review.googlesource.com/c/go/+/449537
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
@griesemer
Copy link
Contributor

griesemer commented Nov 15, 2022

Both the spec and the unsafe package documentation are now done. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests