Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: clarify that conversions to slices don't guarantee slice capacity? #24163

Open
mattbostock opened this issue Feb 27, 2018 · 15 comments

Comments

@mattbostock
Copy link
Contributor

commented Feb 27, 2018

What did you do?

Converted a string to a slice of runes then back to a string, and referenced non-existent elements on the slice of runes:

https://play.golang.org/p/MmJNLu6h6FT

What did you expect to see?

A runtime panic with a 'slice bounds out of range' error.

What did you see instead?

"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"

System details

go version go1.10 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/matt/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/matt/go"
GORACE=""
GOROOT="/opt/boxen/homebrew/Cellar/go/1.10/libexec"
GOTMPDIR=""
GOTOOLDIR="/opt/boxen/homebrew/Cellar/go/1.10/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/k4/_18vs9g119b0q2_p950t0jl00000gn/T/go-build038036646=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.10 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.10
uname -v: Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.13.3
BuildVersion:	17D47
lldb --version: lldb-900.0.64
  Swift-4.0
@bradfitz

This comment has been minimized.

Copy link
Member

commented Feb 27, 2018

Interesting, thanks!

Here we can see that the cap is 32:
https://play.golang.org/p/PDAI5olA4YG

But uncomment the following line and the cap changes to 0:
https://play.golang.org/p/BA2ivc0sTY6

Definitely something messed up in the compiler.

/cc @randall77 @mdempsky

@bradfitz bradfitz changed the title Inconsistent panics on out-of-bounds slice index cmd/compile: inconsistent panics on out-of-bounds slice index Feb 27, 2018

@bradfitz bradfitz added the NeedsFix label Feb 27, 2018

@bradfitz bradfitz added this to the Go1.11 milestone Feb 27, 2018

@bradfitz

This comment has been minimized.

Copy link
Member

commented Feb 27, 2018

Actually, does the language guarantee the size of the cap on string->[]rune conversions?

https://golang.org/ref/spec#Conversions_to_and_from_a_string_type says only:

Converting a value of a string type to a slice of runes type yields a slice containing the individual Unicode code points of the string

But doesn't say anything about the slice's capacity.

/cc @griesemer

@mattbostock

This comment has been minimized.

Copy link
Contributor Author

commented Feb 27, 2018

Here's another example using len() that might help clarify how confusing this could be:
https://play.golang.org/p/YLwjpbY7_nX

@mdempsky

This comment has been minimized.

Copy link
Member

commented Feb 27, 2018

@mattbostock Thanks for the examples. I think this example summarizes/demonstrates the root issue you're noticing: https://play.golang.org/p/Er5s4t9rO7R

(Also, note that the upper bound for slice operations is cap(s), not len(s); i.e., it's possible to grow a slice, as long as there's capacity.)

Like @bradfitz points out, I think the real question is whether string->[]byte or string->[]rune conversions guarantee a particular slice capacity. It seems the spec doesn't guarantee that.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Feb 27, 2018

So seems everything is working fine, but the language spec could possibly be more specific here.

Moving to NeedsDecision to see whether the language gets changed.

/cc @ianlancetaylor too.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Feb 28, 2018

I think that for these conversions the result should be a slice with cap == len. I see no benefit to leaving it unspecified.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Feb 28, 2018

I see no benefit to leaving it unspecified.

I think there's a small performance benefit. Suppose something like:

var s string = f()
return append([]byte(s), "suffix"...)

For the conversion, the runtime will need to allocate memory. Also, it will pad the allocation up to the next allocation bucket size. If we expose this capacity to the user via the resulting slice's capacity, then the append might be able to fit into the existing space without requiring a second allocation.

@griesemer

This comment has been minimized.

Copy link
Contributor

commented Feb 28, 2018

As bad as it sounds, there may be existing code that relies on this feature...

Perhaps the spec just needs to say that it doesn't say anything about the capacity in these cases.

@dominikh

This comment has been minimized.

Copy link
Member

commented Feb 28, 2018

Related: #14232 and #14235

@bradfitz

This comment has been minimized.

Copy link
Member

commented Feb 28, 2018

As bad as it sounds, there may be existing code that relies on this feature...

Considering that escape analysis changes somewhat regularly, I kinda doubt anybody's relying on this. We've never heard anything about it.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2018

For conversions from string to both []byte and []rune, we allocate extra capacity. This happens for both stack-allocated (round up to the 32-byte stack buffer) and heap-allocated (round up to the next size class) results:

func main() {
	foo := "foo"
	bar := []rune(foo)
	fmt.Printf("%d %d\n", len(bar), cap(bar))

	baz := []rune(foo)
	sink = baz
	fmt.Printf("%d %d\n", len(baz), cap(baz))
}

var sink []rune

Prints:

3 32
3 4

So escape analysis doesn't really figure into this discussion, it just affects the constants.

IMO when Go does an allocation and the user didn't specify a capacity, the runtime should be free to allocate capacity somewhat beyond the requested size. Calling make gives you exactly the capacity requested*. Calling append or doing a cast need not.

*Although you could argue that only the 3-arg version of make requests an explicit capacity, the 2-arg version could allow cap>len.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Mar 1, 2018

So escape analysis doesn't really figure into this discussion, it just affects the constants.

I think @bradfitz's point was that because escape analysis changes, the constants particular user code sees sometimes changes too. Thus it's arguably safe if we want to intentionally change the constants.

*Although you could argue that only the 3-arg version of make requests an explicit capacity, the 2-arg version could allow cap>len.

I'd argue even for the 3-arg version, that the explicit cap argument should just be a lower bound on what the runtime returns. But unfortunately these seems out of scope for Go 1.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2018

@mdempsky Wow, you're even more of a rebel than I am.

@josharian

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2018

Although you could argue that only the 3-arg version of make requests an explicit capacity, the 2-arg version could allow cap>len.

To be clear, the spec does currently appear to forbid this.

I'd argue even for the 3-arg version, that the explicit cap argument should just be a lower bound on what the runtime returns. But unfortunately these seems out of scope for Go 1.

There's an analogy with making maps, where the len arg is just a hint. On the other hand, we definitely wouldn't want the len arg to buffered channels to be just a hint, since they are used to strictly control concurrency.

Another option is to add a "round up" function to package runtime that accepts a size (obtained presumably by unsafe.Sizeof) and a quantity and returns a recommended quantity. Then in cases in which it really matters, people would have the tools to do it--at the cost of using both package unsafe and package runtime. Combine with generics and you could even build append as a library function. :)

(Note that if one wanted to go really nuts and build this yourself without package runtime support, you could, by writing an init function that makes byte slices of various sizes and calls append on them and records the cap you get back, and inferring malloc slab sizes from them.)

@mdempsky

This comment has been minimized.

Copy link
Member

commented Mar 1, 2018

I think the discussions about make are interesting, but increasingly off topic. I've opened #24204 to discuss them further. :)

@bradfitz bradfitz changed the title cmd/compile: inconsistent panics on out-of-bounds slice index spec: clarify that conversions to slices don't guarantee slice capacity? Apr 21, 2018

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.