Skip to content

cmd/compile: Don't copy when converting non-escaping []byte to string on return #27713

@Merovius

Description

@Merovius

What version of Go are you using (go version)?

go version go1.11 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/mero/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/mero"
GOPROXY=""
GORACE=""
GOROOT="/home/mero/go"
GOTMPDIR=""
GOTOOLDIR="/home/mero/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build970468447=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Running this benchmark, with -benchmem:

package foo

import (
	"testing"
	"unsafe"
)

func Safe(n int) string {
	b := make([]byte, n)
	for i := 0; i < n; i++ {
		b[i] = byte(i)
	}
	return string(b)
}

func Unsafe(n int) string {
	b := make([]byte, n)
	for i := 0; i < n; i++ {
		b[i] = byte(i)
	}
	return *(*string)(unsafe.Pointer(&b))
}

func BenchmarkSafe(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = Safe(1337)
	}
}

func BenchmarkUnsafe(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = Unsafe(1337)
	}
}

I get

goos: linux                 
goarch: amd64
BenchmarkSafe-4     	 1000000	      1535 ns/op	    2816 B/op	       2 allocs/op
BenchmarkUnsafe-4   	 1000000	      1200 ns/op	    1408 B/op	       1 allocs/op
PASS
ok  	_/home/mero/tmp/opt	2.771s

What did you expect to see?

The compiler recognizing that b in Safe is not escaping and re-using the allocated heap-memory for the returned string. i.e. Safe and Unsafe behaving equivalently.

What did you see instead?

A new chunk was heap-allocated and the data copied over.


I have a package that implements interval tree clocks. For that, I am encoding a binary tree into a []byte. The encoding is done recursively, I allocate a suitably sized []byte at the top and pass it down, filling it in the process. I can't use a strings.Builder, because on the way returning "up the stack" I need to see what the lowers calls have written and may need to modify it - the algorithms are far easier to express when they an use the whole []byte as a scratch space. The API offered to the user, though, I want to be string-based, so that the values are immutable.

This, however, allocates more often than needed. Above code exemplifies what is happening: I create the byte-slice and pass it down, then convert it to string on return. The compiler treats the make([]byte, n) as escaping but then, when returning the string, allocates a new chunk of memory and copies the filled slice over. It could, however, detect that b is not escaping and decide to re-use the allocated space to point the returned string at it (akin to what strings.Builder or Unsafe above does).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions