Skip to content

proposal: spec: byte view: type that can represent a []byte or string #5376

@bradfitz

Description

@bradfitz
Go has no built-in type which can represent a read-only view of possibly-changing bytes.

We have two current types which internally are a byte pointer and a length:  (the first
also has a capacity, irrelevant to this issue) 


[]byte

   Allowance: "you can mess with this memory"
   Restriction: "this memory isn't necessarily forever; anybody else might be changing it now (race) or later (iterator becoming invalidated after this call, etc)"

string

   Allowance: "if you have a reference to this, it's good forever, and nobody will ever mess with it".
   Restriction: "you can NOT change"


And because of their conflicting allowances and restrictions, any conversion from
string->[]byte or []byte->string necessarily has to make a copy and often
generates garbage. 

Both are great, but there's a missing piece:

Many places in Go's API want a type with *neither* of those allowances, and are happy
with both of those restrictions:

-- much of pkg strings
-- much of pkg bytes
-- all the strconv.Parse* functions (issue #2632)
-- most of the string arguments to system calls
-- the io.Writer interface's argument (no need then for io.WriteString)
-- leveldb's Key/Value accessors.

Look at leveldb's Iterator:

http://godoc.org/code.google.com/p/leveldb-go/leveldb/db#Iterator

It has to go out of its way to say "please do not corrupt my memory".  If
somebody uses memdb directly
(http://godoc.org/code.google.com/p/leveldb-go/leveldb/memdb) and misuses the Iterator
type, all the sudden the memory corruption and stack traces implicate the memdb package,
even though it's a caller of memdb who violated the db.Iterator contract.

All leveldb really wants is to give callers a view of the bytes.

That is, in addition to "string" with its promise A (handle for life) and
"[]byte" with its promise B (you can mutate), we need a a common type to both
of those with neither promise, what is constant-time assignable from a string or a
[]byte.

s := "string"
b := []byte(s)
var v byteview = s // constant time assignment
var v byteview = b // constant time assignment

A byteview would be represented in memory just like a string (*byte, int), but the
compiler would prevent mutations or addressing (like string), and callers would always
treat its backing data as ephemeral, like a []byte, unless negotiated otherwise.

Obviously this isn't a Go 1.n item, considering the impact it would have on the standard
library.

A good name for byteview would be "bytes", but we have a package
"bytes" already. Maybe we can get rid of that package.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions