-
Notifications
You must be signed in to change notification settings - Fork 2
Getting Started
This repository includes two things: A tool that generates code and a Go library that is used by the generated code to serialize and unserialize MessagePack data. You need to use both.
The primary difference between msgp
and most other serialization libraries for Go is that msgp
doesn't perform runtime reflection. Instead, you first use the code generator to automatically generate the appropriate methods for the types you want to serialize or unserialize, and at runtime the provided library will know exactly how to deal with the types you have.
Download and install the code generator and library:
$ go get -u -t github.com/dchenk/msgp
First, create a new directory in your GOPATH
with a main.go
file inside.
$ mkdir -p $GOPATH/src/msgp-example
$ cd $GOPATH/src/msgp-example
$ touch main.go
Then open main.go
in your editor and add the following:
package main
import "fmt"
//go:generate msgp
func main() {
fmt.Println("Nothing to do yet...")
}
type Foo struct {
Bar string `msgp:"bar"`
Baz float64 `msgp:"baz"`
}
(You can verify that this builds and runs with $ go build && ./msgp-example
.)
Now let’s add some methods to Foo
by running go generate
:
$ go generate
======= MessagePack Code Generating =======
Input: main.go
Writing file: main_gen.go
Writing file: main_gen_test.go
Two files were generated, one giving the type Foo
additional methods and the other containing automatically generated tests for the generated file.
You can run the tests along with the benchmarks:
$ go test -v -bench .
=== RUN TestMarshalUnmarshalFoo
--- PASS: TestMarshalUnmarshalFoo (0.00s)
=== RUN TestEncodeDecodeFoo
--- PASS: TestEncodeDecodeFoo (0.00s)
goos: darwin
goarch: amd64
pkg: msgp-example
BenchmarkMarshalMsgFoo-8 50000000 32.7 ns/op 32 B/op 1 allocs/op
BenchmarkAppendMsgFoo-8 100000000 14.0 ns/op 1360.15 MB/s 0 B/op 0 allocs/op
BenchmarkUnmarshalFoo-8 30000000 47.3 ns/op 401.42 MB/s 0 B/op 0 allocs/op
BenchmarkEncodeFoo-8 50000000 29.2 ns/op 649.59 MB/s 0 B/op 0 allocs/op
BenchmarkDecodeFoo-8 20000000 69.5 ns/op 273.50 MB/s 0 B/op 0 allocs/op
PASS
ok msgp-example 10.534s
Let’s break down what happened here:
-
go generate
scanned each file inmsgp-example
for ago:generate
directive. -
//go:generate msgp
was found in main.go, which caused$GOFILE
to be set tomain.go
for the code generator. -
msgp
was invoked bygo generate
. The$GOFILE
was parsed, and type declarations were extracted. -
msgp
createdmain_gen.go
, which contains all of the generated methods, andmain_gen_test.go
, which has tests and benchmarks for each generated method.
The key takeaway here is that msgp
works on a per-file, not a per-package, basis. You can invoke the code generator on an entire directory at once by passing a directory path using the -src
flag.
There are a couple reasons why we designed msgp
to operate on files rather than on go packages:
- Integration with build tools like
make
is simple. - Reading one file is much faster than reading a whole directory.
You may find it useful to put types requiring code generation in their own file, and put //go:generate msgp
at the top of that file.
Let’s look at the generated code in main_gen.go
:
(Note: the interfaces that the code generator implements are stable, but the code that it generates in order to implement those interfaces has changed over time in order to provide performance and stability improvements. Don’t be alarmed if you see output that’s different from what is listed below.)
package main
// THIS FILE WAS PRODUCED BY THE MSGP CODE GENERATION TOOL (github.com/dchenk/msgp).
// DO NOT EDIT.
import (
"github.com/dchenk/msgp/msgp"
)
// DecodeMsg implements msgp.Decoder
func (z *Foo) DecodeMsg(dc *msgp.Reader) (err error) {
var field []byte
var zb0001 uint32
zb0001, err = dc.ReadMapHeader()
if err != nil {
return
}
for zb0001 > 0 {
zb0001--
field, err = dc.ReadMapKeyPtr()
if err != nil {
return
}
switch string(field) {
case "bar":
z.Bar, err = dc.ReadString()
if err != nil {
return
}
case "baz":
z.Baz, err = dc.ReadFloat64()
if err != nil {
return
}
default:
err = dc.Skip()
if err != nil {
return
}
}
}
return
}
// EncodeMsg implements msgp.Encoder
func (z Foo) EncodeMsg(en *msgp.Writer) (err error) {
// map header, size 2
// write "bar"
err = en.Append(0x82, 0xa3, 0x62, 0x61, 0x72)
if err != nil {
return
}
err = en.WriteString(z.Bar)
if err != nil {
return
}
// write "baz"
err = en.Append(0xa3, 0x62, 0x61, 0x7a)
if err != nil {
return
}
err = en.WriteFloat64(z.Baz)
if err != nil {
return
}
return
}
// MarshalMsg implements msgp.Marshaler
func (z Foo) MarshalMsg(b []byte) (o []byte, err error) {
o = msgp.Require(b, z.Msgsize())
// map header, size 2
// string "bar"
o = append(o, 0x82, 0xa3, 0x62, 0x61, 0x72)
o = msgp.AppendString(o, z.Bar)
// string "baz"
o = append(o, 0xa3, 0x62, 0x61, 0x7a)
o = msgp.AppendFloat64(o, z.Baz)
return
}
// UnmarshalMsg implements msgp.Unmarshaler
func (z *Foo) UnmarshalMsg(bts []byte) (o []byte, err error) {
var field []byte
var zb0001 uint32
zb0001, bts, err = msgp.ReadMapHeaderBytes(bts)
if err != nil {
return
}
for zb0001 > 0 {
zb0001--
field, bts, err = msgp.ReadMapKeyZC(bts)
if err != nil {
return
}
switch string(field) {
case "bar":
z.Bar, bts, err = msgp.ReadStringBytes(bts)
if err != nil {
return
}
case "baz":
z.Baz, bts, err = msgp.ReadFloat64Bytes(bts)
if err != nil {
return
}
default:
bts, err = msgp.Skip(bts)
if err != nil {
return
}
}
}
o = bts
return
}
// Msgsize returns an upper bound estimate of the number of bytes occupied by the serialized message
func (z Foo) Msgsize() (s int) {
s = 1 + 4 + msgp.StringPrefixSize + len(z.Bar) + 4 + msgp.Float64Size
return
}
As we just saw, by default there are 5 methods implemented by the code generator:
-
MarshalMsg([]byte) ([]byte, error)
implementsmsgp.Marshaler
-
UnmarshalMsg([]byte) ([]byte, error)
implementsmsgp.Unmarshaler
-
EncodeMsg(*msgp.Writer) error
implementsmsgp.Encoder
-
DecodeMsg(*msgp.Reader) error
implementsmsgp.Decoder
-
Msgsize() int
implementsmsgp.Sizer
Each of those methods is actually an implementation of an interface defined in the msgp
library. In effect, the library at github.com/dchenk/msgp/msgp
contains everything we need to encode and decode MessagePack, and the code generator exists simply to write boilerplate code using that library. We could, of course, implement all of these interfaces ourselves, but that would be unnecessarily laborious and error-prone. (Plus, the code generator can perform optimizations like pre-encoding static strings, like the example above. This would be especially cumbersome to write by hand!)
The "memory interfaces" are interfaces through which chunks of memory ([]byte
, in this case) are written or read as MessagePack.
Go veterans will notice that msgp.Marshaler
differs slightly from the conventional Marshaler
interfaces in the standard library (such as json.Marshaler
) in that it takes a []byte
as its first and only argument. The semantics of msgp.Marshaler
dictate that it return a slice that is the concatenation of the input slice and the body of the object itself, and that it is allowed to use the memory between len
and cap
if at all possible. In practice, this allows for zero-allocation marshaling. If you don’t happen to have a slice lying around that you can use, you can always pass a nil
slice so that a new slice is allocated for you. There is a similar set of zero-allocation APIs in the standard library’s strconv
package.
foo1 := Foo{ /* ... */ }
foo2 := Foo{ /* ... */ }
// data contains the body of foo1
data, _ := foo1.MarshalMsg(nil)
fmt.Printf("foo1 is encoded as %x\n", data)
// data is overwritten with the body of foo2. If it fits within
// the old slice, no new memory is allocated.
data, _ = foo2.MarshalMsg(data[:0])
fmt.Printf("foo2 is encoded as %x\n", data)
As you may have already guessed, the msgp.Unmarshaler
interface is simply the inverse of the msgp.Marshaler
interface. The returned []byte
should be a sub-slice of the argument slice pointing to the memory not yet consumed.
For example, here’s a convoluted way to switch the values contained in two structs:
foo1 := Foo{ /* ... */ }
foo2 := Foo{ /* ... */ }
fmt.Printf("foo1: %v\n", foo1)
fmt.Printf("foo2: %v\n", foo2)
// Append two messages to the same slice.
data, _ := foo1.MarshalMsg(nil)
data, _ = foo2.MarshalMsg(data)
// Now just decode them in reverse:
data, _ = foo2.UnmarshalMsg(data)
data, _ = foo1.UnmarshalMsg(data)
// At this point, len(data) should be 0.
fmt.Println("len(data) =", len(data))
fmt.Printf("foo1: %v", foo1)
fmt.Printf("foo2: %v", foo2)
Because MessagePack is self-describing, we can interleave it with other pieces of data without framing and still re-construct the original input. (Notably, the same cannot be said of a number of other popular protocols, including Protocol Buffers.)
Streaming interfaces are interfaces through which MessagePack can be written to an io.Writer
or read from an io.Reader
.
msgp
handles streaming a little differently than the Go standard library. The msgp.Writer
and msgp.Reader
types are MessagePack-aware versions of bufio.Writer
and bufio.Reader
, respectively.
The implementation of msgp.Encoder
writes the object to the msgp.Writer
. Since the buffered writer maintains its own buffer, no memory allocation is performed.
foo := Foo{ /* ... */ }
w := msgp.NewWriter(os.Stdout)
foo.EncoodeMsg(w)
w.Flush()
msgp.Decoder
, as you may have already guessed, is the converse of msgp.Encoder
. It is the interface through which objects read themselves out of a msgp.Reader
.
pr, pw := io.Pipe()
go func() {
w := msgp.NewWriter(pw)
fooIn := Foo{ /* ... */ }
fmt.Printf("fooIn is %v\n", fooIn)
fooIn.EncodeMsg(w)
w.Flush()
}()
var fooOut Foo
fooOut.DecodeMsg(msgp.NewReader(pr))
fmt.Printf("fooOut is %v\n", fooOut)
msgp.Sizer
is a helper interface used in a couple places inside the msgp
library, as well as in the implementation of msgp.Marshaler
. Its purpose is to estimate the amount of memory needed to allocate to fit a particular type of object. (In practice, it systematically over-estimates the encoded size of the object.)