Skip to content
This repository has been archived by the owner on May 8, 2019. It is now read-only.

Getting Started

Dmitriy edited this page May 9, 2018 · 5 revisions

This repository includes two things: A tool that generates code and a Go library that is used by the generated code to serialize and unserialize MessagePack data. You need to use both.

The primary difference between msgp and most other serialization libraries for Go is that msgp doesn't perform runtime reflection. Instead, you first use the code generator to automatically generate the appropriate methods for the types you want to serialize or unserialize, and at runtime the provided library will know exactly how to deal with the types you have.

Download and install the code generator and library:

$ go get -u -t github.com/dchenk/msgp

Hello World Example

First, create a new directory in your GOPATH with a main.go file inside.

$ mkdir -p $GOPATH/src/msgp-example
$ cd $GOPATH/src/msgp-example
$ touch main.go

Then open main.go in your editor and add the following:

package main

import "fmt"

//go:generate msgp

func main() {
	fmt.Println("Nothing to do yet...")
}

type Foo struct {
	Bar string  `msgp:"bar"`
	Baz float64 `msgp:"baz"`
}

(You can verify that this builds and runs with $ go build && ./msgp-example.)

Now let’s add some methods to Foo by running go generate:

$ go generate
======= MessagePack Code Generating =======
   Input: main.go
   Writing file: main_gen.go
   Writing file: main_gen_test.go

Two files were generated, one giving the type Foo additional methods and the other containing automatically generated tests for the generated file.

You can run the tests along with the benchmarks:

$ go test -v -bench .
=== RUN   TestMarshalUnmarshalFoo
--- PASS: TestMarshalUnmarshalFoo (0.00s)
=== RUN   TestEncodeDecodeFoo
--- PASS: TestEncodeDecodeFoo (0.00s)
goos: darwin
goarch: amd64
pkg: msgp-example
BenchmarkMarshalMsgFoo-8   	50000000	        32.7 ns/op	      32 B/op	       1 allocs/op
BenchmarkAppendMsgFoo-8    	100000000	        14.0 ns/op	1360.15 MB/s	       0 B/op	       0 allocs/op
BenchmarkUnmarshalFoo-8    	30000000	        47.3 ns/op	 401.42 MB/s	       0 B/op	       0 allocs/op
BenchmarkEncodeFoo-8       	50000000	        29.2 ns/op	 649.59 MB/s	       0 B/op	       0 allocs/op
BenchmarkDecodeFoo-8       	20000000	        69.5 ns/op	 273.50 MB/s	       0 B/op	       0 allocs/op
PASS
ok  	msgp-example	10.534s

Let’s break down what happened here:

  • go generate scanned each file in msgp-example for a go:generate directive.
  • //go:generate msgp was found in main.go, which caused $GOFILE to be set to main.go for the code generator.
  • msgp was invoked by go generate. The $GOFILE was parsed, and type declarations were extracted.
  • msgp created main_gen.go, which contains all of the generated methods, and main_gen_test.go, which has tests and benchmarks for each generated method.

The key takeaway here is that msgp works on a per-file, not a per-package, basis. You can invoke the code generator on an entire directory at once by passing a directory path using the -src flag.

There are a couple reasons why we designed msgp to operate on files rather than on go packages:

  • Integration with build tools like make is simple.
  • Reading one file is much faster than reading a whole directory.

You may find it useful to put types requiring code generation in their own file, and put //go:generate msgp at the top of that file.

Let’s look at the generated code in main_gen.go:

(Note: the interfaces that the code generator implements are stable, but the code that it generates in order to implement those interfaces has changed over time in order to provide performance and stability improvements. Don’t be alarmed if you see output that’s different from what is listed below.)

package main

// THIS FILE WAS PRODUCED BY THE MSGP CODE GENERATION TOOL (github.com/dchenk/msgp).
// DO NOT EDIT.

import (
	"github.com/dchenk/msgp/msgp"
)

// DecodeMsg implements msgp.Decoder
func (z *Foo) DecodeMsg(dc *msgp.Reader) (err error) {
	var field []byte
	var zb0001 uint32
	zb0001, err = dc.ReadMapHeader()
	if err != nil {
		return
	}
	for zb0001 > 0 {
		zb0001--
		field, err = dc.ReadMapKeyPtr()
		if err != nil {
			return
		}
		switch string(field) {
		case "bar":
			z.Bar, err = dc.ReadString()
			if err != nil {
				return
			}
		case "baz":
			z.Baz, err = dc.ReadFloat64()
			if err != nil {
				return
			}
		default:
			err = dc.Skip()
			if err != nil {
				return
			}
		}
	}
	return
}

// EncodeMsg implements msgp.Encoder
func (z Foo) EncodeMsg(en *msgp.Writer) (err error) {
	// map header, size 2
	// write "bar"
	err = en.Append(0x82, 0xa3, 0x62, 0x61, 0x72)
	if err != nil {
		return
	}
	err = en.WriteString(z.Bar)
	if err != nil {
		return
	}
	// write "baz"
	err = en.Append(0xa3, 0x62, 0x61, 0x7a)
	if err != nil {
		return
	}
	err = en.WriteFloat64(z.Baz)
	if err != nil {
		return
	}
	return
}

// MarshalMsg implements msgp.Marshaler
func (z Foo) MarshalMsg(b []byte) (o []byte, err error) {
	o = msgp.Require(b, z.Msgsize())
	// map header, size 2
	// string "bar"
	o = append(o, 0x82, 0xa3, 0x62, 0x61, 0x72)
	o = msgp.AppendString(o, z.Bar)
	// string "baz"
	o = append(o, 0xa3, 0x62, 0x61, 0x7a)
	o = msgp.AppendFloat64(o, z.Baz)
	return
}

// UnmarshalMsg implements msgp.Unmarshaler
func (z *Foo) UnmarshalMsg(bts []byte) (o []byte, err error) {
	var field []byte
	var zb0001 uint32
	zb0001, bts, err = msgp.ReadMapHeaderBytes(bts)
	if err != nil {
		return
	}
	for zb0001 > 0 {
		zb0001--
		field, bts, err = msgp.ReadMapKeyZC(bts)
		if err != nil {
			return
		}
		switch string(field) {
		case "bar":
			z.Bar, bts, err = msgp.ReadStringBytes(bts)
			if err != nil {
				return
			}
		case "baz":
			z.Baz, bts, err = msgp.ReadFloat64Bytes(bts)
			if err != nil {
				return
			}
		default:
			bts, err = msgp.Skip(bts)
			if err != nil {
				return
			}
		}
	}
	o = bts
	return
}

// Msgsize returns an upper bound estimate of the number of bytes occupied by the serialized message
func (z Foo) Msgsize() (s int) {
	s = 1 + 4 + msgp.StringPrefixSize + len(z.Bar) + 4 + msgp.Float64Size
	return
}

As we just saw, by default there are 5 methods implemented by the code generator:

Each of those methods is actually an implementation of an interface defined in the msgp library. In effect, the library at github.com/dchenk/msgp/msgp contains everything we need to encode and decode MessagePack, and the code generator exists simply to write boilerplate code using that library. We could, of course, implement all of these interfaces ourselves, but that would be unnecessarily laborious and error-prone. (Plus, the code generator can perform optimizations like pre-encoding static strings, like the example above. This would be especially cumbersome to write by hand!)

Memory Interfaces

The "memory interfaces" are interfaces through which chunks of memory ([]byte, in this case) are written or read as MessagePack.

Go veterans will notice that msgp.Marshaler differs slightly from the conventional Marshaler interfaces in the standard library (such as json.Marshaler) in that it takes a []byte as its first and only argument. The semantics of msgp.Marshaler dictate that it return a slice that is the concatenation of the input slice and the body of the object itself, and that it is allowed to use the memory between len and cap if at all possible. In practice, this allows for zero-allocation marshaling. If you don’t happen to have a slice lying around that you can use, you can always pass a nil slice so that a new slice is allocated for you. There is a similar set of zero-allocation APIs in the standard library’s strconv package.

foo1 := Foo{ /* ... */ }
foo2 := Foo{ /* ... */ }

// data contains the body of foo1
data, _ := foo1.MarshalMsg(nil)

fmt.Printf("foo1 is encoded as %x\n", data)

// data is overwritten with the body of foo2. If it fits within
// the old slice, no new memory is allocated.
data, _ = foo2.MarshalMsg(data[:0])

fmt.Printf("foo2 is encoded as %x\n", data)

As you may have already guessed, the msgp.Unmarshaler interface is simply the inverse of the msgp.Marshaler interface. The returned []byte should be a sub-slice of the argument slice pointing to the memory not yet consumed.

For example, here’s a convoluted way to switch the values contained in two structs:

foo1 := Foo{ /* ... */ }
foo2 := Foo{ /* ... */ }

fmt.Printf("foo1: %v\n", foo1)
fmt.Printf("foo2: %v\n", foo2)

// Append two messages to the same slice.
data, _ := foo1.MarshalMsg(nil)
data, _ = foo2.MarshalMsg(data)

// Now just decode them in reverse:
data, _ = foo2.UnmarshalMsg(data)
data, _ = foo1.UnmarshalMsg(data)

// At this point, len(data) should be 0.
fmt.Println("len(data) =", len(data))

fmt.Printf("foo1: %v", foo1)
fmt.Printf("foo2: %v", foo2)

Because MessagePack is self-describing, we can interleave it with other pieces of data without framing and still re-construct the original input. (Notably, the same cannot be said of a number of other popular protocols, including Protocol Buffers.)

Streaming Interfaces

Streaming interfaces are interfaces through which MessagePack can be written to an io.Writer or read from an io.Reader.

msgp handles streaming a little differently than the Go standard library. The msgp.Writer and msgp.Reader types are MessagePack-aware versions of bufio.Writer and bufio.Reader, respectively.

The implementation of msgp.Encoder writes the object to the msgp.Writer. Since the buffered writer maintains its own buffer, no memory allocation is performed.

foo := Foo{ /* ... */ }

w := msgp.NewWriter(os.Stdout)
foo.EncoodeMsg(w)
w.Flush()

msgp.Decoder, as you may have already guessed, is the converse of msgp.Encoder. It is the interface through which objects read themselves out of a msgp.Reader.

pr, pw := io.Pipe()

go func() {
    w := msgp.NewWriter(pw)
    fooIn := Foo{ /* ... */ }
    fmt.Printf("fooIn is %v\n", fooIn)
    fooIn.EncodeMsg(w)
    w.Flush()
}()

var fooOut Foo
fooOut.DecodeMsg(msgp.NewReader(pr))

fmt.Printf("fooOut is %v\n", fooOut)

Helper Methods

msgp.Sizer is a helper interface used in a couple places inside the msgp library, as well as in the implementation of msgp.Marshaler. Its purpose is to estimate the amount of memory needed to allocate to fit a particular type of object. (In practice, it systematically over-estimates the encoded size of the object.)