Skip to content
This repository was archived by the owner on Sep 9, 2024. It is now read-only.

Conversation

@suhailpatel
Copy link
Member

@suhailpatel suhailpatel commented Jun 22, 2019

This PR replaces the use of the mapstructure library with custom reflection which is much more optimised for gocassa internals. Doing so leads to much more performant decoding characteristics.

⚠️ Given the extensive changes in the interface, this PR bumps the version of gocassa to v2.0.0

Motivation

The motivation for this PR came from profiling our services. Turns out quite a lot of services which use Cassandra at scale were spending 50-60%+ of their CPU within gocassa's decodeResult step (with a sizeable ~10% additionally spent in MapScan). This is a direct invocation of mapstructure.

Given I was going to tweak the interfaces, I took the liberty of cleaning up and adding a Statement object instead of ferrying around a query string and list of values. This can also encapsulate field name information under the hood which is used in our scanner.

Change

This patch essentially hooks into the gocql Scanner directly rather than going through a map for SELECT queries.

Previously we had two levels of reflection when reading via gocassa:

  • A conversion from the Cassandra type into a map[string]interface{} within gocql. This used the MapScan function in gocql and thus results in a map per result allocated and populated
  • A conversion using decodeResult in gocassa to convert each map into the target struct

Now we have a direct reflection into struct by leveraging the ordering of fields which gocassa also knows about since gocassa is what constructs the SELECT queries in the first place!

Benchmarks

gocassa on master (the benchmark test is at 96f213d):

goos: darwin
goarch: amd64
pkg: github.com/monzo/gocassa
BenchmarkDecodeBlogSliceNoBody-4       	    3000	    480115 ns/op	   73486 B/op	    1687 allocs/op
BenchmarkDecodeBlogStruct-4            	     100	  12456151 ns/op	  375150 B/op	   30385 allocs/op
BenchmarkDecodeBlogStructEmptyBody-4   	  100000	     22711 ns/op	    3489 B/op	      68 allocs/op
BenchmarkDecodeAlphaSlice-4            	    1000	   1472692 ns/op	  306253 B/op	    5007 allocs/op
BenchmarkDecodeAlphaStruct-4           	   20000	     68542 ns/op	   14407 B/op	     174 allocs/op
PASS
ok  	github.com/monzo/gocassa	9.605s

These values are very high and it doesn't even account for the map allocation/population! I did some profiling and it turns out the DecodeHook functionality we use to convert from the gocql type of various integer types and also to support UnmarshalCQL functionality adds a very large overhead! I got rid of them briefly and ran another test.

gocassa on master without DecodeHook functions (the benchmark test is at 96f213d):

goos: darwin
goarch: amd64
pkg: github.com/monzo/gocassa
BenchmarkDecodeBlogSliceNoBody-4       	   10000	    181272 ns/op	   73410 B/op	    1685 allocs/op
BenchmarkDecodeBlogStruct-4            	    1000	   1836223 ns/op	  374693 B/op	   30383 allocs/op
BenchmarkDecodeBlogStructEmptyBody-4   	  200000	      7485 ns/op	    3425 B/op	      66 allocs/op
BenchmarkDecodeAlphaSlice-4            	    2000	    720789 ns/op	  306197 B/op	    5005 allocs/op
BenchmarkDecodeAlphaStruct-4           	   50000	     35752 ns/op	   14339 B/op	     172 allocs/op
PASS
ok  	github.com/monzo/gocassa	10.730s

Still quite high but there has been noticeable improvement, especially in BenchmarkDecodeBlogStruct where decoding a single struct with a []byte populated was making the per operation latency 9x slower. Note that we can't just get rid of these functions in real use because otherwise our integer types will be wrong. You don't even need to be using any of this functionality, just the conversion to execute these decode hooks adds a high amount of latency.

gocassa with this PR on decoding into structs using the iterator directly:

goos: darwin
goarch: amd64
pkg: github.com/monzo/gocassa
BenchmarkDecodeBlogSliceNoBody-4       	   50000	     34113 ns/op	   21280 B/op	     235 allocs/op
BenchmarkDecodeBlogStruct-4            	  500000	      3403 ns/op	    2784 B/op	      32 allocs/op
BenchmarkDecodeBlogStructEmptyBody-4   	  500000	      3281 ns/op	    2784 B/op	      32 allocs/op
BenchmarkDecodeAlphaSlice-4            	   10000	    103263 ns/op	   43836 B/op	     810 allocs/op
BenchmarkDecodeAlphaStruct-4           	  200000	      9768 ns/op	    9019 B/op	      92 allocs/op
PASS
ok  	github.com/monzo/gocassa	9.222s

This is a 5-10x+ improvement in latency per operation and significantly reduced memory allocations by hooking into gocql directly. If you are using byte slices or other similar data structures within your target unmarshal struct, the improvement will be massive (upwards of hundreds of times faster).

benchmark                                old ns/op     new ns/op     delta
BenchmarkDecodeBlogSliceNoBody-4         480115        34113         -92.89%
BenchmarkDecodeBlogStruct-4              12456151      3403          -99.97%
BenchmarkDecodeBlogStructEmptyBody-4     22711         3281          -85.55%
BenchmarkDecodeAlphaSlice-4              1472692       103263        -92.99%
BenchmarkDecodeAlphaStruct-4             68542         9768          -85.75%

benchmark                                old allocs     new allocs     delta
BenchmarkDecodeBlogSliceNoBody-4         1687           235            -86.07%
BenchmarkDecodeBlogStruct-4              30385          32             -99.89%
BenchmarkDecodeBlogStructEmptyBody-4     68             32             -52.94%
BenchmarkDecodeAlphaSlice-4              5007           810            -83.82%
BenchmarkDecodeAlphaStruct-4             174            92             -47.13%

benchmark                                old bytes     new bytes     delta
BenchmarkDecodeBlogSliceNoBody-4         73486         21280         -71.04%
BenchmarkDecodeBlogStruct-4              375150        2784          -99.26%
BenchmarkDecodeBlogStructEmptyBody-4     3489          2784          -20.21%
BenchmarkDecodeAlphaSlice-4              306253        43836         -85.69%
BenchmarkDecodeAlphaStruct-4             14407         9019          -37.40%

This PR isn't adding any additional hidden work on the gocql side because this iteration cost is already being spent by using MapScan which does the same work (in fact, as of writing this, it uses the gocql.Iter.Scan() under the hood).

This replaces quite a bit of the gocassa internals to return an object
for the query and values rather than a string and a variadic interface.

This refactoring is for down the line so we can return also the Column
Names in the Statement and potentially more data in the future if
needed. To achieve this, we've had to tweak the public interface too
significantly hence the bump to 2.0.
This PR adds type information to the fields we are selecting if making
a SELECT query. This information can be used by consumers to do
reflection and allocate the relevant pointers for scanning using the
gocql iter.Scan method.

To be frankly honest, it's quite a leaky abstraction however it is super
optimal because you don't end up doing multiple reflection steps and can
support things like custom CQL Unmarshal methods without a lot of work.
Initially I wanted to seperate the concerns of iteration to the backend
so it had more control. However, this would mean each backend would need
to do reflection business which is hard to test for and hard to assert
for correctness. It also is a very leaky abstraction because gocassa
would need to expose the field types externally. This is messy and prone
to error.

Instead, this change passes in a scanner which takes an iterator. We can
hook in a mock iterator (as long as it passes the same interface) or
a gocql iterator. That way, gocassa is in control of what happens with
the iterator for scanning rows.

In order for the Scanner to work, we've added a bit more reflection code
which will use the cache to ensure that for a given struct type, we only
need to do the hardcore work of figuring out fields once.
Similar to the previous commit, we don't want a leaky abstraction around
field types when parsing a statement. Also it turns out a lot of
programs pass in a completely different struct types during unmarshaling
(sometimes with fields being removed or with different types) than the
ones passed in during table initialisation. We could've forbidden this
using a type check as an optimisation (and that option is still there)
but for now it seems quite heavy handed.

Because of this, we can get rid of some cruft we added around logging
the struct on table creation and generating field types.
Simply a semantic change but now doing a scan and parsing a raw map
isn't how we do things nowadays. Types and Structs for days!
This removes the use of the mapstructure library. Now the decode step is
embedded as part of the scanner which handles all sorts of pointer magic
and putting the pointers into the right places in the struct.

Also cleans up a stray use of generateFieldTypes which we got rid of in
a prior commit.
The bulk of the change is tweaking the tests which tested the
mapstructure decoder to use the scanner instead. Since the scanner
depends on an iterator, we've added a mockIterator which looks at the
fields and coerces the values from the results map in the order
specified.
Adds some sanity checks given it's a publicly exposed interface and the
implementation is passed across this interface boundary so it's worth
validating we have the implementation correct for future changes.
In an ideal world, we would rip out all of this code and replace it with
a call to UnmarshalCQL with some crafted params. This will give us the
benefit of all the checks that gocql has built in.

Either way, this PR locks down the current behaviour so when we do the
migration in the future, we can assert for major changes.
Part of this work involved some new functions in the reflect side of
things to take a struct and turn it into a map. We've also changed the
semantics of Field (especially now that it's public) to not
automatically follow pointers. If a struct type is *int, that shouldn't
be represented as int as the type.
This commit is the cleanup commit. We've all been there, you've been
prototyping and in the middle of getting things done, you write a bunch
of hacky code all over the place. Well for this refactoring, this was
indeed the scanner. There was a bunch of duplication for reading into
a slice of structs or reading into a single struct. All of this has been
cleaned up and refactored into nice logical functions so you can see
what's happening.

As a result, we've been able to add a bunch of tests for all the edge
cases encountered so far. This boosts the test coverage of the scanner
which is great because it's one of the more fragile components with all
the reflection it needs to do.

I've also taken this opportunity to rename the ScanAll to ScanIter. My
rationale for this is because ScanAll feels like it's going to scan
everything in the iterator. This might not be the case all the time
especially if you pass in a single struct where multiple results have
been returned.
I've developed a benchmark which allows for similar comparisons between
the old method of decoding using mapstructure and the new decoder. There
is some differences due to the fact that we use a mock iterator which
will skew the results (in favour of mapstructure as that skips all the
MapScan business it would usually have to do) but the results show that
it's no contest anyway (our optimisations win by a country mile).

Using mapstructure:

    goos: darwin
    goarch: amd64
    pkg: github.com/monzo/gocassa
    BenchmarkDecodeBlogSliceNoBody-4       	    3000	    480115 ns/op	   73486 B/op	    1687 allocs/op
    BenchmarkDecodeBlogStruct-4            	     100	  12456151 ns/op	  375150 B/op	   30385 allocs/op
    BenchmarkDecodeBlogStructEmptyBody-4   	  100000	     22711 ns/op	    3489 B/op	      68 allocs/op
    BenchmarkDecodeAlphaSlice-4            	    1000	   1472692 ns/op	  306253 B/op	    5007 allocs/op
    BenchmarkDecodeAlphaStruct-4           	   20000	     68542 ns/op	   14407 B/op	     174 allocs/op
    PASS
    ok  	github.com/monzo/gocassa	9.605s

Doing our own reflection:

    goos: darwin
    goarch: amd64
    pkg: github.com/monzo/gocassa
    BenchmarkDecodeBlogSliceNoBody-4       	   30000	     36761 ns/op	   21336 B/op	     238 allocs/op
    BenchmarkDecodeBlogStruct-4            	  300000	      3730 ns/op	    2840 B/op	      35 allocs/op
    BenchmarkDecodeBlogStructEmptyBody-4   	  500000	      4269 ns/op	    2840 B/op	      35 allocs/op
    BenchmarkDecodeAlphaSlice-4            	   10000	    109677 ns/op	   44128 B/op	     815 allocs/op
    BenchmarkDecodeAlphaStruct-4           	  200000	     11790 ns/op	    9315 B/op	      97 allocs/op
    PASS
    ok  	github.com/monzo/gocassa	9.017s
@suhailpatel suhailpatel requested a review from a team June 22, 2019 22:47
@suhailpatel suhailpatel self-assigned this Jun 22, 2019
Map types are a bit unique in the mock because in order to dynamically
do diff based upserts, the map is treated like an interface map value
type ( map[<KeyType>]interface{}) in certain cases. In order to support
this, we do some type assertions in the map case.

Also fixes a test in mock_test.go which we changed previously to
accomodate the refactor. This means we haven't had to change any of the
mock behaviour with this refactoring and all existing use cases work!
If a value hasn't been set previously and we try and select it, it might
be the zero value and that'll cause a panic if we call Type() or similar
on reflect.Zero. This commit adds a valid check to ensure we don't have
the zero type (which is something we can just continue over).
To preserve old behaviour, if we are selecting a field and that field
happens to be nil in our store, we're going to get a nil value back from
gocassa. In this case, we are responsible for allocating the underlying
value and setting it to a non-nil type.
We need to follow pointers for embedded types otherwise we won't
correctly extract all the fields.

Also, if we have embedded pointers which point to nil, we don't do
anything about setting them.
Copy link

@sjwhitworth sjwhitworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the whole, it looks like a great improvement. The size of the PR makes it quite hard to review, but it's often hard to make these sorts of fundamental changes in small chunks. I left a few comments which would aid maintainability.

Some things that would make me more confident we hadn't changed behaviour:

  • Get a corpus of query strings before and after your change: we shouldn't have changed any representation on the wire.
  • Pick some extremely esoteric/weird behaviour of gocassa that others invariably rely on and hand test this. You may have already done this, but I don't think I've got that level of context.

Copy link

@milesbxf milesbxf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A peripheral comment while I look at this in more detail, but have we considered using Go code generation to do this in future?

This would avoid needing any reflection at all, and be more type-safe

Copy link

@milesbxf milesbxf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had time to have the deep look this PR deserves, but I've added a few surface-level comments

if timeline, ok := m["Timeline"]; ok {
if timeline.Name() != "Timeline" {
t.Errorf("Timeline should have name 'Timeline' but got %s", timeline.Name())
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth introducing an assertion library to make these less verbose (e.g. testify/assert) but appreciate we might not necessarily want to add a new dependency/way of testing things

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I've seen that we're using testify/assert below in scanner_test.go, so we should almost definitely update these tests to be consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to drastically change this file and thus have decided to stick with the existing convention (especially because this is a different package too within gocassa). I don't think this needs to be a blocker

@suhailpatel
Copy link
Member Author

suhailpatel commented Jun 24, 2019

@milesbxf I did consider very heavily code generation but that got very complicated with the refactor. This doesn't stop it though so potentially as another update if needed it'd be great!

@sjwhitworth FWIW this change hasn't changed any code paths around query generation. It's purely around query execution and reflection. I've added a heap of tests around based on mock and actual querying tests within our Monzo usage. The fact those tests pass and the necessary tests added here give the confidence needed.

@suhailpatel suhailpatel requested a review from sjwhitworth June 24, 2019 10:31
Copy link

@sjwhitworth sjwhitworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes look good. Would you mind if I took more time to take a final pass through? I think it would be good to get multiple eyes on this as well.

Copy link

@sjwhitworth sjwhitworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, with the caveat that I'd like another approver

@suhailpatel suhailpatel requested a review from milesbxf June 25, 2019 09:29
// setPtrs takes a list of fields and the associated pointers and sets them
// in order to the targetStruct
func setPtrs(structFields []*r.Field, ptrs []interface{}, targetStruct reflect.Value) {
for index, field := range structFields {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially add a check to ensure structFields and ptrs are the same len

var basePtr reflect.Value
switch baseType.Kind() {
case reflect.Array, reflect.Chan, reflect.Func, reflect.Interface, reflect.Ptr, reflect.UnsafePointer:
return fmt.Errorf("type of kind %v is not supported", baseType.Kind())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion, might be worth explaining why in a comment (these aren't settable).

Copy link

@mattrco mattrco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this on the basis that we've walked through the code and there are sufficient tests 👍

@suhailpatel suhailpatel merged commit 2d23b1d into master Jun 25, 2019
@suhailpatel suhailpatel deleted the query-with-slices branch June 25, 2019 13:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants