-
Notifications
You must be signed in to change notification settings - Fork 116
Conversation
Note: I need to update some tests to pass. However, I'm not prioritizing this until I confirm on a staging server that this runs much more efficiently. |
1622e41
to
36caf67
Compare
Pull Request Test Coverage Report for Build 1017
💛 - Coveralls |
fields/sql.go
Outdated
// Get a value of the pointer of our type. The Scanner and Unmarshalers should | ||
// only be implemented as dereference methods, since they would do nothing otherwise. Therefore | ||
// we can safely assume that we should check for these interfaces on the pointer value. | ||
s.value = reflect.New(s.Type) | ||
// s.value = s.va |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
le comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
le grazie
36c7a9d
to
e53078b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Looks pretty decent, a couple questions about some edge cases, but I don't think they'll be issues.
fields/descriptor.go
Outdated
@@ -39,7 +39,12 @@ func (d *Descriptor) Valuer(val reflect.Value) Valuer { | |||
} | |||
|
|||
// Scanner creates a sql.Scanner from the descriptor. | |||
func (d *Descriptor) Scanner() *Scanner { return &Scanner{Descriptor: d} } | |||
func (d *Descriptor) Scanner(value reflect.Value) *Scanner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value doesn't convey a lot of meaning, maybe dest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is gone now
fields/descriptor.go
Outdated
@@ -39,7 +39,12 @@ func (d *Descriptor) Valuer(val reflect.Value) Valuer { | |||
} | |||
|
|||
// Scanner creates a sql.Scanner from the descriptor. | |||
func (d *Descriptor) Scanner() *Scanner { return &Scanner{Descriptor: d} } | |||
func (d *Descriptor) Scanner(value reflect.Value) *Scanner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add some more context on this commit as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK I removed this - but definitely
livesql/binlog.go
Outdated
scanner := table.Columns[i].Descriptor.Scanner() | ||
column := table.Columns[i] | ||
field := elem.FieldByIndex(column.Index) | ||
if field.Kind() != reflect.Ptr && field.CanAddr() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this this alright when it's not a pointer and we can't Addr it? Or is that not physically possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be possible. However, you're right - if it's not a pointer and we can't addr it, we should return an error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, actually, maybe I should remove field.CanAddr()
and just have it panic if it can't. This should always be able to Addr - the struct is even defined in this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems reasonable
if !isValid { | ||
to.Set(reflect.Zero(to.Type())) | ||
return | ||
// We need to hold onto this pointer-pointer in order to make the value addressable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reflect law 3 https://blog.golang.org/laws-of-reflection 👍
fields/sql.go
Outdated
// CopyTo copies the scanner value to another reflect.Value. This is used for setting structs. | ||
func (s *Scanner) CopyTo(to reflect.Value) { | ||
s.copy(s.value, to, s.isValid) | ||
func (s *Scanner) To(value reflect.Value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if To
name makes sense. Maybe Reset
makes more sense?
One example I see often is having a sync.Pool of compression writers, like https://github.com/ungerik/go-pool/blob/master/gzip.go#L11:31. They almost always have a Reset method to replace underlying writer/destination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to renaming this. Maybe it should be Target
? Also, now that I think about it, maybe I should clear the value
at the end of Scan
.
Benchmark before/after: Original
New Types
New Types Optimized
Improvements across the board 😄 |
By having an indirect Descriptor, we were copying it into Scanners and Valuers. This reduces the allocations by using the same pointer to a Descriptor everywhere. Benchmark: ``` goos: darwin goarch: amd64 pkg: github.com/samsarahq/thunder/sqlgen Benchmark/Read-8 5000 258781 ns/op 1994 B/op 52 allocs/op Benchmark/Read_Where-8 3000 522702 ns/op 2707 B/op 67 allocs/op Benchmark/Create-8 3000 527204 ns/op 1376 B/op 35 allocs/op Benchmark/Update-8 3000 549079 ns/op 1784 B/op 45 allocs/op Benchmark/Delete-8 2000 544365 ns/op 808 B/op 24 allocs/op ```
Remove the in-between allocation that we are making for every value on every struct (causing additional heap allocations) and instead scan directly into the struct. Benchmark: ``` goos: darwin goarch: amd64 pkg: github.com/samsarahq/thunder/sqlgen Benchmark/Read-8 3000 337823 ns/op 1908 B/op 47 allocs/op Benchmark/Read_Where-8 2000 728402 ns/op 2631 B/op 62 allocs/op Benchmark/Create-8 2000 585395 ns/op 1376 B/op 35 allocs/op Benchmark/Update-8 2000 710547 ns/op 1784 B/op 45 allocs/op Benchmark/Delete-8 2000 639660 ns/op 808 B/op 24 allocs/op ```
Re-use scanners across different runs in order to prevent more allocation than we need. There's a trade-off here: On one hand, you have a mutex lock that can slow down code. On the other hand, we have allocations that can increase memory. For our use-case (based on simulated traffic), there is no perceptible slowdown and a big memory win. Benchmark: ``` goos: darwin goarch: amd64 pkg: github.com/samsarahq/thunder/sqlgen Benchmark/Read-8 5000 267699 ns/op 1686 B/op 43 allocs/op Benchmark/Read_Where-8 3000 506274 ns/op 2394 B/op 58 allocs/op Benchmark/Create-8 3000 520147 ns/op 1376 B/op 35 allocs/op Benchmark/Update-8 3000 493141 ns/op 1784 B/op 45 allocs/op Benchmark/Delete-8 3000 491342 ns/op 808 B/op 24 allocs/op ```
e53078b
to
fb09f60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment about a commit message. Also the benchmarks you posted aren't 1:1, should we have the read_where bench in the first one as well?
@@ -51,16 +51,20 @@ const ( | |||
) | |||
|
|||
// UnbuildStruct extracts SQL values from a struct | |||
func UnbuildStruct(table *Table, strct interface{}) []interface{} { | |||
func UnbuildStruct(table *Table, strct interface{}) ([]interface{}, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the commit message:
Don't hold onto valuers, instead only use them to get values back. By
doing this, we're allocating the simpler (and more likely to be used)
value onto the heap, while only allocating Valuer onto the heap.
do we mean:
Don't hold onto valuers, instead only use them to get values back. By
doing this, we're allocating the simpler (and more likely to be used)
value onto the heap, while only allocating Valuer onto the **stack**.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not maybe we should reword this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected
Don't hold onto valuers, instead only use them to get values back. By doing this, we're allocating the simpler (and more likely to be used) value onto the heap, while only allocating Valuer onto the stack. Benchmark: ``` goos: darwin goarch: amd64 pkg: github.com/samsarahq/thunder/sqlgen Benchmark/Read-8 5000 304845 ns/op 1683 B/op 43 allocs/op Benchmark/Read_Where-8 2000 589746 ns/op 2346 B/op 56 allocs/op Benchmark/Create-8 2000 654167 ns/op 1120 B/op 26 allocs/op Benchmark/Update-8 2000 711194 ns/op 1496 B/op 32 allocs/op Benchmark/Delete-8 2000 705401 ns/op 808 B/op 25 allocs/op ```
This is internal API and should not be relied on by anyone
fb09f60
to
dd7fa16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
originally, we thought that the benchmark didn't capture the large production difference we saw. what was the realization that changed our minds there?
also, is there some back of the envelope math for how the benchmark translates to real world memory usage? (i.e. accounts for the 3GB we saw in production)
fields/descriptor.go
Outdated
@@ -54,7 +54,7 @@ func (d Descriptor) ValidateSQLType() error { | |||
return d.Scanner().Scan(val) | |||
} | |||
|
|||
func (d Descriptor) copy(from, to reflect.Value, isValid bool) { | |||
func (d *Descriptor) copy(from, to reflect.Value, isValid bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the interesting take away for me here is that pointer usage wrt allocation/heap/stack usage is tricky.
in this case, not copying the struct/using a pointer is cheaper (because... it's already on the heap?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My takeaway here is: everything that's passed into a function, including pointers, is pass by value in go. If we have Descriptor
, we are copying a value (all struct fields) and then pointing to it. If we have *Descriptor
, we are copying a pointer (just one thing pointing to the value). The second is lighter because we don't have to re-allocate the struct and its children, we just have to allocate the pointer.
elem := reflect.ValueOf(strct).Elem() | ||
values := make([]interface{}, len(table.Columns)) | ||
|
||
for i, column := range table.Columns { | ||
val := elem.FieldByIndex(column.Index) | ||
values[i] = column.Descriptor.Valuer(val) | ||
var err error | ||
values[i], err = column.Descriptor.Valuer(val).Value() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my takeaway: escaping a value is much cheaper than escaping a big struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both that and sql
was going to call Value
anyway before it sent the value to the sql driver, which means that the other value was going to be escaped either way - might as well do it as soon as possible so we don't allocate two different values
Add BuildStruct method back into sqlgen so we can parse binlog row event the same way we scan from MySQL driver. This was removed in #166.
Add BuildStruct method back into sqlgen so we can parse binlog row event the same way we scan from MySQL driver. This was removed in #166.
Add BuildStruct method back into sqlgen so we can parse binlog row event the same way we scan from MySQL driver. This was removed in #166.
Add BuildStruct method back into sqlgen so we can parse binlog row event the same way we scan from MySQL driver. This was removed in #166.
Two commits here.
First one reduces allocations by 4 because we're no longer copying references for new scanners/valuers.
Second one reduces allocations by not copying our value, but rather allocating straight to a struct.