perf: vttablet/mysql optimizations #7800

vmg · 2021-04-07T10:30:43Z

Description

Hi! Here's a small list of optimizations for vttablet and it's connection to the MySQL upstream server.

Brief explanation for each follows:

sqlparser: do not lookup keywords if they're too long: this is going to apply to all SQL parsing in Vitess, also to vtgate. It's most noticeable on synthetic queries, like the ones we were using for benchmarking the MySQL client, with large keywords. The idea is this: when building the perfect map for SQL keywords in the parser, we keep track of the longest and the shortest of the keywords. That way, when looking up keywords afterwards, we don't have to bother hashing the potential keyword if it's either shorter than the shortest keyword we know, or longer than the longest.
mysql: do not copy bytes that are immediately cast to string: the MySQL client code was calling readBytesCopy more than it should. You don't have to copy a slice of bytes before you cast it to string: the compiler does that for you automatically (otherwise the cast would be unsafe!), so we were copying twice.
bucketpool: do not use floating point math: this was interestingly hot when profiling the MySQL wire protocol server interface. We can simplify the math required to find the bucket for a given size in our pool by replacing floating point math with bit-level math. In this particular case, we can replace FP Log2 with a reversed count-leading-zeroes (which is essentially what Len64 does in the bits package). It's not magical but it's as fast as it'll get:

name        old time/op  new time/op  delta
Pool-16      281ns ± 1%   244ns ± 2%  -13.15%  (p=0.000 n=10+10)

mysql: do not copy streaming packets: this is the 💰 prize here. The streaming code is already allocating full packets on each read (as opposed to using ephemeral ones like the normal client), so there is no need to copy them when parsing the rows. We can have all the row structs share the same underlying byte slice from the packet. The benefits are massive, because the vast majority of time spent in a streaming query is just parsing rows:

name            old time/op    new time/op    delta
StreamQuery-16    2.82ms ± 1%    2.28ms ± 2%  -19.12%  (p=0.000 n=10+10)

name            old alloc/op   new alloc/op   delta
StreamQuery-16    4.71MB ± 0%    2.59MB ± 0%  -45.00%  (p=0.000 n=10+9)

name            old allocs/op  new allocs/op  delta
StreamQuery-16     5.68k ± 0%     3.51k ± 0%  -38.19%  (p=0.000 n=10+9)

That's 20% increase in throughput with streaming queries, 45% decrease in total memory allocations. This will be very noticeable in large Vitess deployments, particularly ones using OLAP workloads.

Related Issue(s)

Performance Improvements #7674

Checklist

Should this PR be backported?
Tests were added or are not required
Documentation was added or is not required

Deployment Notes

Impacted Areas in Vitess

Components that this PR will affect:

Signed-off-by: Vicent Marti <vmg@strn.cat>

Casting from `[]byte` to `string` already creates a copy; there's no need to copy from the original buffer. Signed-off-by: Vicent Marti <vmg@strn.cat>

name old time/op new time/op delta Pool-16 281ns ± 1% 244ns ± 2% -13.15% (p=0.000 n=10+10) Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>

The packets from a MySQL conn stream are fully owned by the caller so they don't need to copied when parsed into rows. Signed-off-by: Vicent Marti <vmg@strn.cat>

vmg added 5 commits April 7, 2021 12:07

sqlparser: do not lookup keywords if they're too long

6695761

Signed-off-by: Vicent Marti <vmg@strn.cat>

mysql: do not copy bytes that are immediately cast to string

43d10f4

Casting from `[]byte` to `string` already creates a copy; there's no need to copy from the original buffer. Signed-off-by: Vicent Marti <vmg@strn.cat>

bucketpool: do not use floating point math

fb74a47

name old time/op new time/op delta Pool-16 281ns ± 1% 244ns ± 2% -13.15% (p=0.000 n=10+10) Signed-off-by: Vicent Marti <vmg@strn.cat>

endtoend: add streaming benchmark

8e83d42

Signed-off-by: Vicent Marti <vmg@strn.cat>

mysql: do not copy streaming packets

91d9627

The packets from a MySQL conn stream are fully owned by the caller so they don't need to copied when parsed into rows. Signed-off-by: Vicent Marti <vmg@strn.cat>

vmg requested review from GuptaManan100, harshit-gangal and systay as code owners April 7, 2021 10:30

vmg changed the title ~~Vmg/perf mysql~~ perf: vttablet/mysql optimizations Apr 7, 2021

vmg mentioned this pull request Apr 7, 2021

Performance Improvements #7674

Open

harshit-gangal approved these changes Apr 7, 2021

View reviewed changes

harshit-gangal merged commit 25762d8 into vitessio:master Apr 7, 2021

vmg mentioned this pull request Apr 8, 2021

Streaming Query Optimizations tinyspeck/vitess#205

Merged

askdba added Component: Query Serving Severity 4 labels Apr 13, 2021

askdba added this to the v11.0 milestone Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: vttablet/mysql optimizations #7800

perf: vttablet/mysql optimizations #7800

vmg commented Apr 7, 2021 •

edited

perf: vttablet/mysql optimizations #7800

perf: vttablet/mysql optimizations #7800

Conversation

vmg commented Apr 7, 2021 • edited

Description

Related Issue(s)

Checklist

Deployment Notes

Impacted Areas in Vitess

vmg commented Apr 7, 2021 •

edited