perf: vttablet/mysql optimizations #7800
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Hi! Here's a small list of optimizations for
vttablet
and it's connection to the MySQL upstream server.Brief explanation for each follows:
sqlparser: do not lookup keywords if they're too long
: this is going to apply to all SQL parsing in Vitess, also tovtgate
. It's most noticeable on synthetic queries, like the ones we were using for benchmarking the MySQL client, with large keywords. The idea is this: when building the perfect map for SQL keywords in the parser, we keep track of the longest and the shortest of the keywords. That way, when looking up keywords afterwards, we don't have to bother hashing the potential keyword if it's either shorter than the shortest keyword we know, or longer than the longest.mysql: do not copy bytes that are immediately cast to string
: the MySQL client code was callingreadBytesCopy
more than it should. You don't have to copy a slice of bytes before you cast it tostring
: the compiler does that for you automatically (otherwise the cast would be unsafe!), so we were copying twice.bucketpool: do not use floating point math
: this was interestingly hot when profiling the MySQL wire protocol server interface. We can simplify the math required to find the bucket for a given size in our pool by replacing floating point math with bit-level math. In this particular case, we can replace FPLog2
with a reversed count-leading-zeroes (which is essentially whatLen64
does in thebits
package). It's not magical but it's as fast as it'll get:mysql: do not copy streaming packets
: this is the 💰 prize here. The streaming code is already allocating full packets on each read (as opposed to using ephemeral ones like the normal client), so there is no need to copy them when parsing the rows. We can have all the row structs share the same underlying byte slice from the packet. The benefits are massive, because the vast majority of time spent in a streaming query is just parsing rows:That's 20% increase in throughput with streaming queries, 45% decrease in total memory allocations. This will be very noticeable in large Vitess deployments, particularly ones using OLAP workloads.
Related Issue(s)
Checklist
Deployment Notes
Impacted Areas in Vitess
Components that this PR will affect: