perf: cache reserved bind variables in queries #7698
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This week we're starting our performance mission earlier with a long-standing TODO from @sougou in the SQL parser:
GetBindvars
is not as cheap as it could be. It is, in fact, quite expensive! Walking the AST is hard! If we look at a profile for aNormalize
benchmark, we find a surprising result:GetBindvars
accounts for almost 50% the time spent when normalizing a query! And we're doing it twice in the normal flow of a request: once when normalizing the incoming query, and again in the planbuilder when rewriting the query into its final plan. @sougou is quite right that "ideally, this should be done only once"... But we can do even better: ideally this shouldn't be done at all!This PR removes all the usages of
GetBindvars
from the codebase -- instead, it updates our SQL grammar so it keeps track of the bind variables as it finds them while parsing. Then, we propagate these reserved variable names everywhere they are needed. A bit verbose, but it makes normalization twice as fast (as one would expect from looking at the flame graph):Obviously, this is in a synthetic benchmark -- in a production environment, the speedup will be even more significant because we're no longer calling
GetBindvars
redundantly from the query planner.Related Issue(s)
Checklist
Deployment Notes
Impacted Areas in Vitess
Components that this PR will affect: