parser, session: normalize binding digests across redundant parentheses#68005
parser, session: normalize binding digests across redundant parentheses#68005terry1purcell wants to merge 1 commit intopingcap:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @terry1purcell. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
📝 WalkthroughWalkthroughAdds a binding-only post-processing pass to normalization that removes certain redundant wrapper parentheses during binding normalization, updates bootstrap to version 259 with a migration rewriting Changes
Sequence Diagram(s)sequenceDiagram
participant Upgrader as UpgradeToVer259
participant Store as KV Store / Bootstrap
participant Parser as Parser.NormalizeDigestForBinding
participant BindInfo as mysql.bind_info
Upgrader->>Store: iterate non-builtin rows of `mysql.bind_info`
Store->>BindInfo: read row (`_tidb_rowid`, bind_sql, plan_digest, ...)
Upgrader->>Parser: NormalizeDigestForBinding(bind_sql)
Parser-->>Upgrader: original_sql', sql_digest'
Upgrader->>BindInfo: compare with persisted values
alt values differ
Upgrader->>BindInfo: update original_sql, sql_digest using _tidb_rowid
BindInfo-->>Upgrader: OK
alt unique-index collision on (plan_digest, sql_digest)
Upgrader->>BindInfo: retry update with sql_digest = NULL
BindInfo-->>Upgrader: OK or error
end
end
Upgrader-->>Store: continue next row
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/parser/digester.go (1)
411-419: Compactd.tokensin-place to keep the reducer allocation-free on changed queries.The pass reuses scratch buffers, but this final compaction still allocates a new
tokenDequewhenever parentheses are removed. Reusing the existing token backing array keeps the optimization consistent.♻️ Proposed in-place compaction
- normalized := make(tokenDeque, 0, n) + normalized := d.tokens[:0] // Compact the token stream after the removal decisions are finalized. for i, tok := range d.tokens { if removed[i] { continue } normalized = append(normalized, tok) } d.tokens = normalized🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/parser/digester.go` around lines 411 - 419, The current compaction creates a new tokenDeque (normalized) and assigns it to d.tokens, causing allocation; instead perform in-place compaction on d.tokens by using a write index: iterate over d.tokens with index i, if removed[i] copy d.tokens[writeIdx] = d.tokens[i] and increment writeIdx, then after the loop set d.tokens = d.tokens[:writeIdx] to truncate the slice; update any use of normalized to the writeIdx approach and keep references to tokenDeque, d.tokens, removed and the surrounding function so no new backing array is allocated.pkg/session/bootstrap_test.go (1)
1776-1856: Move this new upgrade test out ofbootstrap_test.go.This file explicitly asks not to add new tests here and points new cases to
bootstrap_update_test.go; moving the test keeps the bootstrap test cleanup direction intact.As per coding guidelines, follow existing package-local conventions first and keep style consistent with nearby files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/session/bootstrap_test.go` around lines 1776 - 1856, Move the new test TestTiDBUpgradeToVer259 out of pkg/session/bootstrap_test.go and into the dedicated update tests file (create or append pkg/session/bootstrap_update_test.go) so it follows the repo guideline that new upgrade tests live in bootstrap_update_test.go; copy the entire TestTiDBUpgradeToVer259 function (keeping its name and helper calls like CreateStoreAndBootstrap, CreateSessionAndSetID, MustExec, MustExecToRecodeSet, RevertVersionAndVariables, GetBootstrapVersion, BootstrapSession) into that file, ensure the package declaration and imports match nearby tests in bootstrap_update_test.go (add context, require, parser imports if missing), remove the original function from bootstrap_test.go, run `go test` for the package to verify no missing symbols or import issues, and keep test semantics unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/parser/digester.go`:
- Around line 411-419: The current compaction creates a new tokenDeque
(normalized) and assigns it to d.tokens, causing allocation; instead perform
in-place compaction on d.tokens by using a write index: iterate over d.tokens
with index i, if removed[i] copy d.tokens[writeIdx] = d.tokens[i] and increment
writeIdx, then after the loop set d.tokens = d.tokens[:writeIdx] to truncate the
slice; update any use of normalized to the writeIdx approach and keep references
to tokenDeque, d.tokens, removed and the surrounding function so no new backing
array is allocated.
In `@pkg/session/bootstrap_test.go`:
- Around line 1776-1856: Move the new test TestTiDBUpgradeToVer259 out of
pkg/session/bootstrap_test.go and into the dedicated update tests file (create
or append pkg/session/bootstrap_update_test.go) so it follows the repo guideline
that new upgrade tests live in bootstrap_update_test.go; copy the entire
TestTiDBUpgradeToVer259 function (keeping its name and helper calls like
CreateStoreAndBootstrap, CreateSessionAndSetID, MustExec, MustExecToRecodeSet,
RevertVersionAndVariables, GetBootstrapVersion, BootstrapSession) into that
file, ensure the package declaration and imports match nearby tests in
bootstrap_update_test.go (add context, require, parser imports if missing),
remove the original function from bootstrap_test.go, run `go test` for the
package to verify no missing symbols or import issues, and keep test semantics
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: fa65b8ba-51ab-4151-b677-d5fad88d39b6
📒 Files selected for processing (4)
pkg/parser/digester.gopkg/parser/digester_test.gopkg/session/bootstrap_test.gopkg/session/upgrade_def.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #68005 +/- ##
================================================
- Coverage 77.8512% 77.1907% -0.6605%
================================================
Files 1988 1970 -18
Lines 550883 551321 +438
================================================
- Hits 428869 425569 -3300
- Misses 121094 125750 +4656
+ Partials 920 2 -918
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Binding matching was too strict: syntactically different but semantically
identical WHERE clauses (e.g. `a=1 and b=1` vs `(a=1) and b=1`) produced
different digests, so an existing binding would not match the incoming query.
Core change (pkg/parser/digester.go):
- Add reduceRedundantParenthesesForBinding(), a precedence-aware token-stream
pass that strips redundant parentheses from the binding normalization path.
The pass is conservative: it removes a pair only when the lowest-precedence
operator inside is at least as strong as both adjacent outer operators, so
semantically significant grouping like `a AND (b OR c)` is always preserved.
- Optimization: fast-path pre-scan returns before any heap allocation for the
common case (no '(' adjacent to a boundary/logical token). Struct-level
scratch buffers (parenMatches, parenRemoved, parenStack) are reused across
calls via the existing sync.Pool, eliminating per-query allocation.
- Fix dead code: remove unreachable "straight_join" case in
isBindingParenBoundaryAfter (reduceOptimizerHint rewrites it to "join"
before this pass runs).
Bootstrap migration (pkg/session/upgrade_def.go):
- Add upgradeToVer259 which rewrites original_sql and sql_digest for all
non-builtin bindings after the normalization change. Uses _tidb_rowid for
safe in-transaction row identification. On (plan_digest, sql_digest) unique-
index collision, falls back to sql_digest=NULL rather than failing bootstrap.
Logs only safe metadata on errors (never bind_sql literals).
Tests (pkg/parser/digester_test.go, pkg/session/bootstrap_test.go):
- Add 16-variant parenthesization test in TestNormalize confirming all forms
produce the same normalized string and digest.
- Add BenchmarkNormalizeDigestForBinding covering NoParen (fast-path),
WithSignificantParen (full pass, no removal), and WithRedundantParen
(full pass with compaction).
- Add TestTiDBUpgradeToVer259 verifying the bootstrap migration rewrites
stale digests correctly on upgrade from v258.
Issue Number: ref pingcap#67363
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
34b1f8b to
1b7f77c
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/session/upgrade_def.go (1)
2152-2240: Overall implementation looks good; one minor nit on an unused field.Pessimistic-transaction wrapping, idempotent skip when
(oldOrigSQL, oldDigest)already matches the recomputed pair,ErrKeyExistsfallback tosql_digest = NULL, and deliberate exclusion ofbind_sqlfrom log fields are all handled well. The collect-into-slice-then-update pattern correctly avoids the "session busy with an active recordset" issue called out inupgradeToVer175.Nit:
oldOriginalSQLis written intobindRow(line 2154, set at 2194) but never read — it is not used in any update SQL and not included in either the warn-log at 2221-2228 or the fatal-log at 2234-2238. Either drop the field, or add it to the warn-log alongsideold_sql_digestfor operator visibility when a collision forcessql_digest = NULL.♻️ Proposed cleanup (drop the unused field)
type bindRow struct { rowID int64 - oldOriginalSQL string oldSQLDigest string // Safe fields for logging — deliberately exclude bind_sql. defaultDB string @@ updates = append(updates, bindRow{ rowID: rowID, - oldOriginalSQL: oldOrigSQL, oldSQLDigest: oldDigest, defaultDB: defaultDB,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/session/upgrade_def.go` around lines 2152 - 2240, The struct field oldOriginalSQL in bindRow is never read; remove it to avoid dead state: delete the oldOriginalSQL field from the bindRow type and remove the oldOriginalSQL: oldOrigSQL assignment in the updates = append(updates, bindRow{...}) call (keep the local variable oldOrigSQL used earlier for the comparison), then run tests/lint to ensure no remaining references to oldOriginalSQL remain.pkg/session/bootstrap_test.go (1)
1776-1856: Test covers the happy path well; consider adding the collision-fallback case.The scaffolding matches the existing
TestTiDBUpgradeToVerXXXtests and theNormalizeDigestvsNormalizeDigestForBindingsanity check (lines 1810-1811) is a nice guard against the test becoming a no-op if normalization behavior changes.Optional: add a second scenario with two pre-259 bindings whose
bind_sqldiffer only in redundant parentheses (so they normalize to the same(plan_digest, sql_digest)after 259) and assert that exactly one row ends up withsql_digestrewritten and the other ends up withsql_digest IS NULL. That would cover thekv.ErrKeyExistsfallback branch inupgradeToVer259, which is currently only exercised in production.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/session/bootstrap_test.go` around lines 1776 - 1856, Add a second collision scenario to TestTiDBUpgradeToVer259: create two manual bindings whose bind_sql differ only by redundant parentheses (so NormalizeDigest produces identical (plan_digest, sql_digest) after 259), then overwrite both rows with the same stale values using NormalizeDigest (simulating pre-259), run the bootstrap/upgrade path (BootstrapSession -> upgradeToVer259), and assert that exactly one row has original_sql/sql_digest rewritten to the expected NormalizeDigestForBinding values and the other row has sql_digest IS NULL (the kv.ErrKeyExists collision-fallback behavior). Use the same helpers (CreateSessionAndSetID, MustExec, MustExecToRecodeSet, GetBootstrapVersion) and reference NormalizeDigest/NormalizeDigestForBinding, bind_sql and mysql.bind_info to locate where to insert the extra setup and the final SELECT assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/session/bootstrap_test.go`:
- Around line 1776-1856: Add a second collision scenario to
TestTiDBUpgradeToVer259: create two manual bindings whose bind_sql differ only
by redundant parentheses (so NormalizeDigest produces identical (plan_digest,
sql_digest) after 259), then overwrite both rows with the same stale values
using NormalizeDigest (simulating pre-259), run the bootstrap/upgrade path
(BootstrapSession -> upgradeToVer259), and assert that exactly one row has
original_sql/sql_digest rewritten to the expected NormalizeDigestForBinding
values and the other row has sql_digest IS NULL (the kv.ErrKeyExists
collision-fallback behavior). Use the same helpers (CreateSessionAndSetID,
MustExec, MustExecToRecodeSet, GetBootstrapVersion) and reference
NormalizeDigest/NormalizeDigestForBinding, bind_sql and mysql.bind_info to
locate where to insert the extra setup and the final SELECT assertions.
In `@pkg/session/upgrade_def.go`:
- Around line 2152-2240: The struct field oldOriginalSQL in bindRow is never
read; remove it to avoid dead state: delete the oldOriginalSQL field from the
bindRow type and remove the oldOriginalSQL: oldOrigSQL assignment in the updates
= append(updates, bindRow{...}) call (keep the local variable oldOrigSQL used
earlier for the comparison), then run tests/lint to ensure no remaining
references to oldOriginalSQL remain.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 85edd617-f7dc-4772-b956-bb3567443cdf
📒 Files selected for processing (4)
pkg/parser/digester.gopkg/parser/digester_test.gopkg/session/bootstrap_test.gopkg/session/upgrade_def.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/parser/digester_test.go
- pkg/parser/digester.go
|
@terry1purcell: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #67363
Problem Summary:
What changed and how does it work?
SQL binding matching was too strict: syntactically different but semantically identical WHERE clauses produced different
normalized digests, so an existing binding would not match the incoming query. A binding created for WHERE a=1 AND b=1 would not match WHERE (a=1) AND b=1.
What changed and how does it work?
Core normalization (pkg/parser/digester.go)
Adds reduceRedundantParenthesesForBinding(), a precedence-aware token-stream pass that strips redundant parentheses during binding normalization. Conservative by design: removes a pair only when the lowest-precedence operator inside binds at least as tightly as both adjacent outer operators, so a AND (b OR c) is always preserved.
Two performance optimizations prevent per-query regression: (1) a fast-path pre-scan exits before any allocation for queries with no candidate parentheses (typical OLTP); (2) scratch buffers (parenMatches, parenRemoved, parenStack) are struct field on the pooled sqlDigester, so they are reused across calls rather than heap-allocated per query.
Benchmark (BenchmarkNormalizeDigestForBinding): NoParen ~1234 ns/0 B overhead; WithSignificantParen ~1473 ns/0 B extra alloc; WithRedundantParen ~2021 ns (unavoidable compaction).
Also removes a dead "straight_join" case in isBindingParenBoundaryAfter (unreachable — reduceOptimizerHint rewrites it to "join" before this pass runs).
Bootstrap migration (pkg/session/upgrade_def.go)
Adds upgradeToVer259 which rewrites original_sql and sql_digest for all non-builtin bindings. Uses _tidb_rowid for row
identification. On (plan_digest, sql_digest) unique-index collision, falls back to sql_digest = NULL rather than failing
bootstrap. Never logs bind_sql literals.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Performance
Tests
Chores