util, executor: optimize RUv2 point-get hot path#68119
util, executor: optimize RUv2 point-get hot path#68119disksing wants to merge 3 commits intopingcap:masterfrom
Conversation
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
|
@disksing I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @disksing. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
📝 WalkthroughWalkthroughReworks RUv2 metric storage and hot-path recording: moves per-statement RUv2 into Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Executor
participant StmtExecDetails
participant RUV2Metrics
participant Prometheus
Client->>Executor: call Next()
Executor->>StmtExecDetails: RUV2 := RUV2MetricsFromContext(ctx)
alt RUV2 tracking enabled
Executor->>RUV2Metrics: compute delta (rows|cells)
RUV2Metrics->>RUV2Metrics: update structured counters / extra
RUV2Metrics->>Prometheus: use cached counters or WithLabelValues to increment
Prometheus-->>RUV2Metrics: ack
else tracking disabled
Executor-->>StmtExecDetails: skip RU updates
end
Executor-->>Client: return row/result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/util/execdetails/execdetails_test.go (1)
430-437: Please cover the new context-storage path too.This subtest locks down the L1 fixed-label allocation win, but the risky part of the refactor is the move from
RUV2MetricsCtxKeytoStmtExecDetails-backed storage. A small table-driven test aroundContextWithInitializedExecDetails,ContextWithRUV2Metrics, andRUV2MetricsFromContextwould catch silent propagation regressions that this alloc check won't.Based on learnings, "
**/*_test.go: Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/util/execdetails/execdetails_test.go` around lines 430 - 437, Add a small table-driven subtest that verifies the context-backed storage path (ContextWithInitializedExecDetails, ContextWithRUV2Metrics, and RUV2MetricsFromContext) behaves like the previous key-based path: for each case (direct NewRUV2Metrics use, ContextWithInitializedExecDetails then AddExecutorMetric via RUV2MetricsFromContext, and ContextWithRUV2Metrics helper) create a fresh context, call the metric-add sequence (e.g., AddExecutorMetric/PointGetExecutor) inside testing.AllocsPerRun and assert allocations are <= 1.0; ensure you exercise both initialization helpers (ContextWithInitializedExecDetails and ContextWithRUV2Metrics) and use RUV2MetricsFromContext to retrieve the metric so the table covers silent propagation regressions introduced by moving storage to StmtExecDetails.pkg/metrics/ru_v2.go (1)
241-345: Consider centralizing the known-label inventory.The fast-path labels now have to stay aligned across this cache initializer/accessor,
pkg/executor/internal/exec/executor.go, andpkg/util/execdetails/ruv2_metrics.go. Missing one entry won't fail loudly; it just falls back to the cold path and quietly erodes the optimization. A shared table or a small sync test would make this much safer to maintain.As per coding guidelines, "Code SHOULD remain maintainable for future readers with basic TiDB familiarity, including readers who are not experts in the specific subsystem/feature."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/metrics/ru_v2.go` around lines 241 - 345, The cached-labels list is duplicated across initRUV2CachedLabelCounters, RUV2ExecutorCounter, RUV2TiKVCoprocessorWorkTotalCounter and other files (executor.go, ruv2_metrics.go); extract the known labels into a single shared constant map/slice (e.g. KnownRUV2ExecutorLabels and KnownRUV2TiKVWorkLabels) in a common package (pkg/metrics or pkg/util/execdetails), have initRUV2CachedLabelCounters and the accessor functions (RUV2ExecutorCounter, RUV2TiKVCoprocessorWorkTotalCounter) use that shared table to register/cache and perform fast-path lookups, update executor.go and ruv2_metrics.go to reference the same symbols, and add a small unit or sync test that verifies every label used by executor.go has a cached entry so missing entries fail loudly during CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/metrics/ru_v2.go`:
- Around line 241-345: The cached-labels list is duplicated across
initRUV2CachedLabelCounters, RUV2ExecutorCounter,
RUV2TiKVCoprocessorWorkTotalCounter and other files (executor.go,
ruv2_metrics.go); extract the known labels into a single shared constant
map/slice (e.g. KnownRUV2ExecutorLabels and KnownRUV2TiKVWorkLabels) in a common
package (pkg/metrics or pkg/util/execdetails), have initRUV2CachedLabelCounters
and the accessor functions (RUV2ExecutorCounter,
RUV2TiKVCoprocessorWorkTotalCounter) use that shared table to register/cache and
perform fast-path lookups, update executor.go and ruv2_metrics.go to reference
the same symbols, and add a small unit or sync test that verifies every label
used by executor.go has a cached entry so missing entries fail loudly during CI.
In `@pkg/util/execdetails/execdetails_test.go`:
- Around line 430-437: Add a small table-driven subtest that verifies the
context-backed storage path (ContextWithInitializedExecDetails,
ContextWithRUV2Metrics, and RUV2MetricsFromContext) behaves like the previous
key-based path: for each case (direct NewRUV2Metrics use,
ContextWithInitializedExecDetails then AddExecutorMetric via
RUV2MetricsFromContext, and ContextWithRUV2Metrics helper) create a fresh
context, call the metric-add sequence (e.g., AddExecutorMetric/PointGetExecutor)
inside testing.AllocsPerRun and assert allocations are <= 1.0; ensure you
exercise both initialization helpers (ContextWithInitializedExecDetails and
ContextWithRUV2Metrics) and use RUV2MetricsFromContext to retrieve the metric so
the table covers silent propagation regressions introduced by moving storage to
StmtExecDetails.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 53bd4be1-5f2d-4637-9a5f-4c25935a2f2c
📒 Files selected for processing (8)
pkg/executor/internal/exec/executor.gopkg/executor/staticrecordset/recordset.gopkg/metrics/ru_v2.gopkg/sessionctx/variable/slow_log.gopkg/util/execdetails/execdetails.gopkg/util/execdetails/execdetails_test.gopkg/util/execdetails/ruv2_metrics.gopkg/util/execdetails/util.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #68119 +/- ##
================================================
- Coverage 77.7579% 77.1052% -0.6528%
================================================
Files 1990 1972 -18
Lines 551594 553021 +1427
================================================
- Hits 428908 426408 -2500
- Misses 121766 126510 +4744
+ Partials 920 103 -817
🚀 New features to boost your workflow:
|
Signed-off-by: disksing <i@disksing.com>
|
@disksing: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/util/execdetails/util.go (1)
49-62: Minor code flow observation (non-blocking).When
stmtDetailsis newly created (lines 49-55) and no inherited metrics exist, the subsequent check at lines 57-59 will redundantly queryRUV2MetricsCtxKeyagain. This is correct but slightly non-obvious. Consider restructuring to avoid the repeated context lookup:♻️ Optional: Clearer control flow
if stmtDetails == nil { stmtDetails = &StmtExecDetails{} - if inheritedRUV2Metrics, _ := ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics); inheritedRUV2Metrics != nil { - stmtDetails.setRUV2Metrics(inheritedRUV2Metrics) - } ctx = context.WithValue(ctx, StmtExecDetailKey, stmtDetails) } if stmtDetails.getRUV2Metrics() == nil { if inheritedRUV2Metrics, _ := ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics); inheritedRUV2Metrics != nil { stmtDetails.setRUV2Metrics(inheritedRUV2Metrics) } else { stmtDetails.ensureRUV2Metrics() } }This merges the inheritance check into the second block which already handles both "new stmtDetails" and "existing stmtDetails without metrics" cases.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/util/execdetails/util.go` around lines 49 - 62, The current flow creates stmtDetails and then performs two separate ctx.Value(RUV2MetricsCtxKey) lookups; to eliminate the redundant lookup, read inheritedRUV2Metrics once into a local variable (using ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics)) before or during the stmtDetails nil branch, set stmtDetails via stmtDetails.setRUV2Metrics(inheritedRUV2Metrics) if non-nil when creating it (and store stmtDetails into ctx with StmtExecDetailKey), and then in the subsequent check use stmtDetails.getRUV2Metrics() and the previously captured inheritedRUV2Metrics to either setRUV2Metrics(inheritedRUV2Metrics) or call stmtDetails.ensureRUV2Metrics() as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/util/execdetails/util.go`:
- Around line 49-62: The current flow creates stmtDetails and then performs two
separate ctx.Value(RUV2MetricsCtxKey) lookups; to eliminate the redundant
lookup, read inheritedRUV2Metrics once into a local variable (using
ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics)) before or during the stmtDetails
nil branch, set stmtDetails via stmtDetails.setRUV2Metrics(inheritedRUV2Metrics)
if non-nil when creating it (and store stmtDetails into ctx with
StmtExecDetailKey), and then in the subsequent check use
stmtDetails.getRUV2Metrics() and the previously captured inheritedRUV2Metrics to
either setRUV2Metrics(inheritedRUV2Metrics) or call
stmtDetails.ensureRUV2Metrics() as needed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e4e75946-57f0-42db-9ab1-fad5cfc3d566
📒 Files selected for processing (4)
pkg/executor/adapter.gopkg/util/execdetails/util.gopkg/util/stmtsummary/statement_summary.gopkg/util/stmtsummary/v2/record.go
What problem does this PR solve?
Issue Number: close #68118
Problem Summary:
RUv2 statement accounting adds visible overhead to point-get-like hot paths. Local point-get and prepared-point-get benchmarks showed extra allocations from statement RUv2 context/metrics initialization, fixed executor label accounting through per-statement maps, and repeated Prometheus
CounterVec.WithLabelValues(...)lookups.What changed and how does it work?
This PR reduces RUv2 accounting overhead on the PointGet hot path:
StmtExecDetailsinstead of adding an extra context value on the normal initialized context path.PointGetExecutor, and keep map-based fallback only for non-fixed labels.StmtExecDetailsnow that it carries RUv2 storage.exec.Next.Local server-context prepared point-get proxy after this PR:
c391a01f15f8:-2.45%mean ns/op,-482 B/op,-6 allocs/op0f44b08dc66a: allocation count is back to baseline, with remaining about+265 B/opCheck List
Tests
Unit tests / checks run locally:
./tools/check/failpoint-go-test.sh pkg/executor/internal/exec -run 'TestNextIOAccAddInputCountsRowsWithZeroCols'./tools/check/failpoint-go-test.sh pkg/executor -run 'TestExplainAnalyzeInvokeNextAndClose'./tools/check/failpoint-go-test.sh pkg/session -run 'TestRUV2SessionParserTotalDoesNotLeakAcrossStandaloneParse|TestRUV2MetricsIsolatedPerStatementInExplicitTxn'./tools/check/failpoint-go-test.sh pkg/sessionctx/variable -run '^$'./tools/check/failpoint-go-test.sh pkg/sessionctx/variable/tests -run 'TestSlowLogFormat'Manual benchmark proxy:
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
Refactor
Tests