Skip to content

util, executor: optimize RUv2 point-get hot path#68119

Open
disksing wants to merge 3 commits intopingcap:masterfrom
oh-my-tidb:optimize-ruv2-pointget-hotpath
Open

util, executor: optimize RUv2 point-get hot path#68119
disksing wants to merge 3 commits intopingcap:masterfrom
oh-my-tidb:optimize-ruv2-pointget-hotpath

Conversation

@disksing
Copy link
Copy Markdown
Contributor

@disksing disksing commented Apr 29, 2026

What problem does this PR solve?

Issue Number: close #68118

Problem Summary:

RUv2 statement accounting adds visible overhead to point-get-like hot paths. Local point-get and prepared-point-get benchmarks showed extra allocations from statement RUv2 context/metrics initialization, fixed executor label accounting through per-statement maps, and repeated Prometheus CounterVec.WithLabelValues(...) lookups.

What changed and how does it work?

This PR reduces RUv2 accounting overhead on the PointGet hot path:

  • Cache Prometheus RUv2 counters for known fixed executor and TiKV coprocessor labels.
  • Store statement RUv2 metrics in StmtExecDetails instead of adding an extra context value on the normal initialized context path.
  • Use fixed fields for L1 executor counters, including PointGetExecutor, and keep map-based fallback only for non-fixed labels.
  • Lazily allocate cold RUv2 metric groups/maps so point-get statements only pay for the hot fields they use.
  • Avoid snapshot map construction when calculating RU values or checking whether metrics are zero.
  • Avoid copying the whole StmtExecDetails now that it carries RUv2 storage.
  • Reduce repeated RUv2 context lookups in exec.Next.

Local server-context prepared point-get proxy after this PR:

  • vs pre-optimization baseline c391a01f15f8: -2.45% mean ns/op, -482 B/op, -6 allocs/op
  • vs old baseline 0f44b08dc66a: allocation count is back to baseline, with remaining about +265 B/op

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Unit tests / checks run locally:

go test -run 'TestRUV2|TestFormatRUV2|TestRURuntimeStats' -tags=intest,deadlock ./pkg/util/execdetails
./tools/check/failpoint-go-test.sh pkg/executor/internal/exec -run 'TestNextIOAccAddInputCountsRowsWithZeroCols'
./tools/check/failpoint-go-test.sh pkg/executor -run 'TestExplainAnalyzeInvokeNextAndClose'
go test -run '^$' -tags=intest,deadlock ./pkg/metrics
./tools/check/failpoint-go-test.sh pkg/session -run 'TestRUV2SessionParserTotalDoesNotLeakAcrossStandaloneParse|TestRUV2MetricsIsolatedPerStatementInExplicitTxn'
./tools/check/failpoint-go-test.sh pkg/sessionctx/variable -run '^$'
./tools/check/failpoint-go-test.sh pkg/sessionctx/variable/tests -run 'TestSlowLogFormat'
make lint

Manual benchmark proxy:

go test -run='^$' -bench='BenchmarkPreparedPointGetServerCtx$' \
  -benchtime=3s -count=5 -benchmem -tags=intest,deadlock ./pkg/session

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Reduce RUv2 accounting overhead for point-get workloads.

Summary by CodeRabbit

  • Refactor

    • Optimized RUv2 resource-usage tracking with lazy initialization and cached counters, consolidating metric computations to reduce redundant work and memory allocations.
    • Streamlined propagation of RUv2 metrics through execution context and simplified how write-response duration is stored in statement summaries.
  • Tests

    • Added an allocation-regression test to guard against increased heap allocations during metric updates.

Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
@ti-chi-bot ti-chi-bot Bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 29, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented Apr 29, 2026

@disksing I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 29, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 29, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zimulala for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 29, 2026

Hi @disksing. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Reworks RUv2 metric storage and hot-path recording: moves per-statement RUv2 into StmtExecDetails, lazily initializes extra counters, pre-caches common Prometheus counters, consolidates executor metric updates to a single delta, and updates context helpers and call sites to propagate the new statement-level metrics.

Changes

Cohort / File(s) Summary
Executor hot-path
pkg/executor/internal/exec/executor.go, pkg/executor/staticrecordset/recordset.go
Retrieve RUV2 once per Next call, early-disable when Bypass() set, compute a single delta (rows or cells) and update metrics; use ContextWithRUV2Metrics for propagation.
RUV2 metrics core & structure
pkg/util/execdetails/ruv2_metrics.go, pkg/util/execdetails/execdetails.go, pkg/util/execdetails/ruv2_metrics.go
Move statement RUv2 into StmtExecDetails; add lazy extra via atomic.Pointer, replace some sync.Map use with structured counters, add setters/getters (ensureRUV2Metrics, getRUV2Metrics, setRUV2Metrics), and update clone/merge/read/write paths.
Context & helpers
pkg/util/execdetails/util.go, pkg/util/execdetails/execdetails.go
Stop writing RUv2 under a global context key; provide ContextWithRUV2Metrics and adjust initialization/inheritance helpers to attach RUv2 to StmtExecDetails instead.
Prometheus caching & helpers
pkg/metrics/ru_v2.go
Pre-cache prometheus.Counter for fixed executor labels and coprocessor batch labels; add RUV2ExecutorCounter and RUV2TiKVCoprocessorWorkTotalCounter helpers with dynamic fallback.
Statement summary adjustments
pkg/executor/adapter.go, pkg/util/stmtsummary/statement_summary.go, pkg/util/stmtsummary/v2/record.go
Stop embedding full StmtExecDetails in StmtExecInfo; add WriteSQLRespDuration field and update summary accumulation to use it.
Tests & minor touch-ups
pkg/util/execdetails/execdetails_test.go, pkg/sessionctx/variable/slow_log.go
Add allocation regression test for RUv2 executor metric operations; minor slow-log setter cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Executor
    participant StmtExecDetails
    participant RUV2Metrics
    participant Prometheus

    Client->>Executor: call Next()
    Executor->>StmtExecDetails: RUV2 := RUV2MetricsFromContext(ctx)
    alt RUV2 tracking enabled
        Executor->>RUV2Metrics: compute delta (rows|cells)
        RUV2Metrics->>RUV2Metrics: update structured counters / extra
        RUV2Metrics->>Prometheus: use cached counters or WithLabelValues to increment
        Prometheus-->>RUV2Metrics: ack
    else tracking disabled
        Executor-->>StmtExecDetails: skip RU updates
    end
    Executor-->>Client: return row/result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

type/refactor, approved, lgtm

Suggested reviewers

  • XuHuaiyu
  • wjhuang2016
  • AilinKid

Poem

🐰 I hopped through code with nimble paws,

Caching counters, trimming claws,
Per-stmt needles tucked away,
Point-gets sprint without delay,
A little hop, a lighter cause.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main optimization: reducing RUv2 accounting overhead in the point-get hot path across util and executor packages.
Linked Issues check ✅ Passed Code changes comprehensively address issue #68118 objectives: caching Prometheus counters, storing RUv2 metrics in StmtExecDetails, using fixed L1 fields with map fallback, lazy allocation, and reducing context lookups.
Out of Scope Changes check ✅ Passed All changes are directly scoped to RUv2 optimization: executor counter caching, metrics storage restructuring, statement details refactoring, and related context/summary updates—no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/util/execdetails/execdetails_test.go (1)

430-437: Please cover the new context-storage path too.

This subtest locks down the L1 fixed-label allocation win, but the risky part of the refactor is the move from RUV2MetricsCtxKey to StmtExecDetails-backed storage. A small table-driven test around ContextWithInitializedExecDetails, ContextWithRUV2Metrics, and RUV2MetricsFromContext would catch silent propagation regressions that this alloc check won't.

Based on learnings, "**/*_test.go: Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/util/execdetails/execdetails_test.go` around lines 430 - 437, Add a small
table-driven subtest that verifies the context-backed storage path
(ContextWithInitializedExecDetails, ContextWithRUV2Metrics, and
RUV2MetricsFromContext) behaves like the previous key-based path: for each case
(direct NewRUV2Metrics use, ContextWithInitializedExecDetails then
AddExecutorMetric via RUV2MetricsFromContext, and ContextWithRUV2Metrics helper)
create a fresh context, call the metric-add sequence (e.g.,
AddExecutorMetric/PointGetExecutor) inside testing.AllocsPerRun and assert
allocations are <= 1.0; ensure you exercise both initialization helpers
(ContextWithInitializedExecDetails and ContextWithRUV2Metrics) and use
RUV2MetricsFromContext to retrieve the metric so the table covers silent
propagation regressions introduced by moving storage to StmtExecDetails.
pkg/metrics/ru_v2.go (1)

241-345: Consider centralizing the known-label inventory.

The fast-path labels now have to stay aligned across this cache initializer/accessor, pkg/executor/internal/exec/executor.go, and pkg/util/execdetails/ruv2_metrics.go. Missing one entry won't fail loudly; it just falls back to the cold path and quietly erodes the optimization. A shared table or a small sync test would make this much safer to maintain.

As per coding guidelines, "Code SHOULD remain maintainable for future readers with basic TiDB familiarity, including readers who are not experts in the specific subsystem/feature."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/metrics/ru_v2.go` around lines 241 - 345, The cached-labels list is
duplicated across initRUV2CachedLabelCounters, RUV2ExecutorCounter,
RUV2TiKVCoprocessorWorkTotalCounter and other files (executor.go,
ruv2_metrics.go); extract the known labels into a single shared constant
map/slice (e.g. KnownRUV2ExecutorLabels and KnownRUV2TiKVWorkLabels) in a common
package (pkg/metrics or pkg/util/execdetails), have initRUV2CachedLabelCounters
and the accessor functions (RUV2ExecutorCounter,
RUV2TiKVCoprocessorWorkTotalCounter) use that shared table to register/cache and
perform fast-path lookups, update executor.go and ruv2_metrics.go to reference
the same symbols, and add a small unit or sync test that verifies every label
used by executor.go has a cached entry so missing entries fail loudly during CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/metrics/ru_v2.go`:
- Around line 241-345: The cached-labels list is duplicated across
initRUV2CachedLabelCounters, RUV2ExecutorCounter,
RUV2TiKVCoprocessorWorkTotalCounter and other files (executor.go,
ruv2_metrics.go); extract the known labels into a single shared constant
map/slice (e.g. KnownRUV2ExecutorLabels and KnownRUV2TiKVWorkLabels) in a common
package (pkg/metrics or pkg/util/execdetails), have initRUV2CachedLabelCounters
and the accessor functions (RUV2ExecutorCounter,
RUV2TiKVCoprocessorWorkTotalCounter) use that shared table to register/cache and
perform fast-path lookups, update executor.go and ruv2_metrics.go to reference
the same symbols, and add a small unit or sync test that verifies every label
used by executor.go has a cached entry so missing entries fail loudly during CI.

In `@pkg/util/execdetails/execdetails_test.go`:
- Around line 430-437: Add a small table-driven subtest that verifies the
context-backed storage path (ContextWithInitializedExecDetails,
ContextWithRUV2Metrics, and RUV2MetricsFromContext) behaves like the previous
key-based path: for each case (direct NewRUV2Metrics use,
ContextWithInitializedExecDetails then AddExecutorMetric via
RUV2MetricsFromContext, and ContextWithRUV2Metrics helper) create a fresh
context, call the metric-add sequence (e.g., AddExecutorMetric/PointGetExecutor)
inside testing.AllocsPerRun and assert allocations are <= 1.0; ensure you
exercise both initialization helpers (ContextWithInitializedExecDetails and
ContextWithRUV2Metrics) and use RUV2MetricsFromContext to retrieve the metric so
the table covers silent propagation regressions introduced by moving storage to
StmtExecDetails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 53bd4be1-5f2d-4637-9a5f-4c25935a2f2c

📥 Commits

Reviewing files that changed from the base of the PR and between 42118f3 and 040aa46.

📒 Files selected for processing (8)
  • pkg/executor/internal/exec/executor.go
  • pkg/executor/staticrecordset/recordset.go
  • pkg/metrics/ru_v2.go
  • pkg/sessionctx/variable/slow_log.go
  • pkg/util/execdetails/execdetails.go
  • pkg/util/execdetails/execdetails_test.go
  • pkg/util/execdetails/ruv2_metrics.go
  • pkg/util/execdetails/util.go

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 67.17325% with 108 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.1052%. Comparing base (c391a01) to head (d6fefa3).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #68119        +/-   ##
================================================
- Coverage   77.7579%   77.1052%   -0.6528%     
================================================
  Files          1990       1972        -18     
  Lines        551594     553021      +1427     
================================================
- Hits         428908     426408      -2500     
- Misses       121766     126510      +4744     
+ Partials        920        103       -817     
Components Coverage Δ
dumpling 60.4888% <ø> (ø)
parser ∅ <ø> (∅)
br 50.0597% <ø> (-13.0338%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: disksing <i@disksing.com>
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 29, 2026

@disksing: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/check_dev d6fefa3 link true /test check-dev

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/util/execdetails/util.go (1)

49-62: Minor code flow observation (non-blocking).

When stmtDetails is newly created (lines 49-55) and no inherited metrics exist, the subsequent check at lines 57-59 will redundantly query RUV2MetricsCtxKey again. This is correct but slightly non-obvious. Consider restructuring to avoid the repeated context lookup:

♻️ Optional: Clearer control flow
 if stmtDetails == nil {
   stmtDetails = &StmtExecDetails{}
-  if inheritedRUV2Metrics, _ := ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics); inheritedRUV2Metrics != nil {
-    stmtDetails.setRUV2Metrics(inheritedRUV2Metrics)
-  }
   ctx = context.WithValue(ctx, StmtExecDetailKey, stmtDetails)
 }
 if stmtDetails.getRUV2Metrics() == nil {
   if inheritedRUV2Metrics, _ := ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics); inheritedRUV2Metrics != nil {
     stmtDetails.setRUV2Metrics(inheritedRUV2Metrics)
   } else {
     stmtDetails.ensureRUV2Metrics()
   }
 }

This merges the inheritance check into the second block which already handles both "new stmtDetails" and "existing stmtDetails without metrics" cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/util/execdetails/util.go` around lines 49 - 62, The current flow creates
stmtDetails and then performs two separate ctx.Value(RUV2MetricsCtxKey) lookups;
to eliminate the redundant lookup, read inheritedRUV2Metrics once into a local
variable (using ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics)) before or during
the stmtDetails nil branch, set stmtDetails via
stmtDetails.setRUV2Metrics(inheritedRUV2Metrics) if non-nil when creating it
(and store stmtDetails into ctx with StmtExecDetailKey), and then in the
subsequent check use stmtDetails.getRUV2Metrics() and the previously captured
inheritedRUV2Metrics to either setRUV2Metrics(inheritedRUV2Metrics) or call
stmtDetails.ensureRUV2Metrics() as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/util/execdetails/util.go`:
- Around line 49-62: The current flow creates stmtDetails and then performs two
separate ctx.Value(RUV2MetricsCtxKey) lookups; to eliminate the redundant
lookup, read inheritedRUV2Metrics once into a local variable (using
ctx.Value(RUV2MetricsCtxKey).(*RUV2Metrics)) before or during the stmtDetails
nil branch, set stmtDetails via stmtDetails.setRUV2Metrics(inheritedRUV2Metrics)
if non-nil when creating it (and store stmtDetails into ctx with
StmtExecDetailKey), and then in the subsequent check use
stmtDetails.getRUV2Metrics() and the previously captured inheritedRUV2Metrics to
either setRUV2Metrics(inheritedRUV2Metrics) or call
stmtDetails.ensureRUV2Metrics() as needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e4e75946-57f0-42db-9ab1-fad5cfc3d566

📥 Commits

Reviewing files that changed from the base of the PR and between 040aa46 and d6fefa3.

📒 Files selected for processing (4)
  • pkg/executor/adapter.go
  • pkg/util/execdetails/util.go
  • pkg/util/stmtsummary/statement_summary.go
  • pkg/util/stmtsummary/v2/record.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RUv2 accounting adds point-get hot-path overhead

1 participant