topsql/reporter: add TopRU RU window aggregation and reporting pipeline by zimulala · Pull Request #67089 · pingcap/tidb

zimulala · 2026-03-17T10:02:12Z

What problem does this PR solve?

Issue Number: close #67065

Problem Summary:

Add reporter-side TopRU aggregation/output path, while keeping PR2/PR3 responsibilities split and reviewable.

What changed and how does it work?

Add RU aggregation datamodel and window aggregator:
- ru_datamodel.go, ru_window_aggregator.go
Extend reporter pipeline to collect and emit RURecords on report tick:
- reporter.go
Wire TopRU collector registration lifecycle in TopSQL setup/close:
- topsql.go
Add reporter-side TopRU tests/cases:
- ru_datamodel_test.go, ru_window_aggregator_test.go, topru_case_runner_test.go, topru_generated_cases_test.go, reporter_test.go
Split-only BUILD adjustment:
- remove topru_structured_test.go from reporter test srcs (file no longer exists in current source branch).

Dependency note:
This PR depends on PR2 (RU delta collection in stmtstats/executor). RU collection is completed in PR2; this PR only handles reporter aggregation/output.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Summary by CodeRabbit

New Features
- TopSQL accepts RU (resource-usage) increments, aggregates them in 15s buckets, and emits aligned 60s Top‑N RU summaries (per-user/SQL/plan). Ingestion is non‑blocking with drop metrics for backpressure; RU is included alongside existing report payloads.
Chores
- RU collector is auto-registered/unregistered during TopSQL setup/teardown.
Tests
- Extensive unit tests and benchmarks for RU data model, windowed aggregation, Top‑N eviction/“others” behavior, concurrency, backpressure and reporting.

…ocations Made-with: Cursor

pantheon-ai · 2026-03-17T10:02:19Z

Review Complete

Findings: 0 issues
Posted: 0
Duplicates/Skipped: 0

_{ℹ️ Learn more details on Pantheon AI.}

tiprow · 2026-03-17T10:02:28Z

Hi @zimulala. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-03-17T10:02:36Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds TopRU collection and reporting: in-memory RU data model, a 15s‑bucket sliding RU window aggregator with 60s reporting windows, reporter wiring (non‑blocking RU queue, worker, API), RU included in ReportData, many unit tests/benchmarks, and BUILD/test dependency updates.

Changes

Cohort / File(s)	Summary
RU Data Model `pkg/util/topsql/reporter/ru_datamodel.go`, `pkg/util/topsql/reporter/ru_datamodel_test.go`	New bounded multi-level Top‑N RU model: ruItem/ruRecord, per-user & global collectors, two‑tier “others” eviction, merge/compact/top‑N logic, proto export, and comprehensive unit tests.
RU Window Aggregator `pkg/util/topsql/reporter/ru_window_aggregator.go`, `pkg/util/topsql/reporter/ru_window_aggregator_test.go`	New ruWindowAggregator implementing 15s buckets and 60s window reporting, bucket rotation/compaction, late-data handling, top‑N enforcement, plus extensive concurrency and benchmark tests.
Reporter Integration `pkg/util/topsql/reporter/reporter.go`, `pkg/util/topsql/reporter/reporter_test.go`	Adds CollectRUIncrements API, non‑blocking RU channel & worker, wires ruAggregator into reporting pipeline, propagates RURecords into ReportData, and adds backpressure/drop metrics and related tests.
TopRU Test Scenarios `pkg/util/topsql/reporter/topru_case_runner_test.go`, `pkg/util/topsql/reporter/topru_generated_cases_test.go`	Deterministic test harness and data‑driven cases validating RU aggregation semantics, metadata emission, key aggregation, edge cases, and payload assertions.
Build & Setup `pkg/util/topsql/reporter/BUILD.bazel`, `pkg/util/topsql/topsql.go`	BUILD/test deps updated for new files and Prometheus proto; topsql setup/close optionally register/unregister RUCollector.
Minor `pkg/util/topsql/stmtstats/aggregator_bench_test.go`	One-line benchmark comment clarification.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Reporter as RemoteTopSQLReporter
    participant Worker as collectRUWorker
    participant Aggregator as ruWindowAggregator
    participant Sink as DataSink

    Client->>Reporter: CollectRUIncrements(data)
    Reporter->>Reporter: enqueue to collectRUIncrementsChan (drop metric if full)

    activate Worker
    Worker->>Reporter: dequeue increments
    Worker->>Aggregator: addBatchToBucket(ruIncrements)
    deactivate Worker

    Reporter->>Aggregator: takeReportRecords(nowTs, itemInterval)
    Aggregator->>Aggregator: align windows, merge 15s buckets, apply top‑N caps
    Aggregator-->>Reporter: []TopRURecord

    Reporter->>Reporter: attach RURecords to collectedData
    Reporter->>Sink: send ReportData (SQLMeta + RURecords)
    Sink-->>Sink: consume report

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

executor, topsql/stmtstats: collect TopRU RU deltas behind TopRUEnabled #66911: Implements stmtstats-side RUCollector/RegisterRUCollector APIs; directly related to RU ingestion wiring.
topsql: add TopRU state and subscription plumbing (tipb upgrade) #66642: Related TopRU reporter/datamodel/aggregator changes touching the same reporter paths.
*: Cherry pick topsql adding network field to feature branch #66290: Modifies TopSQL reporter pipeline and eviction/selection logic that overlap with RU aggregation/reporting.

Suggested labels

component/statistics

Suggested reviewers

nolouch
yibin87
XuHuaiyu

Poem

🐇 In fifteen‑second buckets I hop and take note,
I nibble hot keys while the long tails float.
Sixty seconds later I bundle and sing—
TopRU and TopSQL, neat numbers I bring.
A carrot for code, a tiny hopping ping!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly identifies the main change: adding TopRU RU window aggregation and reporting pipeline to the reporter package.
Description check	✅ Passed	The description comprehensively addresses problem statement, detailed changes, test coverage, and dependency notes, though the release note section could be more explicit.
Linked Issues check	✅ Passed	The PR closes issue `#67065` and implements TopRU aggregation and reporting as required, with code changes matching the stated objectives.
Out of Scope Changes check	✅ Passed	All changes are focused on TopRU reporter-side aggregation and reporting; only one minor comment clarification in topsql.go is tangential but related to TopRU setup.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/topru-pr3-rerun

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

hawkingrei · 2026-03-17T10:02:58Z

/ok-to-test

zimulala · 2026-03-17T10:12:00Z

/retest

coderabbitai

🧹 Nitpick comments (2)

pkg/util/topsql/topsql.go (1)

65-67: Capture the registered RU collector instance to keep lifecycle symmetric.

Line 65 registers the collector from the current globalTopProfilingReport, while Line 86 unregisters from whatever instance globalTopProfilingReport points to at close time. If tests replace the global via SetupTopProfilingForTest between setup/close, the originally registered collector can remain registered.

♻️ Proposed refactor

 var (
 	globalTopProfilingReport reporter.TopSQLReporter
 	singleTargetDataSink     *reporter.SingleTargetDataSink
+	registeredRUCollector    stmtstats.RUCollector
 )
@@
 	stmtstats.RegisterCollector(globalTopProfilingReport)
 	if ruCollector, ok := globalTopProfilingReport.(stmtstats.RUCollector); ok {
 		stmtstats.RegisterRUCollector(ruCollector)
+		registeredRUCollector = ruCollector
 	}
 	stmtstats.SetupAggregator()
 }
@@
-	if ruCollector, ok := globalTopProfilingReport.(stmtstats.RUCollector); ok {
-		stmtstats.UnregisterRUCollector(ruCollector)
+	if registeredRUCollector != nil {
+		stmtstats.UnregisterRUCollector(registeredRUCollector)
+		registeredRUCollector = nil
 	}

Also applies to: 86-88

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/util/topsql/topsql.go` around lines 65 - 67, When registering the RU
collector from the current globalTopProfilingReport, capture and store the
specific collector instance (the ruCollector returned in the registration block)
so you can unregister that exact instance later; change the registration site
that calls stmtstats.RegisterRUCollector(ruCollector) to keep a
module-level/local field (e.g., savedRUCollector) and then update the
teardown/close code that currently calls stmtstats.UnregisterRUCollector(...) to
use savedRUCollector instead of re-reading globalTopProfilingReport; make the
same change for the TX collector path so SetupTopProfilingForTest replacements
don’t leave the original collector registered.

pkg/util/topsql/reporter/ru_datamodel.go (1)

123-142: Consider the O(n) timestamp lookup in high-throughput scenarios.

The add method uses a linear scan to find matching timestamps. For the current 15s-bucket design (max ~4 timestamps per 60s window), this is fine. If the bucket granularity ever changes to support many more timestamps per record, consider using a map-based lookup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/util/topsql/reporter/ru_datamodel.go` around lines 123 - 142, The add
method on ruRecord currently does an O(n) scan over r.items to match timestamp;
change ruRecord to maintain a map (e.g., itemsMap map[uint64]*ruItem) alongside
the items slice and update add to first look up the ruItem in itemsMap by
timestamp and update it (and r.totalRU), otherwise create a new ruItem, append
it to r.items and insert it into itemsMap; ensure any other methods that modify
r.items (removal, reset, serialization) also update itemsMap accordingly so both
structures stay consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/util/topsql/reporter/ru_datamodel.go`:
- Around line 123-142: The add method on ruRecord currently does an O(n) scan
over r.items to match timestamp; change ruRecord to maintain a map (e.g.,
itemsMap map[uint64]*ruItem) alongside the items slice and update add to first
look up the ruItem in itemsMap by timestamp and update it (and r.totalRU),
otherwise create a new ruItem, append it to r.items and insert it into itemsMap;
ensure any other methods that modify r.items (removal, reset, serialization)
also update itemsMap accordingly so both structures stay consistent.

In `@pkg/util/topsql/topsql.go`:
- Around line 65-67: When registering the RU collector from the current
globalTopProfilingReport, capture and store the specific collector instance (the
ruCollector returned in the registration block) so you can unregister that exact
instance later; change the registration site that calls
stmtstats.RegisterRUCollector(ruCollector) to keep a module-level/local field
(e.g., savedRUCollector) and then update the teardown/close code that currently
calls stmtstats.UnregisterRUCollector(...) to use savedRUCollector instead of
re-reading globalTopProfilingReport; make the same change for the TX collector
path so SetupTopProfilingForTest replacements don’t leave the original collector
registered.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 56de3fb6-e361-4709-8ae5-d05ac16af549

📥 Commits

Reviewing files that changed from the base of the PR and between 16df0d1 and 794fb69.

📒 Files selected for processing (11)

pkg/util/topsql/reporter/BUILD.bazel
pkg/util/topsql/reporter/reporter.go
pkg/util/topsql/reporter/reporter_test.go
pkg/util/topsql/reporter/ru_datamodel.go
pkg/util/topsql/reporter/ru_datamodel_test.go
pkg/util/topsql/reporter/ru_window_aggregator.go
pkg/util/topsql/reporter/ru_window_aggregator_test.go
pkg/util/topsql/reporter/topru_case_runner_test.go
pkg/util/topsql/reporter/topru_generated_cases_test.go
pkg/util/topsql/stmtstats/aggregator_bench_test.go
pkg/util/topsql/topsql.go

XuHuaiyu

PR review: TopRU aggregation and reporting pipeline. Two findings below.

pkg/util/topsql/reporter/ru_window_aggregator.go

pkg/util/topsql/topsql.go

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

pkg/util/topsql/reporter/ru_window_aggregator.go (1)
178-199: ⚠️ Potential issue | 🟠 Major

Apply the final 100x100 cap after merging sub-intervals.

intervalCompacted limits each 15s/30s slice, but mergedOutput.mergeFrom(...) can still accumulate up to 4× the configured user/SQL count across the 60s window. Returning mergedOutput.toTopRURecords(...) directly therefore breaks the file's own 100x100 contract for itemInterval=15 and 30.
♻️ Proposed fix
-	// Convert to proto at output.
-	return mergedOutput.toTopRURecords(keyspaceName)
+	finalOutput := mergedOutput.compactWithLimits(ruReportTopNUsers, ruReportTopNSQLsPerUser)
+	if finalOutput == nil {
+		return nil
+	}
+	return finalOutput.toTopRURecords(keyspaceName)
Please also add a regression for the 15s/30s over-cap cases, not just the 60s path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/util/topsql/reporter/ru_window_aggregator.go` around lines 178 - 199, The
mergedOutput can exceed the configured top-N caps because you only compact each
sub-interval (intervalCompacted) but never re-apply compactWithLimits after
merging those intervalCompacted results; fix by calling mergedOutput =
mergedOutput.compactWithLimits(ruReportTopNUsers, ruReportTopNSQLsPerUser) (or
equivalent in-place compaction) immediately before converting to proto with
mergedOutput.toTopRURecords(keyspaceName), and ensure the singleBucket path
still returns a capped result (intervalCompacted is fine there). Also add
regression tests that exercise itemInterval=15 and itemInterval=30 to assert the
final output respects the 100x100 caps.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/util/topsql/reporter/reporter.go`:
- Around line 172-185: CollectRUIncrements currently enqueues un-timestamped RU
batches onto collectRUIncrementsChan and batches are timestamped later in
collectRUWorker, which can reorder attribution around the reporting tick; to
fix, change the channel payload to include the producer timestamp (e.g. wrap
stmtstats.RUIncrementMap into a struct with a time.Time field) and have
CollectRUIncrements stamp the batch before sending, then update collectRUWorker
to drain all pending entries and process them in timestamp order at each tick
(or only process entries during the tick goroutine) so RU is attributed to the
correct window; apply the same pattern to the other enqueue points referenced
around collectRUIncrementsChan and the related enqueue/consumer code in the
collectRUWorker/reporting logic (the same timestamped-envelope +
drain-before-report approach).

---

Duplicate comments:
In `@pkg/util/topsql/reporter/ru_window_aggregator.go`:
- Around line 178-199: The mergedOutput can exceed the configured top-N caps
because you only compact each sub-interval (intervalCompacted) but never
re-apply compactWithLimits after merging those intervalCompacted results; fix by
calling mergedOutput = mergedOutput.compactWithLimits(ruReportTopNUsers,
ruReportTopNSQLsPerUser) (or equivalent in-place compaction) immediately before
converting to proto with mergedOutput.toTopRURecords(keyspaceName), and ensure
the singleBucket path still returns a capped result (intervalCompacted is fine
there). Also add regression tests that exercise itemInterval=15 and
itemInterval=30 to assert the final output respects the 100x100 caps.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 645ec5ff-bab3-4bda-a805-13ac7ef3fe47

📥 Commits

Reviewing files that changed from the base of the PR and between 794fb69 and fc3bcfe.

📒 Files selected for processing (3)

pkg/util/topsql/reporter/reporter.go
pkg/util/topsql/reporter/ru_datamodel.go
pkg/util/topsql/reporter/ru_window_aggregator.go

pkg/util/topsql/reporter/reporter.go

coderabbitai

🧹 Nitpick comments (1)

pkg/util/topsql/reporter/ru_datamodel.go (1)
123-142: Consider: Linear scan in ruRecord.add may become a bottleneck.

The add method performs a linear scan over items to find an existing timestamp. For typical 15s buckets with 1-4 timestamps this is acceptable, but if items grow larger (e.g., during merges), this could become O(n²).

A map-based lookup could improve performance if profiling shows this as a hotspot.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/util/topsql/reporter/ru_datamodel.go` around lines 123 - 142, The add
method on ruRecord currently does a linear scan over r.items to find matching
timestamp which can degrade to O(n²); replace this with a map-based index (e.g.,
add a field on ruRecord like itemsIndex map[uint64]int) and change ruRecord.add
to look up timestamp in itemsIndex, update the existing ruItem by index when
present, or append a new ruItem and record its index in itemsIndex; ensure you
update itemsIndex whenever you append, merge, or reorder items and keep
r.totalRU changes identical to the current logic in add.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/util/topsql/reporter/ru_datamodel.go`:
- Around line 123-142: The add method on ruRecord currently does a linear scan
over r.items to find matching timestamp which can degrade to O(n²); replace this
with a map-based index (e.g., add a field on ruRecord like itemsIndex
map[uint64]int) and change ruRecord.add to look up timestamp in itemsIndex,
update the existing ruItem by index when present, or append a new ruItem and
record its index in itemsIndex; ensure you update itemsIndex whenever you
append, merge, or reorder items and keep r.totalRU changes identical to the
current logic in add.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0ff2db95-dc25-4843-87f7-d766f4c1c89e

📥 Commits

Reviewing files that changed from the base of the PR and between 71313c1 and e760e1e.

📒 Files selected for processing (5)

pkg/util/topsql/reporter/BUILD.bazel
pkg/util/topsql/reporter/reporter.go
pkg/util/topsql/reporter/ru_datamodel.go
pkg/util/topsql/reporter/ru_window_aggregator.go
pkg/util/topsql/reporter/ru_window_aggregator_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/util/topsql/reporter/ru_window_aggregator_test.go

codecov · 2026-03-17T14:30:07Z

Codecov Report

❌ Patch coverage is 85.73798% with 86 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.3741%. Comparing base (83d3794) to head (4660412).
⚠️ Report is 44 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #67089        +/-   ##
================================================
- Coverage   77.7079%   77.3741%   -0.3339%     
================================================
  Files          2013       1940        -73     
  Lines        551161     549623      -1538     
================================================
- Hits         428296     425266      -3030     
- Misses       121134     124304      +3170     
+ Partials       1731         53      -1678

Flag	Coverage Δ
integration	`41.0202% <14.7208%> (-7.1352%)`	⬇️
unit	`76.6910% <84.7429%> (+0.4890%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <ø> (+4.7091%)`	⬆️
parser	`∅ <ø> (∅)`
br	`47.2781% <ø> (-13.6005%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

yibin87

If https://github.com/pingcap/tidb/pull/67089/changes#r2945836514 resolved, I'll approve.

zimulala · 2026-03-18T05:45:58Z

Re: ru_datamodel.go `ruRecord.add` linear scan (CodeRabbit nitpick): For the current 15s-bucket design we have at most a few timestamps per record per 60s window, so the linear scan is intentional and acceptable. If we change granularity or profiling shows this as a hotspot, we can add a map-based index (e.g. `itemsMap map[uint64]*ruItem`) in a follow-up.

yibin87

LGTM

Made-with: Cursor

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

pkg/util/topsql/reporter/reporter_test.go (1)

1211-1236: Benchmarks should avoid unbounded sink accumulation.

initializeCache registers a sink that retains every ReportData; in benchmark loops this can dominate memory/time and blur reporter-path measurements. Prefer a no-op or bounded/drained sink for benchmark scenarios.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/util/topsql/reporter/reporter_test.go` around lines 1211 - 1236, The
benchmark currently uses initializeCache which registers a sink that retains
every ReportData and causes unbounded accumulation; change the benchmark to
register a no-op/draining sink instead of the retaining sink (or modify
initializeCache to accept a sink parameter) so ReportData is discarded promptly.
Concretely, for the BenchmarkReporterScenarios setup (where tsr is created for
TopSQLOnly, TopRUOnly, and TopSQLAndTopRU), create and use a sink that simply
returns nil or drains reports (does not append to a slice) and pass it into
initializeCache or call tsr.RegisterSink with that no-op sink before the loops
so populateCache, populateCacheWithRU and tsr.doReport do not retain ReportData.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/util/topsql/reporter/reporter_test.go`:
- Around line 247-285: Record the original TopSQL enabled state at the start
(e.g. origTopSQLEnabled := topsqlstate.TopSQLEnabled()) before calling
topsqlstate.DisableTopSQL(), and in the existing t.Cleanup restore it by calling
topsqlstate.EnableTopSQL() or topsqlstate.DisableTopSQL() based on
origTopSQLEnabled; use the same
TestEffectiveReportIntervalSecondsTopSQLIndependentFromTopRU test and
topsqlstate.EnableTopSQL/DisableTopSQL/TopSQLEnabled symbols so the global
TopSQL flag is returned to its prior state to avoid cross-test leakage.

In `@pkg/util/topsql/reporter/ru_window_aggregator_test.go`:
- Around line 69-80: The helper fillAggregatorSteadyState60sAt10kKeys currently
sets numUsers=200 and numSQLsPerUser=200 producing 40,000 keys; change the
constants so numUsers * numSQLsPerUser == 10_000 (e.g., numUsers=100 and
numSQLsPerUser=100) and rebuild the batch via makeRUBatch(numUsers,
numSQLsPerUser) so the function matches its name/comment and the benchmark
targets 10k keys; update only the constants in
fillAggregatorSteadyState60sAt10kKeys and any related comment text if needed.

---

Nitpick comments:
In `@pkg/util/topsql/reporter/reporter_test.go`:
- Around line 1211-1236: The benchmark currently uses initializeCache which
registers a sink that retains every ReportData and causes unbounded
accumulation; change the benchmark to register a no-op/draining sink instead of
the retaining sink (or modify initializeCache to accept a sink parameter) so
ReportData is discarded promptly. Concretely, for the BenchmarkReporterScenarios
setup (where tsr is created for TopSQLOnly, TopRUOnly, and TopSQLAndTopRU),
create and use a sink that simply returns nil or drains reports (does not append
to a slice) and pass it into initializeCache or call tsr.RegisterSink with that
no-op sink before the loops so populateCache, populateCacheWithRU and
tsr.doReport do not retain ReportData.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 551cab77-6549-4904-be6b-280a0a58ebc3

📥 Commits

Reviewing files that changed from the base of the PR and between e760e1e and 7030ca7.

📒 Files selected for processing (5)

pkg/util/topsql/reporter/reporter_test.go
pkg/util/topsql/reporter/ru_datamodel_test.go
pkg/util/topsql/reporter/ru_window_aggregator.go
pkg/util/topsql/reporter/ru_window_aggregator_test.go
pkg/util/topsql/reporter/topru_generated_cases_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/util/topsql/reporter/ru_window_aggregator.go

pkg/util/topsql/reporter/reporter_test.go

pkg/util/topsql/reporter/ru_window_aggregator_test.go

pkg/util/topsql/reporter/reporter.go

…contract

pkg/util/topsql/reporter/reporter.go

nolouch

lgtm

…ort boundary

XuHuaiyu · 2026-03-20T09:37:17Z

pkg/util/topsql/reporter/reporter.go

+			if len(batch.data) == 0 {
+				continue
 			}
+			tsr.ruAggregator.addBatchToBucket(batch.timestamp, batch.data)


High priority

Moving the timestamp to enqueue time is necessary, but this still leaves a cross-tick race because report flushing and RU draining are on separate goroutines.

Impact

Suppose a batch is enqueued at t = 59, but collectRUWorker does not consume it until after takeReportRecords(60) has advanced lastReportedEndTs to 60.

This line will then call addBatchToBucket(59, ...), which aligns to bucket 45.

ruWindowAggregator.addBatchToBucket(...) will treat that as late data (45 < 60) and drop it entirely.

So a batch produced before the report tick can still disappear from the closed window; this is not only a best-effort shift to the next window.

Test gap
TestTopRUBestEffortBoundaryShift covers a batch collected at t = 61 (already after the tick), but it does not cover the remaining problematic case: collected before the tick, drained after the tick.

Suggested direction
Before closing/reporting a window, drain pending RU batches into the aggregator, or serialize RU ingestion and report flushing on the same goroutine/event loop.

XuHuaiyu

Re-reviewed the latest update. Serializing RU batch ingestion onto collectWorker and shifting late batches to the earliest still-open window addresses my previous concern about pre-tick batches being dropped by the report/drain race. The updated aggregator/reporter tests also look good from my side.

ti-chi-bot · 2026-03-20T11:45:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: XuHuaiyu, yibin87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [XuHuaiyu]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-03-20T11:45:38Z

[LGTM Timeline notifier]

Timeline:

2026-03-18 06:33:11.674611022 +0000 UTC m=+338718.762268559: ☑️ agreed by yibin87.
2026-03-20 11:45:36.906661678 +0000 UTC m=+530263.994319215: ☑️ agreed by XuHuaiyu.

zimulala · 2026-03-20T13:12:14Z

/retest

zimulala · 2026-03-20T13:34:46Z

/retest

zimulala · 2026-03-20T13:53:46Z

/retest

zimulala · 2026-03-20T16:07:27Z

/retest

zimulala · 2026-03-20T16:14:09Z

/retest

zimulala · 2026-03-20T16:23:51Z

/retest

…ne (#67089) close #67065 (cherry picked from commit 56d25cd)

…orting pipeline (#67089) (#67256) close #67065

zimulala added 3 commits March 16, 2026 11:17

topsql/reporter: add TopRU RU aggregation and reporting

b5fbe8b

topsql: tiny update tests

9be3966

pkg/util/topsql: add fast paths to TopRU report records to reduce all…

794fb69

…ocations Made-with: Cursor

ti-chi-bot bot added do-not-merge/needs-triage-completed release-note-none Denotes a PR that doesn't merit a release note. labels Mar 17, 2026

ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 17, 2026

ti-chi-bot bot added ok-to-test Indicates a PR is ready to be tested. and removed do-not-merge/needs-triage-completed labels Mar 17, 2026

zimulala requested review from XuHuaiyu, nolouch and yibin87 March 17, 2026 10:16

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

XuHuaiyu reviewed Mar 17, 2026

View reviewed changes

pkg/util/topsql/reporter/ru_window_aggregator.go Show resolved Hide resolved

pkg/util/topsql/topsql.go Show resolved Hide resolved

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

pkg/util/topsql/reporter/reporter.go Show resolved Hide resolved

pingcap deleted a comment from tiprow bot Mar 17, 2026

pkg/util/topsql/reporter: fix TopRU nogo warnings

e760e1e

zimulala force-pushed the codex/topru-pr3-rerun branch from a6d1e3c to e760e1e Compare March 17, 2026 11:12

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

yibin87 reviewed Mar 18, 2026

View reviewed changes

yibin87 approved these changes Mar 18, 2026

View reviewed changes

ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 18, 2026

topsql/reporter: consolidate benchmarks to fix toomanytests

7030ca7

Made-with: Cursor

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

pkg/util/topsql/reporter/reporter_test.go Outdated Show resolved Hide resolved

pkg/util/topsql/reporter/ru_window_aggregator_test.go Show resolved Hide resolved

XuHuaiyu reviewed Mar 19, 2026

View reviewed changes

pkg/util/topsql/reporter/reporter.go Outdated Show resolved Hide resolved

topsql/reporter: address C1-C5 review comments with fixed 60s report …

5c73c02

…contract

XuHuaiyu reviewed Mar 20, 2026

View reviewed changes

pkg/util/topsql/reporter/reporter.go Outdated Show resolved Hide resolved

nolouch reviewed Mar 20, 2026

View reviewed changes

topsql/reporter: timestamp RU batches at enqueue and clarify best-eff…

0374ede

…ort boundary

XuHuaiyu reviewed Mar 20, 2026

View reviewed changes

zimulala force-pushed the codex/topru-pr3-rerun branch from 89b8852 to 0374ede Compare March 20, 2026 10:12

topsql/reporter: reorder ruBatch fields to satisfy fieldalignment

4660412

XuHuaiyu approved these changes Mar 20, 2026

View reviewed changes

ti-chi-bot bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 20, 2026

ti-chi-bot bot merged commit 56d25cd into master Mar 20, 2026
35 checks passed

ti-chi-bot bot deleted the codex/topru-pr3-rerun branch March 20, 2026 17:01

This was referenced Mar 23, 2026

[release-nextgen-20251011] *: adapt TopSQL naming for TopProfiling (prepare for TopRU) (#65820) #67238

Merged

[release-nextgen-20251011] topsql: add TopRU state and subscription plumbing (tipb upgrade) (#66642) #67245

Merged

zimulala added a commit that referenced this pull request Mar 24, 2026

topsql/reporter: add TopRU RU window aggregation and reporting pipeli…

8b9a9cf

…ne (#67089) close #67065 (cherry picked from commit 56d25cd)

zimulala mentioned this pull request Mar 24, 2026

[release-nextgen-20251011] *: add TopRU RU window aggregation and reporting pipeline (#67089) #67256

Merged

4 tasks

coderabbitai bot mentioned this pull request Mar 24, 2026

util, executor: support RU2 in TopRU and reset TopRU state on RU version handover #67257

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this pull request Mar 24, 2026

[release-nextgen-20251011] *: add TopRU RU window aggregation and rep…

44637f5

…orting pipeline (#67089) (#67256) close #67065

This was referenced Mar 24, 2026

topsql/reporter: decouple othersUser routing from wire label #67265

Merged

[release-nextgen-20251011] reporter: decouple othersUser routing from wire label (#67265) #67280

Merged

Conversation

zimulala commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

pantheon-ai bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiprow bot commented Mar 17, 2026

Uh oh!

coderabbitai bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

hawkingrei commented Mar 17, 2026

Uh oh!

zimulala commented Mar 17, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

XuHuaiyu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yibin87 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zimulala commented Mar 18, 2026

Uh oh!

yibin87 left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nolouch left a comment

Choose a reason for hiding this comment

Uh oh!

XuHuaiyu Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

XuHuaiyu left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented Mar 20, 2026

Uh oh!

ti-chi-bot bot commented Mar 20, 2026

[LGTM Timeline notifier]

Uh oh!

zimulala commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

pantheon-ai bot commented Mar 17, 2026 •

edited

Loading

coderabbitai bot commented Mar 17, 2026 •

edited

Loading

codecov bot commented Mar 17, 2026 •

edited

Loading

yibin87 left a comment •

edited

Loading