Skip to content

planner, copr: estimate TiCI fulltext row count | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts#68056

Merged
ti-chi-bot[bot] merged 12 commits into
pingcap:feature/ftsfrom
wshwsh12:wsh/tici-estimate-count
Apr 29, 2026
Merged

planner, copr: estimate TiCI fulltext row count | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts#68056
ti-chi-bot[bot] merged 12 commits into
pingcap:feature/ftsfrom
wshwsh12:wsh/tici-estimate-count

Conversation

@wshwsh12
Copy link
Copy Markdown
Contributor

@wshwsh12 wshwsh12 commented Apr 27, 2026

What problem does this PR solve?

Issue Number: None

Problem Summary:

TiDB currently does not ask TiCI/TiFlash for fulltext predicate cardinality, so the optimizer cannot use TiCI-side sampled estimate count for fulltext access paths.

What changed and how does it work?

  • Add a TiCI estimate count path from planner statistics collection to the TiKV client layer.
  • Convert fulltext read ranges into TiCI shard/range estimate requests and call GetEstimateTiCICount through client-go.
  • Use the returned estimate as the fulltext access row count when the estimate request succeeds.
  • Log successful TiCI estimate count calls to make e2e verification observable.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Manual test:

  • TiUP playground tag: tici-estimate-skew-20260426-2240.
  • Query: EXPLAIN SELECT COUNT(*) FROM estimate_skew WHERE MATCH(content) AGAINST ('heavyhit' IN BOOLEAN MODE);.
  • Result: TiDB used TiCI estimate 4321; exact match count was 4000; total table row count was 9400.
  • Logs confirmed TiCI estimate count succeeded, GetEstimateTiCICount done, and TiCI EstimateCountResult { estimated_total_count: 4321, ... }.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • New Features

    • Added TiCI-based row count estimation for full-text search paths.
  • Chores

    • Bumped multiple Go module dependency versions.
  • Tests

    • Added unit tests covering TiCI estimation logic, shard grouping, and shard-selection behavior.

@ti-chi-bot ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 27, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end TiCI (TiFlash Compute Index) count-estimation: new KV request/interface, planner hook to call estimator, copr store implementation with shard grouping/sampling and RPC, driver forwarding, unit tests, and dependency version bumps in go.mod.

Changes

Cohort / File(s) Summary
Dependency Management
go.mod
Bumps multiple dependencies (github.com/pingcap/kvproto, github.com/tikv/client-go/v2, various golang.org/x/* modules); only require versions updated.
KV Layer Interface
pkg/kv/tici_estimate.go, pkg/kv/BUILD.bazel
Adds TiCIEstimateCountRequest type and TiCIEstimateCountProvider interface; registers new source in BUILD.
Planner Integration
pkg/planner/core/stats.go, pkg/planner/core/BUILD.bazel
Planner attempts TiCI-based estimation for FTS search paths, builds KV ranges, retrieves txn read ts, calls estimator with 50ms timeout, sets CountAfterAccess on success, falls back to prior heuristic on failure; adds timeutil dep.
Copr Store Implementation
pkg/store/copr/tici_estimate_count.go, pkg/store/copr/BUILD.bazel
Adds Store.EstimateTiCICount: validates request, uses TiCI shard cache, groups shards by local cache addr, picks best sampled group, marshals FTS query, sends CmdGetEstimateTiCICount RPC, handles responses/errors, scales estimate by sampling ratio; BUILD updated and tipb dep added.
Driver Forwarding
pkg/store/driver/tikv_driver.go
Adds tikvStore.EstimateTiCICount that forwards to the copr store implementation.
Tests
pkg/store/copr/tici_estimate_count_test.go
Unit tests for shard grouping, shard-group selection, and protobuf request construction for TiCI estimate flows.

Sequence Diagram

sequenceDiagram
    participant Planner as Planner
    participant Driver as TiKV Driver
    participant Store as Copr Store
    participant Cache as TiCI Shard Cache
    participant TiFlash as TiFlash Endpoint

    Planner->>Planner: validate TiCI prerequisites (FTS, index, ranges)
    Planner->>Driver: EstimateTiCICount(ctx, req, 50ms)
    Driver->>Store: EstimateTiCICount(ctx, req, 50ms)
    Store->>Cache: Get shards for table/index
    Cache-->>Store: shard list with local cache addrs
    Store->>Store: group shards by local addr & dedupe IDs
    Store->>Store: choose best shard group
    Store->>Store: marshal FTS query into protobuf request
    Store->>TiFlash: CmdGetEstimateTiCICount (with timeout)
    TiFlash-->>Store: estimate response or error
    Store->>Store: scale estimate by totalUnique/sampledUnique (ceil)
    Store-->>Driver: scaled estimate
    Driver-->>Planner: scaled estimate (planner caps by realtime count)
    alt TiCI unavailable or error
        Planner->>Planner: fallback to heuristic min(realtime/10,1000)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • winoros
  • wjhuang2016
  • OliverS929

Poem

🐰 I hopped through shards with a curious count,
Packed queries in packets and scaled by amount,
I sampled the caches, then sent them along,
If TiFlash is quiet, my fallback stays strong,
Thump-thump — little estimates sing like a song.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description covers the problem statement, technical implementation details, and includes manual test results with verification logs, though unit tests are not included.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately summarizes the main change: adding TiCI-based fulltext row count estimation for the planner and coprocessor components, which is the core objective of this PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 27, 2026

Hi @wshwsh12. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
pkg/planner/core/stats.go (1)

252-260: Also gate on path.Index.IsTiCIIndex() to avoid wasted 50ms RPCs on non-TiCI FTS paths.

The current guard accepts any path with FtsQueryInfo != nil, but TiCIIndexRangesToKVRanges and the downstream CmdGetEstimateTiCICount are TiCI-shard-specific. If a non-TiCI FTS path ever reaches here (legacy TiFlash FTS, future FTS index types, or an unfilled TiCI shard layout), the planner will pay up to the 50ms timeout per such path before falling back to the heuristic, and getTiCIShardType may return a misleading shard type for the KV-range conversion.

Adding the index-kind check makes the contract with getTiCIShardType explicit and short-circuits cleanly for non-TiCI indexes:

♻️ Suggested change
 func deriveTiCISearchPathStats(ds *logicalop.DataSource, path *util.AccessPath) (float64, bool) {
 	sctx, ok := ds.SCtx().(sessionctx.Context)
 	if !ok || path == nil || path.Index == nil || path.FtsQueryInfo == nil || len(path.Ranges) == 0 {
 		return 0, false
 	}
+	if !path.Index.IsTiCIIndex() {
+		return 0, false
+	}
 	provider, ok := sctx.GetStore().(kv.TiCIEstimateCountProvider)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/stats.go` around lines 252 - 260, In
deriveTiCISearchPathStats add an early guard that checks
path.Index.IsTiCIIndex() and return (0, false) when it is false so we only call
TiCI-specific logic for TiCI indexes; specifically, before obtaining the TiCI
estimate provider or invoking TiCIIndexRangesToKVRanges /
CmdGetEstimateTiCICount (and before calling getTiCIShardType), ensure path !=
nil && path.Index != nil && path.Index.IsTiCIIndex() to short-circuit non-TiCI
FTS paths and avoid the 50ms RPC overhead and misleading shard-type conversions.
pkg/store/copr/tici_estimate_count.go (2)

97-106: Consider populating kvrpcpb.Context with at least request-source/priority for observability and QoS.

The request is sent with Context: &kvrpcpb.Context{}, which means the TiFlash side won't see a request source tag, SQL digest, priority, or resource-control context for this RPC. Other coprocessor calls in this package set these fields so that estimation traffic can be observed, throttled, and attributed correctly. Since TiCI estimate calls run on the planner hot path, having them invisible to TiFlash-side metrics/throttling is a missed signal.

If there is a session/statement context reachable here (or one that can be threaded via req), populate at minimum RequestSource and Priority; otherwise consider letting the TiFlash request-source SDK attach them via the existing wrappers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/store/copr/tici_estimate_count.go` around lines 97 - 106, The
TiCIEstimateCountRequest is built with an empty kvrpcpb.Context, which omits
RequestSource/Priority and prevents TiFlash from observing/throttling this RPC;
update the request construction in the function that returns
&coprocessor.TiCIEstimateCountRequest{} so that the Context is populated
(kvrpcpb.Context) with at least RequestSource and Priority taken from the
available session/statement context or from fields on req (or by calling the
same helper/wrapper used by other coprocessor calls in this package), ensuring
the request carries the same observability/QoS metadata as other coprocessor
RPCs.

85-89: Minor: dead branch in scaling check.

sampledUniqueShards == 0 cannot trigger here — group.shardInfos and group.uniqueShardID are populated in lockstep in buildTiCIEstimateShardGroups, and the earlier guard len(group.shardInfos) == 0 already errors out before this point. The expression simplifies to sampledUniqueShards >= totalUniqueShards. Dropping the == 0 clause makes the invariant clearer and removes the suggestion that a zero-divisor case needs handling.

♻️ Suggested simplification
-	sampledUniqueShards := uint64(len(group.uniqueShardID))
-	if sampledUniqueShards == 0 || sampledUniqueShards >= totalUniqueShards {
+	sampledUniqueShards := uint64(len(group.uniqueShardID))
+	if sampledUniqueShards >= totalUniqueShards {
 		return pbResp.EstCount, nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/store/copr/tici_estimate_count.go` around lines 85 - 89, The guard that
checks sampledUniqueShards currently includes a dead branch `sampledUniqueShards
== 0`; since `group.shardInfos` and `group.uniqueShardID` are populated together
in buildTiCIEstimateShardGroups and an earlier check already rejects empty
groups, simplify the condition in the scaling check to only `sampledUniqueShards
>= totalUniqueShards` (i.e., replace `if sampledUniqueShards == 0 ||
sampledUniqueShards >= totalUniqueShards` with `if sampledUniqueShards >=
totalUniqueShards`) and update any nearby comment to reflect the invariant that
sampledUniqueShards cannot be zero.
pkg/store/copr/tici_estimate_count_test.go (1)

78-92: Optional: extend coverage to the validation/short-circuit branches of EstimateTiCICount.

The helper functions are well covered, but the public method (*Store).EstimateTiCICount itself has untested guard branches (nil request, nil FTSQueryInfo, nil/empty KeyRanges, missing TiCI shard cache). These can be exercised by passing minimally constructed inputs without needing a real RPC client and would lock in the documented short-circuit contract:

  • nil request → error "TiCI estimate count request is nil"
  • nil FTSQueryInfo → error "TiCI estimate count request missing FTS query info"
  • nil/empty KeyRanges → returns (0, nil) (no RPC)

Not required for this PR, but cheap to add and protects the early-return semantics that callers in pkg/planner/core/stats.go rely on.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/store/copr/tici_estimate_count_test.go` around lines 78 - 92, Add unit
tests for the public method (*Store).EstimateTiCICount to cover its early-return
validation branches: create tests that (1) pass a nil request and assert the
error message equals "TiCI estimate count request is nil", (2) pass a request
with nil FTSQueryInfo and assert the error message equals "TiCI estimate count
request missing FTS query info", and (3) pass a request with nil or empty
KeyRanges and assert it returns (0, nil) without attempting RPC. Put these tests
alongside TestBuildTiCIEstimatePBRequest in
pkg/store/copr/tici_estimate_count_test.go, instantiate a Store without a real
RPC client (nil) and with minimal TiCI shard cache state so tests exercise only
the short-circuit logic in EstimateTiCICount.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Around line 381-384: The go.mod currently contains temporary replace
directives pointing to a personal fork (github.com/wshwsh12/kvproto and
github.com/wshwsh12/client-go/v2); remove these replaces by upstreaming the
protobuf and client changes into pingcap/kvproto and tikv/client-go/v2, then
update the corresponding require versions (the current require entries around
the former Lines 104 and 127) to the new upstream tags/versions and delete the
two replace blocks, and finally run make bazel_prepare and include the generated
Bazel metadata changes in the PR so the workspace stays in sync.

In `@pkg/planner/core/stats.go`:
- Around line 276-284: The call to provider.EstimateTiCICount currently uses
context.Background(), which detaches cancellation/tracing from the planner flow;
instead propagate a parent context (e.g., the caller's ctx derived from sctx)
into provider.EstimateTiCICount so upstream cancellations and tracing are
honoured. Update deriveSearchPathStats (or its callers) to accept and pass a
context parameter down to the EstimateTiCICount invocation (replacing
context.Background()), so provider.EstimateTiCICount(...) receives the
propagated ctx while keeping the 50ms timeout/deadline applied.
- Around line 290-299: The Info-level log inside deriveTiCISearchPathStats (the
logutil.BgLogger().Info(...) call that logs "TiCI estimate count succeeded" with
fields like tableID, indexName, indexID, readTS, estimatedCount,
plannerCountAfterAccess, realtimeCount, rangeCount) should be demoted to Debug
to avoid flooding production logs on the planner hot path; locate that call in
pkg/planner/core/stats.go and replace the Info invocation with Debug (or gate it
behind a feature/log-level check) while keeping the same structured fields and
the existing return (plannerCount, true).

---

Nitpick comments:
In `@pkg/planner/core/stats.go`:
- Around line 252-260: In deriveTiCISearchPathStats add an early guard that
checks path.Index.IsTiCIIndex() and return (0, false) when it is false so we
only call TiCI-specific logic for TiCI indexes; specifically, before obtaining
the TiCI estimate provider or invoking TiCIIndexRangesToKVRanges /
CmdGetEstimateTiCICount (and before calling getTiCIShardType), ensure path !=
nil && path.Index != nil && path.Index.IsTiCIIndex() to short-circuit non-TiCI
FTS paths and avoid the 50ms RPC overhead and misleading shard-type conversions.

In `@pkg/store/copr/tici_estimate_count_test.go`:
- Around line 78-92: Add unit tests for the public method
(*Store).EstimateTiCICount to cover its early-return validation branches: create
tests that (1) pass a nil request and assert the error message equals "TiCI
estimate count request is nil", (2) pass a request with nil FTSQueryInfo and
assert the error message equals "TiCI estimate count request missing FTS query
info", and (3) pass a request with nil or empty KeyRanges and assert it returns
(0, nil) without attempting RPC. Put these tests alongside
TestBuildTiCIEstimatePBRequest in pkg/store/copr/tici_estimate_count_test.go,
instantiate a Store without a real RPC client (nil) and with minimal TiCI shard
cache state so tests exercise only the short-circuit logic in EstimateTiCICount.

In `@pkg/store/copr/tici_estimate_count.go`:
- Around line 97-106: The TiCIEstimateCountRequest is built with an empty
kvrpcpb.Context, which omits RequestSource/Priority and prevents TiFlash from
observing/throttling this RPC; update the request construction in the function
that returns &coprocessor.TiCIEstimateCountRequest{} so that the Context is
populated (kvrpcpb.Context) with at least RequestSource and Priority taken from
the available session/statement context or from fields on req (or by calling the
same helper/wrapper used by other coprocessor calls in this package), ensuring
the request carries the same observability/QoS metadata as other coprocessor
RPCs.
- Around line 85-89: The guard that checks sampledUniqueShards currently
includes a dead branch `sampledUniqueShards == 0`; since `group.shardInfos` and
`group.uniqueShardID` are populated together in buildTiCIEstimateShardGroups and
an earlier check already rejects empty groups, simplify the condition in the
scaling check to only `sampledUniqueShards >= totalUniqueShards` (i.e., replace
`if sampledUniqueShards == 0 || sampledUniqueShards >= totalUniqueShards` with
`if sampledUniqueShards >= totalUniqueShards`) and update any nearby comment to
reflect the invariant that sampledUniqueShards cannot be zero.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c3434051-b885-4ffb-a722-1395afc8eb4c

📥 Commits

Reviewing files that changed from the base of the PR and between d0253d6 and 58c2d28.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • go.mod
  • pkg/kv/tici_estimate.go
  • pkg/planner/core/stats.go
  • pkg/store/copr/tici_estimate_count.go
  • pkg/store/copr/tici_estimate_count_test.go
  • pkg/store/driver/tikv_driver.go

Comment thread go.mod Outdated
Comment thread pkg/planner/core/stats.go
Comment thread pkg/planner/core/stats.go Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 27, 2026

Codecov Report

❌ Patch coverage is 0% with 149 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.4385%. Comparing base (bd3b320) to head (0b41ccb).
⚠️ Report is 34 commits behind head on feature/fts.

Additional details and impacted files
@@                 Coverage Diff                  @@
##           feature/fts     #68056         +/-   ##
====================================================
- Coverage      76.8610%   39.4385%   -37.4226%     
====================================================
  Files             1960       1716        -244     
  Lines           555677     475548      -80129     
====================================================
- Hits            427099     187549     -239550     
- Misses          127116     271099     +143983     
- Partials          1462      16900      +15438     
Flag Coverage Δ
integration 39.4385% <0.0000%> (-6.1709%) ⬇️
tiprow_ft ?
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling ∅ <ø> (∅)
parser ∅ <ø> (∅)
br 0.0951% <ø> (-66.1495%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 27, 2026
@wshwsh12 wshwsh12 force-pushed the wsh/tici-estimate-count branch from cb36c32 to ae87825 Compare April 27, 2026 09:24
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
go.mod (2)

104-382: ⚠️ Potential issue | 🟠 Major

Please include Bazel metadata sync for this go.mod update

This PR changes go.mod, but no Bazel metadata updates are shown in the provided diff. Please run make bazel_prepare and include generated changes if any.

#!/bin/bash
set -euo pipefail

# 1) Confirm go module files changed
git diff --name-only | rg -n '^(go\.mod|go\.sum)$' || true

# 2) Check whether Bazel metadata files are included in this diff
git diff --name-only | rg -n '(^DEPS\.bzl$|(^|/)BUILD\.bazel$|\.bzl$)' || true

Based on learnings: "Applies to /*.bazel|/BUILD.bazel|**/*.bzl : MUST run make bazel_prepare and include resulting Bazel metadata changes in the PR when ... changing go.mod/go.sum."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` around lines 104 - 382, The PR updated go.mod but omitted Bazel
metadata; run the workspace preparation step (make bazel_prepare) to regenerate
Bazel metadata, then stage and commit the resulting Bazel files (e.g., DEPS.bzl,
any changed BUILD.bazel or *.bzl files) alongside the go.mod/go.sum updates;
verify with git diff that go.mod/go.sum changes and any generated Bazel files
are included before pushing.

382-382: ⚠️ Potential issue | 🔴 Critical

Blocker: personal-fork client-go replace must not ship

Line 382 still redirects github.com/tikv/client-go/v2 to a personal fork. This is a supply-chain and availability risk for release artifacts. Please switch back to upstream github.com/tikv/client-go/v2 after the required TiCI RPC changes are available there, then remove this replace.

#!/bin/bash
set -euo pipefail

# Verify whether any personal-fork replace directives remain.
rg -n '^\s*replace\s+github\.com/tikv/client-go/v2\s*=>\s*github\.com/wshwsh12/client-go/v2' go.mod
rg -n '^\s*replace\s+github\.com/pingcap/kvproto\s*=>\s*github\.com/wshwsh12/kvproto' go.mod || true

# Show current upstream requirement pinned in go.mod.
rg -n '^\s*github\.com/tikv/client-go/v2\s+v' go.mod
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` at line 382, The go.mod currently contains a replace directive
redirecting github.com/tikv/client-go/v2 to a personal fork (the replace line
shown) which must be removed; edit go.mod to delete the replace entry for
github.com/tikv/client-go/v2 => github.com/wshwsh12/client-go/v2 and ensure the
module requires the upstream github.com/tikv/client-go/v2 (restore or pin the
intended upstream version if needed), then remove any corresponding replace for
github.com/pingcap/kvproto if present and re-run `go mod tidy` to verify there
are no remaining personal-fork replaces.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@go.mod`:
- Around line 104-382: The PR updated go.mod but omitted Bazel metadata; run the
workspace preparation step (make bazel_prepare) to regenerate Bazel metadata,
then stage and commit the resulting Bazel files (e.g., DEPS.bzl, any changed
BUILD.bazel or *.bzl files) alongside the go.mod/go.sum updates; verify with git
diff that go.mod/go.sum changes and any generated Bazel files are included
before pushing.
- Line 382: The go.mod currently contains a replace directive redirecting
github.com/tikv/client-go/v2 to a personal fork (the replace line shown) which
must be removed; edit go.mod to delete the replace entry for
github.com/tikv/client-go/v2 => github.com/wshwsh12/client-go/v2 and ensure the
module requires the upstream github.com/tikv/client-go/v2 (restore or pin the
intended upstream version if needed), then remove any corresponding replace for
github.com/pingcap/kvproto if present and re-run `go mod tidy` to verify there
are no remaining personal-fork replaces.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 317dfe08-8a24-475e-862d-d5b1ad886860

📥 Commits

Reviewing files that changed from the base of the PR and between cb36c32 and ae87825.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (1)
  • go.mod

Signed-off-by: wshwsh12 <793703860@qq.com>
@wshwsh12 wshwsh12 force-pushed the wsh/tici-estimate-count branch from ae87825 to 3b00f60 Compare April 27, 2026 12:13
Signed-off-by: wshwsh12 <793703860@qq.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
pkg/planner/core/stats.go (2)

252-256: Type assertion silently disables TiCI estimation in non-sessionctx.Context cases.

ds.SCtx() returns base.PlanContext; the assertion to sessionctx.Context will succeed in production but may quietly fail in tests/mock contexts that supply only the planner-facing subset, in which case TiCI estimation is skipped without any log. If that is intentional (TiCI estimation is best-effort and the heuristic fallback is acceptable), consider adding a one-line comment documenting the intent so future readers don’t mistake the early return 0, false for an oversight.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/stats.go` around lines 252 - 256, deriveTiCISearchPathStats
currently type-asserts ds.SCtx() to sessionctx.Context in
deriveTiCISearchPathStats which causes TiCI estimation to be silently skipped in
test/mocked PlanContexts; add a one-line clarifying comment above the assertion
in deriveTiCISearchPathStats (referencing ds.SCtx() and sessionctx.Context) that
this is an intentional best-effort check and that non-sessionctx.PlanContext
cases (e.g., mocks) will skip TiCI estimation, so future readers won’t treat the
early return as a bug.

188-241: Deduplicate TiCI shard-type derivation.

getTiCIShardType reproduces the same precedence (HybridInfo/Sharding → CommonHandle → IntHandle) that fillIndexPath derives inline at lines 190-206. Having two copies invites drift if the rules ever change (e.g., a new shard variant is added in one place but not the other). Consider routing fillIndexPath through the new helper so there is a single source of truth.

♻️ Suggested refactor
 func fillIndexPath(ds *logicalop.DataSource, path *util.AccessPath, conds []expression.Expression, possiblePK *model.IndexInfo) error {
 	ticiType := distsql.NotTiCIIndex
 	if path.Index.IsTiCIIndex() {
-		if path.Index.HybridInfo != nil && path.Index.HybridInfo.Sharding != nil {
-			ticiType = distsql.TiCIShardExtraShardingKey
+		ticiType = getTiCIShardType(ds, path)
+		switch ticiType {
+		case distsql.TiCIShardExtraShardingKey:
 			path.Ranges = ranger.FullRange()
-		} else if ds.TableInfo.IsCommonHandle {
-			ticiType = distsql.TiCIShardCommonHandle
+		case distsql.TiCIShardCommonHandle:
 			path.Ranges = ranger.FullNotNullRange()
-		} else {
-			ticiType = distsql.TiCIShardIntHandle
+		default: // TiCIShardIntHandle
 			// Int Handle's range is a special one.
 			unsignedFlag := false
-			// We will not get the column for the _tidb_rowid case.
 			if intHandle := ds.TableInfo.GetPkColInfo(); intHandle != nil {
 				unsignedFlag = mysql.HasUnsignedFlag(intHandle.GetFlag())
 			}
 			path.Ranges = ranger.FullIntRange(unsignedFlag)
 		}

Also applies to: 306-314

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/stats.go` around lines 188 - 241, The code in fillIndexPath
duplicates TiCI shard-type logic (HybridInfo/Sharding → CommonHandle →
IntHandle) that getTiCIShardType encapsulates; refactor fillIndexPath to call
getTiCIShardType(ds, path.Index, possiblePK) (or the appropriate signature of
the helper) to obtain ticiType and then handle only the side effects (setting
path.Ranges for each ticiType branch) and the subsequent column conversions
(expression.TiCIIndexInfo2ShardCols / IndexInfo2Cols), removing the inline
precedence checks; ensure the IntHandle unsignedFlag logic (using
ds.TableInfo.GetPkColInfo and mysql.HasUnsignedFlag) is preserved when applying
the IntHandle case and keep the existing updates to
path.IdxCols/IdxColLens/FullIdxCols/FullIdxColLens and the subsequent
detachCondAndBuildRangeForPath call so behavior is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/planner/core/stats.go`:
- Around line 280-292: The synchronous call to provider.EstimateTiCICount in
stats.go can block planning up to the hard-coded 50ms timeout per FTS path; add
a failure short-lived negative cache or simple circuit-breaker around
EstimateTiCICount (cached failure state keyed by the TiCI provider or
table/index + backoff TTL and failure threshold) so consecutive failures quickly
skip the RPC and return the fallback, and make the timeout configurable (expose
it as a session variable or config value used instead of the hard-coded
50*time.Millisecond) so callers of EstimateTiCICount and the callsite in this
file can use the tunable deadline.

---

Nitpick comments:
In `@pkg/planner/core/stats.go`:
- Around line 252-256: deriveTiCISearchPathStats currently type-asserts
ds.SCtx() to sessionctx.Context in deriveTiCISearchPathStats which causes TiCI
estimation to be silently skipped in test/mocked PlanContexts; add a one-line
clarifying comment above the assertion in deriveTiCISearchPathStats (referencing
ds.SCtx() and sessionctx.Context) that this is an intentional best-effort check
and that non-sessionctx.PlanContext cases (e.g., mocks) will skip TiCI
estimation, so future readers won’t treat the early return as a bug.
- Around line 188-241: The code in fillIndexPath duplicates TiCI shard-type
logic (HybridInfo/Sharding → CommonHandle → IntHandle) that getTiCIShardType
encapsulates; refactor fillIndexPath to call getTiCIShardType(ds, path.Index,
possiblePK) (or the appropriate signature of the helper) to obtain ticiType and
then handle only the side effects (setting path.Ranges for each ticiType branch)
and the subsequent column conversions (expression.TiCIIndexInfo2ShardCols /
IndexInfo2Cols), removing the inline precedence checks; ensure the IntHandle
unsignedFlag logic (using ds.TableInfo.GetPkColInfo and mysql.HasUnsignedFlag)
is preserved when applying the IntHandle case and keep the existing updates to
path.IdxCols/IdxColLens/FullIdxCols/FullIdxColLens and the subsequent
detachCondAndBuildRangeForPath call so behavior is unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dba5b70b-61bb-4439-b3ba-8a9d3c170cab

📥 Commits

Reviewing files that changed from the base of the PR and between ae87825 and fce9d02.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (2)
  • go.mod
  • pkg/planner/core/stats.go
✅ Files skipped from review due to trivial changes (1)
  • go.mod

Comment thread pkg/planner/core/stats.go Outdated
Comment on lines +280 to +292
count, err := provider.EstimateTiCICount(ctx, &kv.TiCIEstimateCountRequest{
TableID: tableID,
IndexID: path.Index.ID,
StartTS: readTS,
FTSQueryInfo: path.FtsQueryInfo,
KeyRanges: keyRanges,
TimeZoneName: tzName,
TimeZoneOffset: tzOffset,
}, 50*time.Millisecond)
if err != nil {
logutil.BgLogger().Warn("failed to estimate TiCI search path row count", zap.Error(err), zap.Int64("tableID", tableID), zap.Int64("indexID", path.Index.ID))
return 0, false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Synchronous TiCI RPC adds up to 50 ms per FTS path during planning.

EstimateTiCICount is a blocking call on the planner thread guarded only by a 50 ms timeout. When TiFlash/TiCI is unhealthy (network partition, restart, slow node) every FTS query that reaches this code will eat the full deadline before falling back, and a query with multiple FTS access paths will pay this cost more than once. The fallback at the call site is correct, but consider:

  • A short-lived negative cache or circuit breaker on consecutive failures so a degraded TiCI does not regress every concurrent planner.
  • Surfacing the timeout as a tunable (session var or config) — 50 ms may be too tight for cross-AZ deployments and too loose under heavy load.

Not a blocker for this PR, but worth tracking before the feature is enabled by default.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/stats.go` around lines 280 - 292, The synchronous call to
provider.EstimateTiCICount in stats.go can block planning up to the hard-coded
50ms timeout per FTS path; add a failure short-lived negative cache or simple
circuit-breaker around EstimateTiCICount (cached failure state keyed by the TiCI
provider or table/index + backoff TTL and failure threshold) so consecutive
failures quickly skip the RPC and return the fallback, and make the timeout
configurable (expose it as a session variable or config value used instead of
the hard-coded 50*time.Millisecond) so callers of EstimateTiCICount and the
callsite in this file can use the tunable deadline.

@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 28, 2026
@wshwsh12 wshwsh12 force-pushed the wsh/tici-estimate-count branch from 9debcad to 289077e Compare April 28, 2026 08:01
Copy link
Copy Markdown
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

planner part lgtm.
Enough for the 1st executable version.

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 28, 2026
Signed-off-by: wshwsh12 <793703860@qq.com>
Signed-off-by: wshwsh12 <793703860@qq.com>
@wshwsh12 wshwsh12 changed the title planner, copr: estimate TiCI fulltext row count planner, copr: estimate TiCI fulltext row count | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts Apr 28, 2026
@wshwsh12
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 28, 2026

@wshwsh12: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: wshwsh12 <793703860@qq.com>
Signed-off-by: wshwsh12 <793703860@qq.com>
Comment thread go.mod
golang.org/x/text v0.33.0
golang.org/x/sys v0.41.0
golang.org/x/term v0.40.0
golang.org/x/text v0.34.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why there is some other changes?

Copy link
Copy Markdown
Contributor Author

@wshwsh12 wshwsh12 Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kvproto has updated these dependencies.... It was not caused by my change, but by dependency updates introduced by other commits on the kvproto master branch.

Comment on lines +71 to +72
clientReq := tikvrpc.NewRequest(tikvrpc.CmdGetEstimateTiCICount, pbReq)
clientReq.StoreTp = getEndPointType(kv.TiFlash)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these codes looks too low level to me, i think tidb already have a unified code to handle these?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find a unified code to handle these.

Comment thread pkg/planner/core/stats.go Outdated
if !ok {
return 0, false
}
readTS, err := sessiontxn.GetTxnManager(sctx).GetStmtReadTS()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this request a new start_ts from pd?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. But this would move the stage of obtaining read_ts from the execution phase to the optimizer phase, which is indeed a problem. Considering that getting the estimated row count from TiCI does not actually require read_ts, I decided to remove this.

Signed-off-by: wshwsh12 <793703860@qq.com>
@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 29, 2026

@wshwsh12: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@wshwsh12
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 29, 2026

@wshwsh12: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 29, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: windtalker, winoros
Once this PR has been reviewed and has the lgtm label, please assign cfzjywxk, qw4990 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 29, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 29, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-04-28 08:26:36.792942889 +0000 UTC m=+2672801.998302945: ☑️ agreed by winoros.
  • 2026-04-29 10:56:50.030662753 +0000 UTC m=+2768215.236022820: ☑️ agreed by windtalker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants