*: bump tipb and refactor analyze sampling helpers#68414
Conversation
Match the name the row sampler already uses for the columns + column groups slot count, and drop the redundant later redeclaration that recomputed the same value from e.colsInfo/e.indexes.
|
@0xPoe I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
📝 WalkthroughWalkthroughThis PR updates the tipb dependency revision and refactors statistics hashing and column sampling code. FMSketch hashing logic is extracted into reusable helper functions, and the column sampling pipeline is updated to compute and propagate a consistent totalLen parameter throughout. ChangesStatistics Hashing and Sampling Updates
Sequence Diagram(s)(Diagrams included in the hidden review stack artifact above.) Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #68414 +/- ##
================================================
- Coverage 77.7123% 76.0520% -1.6604%
================================================
Files 1991 2019 +28
Lines 552087 579691 +27604
================================================
+ Hits 429040 440867 +11827
- Misses 122127 136908 +14781
- Partials 920 1916 +996
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/statistics/fmsketch.go (1)
142-178: 💤 Low valueConsider consistent error handling style between helpers.
Both
hashDatum(line 144) andhashRow(line 168) handle encoding errors, but they use slightly different patterns:
hashDatumcallssc.HandleError(err)directlyhashRowcallserrCtx.HandleError(err)aftererrCtx := sc.ErrCtx()While functionally equivalent, using a consistent style across both helpers would improve readability.
♻️ Align error handling style
Option 1: Both use
sc.HandleErrordirectly (simpler):func hashRow(sc *stmtctx.StatementContext, values []types.Datum) (uint64, error) { b := make([]byte, 0, 8) hashFunc := murmur3Pool.Get().(hash.Hash64) hashFunc.Reset() defer murmur3Pool.Put(hashFunc) - errCtx := sc.ErrCtx() for _, v := range values { b = b[:0] b, err := codec.EncodeValue(sc.TimeZone(), b, v) - err = errCtx.HandleError(err) + err = sc.HandleError(err) if err != nil { return 0, err }Option 2: Both use
sc.ErrCtx()(explicit):func hashDatum(sc *stmtctx.StatementContext, value types.Datum) (uint64, error) { + errCtx := sc.ErrCtx() bytes, err := codec.EncodeValue(sc.TimeZone(), nil, value) - err = sc.HandleError(err) + err = errCtx.HandleError(err) if err != nil { return 0, err }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/statistics/fmsketch.go` around lines 142 - 178, Align the error-handling style by using StatementContext.ErrCtx() consistently: in hashDatum replace the direct sc.HandleError(err) call with errCtx := sc.ErrCtx() and err = errCtx.HandleError(err) (mirroring hashRow), ensuring you preserve the same early-return behavior on error; leave hashRow as-is (it already uses errCtx.HandleError). This makes hashDatum and hashRow consistent while keeping their behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/statistics/fmsketch.go`:
- Around line 142-178: Align the error-handling style by using
StatementContext.ErrCtx() consistently: in hashDatum replace the direct
sc.HandleError(err) call with errCtx := sc.ErrCtx() and err =
errCtx.HandleError(err) (mirroring hashRow), ensuring you preserve the same
early-return behavior on error; leave hashRow as-is (it already uses
errCtx.HandleError). This makes hashDatum and hashRow consistent while keeping
their behavior unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: cf7849dc-e1e7-4d62-8082-1e615f6965cd
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (5)
DEPS.bzlcmd/tidb-server/BUILD.bazelgo.modpkg/executor/analyze_col_sampling.gopkg/statistics/fmsketch.go
0xPoe
left a comment
There was a problem hiding this comment.
🔢 Self-check (PR reviewed by myself and ready for feedback)
-
Code compiles successfully
-
Unit tests added
-
No AI-generated elegant nonsense in PR.
-
Comments added where necessary
-
PR title and description updated
-
Documentation PR created (or confirmed not needed)
-
PR size is reasonable
|
/cc bb7133 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bb7133, time-and-fate The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
Thanks for your review! 💚 💙 💜 💛 ❤️ |
What problem does this PR solve?
Issue Number: ref #67449
Problem Summary:
Split from #68157
Splitting prep work off from the larger analyze NDV-rate change so review can focus on each piece. Three independent commits, no behavior change:
github.com/pingcap/tipbto pick up the merged singleton-sketch fields (proto: add NDV rate to analyze request tipb#410).l→totalLenin analyze sampling for consistency with the row sampler, dropping a redundant redeclaration.What changed and how does it work?
Pure dep bump + refactors.
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Chores
Refactor
Tests