ruv2: compute TiKV RU v2 in client-go#1905
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds TiKV RU v2 config/defaults, a function to compute and accumulate TiKV RU v2 from ExecDetailsV2 into RUDetails via context, propagates request context through cop-stream responses, and extends RUDetails to store and expose TiKV RU v2. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Internal
participant TiKVRPC
participant Config
participant RU as RUDetails(ctx)
Client->>Internal: sendRequest(ctx)
Internal->>TiKVRPC: perform RPC / stream (attach ctx)
TiKVRPC-->>Internal: CopStreamResponse (ExecDetailsV2, Ctx)
Internal->>Config: UpdateTiKVRUV2FromExecDetailsV2(Ctx, ExecDetailsV2, writeRPCCount)
Config->>RU: compute weighted delta and call AddTiKVRUV2(delta)
Internal-->>Client: return response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
go.mod (1)
1-67:⚠️ Potential issue | 🟡 MinorRun
go mod tidyto fix the pipeline failure.The integration test pipeline is failing because the go.mod file is out of sync. Run
go mod tidyto update the module dependencies.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@go.mod` around lines 1 - 67, The go.mod is out of sync causing CI failures; run `go mod tidy` in the repository root to update go.mod and generate the correct go.sum, then stage and commit the updated go.mod and go.sum so the pipeline passes; locate the module declaration "module github.com/tikv/client-go/v2" in go.mod to ensure you run tidy in the correct module and verify the require blocks (e.g., entries like "github.com/pingcap/errors" and "google.golang.org/grpc") are updated before pushing.
🧹 Nitpick comments (1)
config/client.go (1)
126-156: Consider documenting the source of the calibration weights.The
RUV2TiKVConfigstruct contains precise floating-point constants (e.g.,RUScale: 5697.054498). Adding a brief comment or link explaining how these weights were derived would help future maintainers understand their purpose.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/client.go` around lines 126 - 156, The DefaultRUV2TiKVConfig contains hard-coded calibration constants (e.g., RUScale in RUV2TiKVConfig and the various TiKV* weight fields) with no provenance; add a concise comment above the RUV2TiKVConfig type or above DefaultRUV2TiKVConfig that documents the source and method used to derive these constants (e.g., link to the calibration script/issue/benchmark, date, and author or brief description of the calibration procedure) so future maintainers can trace or re-run the calibration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/client.go`:
- Around line 155-157: Remove the extraneous trailing closing brace '}' at the
end of the file that is not matching any open block (it follows the file's final
function/struct closure); delete that extra brace so the braces are balanced,
verify the containing function/struct (the final declaration in the file) closes
only once, and run the compiler/formatter to confirm the syntax error is
resolved.
---
Outside diff comments:
In `@go.mod`:
- Around line 1-67: The go.mod is out of sync causing CI failures; run `go mod
tidy` in the repository root to update go.mod and generate the correct go.sum,
then stage and commit the updated go.mod and go.sum so the pipeline passes;
locate the module declaration "module github.com/tikv/client-go/v2" in go.mod to
ensure you run tidy in the correct module and verify the require blocks (e.g.,
entries like "github.com/pingcap/errors" and "google.golang.org/grpc") are
updated before pushing.
---
Nitpick comments:
In `@config/client.go`:
- Around line 126-156: The DefaultRUV2TiKVConfig contains hard-coded calibration
constants (e.g., RUScale in RUV2TiKVConfig and the various TiKV* weight fields)
with no provenance; add a concise comment above the RUV2TiKVConfig type or above
DefaultRUV2TiKVConfig that documents the source and method used to derive these
constants (e.g., link to the calibration script/issue/benchmark, date, and
author or brief description of the calibration procedure) so future maintainers
can trace or re-run the calibration.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d26cca6b-c969-4181-823c-1c904006469a
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
config/client.goconfig/ruv2.goconfig/ruv2_test.gogo.modinternal/client/client.gotikvrpc/tikvrpc.goutil/execdetails.go
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
config/client.go (1)
126-156: Validate the RU v2 coefficients inTiKVClient.Valid().These fields are now user-configurable, but nothing rejects negative or non-finite values. A bad config here can make TiKV RU v2 accounting go backwards or become nonsensical. Please add basic finite/non-negative checks alongside the existing client config validation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/client.go` around lines 126 - 156, TiKV RU v2 coefficients in RUV2TiKVConfig (fields RUScale, TiKVKVEngineCacheMiss, ResourceManagerWriteCntTiKV, ExecutorInputs, TiKVCoprocessorExecutorIterations, TiKVCoprocessorResponseBytes, TiKVRaftstoreStoreWriteTriggerWB, TiKVStorageProcessedKeysBatchGet, TiKVStorageProcessedKeysGet) are user-configurable but currently not validated; update TiKVClient.Valid() to reject any of these being non-finite (NaN or ±Inf) or negative by checking each field with math.IsNaN/math.IsInf and value < 0 and return a descriptive error when a check fails (mirror how other client validations return errors), ensuring DefaultRUV2TiKVConfig remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/ruv2.go`:
- Around line 35-41: The code overwrites the entire default RUV2 weights with
cfg.TiKVClient.RUV2, causing unspecified fields to become zero; instead, start
from DefaultRUV2TiKVConfig(), then overlay only the fields that are explicitly
set in cfg.TiKVClient.RUV2 (or use a nullable representation) so unset fields
keep their defaults; update the logic around weights (the variable named weights
and the retrieval via GetGlobalConfig()/TiKVClient.RUV2) to perform a
field-by-field merge (and keep the existing RUScale fallback) rather than
assigning the struct wholesale.
In `@internal/client/client.go`:
- Around line 344-349: The current logic for writeRPCCount skips requests where
resourcecontrol.MakeRequestInfo(req) marks them as bypassed, which causes
ResourceManagerWriteCntTiKV to undercount internal/high-priority TiKV writes;
change the branch in the client handler (the block using writeRPCCount, req,
StoreTp, tikvrpc.TiKV, and IsDebugReq()) to determine writeRPCCount directly
from the request's type (use the request-level write indicator instead of
reqInfo.Bypass()), i.e., set writeRPCCount = 1 when the TiKV request itself is a
write (based on the request type/IsWrite-equivalent) and not a debug request,
and remove the dependency on resourcecontrol.MakeRequestInfo()/reqInfo.Bypass()
for this counter so bypassed writes are still counted toward
ResourceManagerWriteCntTiKV and ExecDetailsV2.RuV2 aggregation.
---
Nitpick comments:
In `@config/client.go`:
- Around line 126-156: TiKV RU v2 coefficients in RUV2TiKVConfig (fields
RUScale, TiKVKVEngineCacheMiss, ResourceManagerWriteCntTiKV, ExecutorInputs,
TiKVCoprocessorExecutorIterations, TiKVCoprocessorResponseBytes,
TiKVRaftstoreStoreWriteTriggerWB, TiKVStorageProcessedKeysBatchGet,
TiKVStorageProcessedKeysGet) are user-configurable but currently not validated;
update TiKVClient.Valid() to reject any of these being non-finite (NaN or ±Inf)
or negative by checking each field with math.IsNaN/math.IsInf and value < 0 and
return a descriptive error when a check fails (mirror how other client
validations return errors), ensuring DefaultRUV2TiKVConfig remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c9516955-248c-4678-94c3-c2308c7aff10
📒 Files selected for processing (6)
config/client.goconfig/ruv2.goconfig/ruv2_test.gointernal/client/client.gotikvrpc/tikvrpc.goutil/execdetails.go
🚧 Files skipped from review as they are similar to previous changes (1)
- tikvrpc/tikvrpc.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
config/ruv2.go (1)
40-42: Minor: Redundant RUScale fallback after merge.The
mergeRUV2TiKVConfigfunction already preservesbase.RUScalewhenoverride.RUScaleis 0 (lines 75-77). This additional check only has an effect ifDefaultRUV2TiKVConfig()itself returnsRUScale == 0, which would indicate a bug in the defaults rather than something to silently fix here.Consider removing these lines for clarity, or keeping them as explicit documentation of the invariant.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/ruv2.go` around lines 40 - 42, The snippet redundantly resets weights.RUScale to defaultWeights.RUScale when zero; since mergeRUV2TiKVConfig already preserves base.RUScale when override.RUScale == 0 and DefaultRUV2TiKVConfig should supply a non‑zero default, remove the redundant conditional in config/ruv2.go that checks if weights.RUScale == 0 and assigns defaultWeights.RUScale (leave mergeRUV2TiKVConfig and DefaultRUV2TiKVConfig unchanged); if you prefer to keep an explicit invariant, replace the code with a short comment referencing mergeRUV2TiKVConfig instead of performing the assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@config/ruv2.go`:
- Around line 40-42: The snippet redundantly resets weights.RUScale to
defaultWeights.RUScale when zero; since mergeRUV2TiKVConfig already preserves
base.RUScale when override.RUScale == 0 and DefaultRUV2TiKVConfig should supply
a non‑zero default, remove the redundant conditional in config/ruv2.go that
checks if weights.RUScale == 0 and assigns defaultWeights.RUScale (leave
mergeRUV2TiKVConfig and DefaultRUV2TiKVConfig unchanged); if you prefer to keep
an explicit invariant, replace the code with a short comment referencing
mergeRUV2TiKVConfig instead of performing the assignment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 473c8156-e22e-474b-85f5-7610f465a1d5
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (4)
config/ruv2.goconfig/ruv2_test.gogo.modinternal/client/client.go
🚧 Files skipped from review as they are similar to previous changes (3)
- go.mod
- internal/client/client.go
- config/ruv2_test.go
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tikvrpc/tikvrpc.go (1)
819-825:⚠️ Potential issue | 🟠 MajorConsider keeping internal context management unexported.
Ctxis internal bookkeeping written byinternal/client/client.gofor timeout management. Exporting it as a public field exposes implementation detail to the API surface. This also creates inconsistency withBatchCopStreamResponseandMPPStreamResponse, which lack this field. Use an unexported field with a getter, or manage context lifecycle in the client layer only.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tikvrpc/tikvrpc.go` around lines 819 - 825, The CopStreamResponse struct currently exposes an exported Ctx field; make it unexported (rename to ctx) to keep internal timeout/context bookkeeping private (used by internal/client/client.go) and align with BatchCopStreamResponse and MPPStreamResponse; if external code needs read access, add an unexported field plus a small exported getter method (e.g., Context() context.Context) or keep context lifecycle entirely inside the client layer and remove any external usages of Ctx to avoid changing the public API surface.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/ruv2.go`:
- Around line 71-103: mergeRUV2TiKVConfig currently treats 0 as "unset" so
callers cannot explicitly set coefficients to zero; convert RUV2TiKVConfig's
numeric fields to pointer types (e.g. *float64) or add explicit "is set" boolean
flags for each field, then update mergeRUV2TiKVConfig to check for non-nil (or
the corresponding "set" flag) and assign the override value (dereferencing
pointers when present); also ensure the config decoder/JSON/YAML unmarshalling
preserves nil vs set semantics so intentional zero values are retained.
- Around line 26-33: The code reads *util.RUDetails from context in
UpdateTiKVRUV2FromExecDetailsV2 but nothing in production injects it
(NewRUDetails and context.WithValue are only used in tests), so RU v2 accounting
never runs; fix by ensuring a production code path populates the context with a
*util.RUDetails under util.RUDetailsCtxKey (for example create/instantiate
util.NewRUDetails() and attach it via context.WithValue) at the earliest entry
point for TiKV RPCs—modify the client interceptor in
internal/client/client_interceptor.go (or the RPC request wrapper used by
UpdateTiKVRUV2FromExecDetailsV2 callers) to wrap incoming/outgoing calls and set
ctx = context.WithValue(ctx, util.RUDetailsCtxKey, newRU) when absent so
UpdateTiKVRUV2FromExecDetailsV2 can read and update the RUDetails in production.
In `@internal/client/client.go`:
- Around line 343-350: The writeRPCCount calculation in client.go only
increments for Request.IsTxnWriteRequest() and Request.IsRawWriteRequest(),
which misses additional TiKV write commands (e.g., CmdDeleteRange,
CmdRawBatchDelete, CmdRawDeleteRange) and thus undercounts
ResourceManagerWriteCntTiKV / RU v2 deltas; update the logic that sets
writeRPCCount (referencing the writeRPCCount variable and the req checks) to
also treat those command types as writes — either by adding explicit checks for
req.GetCmdType() values (CmdDeleteRange, CmdRawBatchDelete, CmdRawDeleteRange)
or by using a unified helper (add or use a Request.IsWriteCommand() helper) so
all TiKV write RPCs are counted when computing ResourceManagerWriteCntTiKV and
RU v2 aggregation.
---
Outside diff comments:
In `@tikvrpc/tikvrpc.go`:
- Around line 819-825: The CopStreamResponse struct currently exposes an
exported Ctx field; make it unexported (rename to ctx) to keep internal
timeout/context bookkeeping private (used by internal/client/client.go) and
align with BatchCopStreamResponse and MPPStreamResponse; if external code needs
read access, add an unexported field plus a small exported getter method (e.g.,
Context() context.Context) or keep context lifecycle entirely inside the client
layer and remove any external usages of Ctx to avoid changing the public API
surface.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b1acf3f9-be0e-4fd2-a861-29e4ee824d32
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
config/client.goconfig/ruv2.goconfig/ruv2_test.gogo.modinternal/client/client.gotikvrpc/tikvrpc.goutil/execdetails.go
🚧 Files skipped from review as they are similar to previous changes (3)
- go.mod
- config/client.go
- config/ruv2_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/ruv2_test.go`:
- Around line 27-32: The cleanup currently skips restoring the original global
config when GetGlobalConfig() returns nil, causing StoreGlobalConfig(&cfg) in
the test to leak state; change the t.Cleanup callback so it always restores the
original value by calling StoreGlobalConfig(original) (i.e., remove the nil
check), and if StoreGlobalConfig(nil) is not supported, add or call a
ResetGlobalConfig() helper instead; reference GetGlobalConfig, StoreGlobalConfig
and t.Cleanup to locate and update the cleanup behavior accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 466874d0-6711-48de-882f-1a9159301ee8
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (3)
config/ruv2.goconfig/ruv2_test.gogo.mod
🚧 Files skipped from review as they are similar to previous changes (2)
- go.mod
- config/ruv2.go
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
| func DefaultRUV2TiKVConfig() RUV2TiKVConfig { | ||
| return RUV2TiKVConfig{ | ||
| RUScale: 5697.054498, | ||
|
|
||
| TiKVKVEngineCacheMiss: 0.45975389, | ||
| ResourceManagerWriteCntTiKV: 0.09642181, | ||
| ExecutorInputs: 0.00003150, | ||
| TiKVCoprocessorExecutorIterations: 0.05775369, | ||
| TiKVCoprocessorResponseBytes: 0.00000087, | ||
| TiKVRaftstoreStoreWriteTriggerWB: 0.00006100, | ||
| TiKVStorageProcessedKeysBatchGet: 0.00266791, | ||
| TiKVStorageProcessedKeysGet: 0.01416829, | ||
| } |
There was a problem hiding this comment.
Do we have any comments on how these values were derived?
There was a problem hiding this comment.
added some comment.
In tikvrpc/tikvrpc.go: The current change: go go
Are there any other request types that should contribute to this count (e.g., certain atomic operations)? Review by Gemini |
1, 2 : BatchCopStreamResponse is for tiflash, no need to handle it. |
Signed-off-by: disksing <i@disksing.com>
ekexium
left a comment
There was a problem hiding this comment.
Should also count RU in client_async
Signed-off-by: disksing <i@disksing.com>
|
@ekexium fixed client async |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ekexium, nolouch The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add support for computing TiKV RU v2 values from ExecDetailsV2.RuV2.
What's Changed:
Signed-off-by: disksing i@disksing.com
Summary by CodeRabbit
New Features
Improvements
Tests
Chores