resource_control: add PredictedReadBytes hint for RC paging pre-charge#10599
resource_control: add PredictedReadBytes hint for RC paging pre-charge#10599YuhaoZhang00 wants to merge 6 commits intotikv:masterfrom
Conversation
Add PagingSizeBytes() to RequestInfo interface. When a read request carries a byte budget (from RC paging), BeforeKVRequest pre-charges the estimated read RU in Phase 1, and AfterKVRequest subtracts it in Phase 2 to maintain correct total cost. This makes concurrent workers throttle at Phase 1 instead of all hitting Phase 2 simultaneously. Signed-off-by: JmPotato <github@ipotato.me>
Test that BeforeKVRequest pre-charges pagingSizeBytes * ReadBytesCost, AfterKVRequest subtracts it, and the net total equals baseCost + actualCost. Signed-off-by: JmPotato <github@ipotato.me>
…ment When paging byte budget (pagingSizeBytes) is enabled, BeforeKVRequest pre-charges ReadBytesCost * pagingSizeBytes RRU into the token limiter. AfterKVRequest then computes the actual cost and subtracts the pre-charge. If the actual cost is less than the pre-charge (common case, since pagingSizeBytes is an upper bound), the settlement delta is negative. Previously, the negative delta was correctly recorded in the consumption counter but silently dropped by the token limiter due to a `v > 0` guard — causing permanent token leakage proportional to (pagingSizeBytes - actualReadBytes) per request. Fix: add Limiter.RefundTokens (inverse of RemoveTokens) and call it from both onResponseImpl and onResponseWaitImpl when the settlement delta is negative. This ensures the limiter's available token balance accurately reflects actual consumption after each request completes. RefundTokens design notes: - No burst cap applied (consistent with Reconfigure; getTokens handles lazy capping on the next limiter operation). - No maybeNotify call (refunding moves balance away from the low-token threshold, never toward it). Signed-off-by: JmPotato <github@ipotato.me>
Introduce an optional predictedReadBytesProvider interface on RequestInfo. When a caller (e.g. TiDB maintaining a per-logical-scan EMA across paging RPCs) supplies a non-zero PredictedReadBytes, BeforeKVRequest/AfterKVRequest use that value as the byte basis for the paging pre-charge instead of PagingSizeBytes. PagingSizeBytes remains the fallback and worst-case cap. This lets TiDB replace the current fixed 4 MiB pre-charge (which matches the paging byte budget but typically overshoots actual scanned bytes and stalls concurrent workers at Phase 1) with a learned estimate, without changing kvproto or TiKV behavior. The hint is added as an optional interface (not a method on RequestInfo) so existing RequestInfo implementations compile unchanged; they continue to fall back to PagingSizeBytes. Ref: per-logical-scan EMA pre-deduction design Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @YuhaoZhang00. Thanks for your PR. I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10599 +/- ##
==========================================
+ Coverage 78.88% 79.01% +0.13%
==========================================
Files 530 532 +2
Lines 71548 71992 +444
==========================================
+ Hits 56439 56888 +449
+ Misses 11092 11085 -7
- Partials 4017 4019 +2
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Adds four Prometheus metrics under namespace resource_manager_client to
make the PredictedReadBytes-based paging pre-charge fix directly
observable end-to-end:
- paging_precharge_source_total{source=predicted|fallback}
- paging_precharge_bytes_total{source}
- paging_actual_bytes_total{source}
- paging_prediction_residual_bytes (histogram, predicted path only)
They are wired into onRequestWaitImpl / onResponseImpl so no change to
the ResourceCalculator interface is needed. Labels are cached per group
to keep the hot path allocation-free.
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
Added 4 observability metrics on the PD client side for upcoming e2e validation of the paging pre-charge path:
No public interface change; instrumentation is at the |
The paging actual-bytes counter was only updated from onResponseImpl, but most responses under RC throttling flow through onResponseWaitImpl, so the counter stayed at 0 in practice and the over-charge ratio metric could not be computed. Mirror the observePagingActual call into the wait path. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the 📖 For more info, you can check the "Linking issues" section in the CONTRIBUTING.md. |
Summary
Add an optional
predictedReadBytesProviderinterface onRequestInfo.When a caller (e.g. TiDB maintaining a per-logical-scan EMA across paging
cop RPCs) supplies a non-zero
PredictedReadBytes,BeforeKVRequestandAfterKVRequestuse that value as the byte basis for the paging pre-chargeinstead of
PagingSizeBytes.PagingSizeBytesremains the fallback andworst-case cap.
Existing
RequestInfoimplementations compile unchanged — the hint is anoptional interface, not a method on
RequestInfo.Why
The fixed 4 MiB paging pre-charge introduced in #10548 typically overshoots
actual scanned bytes and stalls concurrent workers at Phase 1 under tight
resource tiers. A learned per-scan estimate from the TiDB side eliminates
the over-estimation without changing kvproto or TiKV behavior.
Status
Draft. Part of a stacked change:
tikv/client-go: addstikvrpc.Request.PredictedReadBytes+RequestInfogetterpingcap/tidb: maintains per-logical-scan TAEMA oncopTask, feeds prediction on every paging RPCNot yet ready for review; e2e validation on the simulation cluster is pending.
Test plan
client/resource_group/controller/cover hint-present and hint-absent paths, plus an excess-pre-charge refund case