Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: rfc for configurable kv timeout #45093

Merged
merged 7 commits into from Jul 10, 2023

Conversation

cfzjywxk
Copy link
Contributor

@cfzjywxk cfzjywxk commented Jun 30, 2023

What problem does this PR solve?

Issue Number: ref #44771

Problem Summary:
Add the detailed design document for the configurable kv timeout proposal.

What is changed and how it works?

Check List

Tests

Side effects

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: cfzjywxk <lsswxrxr@163.com>
Signed-off-by: cfzjywxk <lsswxrxr@163.com>
Signed-off-by: cfzjywxk <lsswxrxr@163.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 30, 2023
@tiprow
Copy link

tiprow bot commented Jun 30, 2023

Hi @cfzjywxk. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Comment on lines 102 to 119
- Change the `kvproto` to pass the read timeout value for the [`GetRequest`](https://github.com/pingcap/kvproto/blob/master/proto/kvrpcpb.proto#L26)
```protobuf
message GetRequest {
Context context = 1;
bytes key = 2;
uint64 version = 3;
uint32 read_timeout = 4; // Add this new field.
}
```
- Change the `kvproto` to pass the read timeout value ~ the [`BatchGetRequest`](https://github.com/pingcap/kvproto/blob/master/proto/kvrpcpb.proto#L414)
```protobuf
message BatchGetRequest {
Context context = 1;
repeated bytes keys = 2;
uint64 version = 3;
uint32 read_timeout = 4; // Add this new field.
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently kvproto already has a field max_execution_duration_ms in Context https://github.com/pingcap/kvproto/blob/master/proto/kvrpcpb.proto#L820, so no need to add new timeout field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we could re-use the field in the Context which is used by write requests now, thanks for the reminding.

Comment on lines 173 to 197
- Add the `kv_read_timeout` field in the [`coprocessor.Request`](https://github.com/pingcap/kvproto/blob/master/proto/coprocessor.proto#L24)
This change needs to be done in the `kvproto` repository.
- Use the `kv_read_timeout` value passed in to calculate the `deadline` result in `parse_and_handle_unary_request`,
```rust
fn parse_request_and_check_memory_locks(
&self,
mut req: coppb::Request,
peer: Option<String>,
is_streaming: bool,
) -> Result<(RequestHandlerBuilder<E::Snap>, ReqContext)> {
...
req_ctx = ReqContext::new(
tag,
context,
ranges,
self.max_handle_duration, // Here use the specified timeout value.
peer,
Some(is_desc_scan),
start_ts.into(),
cache_match_version,
self.perf_level,
);
}
```
This change needs to be done in the `tikv` repository.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto?

```SQL
set @@tidb_read_staleness=-5;
# The unit is miliseconds. The session variable usage.
set @@tidb_tikv_tidb_timeout=500;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set @@tidb_tikv_tidb_timeout=500;
set @@ tidb_kv_read_timeout=500;

Signed-off-by: cfzjywxk <lsswxrxr@163.com>
Signed-off-by: cfzjywxk <lsswxrxr@163.com>
- Setting the variable `tidb_kv_read_timeout ` may not be easy if it affects the timeout for all
TiKV read requests, such as Get, BatchGet, Cop in this session.A timeout of 1 second may be sufficient for GET requests,
but may be small for COP requests. Some large COP requests may keep timing out and could not be processed properly.
- If the value of the variable `tidb_kv_read_timeout` is set too small, more retries will occur,
Copy link
Contributor

@ekexium ekexium Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can solve this by letting this timeout setting take effect only once, because it is less likely to have multiple nodes affected by jitter at the same time. Unlimited retry can also lead to congestion, increasing latency of all requests and make the situation worse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Or the max retry times could be set to the same value as the max-replica configuration which is the available replicas that can handle stale or follower read requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Described in the Timeout Retry part, current strategy is to let the timeout retry take effect at most available-replicas - 1 times.

value of `ReadTimeoutShort` and `ReadTimeoutMedium`.
- Adding statement level hint like `SELECT /*+ tidb_kv_read_timeout(500ms) */ * FROM t where id = ?;` to
set the timeout value of the KV requests of this single query to the certain value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can briefly describe the benefit of configurable timeout at the end of the Motivation section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in the Motivation & Background section.

- Support timeout check during the request handling in TiKV. When there's new point get and batch point get
requests are created, the `kv_read_timeout` value should be read from the `GetReuqest` and `BatchGetRequest`
and passed to the `pointGetter`. But by now there's no canceling mechanism when the task is scheduled to the
read pool in TiKV, a simpler way is to check the deadline duration before next processing and try to return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we distinguish different DeadlineExceeded Errors? If it's caused by kv_read_timeout it's supposed to retry. What happens if it is caused by other deadlines setting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There don't seem to be any other deadlines setting in TiDB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TiKV coprocessor requests set their deadlines from the config item end_point_request_max_handle_duration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more clear to distinguish them as the kv_read_timeout is supposed to only affect read requests.

serveral reasons:
- When the read task is polled and executed, there's no timeout checking mechanism in the task scheduler
by now.
- The read operations are synchronous, so if the read task is blocked by slow io the task could not be
Copy link
Contributor

@ekexium ekexium Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about letting client-go send the second request when timeout exceeded in the client side? It can cover more cases like network delay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it may be the first step in the initial implementation. It's better if the requests could be canceled in time top-down, but it's not easy in TiKV by now for the reasons listed here.

The requests with configurable timeout values would take effect on newer versions. These fields are expected not to take effect when down-grading the
cluster theoretically.

## More comprehensive solution
Copy link
Contributor

@ekexium ekexium Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we support cancelling a request from outside commands, we can implement a real hedge policy. For example

  • For requests taking more than a threshold of time, send a hedge request to another replica. Either response can be taken as the final result.
  • When either response is returned, cancel the other request.
  • TiKV prioritizes non-hedge requests so the hedge policy won't increase latencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be realized if the real in-time canceling is supported as described above.

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REST LGTM

Co-authored-by: crazycs <crazycs520@gmail.com>
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jul 6, 2023

@cfzjywxk: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-br-integration-test e522baa link true /test pull-br-integration-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ti-chi-bot
Copy link
Member

@cfzjywxk: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-br-integration-test e522baa link true /test pull-br-integration-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@zhangjinpeng87 zhangjinpeng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please organize the RFC as:

  1. Motivation/Background
  2. The usage of proposed solution, detailed design
  3. Other alternatives we have considered, list their pros and cons

Signed-off-by: cfzjywxk <lsswxrxr@163.com>
# Proposal: Configurable KV Timeout

* Authors: [cfzjywxk](https://github.com/cfzjywxk)
* Tracking issue: TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can file a tracking issue now

@ti-chi-bot ti-chi-bot bot added the lgtm label Jul 10, 2023
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jul 10, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crazycs520, ekexium

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jul 10, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-07-04 08:07:57.165512981 +0000 UTC m=+104909.099146404: ☑️ agreed by crazycs520.
  • 2023-07-10 02:26:48.12626193 +0000 UTC m=+298899.896600643: ☑️ agreed by ekexium.

@ti-chi-bot ti-chi-bot bot merged commit b040671 into pingcap:master Jul 10, 2023
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved component/docs lgtm release-note-none sig/transaction SIG:Transaction size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants