-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/kv: bounded staleness reads #67562
Labels
A-kv-transactions
Relating to MVCC and the transactional model.
A-multiregion
Related to multi-region
A-sql-execution
Relating to SQL execution.
A-sql-optimizer
SQL logical planning and optimizations.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-multiregion
T-sql-queries
SQL Queries Team
Comments
nvanbenschoten
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-sql-optimizer
SQL logical planning and optimizations.
A-kv-transactions
Relating to MVCC and the transactional model.
A-sql-execution
Relating to SQL execution.
A-multiregion
Related to multi-region
labels
Jul 13, 2021
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jul 19, 2021
Closes cockroachdb#67549. Touches cockroachdb#67562. This commit introduces a new QueryResolvedTimestampRequest type, which is the first step towards implementing bounded staleness reads. This new request type requests the resolved timestamp of the key span it is issued over. The resolved timestamp of a key span is defined as the minimum of all closed timestamps across the key span (there can be multiple if the key span touches multiple ranges) along with the timestamp immediately preceding each intent in the key span. Because the closed timestamp increases monotonically on all ranges and also blocks the creation of new intents at timestamps below it, the resolved timestamp over a given key span also increases monotonically. However, within a given range, the closed timestamp and the set of intents are both properties of the specific replica consulted. This means that two replicas in the same range may report different resolved timestamps at the same point in time, depending on how far they have each caught up on their range's Raft log. As a result, the resolved timestamp is only guaranteed to increase monotonically if the same replica or set of replicas are consulted each time. The expectation is that a CONSISTENT read at or below a key span's resolved timestamp will never block on replication or on conflicting transactions. For this to be guaranteed, the read must be issued to the same replica or set of replicas (for multi-range reads) that were consulted when computing the key span's resolved timestamp. The resolved timestamp of a key span is a sibling concept to the resolved timestamp of a rangefeed, which is defined in: pkg/kv/kvserver/rangefeed/resolved_timestamp.go Whereas the resolved timestamp of a rangefeed refers to a timestamp below which no future updates will be published on the rangefeed, the resolved timestamp of a key span refers to a timestamp below which no future state modifications that could change the result of read requests will be made. Both concepts rely on some notion of immutability, but the former imparts this property on a stream of events while the latter imparts this property on materialized state.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jul 23, 2021
Closes cockroachdb#67549. Touches cockroachdb#67562. This commit introduces a new QueryResolvedTimestampRequest type, which is the first step towards implementing bounded staleness reads. This new request type requests a resolved timestamp for the key span it is issued over. A resolved timestamp for a key span is a timestamp at or below which all future reads within the span are guaranteed to produce the same results, i.e. at which MVCC history has become immutable. The most up-to-date such bound can be computed for a key span contained in a single range by taking the minimum of the leaseholder's closed timestamp and the timestamp preceding the earliest intent present on the range that overlaps with the key span of interest. This optimum timestamp is nondecreasing over time, since the closed timestamp will not regress and since it also prevents intents at lower timestamps from being created. Follower replicas can also provide a resolved timestamp, though it may not be the most recent one due to replication delay. However, a given follower replica will similarly produce a nondecreasing sequence of closed timestamps. QueryResolvedTimestampRequest returns a resolved timestamp for the input key span by returning the minimum of all replicas contacted in order to cover the key span. This means that repeated invocations of this operation will be guaranteed nondecreasing only if routed to the same replicas. A CONSISTENT read at or below a key span's resolved timestamp will never block on replication or on conflicting transactions. However, as can be inferred from the previous paragraph, for this to be guaranteed, the read must be issued to the same replica or set of replicas (for multi-range reads) that were consulted when computing the key span's resolved timestamp. A resolved timestamp for a key span is a sibling concept a resolved timestamp for a rangefeed, which is defined in: pkg/kv/kvserver/rangefeed/resolved_timestamp.go Whereas a resolved timestamp for a rangefeed refers to a timestamp below which no future updates will be published on the rangefeed, a resolved timestamp for a key span refers to a timestamp below which no future state modifications that could change the result of read requests will be made. Both concepts rely on some notion of immutability, but the former imparts this property on a stream of events while the latter imparts this property on materialized state. This commit does not begin using the new QueryResolvedTimestampRequest. Its use will begin in a follow-up commit that implements the "Server-side negotiation fast-path". See the bounded staleness RFC for details.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jul 26, 2021
Closes cockroachdb#67549. Touches cockroachdb#67562. This commit introduces a new QueryResolvedTimestampRequest type, which is the first step towards implementing bounded staleness reads. This new request type requests a resolved timestamp for the key span it is issued over. A resolved timestamp for a key span is a timestamp at or below which all future reads within the span are guaranteed to produce the same results, i.e. at which MVCC history has become immutable. The most up-to-date such bound can be computed for a key span contained in a single range by taking the minimum of the leaseholder's closed timestamp and the timestamp preceding the earliest intent present on the range that overlaps with the key span of interest. This optimum timestamp is nondecreasing over time, since the closed timestamp will not regress and since it also prevents intents at lower timestamps from being created. Follower replicas can also provide a resolved timestamp, though it may not be the most recent one due to replication delay. However, a given follower replica will similarly produce a nondecreasing sequence of resolved timestamps. QueryResolvedTimestampRequest returns a resolved timestamp for the input key span by returning the minimum of all replicas contacted in order to cover the key span. This means that repeated invocations of this operation will be guaranteed nondecreasing only if routed to the same replicas. A CONSISTENT read at or below a key span's resolved timestamp will never block on replication or on conflicting transactions. However, as can be inferred from the previous paragraph, for this to be guaranteed, the read must be issued to the same replica or set of replicas (for multi-range reads) that were consulted when computing the key span's resolved timestamp. A resolved timestamp for a key span is a sibling concept a resolved timestamp for a rangefeed, which is defined in: pkg/kv/kvserver/rangefeed/resolved_timestamp.go Whereas a resolved timestamp for a rangefeed refers to a timestamp below which no future updates will be published on the rangefeed, a resolved timestamp for a key span refers to a timestamp below which no future state modifications that could change the result of read requests will be made. Both concepts rely on some notion of immutability, but the former imparts this property on a stream of events while the latter imparts this property on materialized state. This commit does not begin using the new QueryResolvedTimestampRequest. Its use will begin in a follow-up commit that implements the "Server-side negotiation fast-path". See the bounded staleness RFC for details.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jul 26, 2021
Closes cockroachdb#67549. Touches cockroachdb#67562. This commit introduces a new QueryResolvedTimestampRequest type, which is the first step towards implementing bounded staleness reads. This new request type requests a resolved timestamp for the key span it is issued over. A resolved timestamp for a key span is a timestamp at or below which all future reads within the span are guaranteed to produce the same results, i.e. at which MVCC history has become immutable. The most up-to-date such bound can be computed for a key span contained in a single range by taking the minimum of the leaseholder's closed timestamp and the timestamp preceding the earliest intent present on the range that overlaps with the key span of interest. This optimum timestamp is nondecreasing over time, since the closed timestamp will not regress and since it also prevents intents at lower timestamps from being created. Follower replicas can also provide a resolved timestamp, though it may not be the most recent one due to replication delay. However, a given follower replica will similarly produce a nondecreasing sequence of resolved timestamps. QueryResolvedTimestampRequest returns a resolved timestamp for the input key span by returning the minimum of all replicas contacted in order to cover the key span. This means that repeated invocations of this operation will be guaranteed nondecreasing only if routed to the same replicas. A CONSISTENT read at or below a key span's resolved timestamp will never block on replication or on conflicting transactions. However, as can be inferred from the previous paragraph, for this to be guaranteed, the read must be issued to the same replica or set of replicas (for multi-range reads) that were consulted when computing the key span's resolved timestamp. A resolved timestamp for a key span is a sibling concept a resolved timestamp for a rangefeed, which is defined in: pkg/kv/kvserver/rangefeed/resolved_timestamp.go Whereas a resolved timestamp for a rangefeed refers to a timestamp below which no future updates will be published on the rangefeed, a resolved timestamp for a key span refers to a timestamp below which no future state modifications that could change the result of read requests will be made. Both concepts rely on some notion of immutability, but the former imparts this property on a stream of events while the latter imparts this property on materialized state. This commit does not begin using the new QueryResolvedTimestampRequest. Its use will begin in a follow-up commit that implements the "Server-side negotiation fast-path". See the bounded staleness RFC for details.
craig bot
pushed a commit
that referenced
this issue
Jul 26, 2021
66782: jobs: add support for SHOW CREATE SCHEDULE command r=annezhu98 a=annezhu98 Before this change, there was no command to show SQL statements used to create scheduled jobs. This commit allows users to view create statements for scheduled jobs. There are currently two options for this command: - `SHOW CREATE SCHEDULE <schedule_id>`: show the create statement for a scheduled job. - `SHOW CREATE ALL SCHEDULES`: show the create statements for all scheduled jobs. Example usage: ``` > SHOW CREATE SCHEDULE 123; schedule_id | create_statement ---------------------+------------------------------------------------------------------------------------------------------------------------ 123 | CREATE SCHEDULE 'core_schedule_label' FOR BACKUP DATABASE defaultdb INTO 'nodelocal://1/my_backup' RECURRING '@daily' (1 row) ``` Example execution results: ![image](https://user-images.githubusercontent.com/29808757/126368662-7fb6a5d8-a7c1-4906-be14-b4ab4679c689.png) ![image (1)](https://user-images.githubusercontent.com/29808757/126368672-f8bd1ee7-107b-4821-9f7f-2874a9ae0e55.png) ![image (2)](https://user-images.githubusercontent.com/29808757/126368676-7b0dc939-d9a4-4f41-a67e-7de2a2d7ea86.png) ![image](https://user-images.githubusercontent.com/29808757/126369040-df35ebaf-19c2-4f92-aff6-e5fd8f276986.png) Resolves: #58372 Release note (sql change): added SHOW CREATE SCHEDULES command to view SQL statements used to create existing schedules 67725: kv: introduce QueryResolvedTimestamp request r=tbg,irfansharif a=nvanbenschoten Closes #67549. Touches #67562. This commit introduces a new `QueryResolvedTimestampRequest` type, which is the first step towards implementing bounded staleness reads. This new request type requests a resolved timestamp for the key span it is issued over. A resolved timestamp for a key span is a timestamp at or below which all future reads within the span are guaranteed to produce the same results, i.e. at which MVCC history has become immutable. The most up-to-date such bound can be computed for a key span contained in a single range by taking the minimum of the leaseholder's closed timestamp and the timestamp preceding the earliest intent present on the range that overlaps with the key span of interest. This optimum timestamp is nondecreasing over time, since the closed timestamp will not regress and since it also prevents intents at lower timestamps from being created. Follower replicas can also provide a resolved timestamp, though it may not be the most recent one due to replication delay. However, a given follower replica will similarly produce a nondecreasing sequence of resolved timestamps. QueryResolvedTimestampRequest returns a resolved timestamp for the input key span by returning the minimum of all replicas contacted in order to cover the key span. This means that repeated invocations of this operation will be guaranteed nondecreasing only if routed to the same replicas. A CONSISTENT read at or below a key span's resolved timestamp will never block on replication or on conflicting transactions. However, as can be inferred from the previous paragraph, for this to be guaranteed, the read must be issued to the same replica or set of replicas (for multi-range reads) that were consulted when computing the key span's resolved timestamp. A resolved timestamp for a key span is a sibling concept a resolved timestamp for a rangefeed, which is defined in `pkg/kv/kvserver/rangefeed/resolved_timestamp.go`. Whereas a resolved timestamp for a rangefeed refers to a timestamp below which no future updates will be published on the rangefeed, a resolved timestamp for a key span refers to a timestamp below which no future state modifications that could change the result of read requests will be made. Both concepts rely on some notion of immutability, but the former imparts this property on a stream of events while the latter imparts this property on materialized state. This commit does not begin using the new QueryResolvedTimestampRequest. Its use will begin in a follow-up commit that implements the "Server-side negotiation fast-path". See the bounded staleness RFC for details. 68041: tree: OPERATOR pretty printing changes r=rafiss a=otan See individual commits for details. Resolves #68035. Co-authored-by: Anne Zhu <anne.zhu@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jul 28, 2021
Touches cockroachdb#67562. This commit introduces a new RoutingPolicy configuration that lives on a BatchRequest header. A request's routing policy specifies how the request should be routed to the replicas of its target range(s) by the DistSender. There are initially two routing policies: ``` enum RoutingPolicy { // LEASEHOLDER means that the DistSender should route the request to the // leaseholder replica(s) of its target range(s). LEASEHOLDER = 0; // NEAREST means that the DistSender should route the request to the // nearest replica(s) of its target range(s). NEAREST = 1; } ``` The default policy is `LEASEHOLDER`. Routing policies allow us to stop overloading the use of the ReadConsistency enum to dictate both how the client should route a request to a server and which kinds of requests should be eligible to be served by a given replica. Routing policies are a client-side only concept. They do not dictate which replicas in a range are eligible to serve the request, only which replicas are considered as targets by the DistSender, and in which order. A request that is routed to an ineligible replica (a function of request type, timestamp, and read consistency) will be rejected by that replica and the DistSender will target another replica in the range. As discussed in cockroachdb#67725 (review), we will likely need to introduce a third routing policy that called `SINGLE_REPLICA` to address cockroachdb#67554. This policy would be accompanied by a ReplicaDescriptor and would specify that a given request must be sent to that replica and DistSender should throw an error if the replica is not part of its cached range descriptor. This is important to ensure that a QueryResolvedTimestampRequest and its follow-up ScanRequest are both sent to the same replica.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jul 28, 2021
Touches cockroachdb#67562. This commit adds support for a subset of non-transactional batch requests to perform follower reads. Specifically, it makes those that do not rely on their timestamp being set from the server's clock eligible. This condition is necessary because if a follower with a lagging clock sets its timestamp and this then allows the follower to evaluate the batch as a follower read, then the batch might miss past writes served at higher timestamps on the leaseholder.
This was referenced Jul 28, 2021
otan
pushed a commit
to otan-cockroach/cockroach
that referenced
this issue
Jul 29, 2021
Fixes cockroachdb#67551. Fixes cockroachdb#67552. Fixes cockroachdb#67553. Touches cockroachdb#67562. Bounded-staleness read orchestration consists of two phases - negotiation and execution. Negotiation determines the timestamp to run the query at in order to ensure that the read will not block on replication or on conflicting transactions. Execution then uses this timestamp to run the read request. This commit implements the bounded staleness server-side negotiation fast-path. This fast-path allows a bounded staleness read request that lands on a single range to perform its negotiation phase and execution phase in a single RPC. The server-side negotiation fast-path provides two benefits: 1. it avoids two network hops in the common-case where a bounded staleness read is targeting a single range. This in an important performance optimization for single-row point lookups. 2. it provides stronger guarantees around minimizing staleness during bounded staleness reads. Bounded staleness reads that hit the server-side fast-path use their target replica's most up-to-date resolved timestamp, so they are as fresh as possible. Bounded staleness reads that miss the fast-path and perform explicit negotiation (see below) consult a cache, so they may use an out-of-date, suboptimal resolved timestamp, as long as it is fresh enough to satisfy the staleness bound of the request. The commit then uses this new functionality to implement the `(*Txn).NegotiateAndSend` method detailed in the bounded staleness RFC. `NegotiateAndSend` is a specialized version of `Send` that is capable of orchestrating a bounded-staleness read through a transaction, given a read-only BatchRequest with a `min_timestamp_bound` set in its Header. If the method returns successfully, the transaction will have been given a fixed timestamp equal to the timestamp that the read-only request was evaluated at.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Aug 2, 2021
Touches cockroachdb#67562. This commit adds support for a subset of non-transactional batch requests to perform follower reads. Specifically, it makes those that do not rely on their timestamp being set from the server's clock eligible. This condition is necessary because if a follower with a lagging clock sets its timestamp and this then allows the follower to evaluate the batch as a follower read, then the batch might miss past writes served at higher timestamps on the leaseholder.
craig bot
pushed a commit
that referenced
this issue
Aug 2, 2021
68192: kv: permit some non-transactional batches to perform follower reads r=nvanbenschoten a=nvanbenschoten Touches #67562. This commit adds support for a subset of non-transactional batch requests to perform follower reads. Specifically, it makes those that do not rely on their timestamp being set from the server's clock eligible. This condition is necessary because if a follower with a lagging clock sets its timestamp and this then allows the follower to evaluate the batch as a follower read, then the batch might miss past writes served at higher timestamps on the leaseholder. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Aug 2, 2021
Touches cockroachdb#67562. This commit introduces a new RoutingPolicy configuration that lives on a BatchRequest header. A request's routing policy specifies how the request should be routed to the replicas of its target range(s) by the DistSender. There are initially two routing policies: ``` enum RoutingPolicy { // LEASEHOLDER means that the DistSender should route the request to the // leaseholder replica(s) of its target range(s). LEASEHOLDER = 0; // NEAREST means that the DistSender should route the request to the // nearest replica(s) of its target range(s). NEAREST = 1; } ``` The default policy is `LEASEHOLDER`. Routing policies allow us to stop overloading the use of the ReadConsistency enum to dictate both how the client should route a request to a server and which kinds of requests should be eligible to be served by a given replica. Routing policies are a client-side only concept. They do not dictate which replicas in a range are eligible to serve the request, only which replicas are considered as targets by the DistSender, and in which order. A request that is routed to an ineligible replica (a function of request type, timestamp, and read consistency) will be rejected by that replica and the DistSender will target another replica in the range. As discussed in cockroachdb#67725 (review), we will likely need to introduce a third routing policy that called `SINGLE_REPLICA` to address cockroachdb#67554. This policy would be accompanied by a ReplicaDescriptor and would specify that a given request must be sent to that replica and DistSender should throw an error if the replica is not part of its cached range descriptor. This is important to ensure that a QueryResolvedTimestampRequest and its follow-up ScanRequest are both sent to the same replica.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Aug 2, 2021
Fixes cockroachdb#67551. Fixes cockroachdb#67552. Fixes cockroachdb#67553. Touches cockroachdb#67562. Bounded-staleness read orchestration consists of two phases - negotiation and execution. Negotiation determines the timestamp to run the query at in order to ensure that the read will not block on replication or on conflicting transactions. Execution then uses this timestamp to run the read request. This commit implements the bounded staleness server-side negotiation fast-path. This fast-path allows a bounded staleness read request that lands on a single range to perform its negotiation phase and execution phase in a single RPC. The server-side negotiation fast-path provides two benefits: 1. it avoids two network hops in the common-case where a bounded staleness read is targeting a single range. This in an important performance optimization for single-row point lookups. 2. it provides stronger guarantees around minimizing staleness during bounded staleness reads. Bounded staleness reads that hit the server-side fast-path use their target replica's most up-to-date resolved timestamp, so they are as fresh as possible. Bounded staleness reads that miss the fast-path and perform explicit negotiation (see below) consult a cache, so they may use an out-of-date, suboptimal resolved timestamp, as long as it is fresh enough to satisfy the staleness bound of the request. The commit then uses this new functionality to implement the `(*Txn).NegotiateAndSend` method detailed in the bounded staleness RFC. `NegotiateAndSend` is a specialized version of `Send` that is capable of orchestrating a bounded-staleness read through a transaction, given a read-only BatchRequest with a `min_timestamp_bound` set in its Header. If the method returns successfully, the transaction will have been given a fixed timestamp equal to the timestamp that the read-only request was evaluated at.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Aug 5, 2021
Touches cockroachdb#67562. This commit introduces a new RoutingPolicy configuration that lives on a BatchRequest header. A request's routing policy specifies how the request should be routed to the replicas of its target range(s) by the DistSender. There are initially two routing policies: ``` enum RoutingPolicy { // LEASEHOLDER means that the DistSender should route the request to the // leaseholder replica(s) of its target range(s). LEASEHOLDER = 0; // NEAREST means that the DistSender should route the request to the // nearest replica(s) of its target range(s). NEAREST = 1; } ``` The default policy is `LEASEHOLDER`. Routing policies allow us to stop overloading the use of the ReadConsistency enum to dictate both how the client should route a request to a server and which kinds of requests should be eligible to be served by a given replica. Routing policies are a client-side only concept. They do not dictate which replicas in a range are eligible to serve the request, only which replicas are considered as targets by the DistSender, and in which order. A request that is routed to an ineligible replica (a function of request type, timestamp, and read consistency) will be rejected by that replica and the DistSender will target another replica in the range. As discussed in cockroachdb#67725 (review), we will likely need to introduce a third routing policy that called `SINGLE_REPLICA` to address cockroachdb#67554. This policy would be accompanied by a ReplicaDescriptor and would specify that a given request must be sent to that replica and DistSender should throw an error if the replica is not part of its cached range descriptor. This is important to ensure that a QueryResolvedTimestampRequest and its follow-up ScanRequest are both sent to the same replica.
craig bot
pushed a commit
that referenced
this issue
Aug 5, 2021
68191: kv: introduce request RoutingPolicy configuration r=nvanbenschoten a=nvanbenschoten Half of #67551. Touches #67562. This commit introduces a new RoutingPolicy configuration that lives on a BatchRequest header. A request's routing policy specifies how the request should be routed to the replicas of its target range(s) by the DistSender. There are initially two routing policies: ``` enum RoutingPolicy { // LEASEHOLDER means that the DistSender should route the request to the // leaseholder replica(s) of its target range(s). LEASEHOLDER = 0; // NEAREST means that the DistSender should route the request to the // nearest replica(s) of its target range(s). NEAREST = 1; } ``` The default policy is `LEASEHOLDER`. Routing policies allow us to stop overloading the use of the ReadConsistency enum to dictate both how the client should route a request to a server and which kinds of requests should be eligible to be served by a given replica. Routing policies are a client-side only concept. They do not dictate which replicas in a range are eligible to serve the request, only which replicas are considered as targets by the DistSender, and in which order. A request that is routed to an ineligible replica (a function of request type, timestamp, and read consistency) will be rejected by that replica and the DistSender will target another replica in the range. As discussed in #67725 (review), we will likely need to introduce a third routing policy called `SINGLE_REPLICA` to address #67554. This policy would be accompanied by a ReplicaDescriptor and would specify that a given request must be sent to that replica and DistSender should throw an error if the replica is not part of its cached range descriptor. This is important to ensure that a QueryResolvedTimestampRequest and its follow-up ScanRequest are both sent to the same replica. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
sajjadrizvi
pushed a commit
to sajjadrizvi/cockroach
that referenced
this issue
Aug 10, 2021
Touches cockroachdb#67562. This commit introduces a new RoutingPolicy configuration that lives on a BatchRequest header. A request's routing policy specifies how the request should be routed to the replicas of its target range(s) by the DistSender. There are initially two routing policies: ``` enum RoutingPolicy { // LEASEHOLDER means that the DistSender should route the request to the // leaseholder replica(s) of its target range(s). LEASEHOLDER = 0; // NEAREST means that the DistSender should route the request to the // nearest replica(s) of its target range(s). NEAREST = 1; } ``` The default policy is `LEASEHOLDER`. Routing policies allow us to stop overloading the use of the ReadConsistency enum to dictate both how the client should route a request to a server and which kinds of requests should be eligible to be served by a given replica. Routing policies are a client-side only concept. They do not dictate which replicas in a range are eligible to serve the request, only which replicas are considered as targets by the DistSender, and in which order. A request that is routed to an ineligible replica (a function of request type, timestamp, and read consistency) will be rejected by that replica and the DistSender will target another replica in the range. As discussed in cockroachdb#67725 (review), we will likely need to introduce a third routing policy that called `SINGLE_REPLICA` to address cockroachdb#67554. This policy would be accompanied by a ReplicaDescriptor and would specify that a given request must be sent to that replica and DistSender should throw an error if the replica is not part of its cached range descriptor. This is important to ensure that a QueryResolvedTimestampRequest and its follow-up ScanRequest are both sent to the same replica.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Aug 10, 2021
Fixes cockroachdb#67551. Fixes cockroachdb#67552. Fixes cockroachdb#67553. Touches cockroachdb#67562. Bounded-staleness read orchestration consists of two phases - negotiation and execution. Negotiation determines the timestamp to run the query at in order to ensure that the read will not block on replication or on conflicting transactions. Execution then uses this timestamp to run the read request. This commit implements the bounded staleness server-side negotiation fast-path. This fast-path allows a bounded staleness read request that lands on a single range to perform its negotiation phase and execution phase in a single RPC. The server-side negotiation fast-path provides two benefits: 1. it avoids two network hops in the common-case where a bounded staleness read is targeting a single range. This in an important performance optimization for single-row point lookups. 2. it provides stronger guarantees around minimizing staleness during bounded staleness reads. Bounded staleness reads that hit the server-side fast-path use their target replica's most up-to-date resolved timestamp, so they are as fresh as possible. Bounded staleness reads that miss the fast-path and perform explicit negotiation (see below) consult a cache, so they may use an out-of-date, suboptimal resolved timestamp, as long as it is fresh enough to satisfy the staleness bound of the request. The commit then uses this new functionality to implement the `(*Txn).NegotiateAndSend` method detailed in the bounded staleness RFC. `NegotiateAndSend` is a specialized version of `Send` that is capable of orchestrating a bounded-staleness read through a transaction, given a read-only BatchRequest with a `min_timestamp_bound` set in its Header. If the method returns successfully, the transaction will have been given a fixed timestamp equal to the timestamp that the read-only request was evaluated at.
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Aug 16, 2021
Fixes cockroachdb#67551. Fixes cockroachdb#67552. Fixes cockroachdb#67553. Touches cockroachdb#67562. Bounded-staleness read orchestration consists of two phases - negotiation and execution. Negotiation determines the timestamp to run the query at in order to ensure that the read will not block on replication or on conflicting transactions. Execution then uses this timestamp to run the read request. This commit implements the bounded staleness server-side negotiation fast-path. This fast-path allows a bounded staleness read request that lands on a single range to perform its negotiation phase and execution phase in a single RPC. The server-side negotiation fast-path provides two benefits: 1. it avoids two network hops in the common-case where a bounded staleness read is targeting a single range. This in an important performance optimization for single-row point lookups. 2. it provides stronger guarantees around minimizing staleness during bounded staleness reads. Bounded staleness reads that hit the server-side fast-path use their target replica's most up-to-date resolved timestamp, so they are as fresh as possible. Bounded staleness reads that miss the fast-path and perform explicit negotiation (see below) consult a cache, so they may use an out-of-date, suboptimal resolved timestamp, as long as it is fresh enough to satisfy the staleness bound of the request. The commit then uses this new functionality to implement the `(*Txn).NegotiateAndSend` method detailed in the bounded staleness RFC. `NegotiateAndSend` is a specialized version of `Send` that is capable of orchestrating a bounded-staleness read through a transaction, given a read-only BatchRequest with a `min_timestamp_bound` set in its Header. If the method returns successfully, the transaction will have been given a fixed timestamp equal to the timestamp that the read-only request was evaluated at.
craig bot
pushed a commit
that referenced
this issue
Aug 17, 2021
68194: kv: implement server-side negotiation fast-path and Txn.NegotiateAndSend r=nvanbenschoten a=nvanbenschoten Fixes #67551. Fixes #67552. Fixes #67553. Touches #67562. Bounded-staleness read orchestration consists of two phases - negotiation and execution. Negotiation determines the timestamp to run the query at in order to ensure that the read will not block on replication or on conflicting transactions. Execution then uses this timestamp to run the read request. This commit implements the bounded staleness server-side negotiation fast-path. This fast-path allows a bounded staleness read request that lands on a single range to perform its negotiation phase and execution phase in a single RPC. The server-side negotiation fast-path provides two benefits: 1. it avoids two network hops in the common-case where a bounded staleness read is targeting a single range. This in an important performance optimization for single-row point lookups. 2. it provides stronger guarantees around minimizing staleness during bounded staleness reads. Bounded staleness reads that hit the server-side fast-path use their target replica's most up-to-date resolved timestamp, so they are as fresh as possible. Bounded staleness reads that miss the fast-path and perform explicit negotiation (see below) consult a cache, so they may use an out-of-date, suboptimal resolved timestamp, as long as it is fresh enough to satisfy the staleness bound of the request. The commit then uses this new functionality to implement the `(*Txn).NegotiateAndSend` method detailed in the bounded staleness RFC. `NegotiateAndSend` is a specialized version of `Send` that is capable of orchestrating a bounded-staleness read through a transaction, given a read-only BatchRequest with a `min_timestamp_bound` set in its Header. If the method returns successfully, the transaction will have been given a fixed timestamp equal to the timestamp that the read-only request was evaluated at. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Oct 5, 2021
…ollower-reads` suite Related to cockroachdb#67562. This commit adds the following two roachtest variants: - `follower-reads/survival=zone/locality=regional/reads=bounded-staleness/insufficient-quorum` - `follower-reads/survival=region/locality=regional/reads=bounded-staleness/insufficient-quorum` These two tests are similar to the other `follower-reads` variants in that they perform follower reads across 3 different regions on a table in a multi-region database. In this case, both variants perform bounded staleness reads on a REGIONAL table using the `with_max_staleness` option. The new addition to this test is that these variants kill the database's primary region and assert that bounded staleness reads remain available from outside of that region. This directly tests one of the major benefits of bounded staleness reads in a way that we had not yet done in an end-to-end system test like this. Release note: None Release justification: testing only
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Oct 5, 2021
…ollower-reads` suite Related to cockroachdb#67562. This commit adds the following two roachtest variants: - `follower-reads/survival=zone/locality=regional/reads=bounded-staleness/insufficient-quorum` - `follower-reads/survival=region/locality=regional/reads=bounded-staleness/insufficient-quorum` These two tests are similar to the other `follower-reads` variants in that they perform follower reads across 3 different regions on a table in a multi-region database. In this case, both variants perform bounded staleness reads on a REGIONAL table using the `with_max_staleness` option. The new addition to this test is that these variants kill the database's primary region and assert that bounded staleness reads remain available from outside of that region. This directly tests one of the major benefits of bounded staleness reads in a way that we had not yet done in an end-to-end system test like this. Release note: None Release justification: testing only
craig bot
pushed a commit
that referenced
this issue
Oct 5, 2021
70716: sql: fix timezone formatting for GMT offsets r=otan a=RichardJCai Release note (sql change): If the time zone is set in a GMT offset, for example +7 or -11, the timezone will be formatted as <+07>-07 and <-11>+11 respectively instead of +7, -11. This most notably shows up when doing SHOW TIME ZONE. 71122: roachtest: add bounded staleness `insufficient-quorum` variants to `follower-reads` suite r=nvanbenschoten a=nvanbenschoten Related to #67562. This commit adds the following two roachtest variants: - `follower-reads/survival=zone/locality=regional/reads=bounded-staleness/insufficient-quorum` - `follower-reads/survival=region/locality=regional/reads=bounded-staleness/insufficient-quorum` These two tests are similar to the other `follower-reads` variants in that they perform follower reads across 3 different regions on a table in a multi-region database. In this case, both variants perform bounded staleness reads on a REGIONAL table using the `with_max_staleness` option. The new addition to this test is that these variants kill the database's primary region and assert that bounded staleness reads remain available from outside of that region. This directly tests one of the major benefits of bounded staleness reads in a way that we had not yet done in an end-to-end system test like this. 71143: storage: remove unused SSTableInfo types r=nicktrav a=jbowens Release note: None Co-authored-by: richardjcai <caioftherichard@gmail.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
We have marked this issue as stale because it has been inactive for |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-transactions
Relating to MVCC and the transactional model.
A-multiregion
Related to multi-region
A-sql-execution
Relating to SQL execution.
A-sql-optimizer
SQL logical planning and optimizations.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-multiregion
T-sql-queries
SQL Queries Team
Tracking issue for bounded staleness reads, as outlined in #66020.
The tracking issue is split into three stages:
Stage 1:
productionKVBatchSize
batches from being serviced by bounded staleness #69063Stage 2:
Stage 3:
Epic CRDB-2527
Jira issue: CRDB-8608
The text was updated successfully, but these errors were encountered: