Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] YCQL scan by indexed column can take longer than client timeout and cause a log spew #19221

Closed
1 task done
ttyusupov opened this issue Sep 20, 2023 · 2 comments
Closed
1 task done
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/high High Priority

Comments

@ttyusupov
Copy link
Contributor

ttyusupov commented Sep 20, 2023

Jira Link: DB-8015

Description

This could be reproduced with the following test on GCP n1-standard-4 1-node local cluster with release build at commit 0cbfb04:

./bin/yb-ctl start

CREATE KEYSPACE ks;
USE ks;
CREATE TABLE test_table (key INT PRIMARY KEY, category INT, value INT) WITH transactions = { 'enabled' : true };
CREATE INDEX category_idx ON test_table (category) WITH TABLETS = 2;

python3 -c "for i in range(0, 1000000): print(i, ',', 0, ',', i);" | ./bin/ycqlsh -e "COPY ks.test_table(key, category, value) from STDIN";

./bin/yb-ctl restart --tserver_flags 'client_read_write_timeout_ms=4000,rpc_slow_query_threshold_ms=4000'

time ./bin/ycqlsh -e "PAGING OFF; select * from ks.test_table where category=0;"

Actual result:

time ./bin/ycqlsh -e "PAGING OFF; select * from ks.test_table where category=0;"
<stdin>:1:OperationTimedOut: errors={'127.0.0.1': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.1

real	0m10.605s
user	0m0.404s
sys	0m0.076s

W0920 18:59:02.334851 249960 cql_rpc.cc:335] CQL Call from 127.0.0.1:57058, stream id: 2 took 25781ms. Details:
W0920 18:59:02.334957 249960 cql_rpc.cc:338] cql_details {
  type: "QUERY"
  call_details {
    sql_string: "select * from ks.test_table where category=0;"
  }
}

Also a lot of the following error messages in logs:

I0920 18:58:42.527166 249960 client_master_rpc.cc:77] 0x00000000022a9398 -> GetTableLocations { table_name: ks.test_table, table_id: 6cc26701215d4dd49bff3681cb4a8892, partition_start_key: hash_code: NaN, partition_list_version: 0, request_no: 13, num_attempts: 1 }: Failed: Timed out (yb/client/meta_cache.cc:816): Timed out after deadline expired, passed: 0.000s
I0920 18:58:42.527400 249960 client_master_rpc.cc:77] 0x00000000022a9398 -> GetTableLocations { table_name: ks.test_table, table_id: 6cc26701215d4dd49bff3681cb4a8892, partition_start_key: hash_code: NaN, partition_list_version: 0, request_no: 14, num_attempts: 1 }: Failed: Timed out (yb/client/meta_cache.cc:816): Timed out after deadline expired, passed: 0.000s
...
I0920 18:59:00.661993 249960 client_master_rpc.cc:77] 0x00000000022a9e18 -> GetTableLocations { table_name: ks.test_table, table_id: 6cc26701215d4dd49bff3681cb4a8892, partition_start_key: hash_code: NaN, partition_list_version: 0, request_no: 1000011, num_attempts: 1 }: Failed: Timed out (yb/client/meta_cache.cc:816): Timed out after deadline expired, passed: 0.000s
I0920 18:59:00.662026 249960 client_master_rpc.cc:77] 0x00000000022a9e18 -> GetTableLocations { table_name: ks.test_table, table_id: 6cc26701215d4dd49bff3681cb4a8892, partition_start_key: hash_code: NaN, partition_list_version: 0, request_no: 1000012, num_attempts: 1 }: Failed: Timed out (yb/client/meta_cache.cc:816): Timed out after deadline expired, passed: 0.000s

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@ttyusupov ttyusupov added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Sep 20, 2023
@ttyusupov ttyusupov self-assigned this Sep 20, 2023
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Sep 20, 2023
@ttyusupov
Copy link
Contributor Author

ttyusupov commented Sep 20, 2023

Here is what happens:

  1. select * from ks.test_table where category=0 is executed, it is important that there is an index on category column, so index scan is performed.
  2. YCQL sets the deadline to now + client_read_write_timeout_ms (4 seconds in the example above).
  3. Index tablet lookup is requested from MetaCache and first GetTableLocations RPC is sent to master and response is returned quickly.
  4. YCQL sends Read RPC to the index tablet leader with the timeout calculated based on deadline determined in step 2.
  5. Index tablet leader starts scan, but there are 1M records with category = 0 and index tablet scan is completed around or after the timeout, but before hitting deadline is detected at tserver side (there is a some granularity in detecting deadline).
  6. CQL receives answer from index tablet leader after deadline, but before Connection::CleanupExpirationQueue is executed, so index tablet scan is not timed out at YCQLProxy/YBClient side.
    Connection::CleanupExpirationQueue is executed by timer at FLAGS_rpc_connection_timeout_ms (60 seconds by default) granularity and also when we send outgoing call on the same connection.
  7. CQL is preparing Read operations from the main table based on keys received and adding these operations to YBSession which adds them to Batcher. Batcher is doing loop over these 1M operations and asks MetaCache to determine tablet for each one.
  8. Since we’ve already hit deadline at this step, MetaCache gets error Timed out after deadline expired even before sending GetTableLocations RPC to the master.
  9. Tablet lookup for each of 1M+ operations is failing with the same Timed out after deadline expired and every error is logged by client_master_rpc.cc.
  10. Due to log spew from step 10 we have a longer CQL call execution time.

Important factor here is that paging is off or page size is big enough, so index table page scan takes around or a little more than YCQL request timeout.

@ttyusupov ttyusupov changed the title [DocDB] Non-paged scan by indexed column can take longer than client timeout and cause a log spew [DocDB] YCQL scan by indexed column can take longer than client timeout and cause a log spew Sep 22, 2023
@ttyusupov ttyusupov added this to Backlog in YBase features via automation Sep 22, 2023
@ttyusupov
Copy link
Contributor Author

Lower priority follow-up issue for YCQL: #19252

@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Sep 26, 2023
ttyusupov added a commit that referenced this issue Sep 27, 2023
Summary:
Scan by indexed column in YCQL can take longer than client timeout and cause a log spew in the following scenario:

1. `select * from ks.test_table where category=0` is executed, it is important that there is an index on `category` column, so index scan is performed.
2. YCQL sets the deadline to `now + client_read_write_timeout_ms` (4 seconds in the example above).
3. Index tablet lookup is requested from MetaCache and first GetTableLocations RPC is sent to master and response is returned quickly.
4. YCQL sends Read RPC to the index tablet leader with the `timeout` calculated based on deadline determined in step 2.
5. Index tablet leader starts scan, but there are 1M records with category = 0 and index tablet scan is completed around or after the `timeout`, but before tserver timesout the request (there is a some granularity in detecting deadline).
6. CQL receives answer from index tablet leader after deadline, but before `Connection::CleanupExpirationQueue` is executed, so index tablet scan is not timed out at YCQLProxy/YBClient side.
`Connection::CleanupExpirationQueue` is executed by timer at `FLAGS_rpc_connection_timeout_ms` (60 seconds by default) granularity and also when we send outgoing call on the same connection.
7. CQL is preparing Read operations from the main table based on keys received and adding these operations to `YBSession` which adds them to `Batcher`.  `Batcher` is doing loop over these 1M operations and asks `MetaCache` to determine tablet for each one.
9. Since we’ve already hit deadline at this step, `MetaCache` gets error `Timed out after deadline expired` even before sending `GetTableLocations` RPC to the master.
10. Tablet lookup for each of 1M+ operations is failing with the same `Timed out after deadline expired` and every error is logged by `client_master_rpc.cc`.
11. Due to log spew from step 10 we have a longer CQL call execution time.

Important factor here is that paging is off or page size is big enough, so index table page scan takes around or a little more than YCQL request timeout.

Implemented the following changes to address that:
- Enhanced VLOGs
- ClientMasterRpcBase
  - Fixed log spew when having a lot of errors
  - Avoid re-requesting master leader in case RPC was timed out because it was simply attempted to send after deadline
- MetaCache::LookupTabletByKey - fail earlier if we've hit the deadline and avoid attempts to create and send GetTableLocations RPC in this case
- Enhanced VLOGs
- Added CqlIndexTest.SlowIndexResponse
Jira: DB-8015

Test Plan: CqlIndexTest.SlowIndexResponse

Reviewers: arybochkin, amitanand

Reviewed By: amitanand

Subscribers: yql, bogdan, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D28728
@ttyusupov ttyusupov moved this from Backlog to In progress in YBase features Sep 27, 2023
ttyusupov added a commit that referenced this issue Sep 29, 2023
…ex scan

Summary:
Scan by indexed column in YCQL can take longer than client timeout and cause a log spew in the following scenario:

1. `select * from ks.test_table where category=0` is executed, it is important that there is an index on `category` column, so index scan is performed.
2. YCQL sets the deadline to `now + client_read_write_timeout_ms` (4 seconds in the example above).
3. Index tablet lookup is requested from MetaCache and first GetTableLocations RPC is sent to master and response is returned quickly.
4. YCQL sends Read RPC to the index tablet leader with the `timeout` calculated based on deadline determined in step 2.
5. Index tablet leader starts scan, but there are 1M records with category = 0 and index tablet scan is completed around or after the `timeout`, but before tserver timesout the request (there is a some granularity in detecting deadline).
6. CQL receives answer from index tablet leader after deadline, but before `Connection::CleanupExpirationQueue` is executed, so index tablet scan is not timed out at YCQLProxy/YBClient side.
`Connection::CleanupExpirationQueue` is executed by timer at `FLAGS_rpc_connection_timeout_ms` (60 seconds by default) granularity and also when we send outgoing call on the same connection.
7. CQL is preparing Read operations from the main table based on keys received and adding these operations to `YBSession` which adds them to `Batcher`.  `Batcher` is doing loop over these 1M operations and asks `MetaCache` to determine tablet for each one.
9. Since we’ve already hit deadline at this step, `MetaCache` gets error `Timed out after deadline expired` even before sending `GetTableLocations` RPC to the master.
10. Tablet lookup for each of 1M+ operations is failing with the same `Timed out after deadline expired` and every error is logged by `client_master_rpc.cc`.
11. Due to log spew from step 10 we have a longer CQL call execution time.

Important factor here is that paging is off or page size is big enough, so index table page scan takes around or a little more than YCQL request timeout.

Implemented the following changes to address that:
- Enhanced VLOGs
- ClientMasterRpcBase
  - Fixed log spew when having a lot of errors
  - Avoid re-requesting master leader in case RPC was timed out because it was simply attempted to send after deadline
- MetaCache::LookupTabletByKey - fail earlier if we've hit the deadline and avoid attempts to create and send GetTableLocations RPC in this case
- Enhanced VLOGs
- Added CqlIndexTest.SlowIndexResponse
Jira: DB-8015

Original commit: 0f09d15 / D28728

Test Plan: CqlIndexTest.SlowIndexResponse

Reviewers: arybochkin, amitanand

Reviewed By: amitanand

Subscribers: ybase, bogdan, yql

Differential Revision: https://phorge.dev.yugabyte.com/D28864
ttyusupov added a commit that referenced this issue Sep 29, 2023
…ex scan

Summary:
Scan by indexed column in YCQL can take longer than client timeout and cause a log spew in the following scenario:

1. `select * from ks.test_table where category=0` is executed, it is important that there is an index on `category` column, so index scan is performed.
2. YCQL sets the deadline to `now + client_read_write_timeout_ms` (4 seconds in the example above).
3. Index tablet lookup is requested from MetaCache and first GetTableLocations RPC is sent to master and response is returned quickly.
4. YCQL sends Read RPC to the index tablet leader with the `timeout` calculated based on deadline determined in step 2.
5. Index tablet leader starts scan, but there are 1M records with category = 0 and index tablet scan is completed around or after the `timeout`, but before tserver timesout the request (there is a some granularity in detecting deadline).
6. CQL receives answer from index tablet leader after deadline, but before `Connection::CleanupExpirationQueue` is executed, so index tablet scan is not timed out at YCQLProxy/YBClient side.
`Connection::CleanupExpirationQueue` is executed by timer at `FLAGS_rpc_connection_timeout_ms` (60 seconds by default) granularity and also when we send outgoing call on the same connection.
7. CQL is preparing Read operations from the main table based on keys received and adding these operations to `YBSession` which adds them to `Batcher`.  `Batcher` is doing loop over these 1M operations and asks `MetaCache` to determine tablet for each one.
9. Since we’ve already hit deadline at this step, `MetaCache` gets error `Timed out after deadline expired` even before sending `GetTableLocations` RPC to the master.
10. Tablet lookup for each of 1M+ operations is failing with the same `Timed out after deadline expired` and every error is logged by `client_master_rpc.cc`.
11. Due to log spew from step 10 we have a longer CQL call execution time.

Important factor here is that paging is off or page size is big enough, so index table page scan takes around or a little more than YCQL request timeout.

Implemented the following changes to address that:
- Enhanced VLOGs
- ClientMasterRpcBase
  - Fixed log spew when having a lot of errors
  - Avoid re-requesting master leader in case RPC was timed out because it was simply attempted to send after deadline
- MetaCache::LookupTabletByKey - fail earlier if we've hit the deadline and avoid attempts to create and send GetTableLocations RPC in this case
- Enhanced VLOGs
- Added CqlIndexTest.SlowIndexResponse
Jira: DB-8015

Original commit: 0f09d15 / D28728

Test Plan: CqlIndexTest.SlowIndexResponse

Reviewers: arybochkin, amitanand

Reviewed By: amitanand

Subscribers: yql, bogdan, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D28870
ttyusupov added a commit that referenced this issue Sep 29, 2023
…ex scan

Summary:
Scan by indexed column in YCQL can take longer than client timeout and cause a log spew in the following scenario:

1. `select * from ks.test_table where category=0` is executed, it is important that there is an index on `category` column, so index scan is performed.
2. YCQL sets the deadline to `now + client_read_write_timeout_ms` (4 seconds in the example above).
3. Index tablet lookup is requested from MetaCache and first GetTableLocations RPC is sent to master and response is returned quickly.
4. YCQL sends Read RPC to the index tablet leader with the `timeout` calculated based on deadline determined in step 2.
5. Index tablet leader starts scan, but there are 1M records with category = 0 and index tablet scan is completed around or after the `timeout`, but before tserver timesout the request (there is a some granularity in detecting deadline).
6. CQL receives answer from index tablet leader after deadline, but before `Connection::CleanupExpirationQueue` is executed, so index tablet scan is not timed out at YCQLProxy/YBClient side.
`Connection::CleanupExpirationQueue` is executed by timer at `FLAGS_rpc_connection_timeout_ms` (60 seconds by default) granularity and also when we send outgoing call on the same connection.
7. CQL is preparing Read operations from the main table based on keys received and adding these operations to `YBSession` which adds them to `Batcher`.  `Batcher` is doing loop over these 1M operations and asks `MetaCache` to determine tablet for each one.
9. Since we’ve already hit deadline at this step, `MetaCache` gets error `Timed out after deadline expired` even before sending `GetTableLocations` RPC to the master.
10. Tablet lookup for each of 1M+ operations is failing with the same `Timed out after deadline expired` and every error is logged by `client_master_rpc.cc`.
11. Due to log spew from step 10 we have a longer CQL call execution time.

Important factor here is that paging is off or page size is big enough, so index table page scan takes around or a little more than YCQL request timeout.

Implemented the following changes to address that:
- Enhanced VLOGs
- ClientMasterRpcBase
  - Fixed log spew when having a lot of errors
  - Avoid re-requesting master leader in case RPC was timed out because it was simply attempted to send after deadline
- MetaCache::LookupTabletByKey - fail earlier if we've hit the deadline and avoid attempts to create and send GetTableLocations RPC in this case
- Enhanced VLOGs
- Added CqlIndexTest.SlowIndexResponse
Jira: DB-8015

Original commit: 0f09d15 / D28728

Test Plan: CqlIndexTest.SlowIndexResponse

Reviewers: arybochkin, amitanand

Reviewed By: amitanand

Subscribers: ybase, bogdan, yql

Differential Revision: https://phorge.dev.yugabyte.com/D28872
ttyusupov added a commit that referenced this issue Sep 29, 2023
…ex scan

Summary:
Scan by indexed column in YCQL can take longer than client timeout and cause a log spew in the following scenario:

1. `select * from ks.test_table where category=0` is executed, it is important that there is an index on `category` column, so index scan is performed.
2. YCQL sets the deadline to `now + client_read_write_timeout_ms` (4 seconds in the example above).
3. Index tablet lookup is requested from MetaCache and first GetTableLocations RPC is sent to master and response is returned quickly.
4. YCQL sends Read RPC to the index tablet leader with the `timeout` calculated based on deadline determined in step 2.
5. Index tablet leader starts scan, but there are 1M records with category = 0 and index tablet scan is completed around or after the `timeout`, but before tserver timesout the request (there is a some granularity in detecting deadline).
6. CQL receives answer from index tablet leader after deadline, but before `Connection::CleanupExpirationQueue` is executed, so index tablet scan is not timed out at YCQLProxy/YBClient side.
`Connection::CleanupExpirationQueue` is executed by timer at `FLAGS_rpc_connection_timeout_ms` (60 seconds by default) granularity and also when we send outgoing call on the same connection.
7. CQL is preparing Read operations from the main table based on keys received and adding these operations to `YBSession` which adds them to `Batcher`.  `Batcher` is doing loop over these 1M operations and asks `MetaCache` to determine tablet for each one.
9. Since we’ve already hit deadline at this step, `MetaCache` gets error `Timed out after deadline expired` even before sending `GetTableLocations` RPC to the master.
10. Tablet lookup for each of 1M+ operations is failing with the same `Timed out after deadline expired` and every error is logged by `client_master_rpc.cc`.
11. Due to log spew from step 10 we have a longer CQL call execution time.

Important factor here is that paging is off or page size is big enough, so index table page scan takes around or a little more than YCQL request timeout.

Implemented the following changes to address that:
- Enhanced VLOGs
- ClientMasterRpcBase
  - Fixed log spew when having a lot of errors
  - Avoid re-requesting master leader in case RPC was timed out because it was simply attempted to send after deadline
- MetaCache::LookupTabletByKey - fail earlier if we've hit the deadline and avoid attempts to create and send GetTableLocations RPC in this case
- Enhanced VLOGs
- Added CqlIndexTest.SlowIndexResponse
Jira: DB-8015

Original commit: 0f09d15 / D28728

Test Plan: CqlIndexTest.SlowIndexResponse

Reviewers: arybochkin, amitanand

Reviewed By: amitanand

Subscribers: yql, bogdan, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D28873
YBase features automation moved this from In progress to Done Oct 16, 2023
@yugabyte-ci yugabyte-ci added priority/high High Priority and removed priority/medium Medium priority issue labels Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/high High Priority
Projects
Development

No branches or pull requests

2 participants