Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Batch read operations to main table tablets for an Index scan #12553

Closed
pkj415 opened this issue May 18, 2022 · 1 comment
Closed

[YSQL] Batch read operations to main table tablets for an Index scan #12553

pkj415 opened this issue May 18, 2022 · 1 comment
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug kind/perf priority/high High Priority

Comments

@pkj415
Copy link
Contributor

pkj415 commented May 18, 2022

Jira Link: [DB-423](https://yugabyte.atlassian.net/browse/DB-423)

Description

CREATE TABLE T(h int, r int, v1 int, v2 int, primary key(h, r));
CREATE INDEX my_index on T(v1);
insert into T values (1, 1, 1, 1), (1,2, 1, 2), (1, 3, 1, 3), (1, 4, 1, 4);
select v2 from T where v1 = 1;

In the above example, YSQL performs 1 read rpc to query the index table and find the main table's pks that have v1=1. Then it performs separate read rpcs to the main table's tablet servers one after the other -- 1 rpc for each pk, even though all the pks belong to the same tablet.

In the general case -- even though the main table pks to be queried can belong to different tablets, there are two improvements possible -

  1. Reads for pks that hash to the same tablet server can be batched into the same rpc call
  2. Read rpc to different tablet servers don't have to block each other, they all can be done in parallel

Code path for the 2 steps involved with Index Scan -

  1. YSQL issues an rpc to the index table's tablet server which hosts v1=1 and fetches the pks in the main table that have v1=1 -- this is done as part of the function ybcingettuple()
  2. YSQL iterates through the pks one by one and fetches the value for v2 by sending an rpc to the main table's tablet server which hosts the pk -- this is done via repeated calls to index_getnext() which in turn calls index_fetch_heap()
@pkj415 pkj415 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels May 18, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels May 18, 2022
@pkj415 pkj415 added kind/perf priority/high High Priority and removed kind/bug This issue is a bug status/awaiting-triage Issue awaiting triage priority/medium Medium priority issue labels May 18, 2022
@yugabyte-ci yugabyte-ci added the kind/bug This issue is a bug label Jul 9, 2022
@pkj415
Copy link
Contributor Author

pkj415 commented Jul 14, 2022

This is a non-issue, the read rpcs already fetches data from main table using the primary keys in batches. The missing piece in the issue summary is that ybc_getnext_heaptuple internally prefetches data for the pks found by the index scan.

Also both optimizations mentioned above are already a part of the database.

Thanks to @tanujnay112 for also confirming this by logging the docdb requests using set yb_debug_log_docdb_requests=true.

@pkj415 pkj415 closed this as completed Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug kind/perf priority/high High Priority
Projects
None yet
Development

No branches or pull requests

3 participants