Skip to content

[YSQL] Improve range scan tablet pruning #27695

Open
@andrei-mart

Description

@andrei-mart

Jira Link: DB-17294

Description

In following setup:

yugabyte=# create table range_test(a int, b int, primary key(a asc)) split at values ((100), (200), (300), (400), (500), (600), (700), (800), (900));
CREATE TABLE
yugabyte=# insert into range_test select i, i from generate_series(1, 1000) i;
INSERT 0 1000

simple ordered range scan knows to scan only tablets that may have matches:

 yugabyte=# explain (analyze, dist) select * from range_test where a between 250 and 350 order by a;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Index Scan using range_test_pkey on range_test  (cost=0.00..4.12 rows=1 width=8) (actual time=9.612..25.150 rows=101 loops=1)
   Index Cond: ((a >= 250) AND (a <= 350))
   Storage Table Read Requests: 2
   Storage Table Read Execution Time: 22.726 ms
   Storage Table Rows Scanned: 101
 Planning Time: 32.305 ms
 Execution Time: 25.469 ms
 Storage Read Requests: 2
 Storage Read Execution Time: 22.726 ms
 Storage Rows Scanned: 101
 Storage Write Requests: 0
 Catalog Read Requests: 9
 Catalog Read Execution Time: 138.372 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 161.098 ms
 Peak Memory Usage: 24 kB
(17 rows)

yugabyte=# explain (analyze, dist) select * from range_test where a between 750 and 850 order by a;
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Index Scan using range_test_pkey on range_test  (cost=0.00..4.12 rows=1 width=8) (actual time=15.852..25.477 rows=101 loops=1)
   Index Cond: ((a >= 750) AND (a <= 850))
   Storage Table Read Requests: 2
   Storage Table Read Execution Time: 19.067 ms
   Storage Table Rows Scanned: 101
 Planning Time: 0.256 ms
 Execution Time: 25.711 ms
 Storage Read Requests: 2
 Storage Read Execution Time: 19.067 ms
 Storage Rows Scanned: 101
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 19.067 ms
 Peak Memory Usage: 24 kB
(16 rows)

backward scan - not so much, it knows where to stop, but not where to start

yugabyte=# explain (analyze, dist) select * from range_test where a between 750 and 850 order by a desc;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan Backward using range_test_pkey on range_test  (cost=0.00..4.12 rows=1 width=8) (actual time=19.270..31.327 rows=101 loops=1)
   Index Cond: ((a >= 750) AND (a <= 850))
   Storage Table Read Requests: 2
   Storage Table Read Execution Time: 28.111 ms
   Storage Table Rows Scanned: 101
 Planning Time: 0.259 ms
 Execution Time: 31.563 ms
 Storage Read Requests: 2
 Storage Read Execution Time: 28.111 ms
 Storage Rows Scanned: 101
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 28.111 ms
 Peak Memory Usage: 24 kB
(16 rows)

yugabyte=# explain (analyze, dist) select * from range_test where a between 350 and 450 order by a desc;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan Backward using range_test_pkey on range_test  (cost=0.00..4.12 rows=1 width=8) (actual time=20.872..26.385 rows=101 loops=1)
   Index Cond: ((a >= 350) AND (a <= 450))
   Storage Table Read Requests: 4
   Storage Table Read Execution Time: 21.806 ms
   Storage Table Rows Scanned: 101
 Planning Time: 0.239 ms
 Execution Time: 26.704 ms
 Storage Read Requests: 4
 Storage Read Execution Time: 21.806 ms
 Storage Rows Scanned: 101
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 21.806 ms
 Peak Memory Usage: 24 kB
(16 rows)

yugabyte=# explain (analyze, dist) select * from range_test where a between 150 and 250 order by a desc;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan Backward using range_test_pkey on range_test  (cost=0.00..4.12 rows=1 width=8) (actual time=63.939..85.010 rows=101 loops=1)
   Index Cond: ((a >= 150) AND (a <= 250))
   Storage Table Read Requests: 5
   Storage Table Read Execution Time: 71.402 ms
   Storage Table Rows Scanned: 101
 Planning Time: 0.358 ms
 Execution Time: 85.356 ms
 Storage Read Requests: 5
 Storage Read Execution Time: 71.402 ms
 Storage Rows Scanned: 101
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 71.402 ms
 Peak Memory Usage: 24 kB
(16 rows)

Unordered scan runs requests in parallel, and Read Requests show 1, it actually sends requests to all 5 tablets.

yugabyte=# explain (analyze, dist) select * from range_test where a between 150 and 250;
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Index Scan using range_test_pkey on range_test  (cost=0.00..4.12 rows=1 width=8) (actual time=23.192..23.487 rows=101 loops=1)
   Index Cond: ((a >= 150) AND (a <= 250))
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 17.496 ms
   Storage Table Rows Scanned: 101
 Planning Time: 0.247 ms
 Execution Time: 23.752 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 17.496 ms
 Storage Rows Scanned: 101
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 17.496 ms
 Peak Memory Usage: 24 kB
(16 rows)

The PgGate has a function to calculate scan boundaries and can prune out of boundaries tablets, but this function is used inconsistently: in come cases it repeatedly evaluates same boundaries, in others it is not used at all.
We want tablet pruning to happen once per scan.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions