[YSQL] BNL queries on hash sharded inner tables can result in big memory usage #17762

tanujnay112 · 2023-06-13T21:59:40Z

Jira Link: DB-6857

Description

Consider the following queries where g1 and g2 are hash sharded tables while g1range and g2range are range sharded.

create table g1(h int primary key);
create table g2(h int primary key);
insert into g1 select generate_series(1,100000);
insert into g2 select generate_series(1,100000)
create table g1range(r int, primary key(r asc));
create table g2range(r int, primary key(r asc));
insert into g1range select generate_series(1,100000);
insert into g2range select generate_series(1,100000);
analyze;
set yb_bnl_batch_size = 1024;
set yb_enable_optimizer_statistics=true;

We see that a BNL on the range sharded tables results in the following in a release build

/*+Set(yb_bnl_batch_size 1024) Leading((t1 t2)) NestLoop(t1 t2)**/explain analyze select count(*) from g1range t1, g2range t2 where t1.r = t2.r;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=11909.46..11909.47 rows=1 width=8) (actual time=546.562..546.562 rows=1 loops=1)
   ->  YB Batched Nested Loop Join  (cost=8.00..11659.46 rows=100000 width=0) (actual time=5.035..533.499 rows=100000 loops=1)
         Join Filter: (t1.r = t2.r)
         ->  Seq Scan on g1range t1  (cost=4.00..10004.00 rows=100000 width=4) (actual time=1.398..23.949 rows=100000 loops=1)
         ->  Index Scan using g2range_pkey on g2range t2  (cost=4.00..4.11 rows=1 width=4) (actual time=3.126..3.542 rows=1020 loops=98)
               Index Cond: (r = ANY (ARRAY[t1.r, $1, $2, ..., $1023]))
 Planning Time: 9.215 ms
 Execution Time: 547.155 ms
 Peak Memory Usage: 1074 kB
(9 rows)

However, we do the same join on the hash sharded tables to result in being >15x slower and >100x more memory usage:

/*+Set(yb_bnl_batch_size 1024) Leading((t1 t2)) NestLoop(t1 t2)*/explain analyze select count(*) from g1 t1, g2 t2 where t1.h = t2.h;
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=11909.46..11909.47 rows=1 width=8) (actual time=7761.028..7761.028 rows=1 loops=1)
   ->  YB Batched Nested Loop Join  (cost=8.00..11659.46 rows=100000 width=0) (actual time=145.027..7748.264 rows=100000 loops=1)
         Join Filter: (t1.h = t2.h)
         ->  Seq Scan on g1 t1  (cost=4.00..10004.00 rows=100000 width=4) (actual time=2.036..27.689 rows=100000 loops=1)
         ->  Index Scan using g2_pkey on g2 t2  (cost=4.00..4.11 rows=1 width=4) (actual time=76.028..76.442 rows=1020 loops=98)
               Index Cond: (h = ANY (ARRAY[t1.h, $1, $2, ..., $1023]))
 Planning Time: 6.510 ms
 Execution Time: 7761.942 ms
 Peak Memory Usage: 117611 kB
(9 rows)

It seems like the rescanned IN queries to the hash sharded tables take up way more memory and, in effect, time than the equivalent query with range sharded tables. This needs to be investigated.

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

…or IN conditions sent to hash sharded tables Summary: Before this change we would create `min(permutation_count, FLAGS_ysql_request_limit)` clones of the original read request that contains an IN condition to send over to hash sharded tables. Having these many clones was relevant before hash batching introduced in fc57665 where we would send over one request per IN element. However, when we are batching our IN elements, we only need one op per remote partition. Anything extra results in a lot of unnecessary memory and cpu usage during cloning. This diff fixes this inefficiency. Test Plan: Jenkins The scenario in the GHI has been tested. We get that the join on the hash sharded tables takes up 762ms and 1648kB while the equivalent one on range sharded tables takes up 800ms and 1074kB. Reviewers: amartsinchyk Reviewed By: amartsinchyk Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D26134

…of partitions for IN conditions sent to hash sharded tables Summary: Before this change we would create `min(permutation_count, FLAGS_ysql_request_limit)` clones of the original read request that contains an IN condition to send over to hash sharded tables. Having these many clones was relevant before hash batching introduced in fc57665 where we would send over one request per IN element. However, when we are batching our IN elements, we only need one op per remote partition. Anything extra results in a lot of unnecessary memory and cpu usage during cloning. This diff fixes this inefficiency. Original commit: d87a628 / D26134 Test Plan: Jenkins The scenario in the GHI has been tested. We get that the join on the hash sharded tables takes up 762ms and 1648kB while the equivalent one on range sharded tables takes up 800ms and 1074kB. Reviewers: amartsinchyk Reviewed By: amartsinchyk Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D26197

tanujnay112 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jun 13, 2023

tanujnay112 self-assigned this Jun 13, 2023

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Jun 13, 2023

yugabyte-ci closed this as completed Jun 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] BNL queries on hash sharded inner tables can result in big memory usage #17762

[YSQL] BNL queries on hash sharded inner tables can result in big memory usage #17762

tanujnay112 commented Jun 13, 2023 •

edited

[YSQL] BNL queries on hash sharded inner tables can result in big memory usage #17762

[YSQL] BNL queries on hash sharded inner tables can result in big memory usage #17762

Comments

tanujnay112 commented Jun 13, 2023 • edited

Description

Warning: Please confirm that this issue does not contain any sensitive information

tanujnay112 commented Jun 13, 2023 •

edited