[YSQL] Enabling CBO without running ANALYZE may lead to suboptimal plans #16825

qvad · 2023-04-12T15:03:18Z

Jira Link: DB-6175

Description

Full list of queries can be found here, also subqueries set shows bad execution time.
https://github.com/qvad/taqo/blob/main/sql/complex/queries/distinct.sql

There is comparison between 2 runs - default and with CBO and TABLE ANALYZE.

This is one example from queries that started perform bad with CBO/ANALYZE

SELECT DISTINCT t500000.c_int,
                t50000.c_bool
FROM   t1000000 right
    OUTER JOIN t500000
        ON t1000000.c_text = t500000.c_text right
    OUTER JOIN t50000
        ON t1000000.c_text = t50000.c_text
WHERE  t1000000.c_int < 474525
ORDER BY t500000.c_int, t50000.c_bool desc limit 1000 offset 10

Here is execution plan difference:

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

mtakahar · 2023-04-12T15:18:14Z

Those plans highlighted in red and green are the same, including the row counts and the cost estimates. The red one is showing the runtime stats and the actual hash bucket adjustments happened at runtime, but green one doesn't.

It is possible that the initial number of hash buckets and whether to do "batched hash join" (partitioned hash join in more general term) were different, but can't tell without the runtime information from the green one.

Can you try it again with ANALYZE option specified to the EXPLAIN command for the faster run and post the plan diff?

qvad · 2023-05-02T16:09:36Z

This is compact reproducer for other case:

CREATE TABLE t1
    WITH (colocation = true) AS
SELECT k1,
       (k1 + 0.0001)::text as k2,
       k1 as v1,
       (k1 + 0.0002):: varchar as v2 FROM generate_Series(1, 50000) k1;


CREATE TABLE t2
    WITH (colocation = true) AS
SELECT k1,
       (k1 + 0.0001)::text as k2,
       k1 as v1,
       (k1 + 0.0002):: varchar as v2 FROM generate_Series(1, 100000) k1;

-- w/o CBO
SELECT t1.k1,
      t1.k2,
       t2.v1,
      t2.v2
FROM   t1 join t2
        ON t1.k1 = t2.k1
WHERE  t1.k1 >= 2500
   and t1.k1 < 25100
   and t2.k1 >= 2500
   and t2.k1 < 25100
GROUP BY t1.k1, t1.k2, t2.v1, t2.v2;

--enable CBO
SET yb_enable_optimizer_statistics = true;
SELECT t1.k1,
      t1.k2,
       t2.v1,
      t2.v2
FROM   t1 join t2
        ON t1.k1 = t2.k1
WHERE  t1.k1 >= 2500
   and t1.k1 < 25100
   and t2.k1 >= 2500
   and t2.k1 < 25100
GROUP BY t1.k1, t1.k2, t2.v1, t2.v2;

tverona1 · 2023-05-02T16:37:42Z

In the compact repro above, we are not running ANALYZE. With CBO enabled, and without ANALYZE, we do not have cardinality estimates for the tables. Looks like we assume that cardinality is 1, so we end up with a nested loop join that in reality generates ~500M rows:

With CBO (without running ANALYZ prior):
 HashAggregate  (cost=0.02..0.03 rows=1 width=72)
   Group Key: t1.k1, t1.k2, t2.v1, t2.v2
   ->  Nested Loop  (cost=0.00..0.01 rows=1 width=72)
         Join Filter: (t1.k1 = t2.k1)
         ->  Seq Scan on t1  (cost=0.00..0.00 rows=1 width=36)
               Remote Filter: ((k1 >= 2500) AND (k1 < 25100))
         ->  Seq Scan on t2  (cost=0.00..0.00 rows=1 width=40)
               Remote Filter: ((k1 >= 2500) AND (k1 < 25100))

Perhaps, if we are missing cardinality estimates, we should fall back to some hard-coded cardinality (i.e. 10k rows) like we do in the heuristic model.

tverona1 · 2023-05-02T16:45:43Z

Note - in vanilla PG, without stats, cardinality estimates are better. Perhaps it is estimating based on # of pages:

CREATE TABLE t1
    WITH (autovacuum_enabled = false) AS
SELECT k1,
       (k1 + 0.0001)::text as k2,
       k1 as v1,
       (k1 + 0.0002):: varchar as v2 FROM generate_Series(1, 50000) k1;

CREATE TABLE t2
    WITH (autovacuum_enabled = false) AS
SELECT k1,
       (k1 + 0.0001)::text as k2,
       k1 as v1,
       (k1 + 0.0002):: varchar as v2 FROM generate_Series(1, 100000) k1;

explain SELECT t1.k1,
      t1.k2,
       t2.v1,
      t2.v2
FROM   t1 join t2
        ON t1.k1 = t2.k1
WHERE  t1.k1 >= 2500
   and t1.k1 < 25100
   and t2.k1 >= 2500
   and t2.k1 < 25100
GROUP BY t1.k1, t1.k2, t2.v1, t2.v2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Group  (cost=2464.43..2467.20 rows=222 width=72)
   Group Key: t1.k1, t1.k2, t2.v1, t2.v2
   ->  Sort  (cost=2464.43..2464.98 rows=222 width=72)
         Sort Key: t1.k1, t1.k2, t2.v1, t2.v2
         ->  Hash Join  (cost=816.98..2455.78 rows=222 width=72)
               Hash Cond: (t2.k1 = t1.k1)
               ->  Seq Scan on t2  (cost=0.00..1630.24 rows=298 width=40)
                     Filter: ((k1 >= 2500) AND (k1 < 25100))
               ->  Hash  (cost=815.12..815.12 rows=149 width=36)
                     ->  Seq Scan on t1  (cost=0.00..815.12 rows=149 width=36)
                           Filter: ((k1 >= 2500) AND (k1 < 25100))

mtakahar · 2023-05-02T17:17:44Z

In the compact repro above, we are not running ANALYZE.

I briefly tried it myself with TAQO on my MacBook Pro and couldn't reproduce it. Now I see it was because I specified to run the analyze script from the command line.

Perhaps, if we are missing cardinality estimates, we should fall back to some hard-coded cardinality (i.e. 10k rows) like we do in the heuristic model.

I agree. I think we should either create a separate issue or repurpose this to track that.

Perhaps it is estimating based on # of pages:

I believe that's what it does. Postgres also uses 1000 rows as the default in some places though.

mtakahar · 2023-05-09T16:21:46Z

#16097 is the most likely cause of the same hash join plan taking much longer when yb_enable_optimizer_statistics is turned on but the tables are unanalyzed.

gauravk-in · 2023-10-25T18:13:27Z

When yb_enable_optimizer_statistics is disabled, YB uses the heuristic 1000 rows per table. When enabled, we take the value from pg_class.reltuples. This is 0 when ANALYZE is not run and gets “clamped” to 1.

The problem is that when the estimated output rows is 1, the CBO tends to chooses Nested Loop Join which is very expensive.

PG’s logic to estimate reltuples is in estimate_rel_size method. They compute the size of the row and divide the total size of the table by that to estimate the number of tuples. When the table is empty, they assume it has 10 pages. This is why their estimate is never 0.

We can fix the issue in YB in following ways,

To keep things simple (as @mtakahar has also suggested) if pg_class.reltuples == 0 we can return 1000.
Like PG, when reltuples is 0, we can get the size of the SST files from docdb, compute the width of the tuple and use that to compute the number of tuples.

@rthallamko3 Can you please comment how challenging it is to extract the size of SST tables from DocDB? We would need to expose this information to Cost Model and cache it somewhere too.

rthallamko3 · 2023-11-06T20:42:23Z

@gauravk-in , Is the SST size the only unknown for CBO or are there other aspects. I would recommend going down the simpler approach - If analyze is not run, then assume 1000. Once analyze is run, all the statistics are populated and are available?

…ot yet called Summary: Before this change, if `yb_enable_optimizer_statistics` is set to TRUE but `ANALYZE` has not been called, all tables seem to have 0 rows. This causes CBO to pick poor execution plans. For `yb_enable_optimizer_statistics=FALSE` this case was handled by setting tuple count to 1000, if statistics suggested that the table had 0 rows. In this change, the same logic is extended to `yb_enable_optimizer_statistic=TRUE`. This should make CBO produce better query plans. Jira: DB-6175 Test Plan: ybd --java-test 'org.yb.pgsql.TestPgYbOptimizerStatistics' Reviewers: tverona, tnayak, yguan Reviewed By: tnayak Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D30194

qvad added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Apr 12, 2023

qvad assigned tverona1 Apr 12, 2023

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Apr 12, 2023

tverona1 changed the title ~~[YSQL] Enabling CBO may lead to significat performance decreasing for some queries~~ [YSQL] Enabling CBO may lead to significant performance decreasing for some queries May 2, 2023

yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug status/awaiting-triage Issue awaiting triage labels May 17, 2023

tverona1 changed the title ~~[YSQL] Enabling CBO may lead to significant performance decreasing for some queries~~ [YSQL] Enabling CBO without running ANALYZE may lead to suboptimal plans Oct 16, 2023

gauravk-in self-assigned this Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Enabling CBO without running ANALYZE may lead to suboptimal plans #16825

[YSQL] Enabling CBO without running ANALYZE may lead to suboptimal plans #16825

qvad commented Apr 12, 2023 •

edited

mtakahar commented Apr 12, 2023

qvad commented May 2, 2023 •

edited

tverona1 commented May 2, 2023

tverona1 commented May 2, 2023 •

edited

mtakahar commented May 2, 2023 •

edited

mtakahar commented May 9, 2023

gauravk-in commented Oct 25, 2023

rthallamko3 commented Nov 6, 2023

[YSQL] Enabling CBO without running ANALYZE may lead to suboptimal plans #16825

[YSQL] Enabling CBO without running ANALYZE may lead to suboptimal plans #16825

Comments

qvad commented Apr 12, 2023 • edited

Description

Warning: Please confirm that this issue does not contain any sensitive information

mtakahar commented Apr 12, 2023

qvad commented May 2, 2023 • edited

tverona1 commented May 2, 2023

tverona1 commented May 2, 2023 • edited

mtakahar commented May 2, 2023 • edited

mtakahar commented May 9, 2023

gauravk-in commented Oct 25, 2023

rthallamko3 commented Nov 6, 2023

qvad commented Apr 12, 2023 •

edited

qvad commented May 2, 2023 •

edited

tverona1 commented May 2, 2023 •

edited

mtakahar commented May 2, 2023 •

edited