Description
Describe the problem
The optimizer seems to be missing a join reordering opportunity that would avoid a full scan.
This issue was originally reported in Zendesk https://cockroachdb.zendesk.com/agent/tickets/24877
The Zendesk ticket includes statement bundles from the customer's cluster, and the un-redacted SQL.
This query below has a full scan on join_requests
returning 1,087,393 rows, then a lookup join on projects
which is an outer join. This lookup join is where all the time is being spent. Finally, the 820,399 rows from the lookup join are hash joined to teams
. However, the scan on teams
results in 1 row, and the hash join results in 0 rows.
sql> SELECT
-> sj0.id AS id
-> FROM
-> join_requests AS sj0
-> LEFT JOIN projects AS sp1 ON sp1.id = sj0.project_id
-> INNER JOIN teams AS st2 ON st2.id = sj0.team_id
-> WHERE
-> (st2.deleted_at IS NULL)
-> AND ((sp1.id IS NULL) OR (sp1.deleted_at IS NULL))
-> AND st2.account_id = '<redacted>':::UUID
-> ;
id
------
(0 rows)
Time: 5.721s total (execution 5.641s / network 0.080s)
If I simply switch the order that projects
and teams
are listed in the SQL, the subquery returns in 6ms. This scans teams
and results in 1 row, which is lookup joined to join_requests
resulting in 0 rows.
sql> SELECT
-> sj0.id AS id
-> FROM
-> join_requests AS sj0
-> INNER JOIN teams AS st2 ON st2.id = sj0.team_id
-> LEFT JOIN projects AS sp1 ON sp1.id = sj0.project_id
-> WHERE
-> (st2.deleted_at IS NULL)
-> AND ((sp1.id IS NULL) OR (sp1.deleted_at IS NULL))
-> AND st2.account_id = '<redacted>':::UUID;
id
------
(0 rows)
Time: 90ms total (execution 6ms / network 83ms)
To Reproduce
Statement bundles of both versions of the original query are included in the Zendesk ticket https://cockroachdb.zendesk.com/agent/tickets/24877
The query described above is just one subquery within a larger SQL. The statement bundles in the Zendesk ticket include the full original SQL.
I checked the reorder_joins_limit
which was set to the default. When I set reorder_joins_limit = 10
, there was no improvement in the query or plan using the original join order.
Expected behavior
I expected the optimizer to consider the join reorder without a SQL modiication.
Environment:
- CockroachDB version: v24.2.5
- Cockroach Cloud AWS Advanced cluster
- Single-region
us-east-1
with 12 nodes
- Single-region
Jira issue: CRDB-45710
gz#24877
Metadata
Metadata
Assignees
Labels
Type
Projects
Status