SQL Optimizer - not choosing the better join order

**Describe the problem**

The optimizer seems to be missing a join reordering opportunity that would avoid a full scan.

This issue was originally reported in Zendesk https://cockroachdb.zendesk.com/agent/tickets/24877
The Zendesk ticket includes statement bundles from the customer's cluster, and the un-redacted SQL.

This query below has  a full scan on `join_requests` returning 1,087,393 rows, then a lookup join on `projects` which is an outer join. This lookup join is where all the time is being spent. Finally, the 820,399 rows from the lookup join are hash joined to `teams`. However, the scan on `teams` results in 1 row, and the hash join results in 0 rows.

```
sql>       SELECT
  ->         sj0.id AS id
  ->       FROM
  ->         join_requests AS sj0
  ->         LEFT JOIN projects AS sp1 ON sp1.id = sj0.project_id
  ->         INNER JOIN teams AS st2 ON st2.id = sj0.team_id
  ->       WHERE
  ->         (st2.deleted_at IS NULL)
  ->         AND ((sp1.id IS NULL) OR (sp1.deleted_at IS NULL))
  ->         AND st2.account_id = '<redacted>':::UUID
  -> ;
  id
------
(0 rows)

Time: 5.721s total (execution 5.641s / network 0.080s)
```

If I simply switch the order that `projects` and `teams` are listed in the SQL, the subquery returns in 6ms. This scans `teams` and results in 1 row, which is lookup joined to `join_requests` resulting in 0 rows.

```
sql>       SELECT
  ->         sj0.id AS id
  ->       FROM
  ->         join_requests AS sj0
  ->         INNER JOIN teams AS st2 ON st2.id = sj0.team_id
  ->         LEFT JOIN projects AS sp1 ON sp1.id = sj0.project_id
  ->       WHERE
  ->         (st2.deleted_at IS NULL)
  ->         AND ((sp1.id IS NULL) OR (sp1.deleted_at IS NULL))
  ->         AND st2.account_id = '<redacted>':::UUID;
  id
------
(0 rows)

Time: 90ms total (execution 6ms / network 83ms)
```

**To Reproduce**

Statement bundles of both versions of the original query are included in the Zendesk ticket https://cockroachdb.zendesk.com/agent/tickets/24877

The query described above is just one subquery within a larger SQL. The statement bundles in the Zendesk ticket include the full original SQL.

I checked the `reorder_joins_limit` which was set to the default. When I set `reorder_joins_limit = 10`, there was no improvement in the query or plan using the original join order.

**Expected behavior**
I expected the optimizer to consider the join reorder without a SQL modiication.

**Environment:**
 - CockroachDB version: v24.2.5
 - Cockroach Cloud AWS Advanced cluster
   - Single-region `us-east-1` with 12 nodes



Jira issue: CRDB-45710

gz#24877

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SQL Optimizer - not choosing the better join order #137701

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SQL Optimizer - not choosing the better join order #137701

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions