Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] pg_hint_plan for Batched Nested Loop Joins #19494

Open
1 task done
tanujnay112 opened this issue Oct 11, 2023 · 0 comments
Open
1 task done

[YSQL] pg_hint_plan for Batched Nested Loop Joins #19494

tanujnay112 opened this issue Oct 11, 2023 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@tanujnay112
Copy link
Contributor

tanujnay112 commented Oct 11, 2023

Jira Link: DB-8289

Description

Currently, there is no easy mechanism to control the appearance of Batched NL joins as there is for other join types through pg_hint_plan. It would be nice to force certain joins to be BNL's like /*+BatchedNL(t1 (t2 t3))*/ until the CBO is GA'd.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@tanujnay112 tanujnay112 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Oct 11, 2023
@tanujnay112 tanujnay112 self-assigned this Oct 11, 2023
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Oct 11, 2023
@yugabyte-ci yugabyte-ci added priority/high High Priority priority/medium Medium priority issue and removed priority/medium Medium priority issue priority/high High Priority labels Oct 16, 2023
tanujnay112 added a commit that referenced this issue Oct 19, 2023
Summary:
This change adds syntax and functionality in pg_hint_plan to control the appearance of batched nested loop joins. One can supply the hint `YbBatchedNL(t1 t2)` to do the BNL analogue of `NestLoop(t1 t2)`. Similarly, this change also adds functionality for the hint `NoYbBatchedNL`. The functionality of `yb_prefer_bnl` should now be carried out through pg_hint_plan. The cost model adds `disable_cost` to BNL plans when `yb_enable_batchednl` is disabled. This is analogous to `enable_nestloop` and `enable_hashjoin`. This GUC is necessary for the functionality of pg_hint_plan in controlling how joins might get preferred, also much like how pg_hint_plan uses `enable_nestloop`, `enable_hashjoin`.

When `yb_bnl_batch_size = 1` (BNL disabled), the `(No)YbBatchedNL` hints have the same effect as a `(No)NestLoop` hint.

This change also removes BNL's previous dependency on `enable_nestloop`. That flag now only controls classical nested loop joins.

Note that `yb_bnl_batch_size` still needs to be set to > 1 which produces candidate BNL plans. `yb_enable_batchednl` permits the cost model to choose these candidate BNL's.

`yb_prefer_bnl`'s implementation has been changed in this diff. Before, `yb_prefer_bnl=true` meant we did not emit parameterized "unbatched" index scan paths if batched equivalents were available. This had to be changed for a couple reasons. First, this allowed us to run into situations when planning index access paths to `t1` where we emit unbatched index paths on `t1_pkey` and `t1_idx`. In this scenario, say we were only able to emit a batched index path of `t1_idx`. If the planner ended up pruning the unbatched version of `t1_idx` away in favor of a lower cost `t1_pkey`, the batched version of `t1_idx` would remain unpruned. In this scenario, it would be possible for a plan that involved an index scan on `t1_pkey` with `yb_bnl_batch_size = 1` to then involve a batched index scan on `t1_idx` instead when `yb_bnl_batch_size` is turned up. This goes against the promise of `yb_prefer_bnl` with CBO off, where we expected YB BNL nodes to cleanly replace formerly NL nodes when batching is turned on, nothing more or less.
The earlier implementation of `yb_prefer_bnl` also disallows us of a way to enforce a `NestLoop(t1 t2)` hint in this diff as unbatched index scans to `t1` or `t2` wouldn't have been available by then time we would get to planning the relation `(t1 t2)` if `yb_prefer_bnl` was on.

In the new implementation of `yb_prefer_bnl`, we always emit both unbatched and batched index paths. In `add_paths` of `pathnode.c`, we added code to ensure that batched NL joins are always prioritized over equivalent NL joins in the path queue in `yb_prefer_bnl` is on.
Jira: DB-8289

Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressJoin'

Reviewers: mtakahara, gkukreja

Reviewed By: gkukreja

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D29260
jasonyb pushed a commit that referenced this issue Mar 14, 2024
Summary:
This change adds syntax and functionality in pg_hint_plan to control the appearance of batched nested loop joins. One can supply the hint `YbBatchedNL(t1 t2)` to do the BNL analogue of `NestLoop(t1 t2)`. Similarly, this change also adds functionality for the hint `NoYbBatchedNL`. The functionality of `yb_prefer_bnl` should now be carried out through pg_hint_plan. The cost model adds `disable_cost` to BNL plans when `yb_enable_batchednl` is disabled. This is analogous to `enable_nestloop` and `enable_hashjoin`. This GUC is necessary for the functionality of pg_hint_plan in controlling how joins might get preferred, also much like how pg_hint_plan uses `enable_nestloop`, `enable_hashjoin`.

When `yb_bnl_batch_size = 1` (BNL disabled), the `(No)YbBatchedNL` hints have the same effect as a `(No)NestLoop` hint.

This change also removes BNL's previous dependency on `enable_nestloop`. That flag now only controls classical nested loop joins.

Note that `yb_bnl_batch_size` still needs to be set to > 1 which produces candidate BNL plans. `yb_enable_batchednl` permits the cost model to choose these candidate BNL's.

`yb_prefer_bnl`'s implementation has been changed in this diff. Before, `yb_prefer_bnl=true` meant we did not emit parameterized "unbatched" index scan paths if batched equivalents were available. This had to be changed for a couple reasons. First, this allowed us to run into situations when planning index access paths to `t1` where we emit unbatched index paths on `t1_pkey` and `t1_idx`. In this scenario, say we were only able to emit a batched index path of `t1_idx`. If the planner ended up pruning the unbatched version of `t1_idx` away in favor of a lower cost `t1_pkey`, the batched version of `t1_idx` would remain unpruned. In this scenario, it would be possible for a plan that involved an index scan on `t1_pkey` with `yb_bnl_batch_size = 1` to then involve a batched index scan on `t1_idx` instead when `yb_bnl_batch_size` is turned up. This goes against the promise of `yb_prefer_bnl` with CBO off, where we expected YB BNL nodes to cleanly replace formerly NL nodes when batching is turned on, nothing more or less.
The earlier implementation of `yb_prefer_bnl` also disallows us of a way to enforce a `NestLoop(t1 t2)` hint in this diff as unbatched index scans to `t1` or `t2` wouldn't have been available by then time we would get to planning the relation `(t1 t2)` if `yb_prefer_bnl` was on.

In the new implementation of `yb_prefer_bnl`, we always emit both unbatched and batched index paths. In `add_paths` of `pathnode.c`, we added code to ensure that batched NL joins are always prioritized over equivalent NL joins in the path queue in `yb_prefer_bnl` is on.
Jira: DB-8289

Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressJoin'

Reviewers: mtakahara, gkukreja

Reviewed By: gkukreja

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D29260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants