New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] pg_hint_plan for Batched Nested Loop Joins #19494
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
tanujnay112
added
area/ysql
Yugabyte SQL (YSQL)
status/awaiting-triage
Issue awaiting triage
labels
Oct 11, 2023
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
and removed
status/awaiting-triage
Issue awaiting triage
labels
Oct 11, 2023
yugabyte-ci
added
priority/high
High Priority
priority/medium
Medium priority issue
and removed
priority/medium
Medium priority issue
priority/high
High Priority
labels
Oct 16, 2023
tanujnay112
added a commit
that referenced
this issue
Oct 19, 2023
Summary: This change adds syntax and functionality in pg_hint_plan to control the appearance of batched nested loop joins. One can supply the hint `YbBatchedNL(t1 t2)` to do the BNL analogue of `NestLoop(t1 t2)`. Similarly, this change also adds functionality for the hint `NoYbBatchedNL`. The functionality of `yb_prefer_bnl` should now be carried out through pg_hint_plan. The cost model adds `disable_cost` to BNL plans when `yb_enable_batchednl` is disabled. This is analogous to `enable_nestloop` and `enable_hashjoin`. This GUC is necessary for the functionality of pg_hint_plan in controlling how joins might get preferred, also much like how pg_hint_plan uses `enable_nestloop`, `enable_hashjoin`. When `yb_bnl_batch_size = 1` (BNL disabled), the `(No)YbBatchedNL` hints have the same effect as a `(No)NestLoop` hint. This change also removes BNL's previous dependency on `enable_nestloop`. That flag now only controls classical nested loop joins. Note that `yb_bnl_batch_size` still needs to be set to > 1 which produces candidate BNL plans. `yb_enable_batchednl` permits the cost model to choose these candidate BNL's. `yb_prefer_bnl`'s implementation has been changed in this diff. Before, `yb_prefer_bnl=true` meant we did not emit parameterized "unbatched" index scan paths if batched equivalents were available. This had to be changed for a couple reasons. First, this allowed us to run into situations when planning index access paths to `t1` where we emit unbatched index paths on `t1_pkey` and `t1_idx`. In this scenario, say we were only able to emit a batched index path of `t1_idx`. If the planner ended up pruning the unbatched version of `t1_idx` away in favor of a lower cost `t1_pkey`, the batched version of `t1_idx` would remain unpruned. In this scenario, it would be possible for a plan that involved an index scan on `t1_pkey` with `yb_bnl_batch_size = 1` to then involve a batched index scan on `t1_idx` instead when `yb_bnl_batch_size` is turned up. This goes against the promise of `yb_prefer_bnl` with CBO off, where we expected YB BNL nodes to cleanly replace formerly NL nodes when batching is turned on, nothing more or less. The earlier implementation of `yb_prefer_bnl` also disallows us of a way to enforce a `NestLoop(t1 t2)` hint in this diff as unbatched index scans to `t1` or `t2` wouldn't have been available by then time we would get to planning the relation `(t1 t2)` if `yb_prefer_bnl` was on. In the new implementation of `yb_prefer_bnl`, we always emit both unbatched and batched index paths. In `add_paths` of `pathnode.c`, we added code to ensure that batched NL joins are always prioritized over equivalent NL joins in the path queue in `yb_prefer_bnl` is on. Jira: DB-8289 Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressJoin' Reviewers: mtakahara, gkukreja Reviewed By: gkukreja Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D29260
jasonyb
pushed a commit
that referenced
this issue
Mar 14, 2024
Summary: This change adds syntax and functionality in pg_hint_plan to control the appearance of batched nested loop joins. One can supply the hint `YbBatchedNL(t1 t2)` to do the BNL analogue of `NestLoop(t1 t2)`. Similarly, this change also adds functionality for the hint `NoYbBatchedNL`. The functionality of `yb_prefer_bnl` should now be carried out through pg_hint_plan. The cost model adds `disable_cost` to BNL plans when `yb_enable_batchednl` is disabled. This is analogous to `enable_nestloop` and `enable_hashjoin`. This GUC is necessary for the functionality of pg_hint_plan in controlling how joins might get preferred, also much like how pg_hint_plan uses `enable_nestloop`, `enable_hashjoin`. When `yb_bnl_batch_size = 1` (BNL disabled), the `(No)YbBatchedNL` hints have the same effect as a `(No)NestLoop` hint. This change also removes BNL's previous dependency on `enable_nestloop`. That flag now only controls classical nested loop joins. Note that `yb_bnl_batch_size` still needs to be set to > 1 which produces candidate BNL plans. `yb_enable_batchednl` permits the cost model to choose these candidate BNL's. `yb_prefer_bnl`'s implementation has been changed in this diff. Before, `yb_prefer_bnl=true` meant we did not emit parameterized "unbatched" index scan paths if batched equivalents were available. This had to be changed for a couple reasons. First, this allowed us to run into situations when planning index access paths to `t1` where we emit unbatched index paths on `t1_pkey` and `t1_idx`. In this scenario, say we were only able to emit a batched index path of `t1_idx`. If the planner ended up pruning the unbatched version of `t1_idx` away in favor of a lower cost `t1_pkey`, the batched version of `t1_idx` would remain unpruned. In this scenario, it would be possible for a plan that involved an index scan on `t1_pkey` with `yb_bnl_batch_size = 1` to then involve a batched index scan on `t1_idx` instead when `yb_bnl_batch_size` is turned up. This goes against the promise of `yb_prefer_bnl` with CBO off, where we expected YB BNL nodes to cleanly replace formerly NL nodes when batching is turned on, nothing more or less. The earlier implementation of `yb_prefer_bnl` also disallows us of a way to enforce a `NestLoop(t1 t2)` hint in this diff as unbatched index scans to `t1` or `t2` wouldn't have been available by then time we would get to planning the relation `(t1 t2)` if `yb_prefer_bnl` was on. In the new implementation of `yb_prefer_bnl`, we always emit both unbatched and batched index paths. In `add_paths` of `pathnode.c`, we added code to ensure that batched NL joins are always prioritized over equivalent NL joins in the path queue in `yb_prefer_bnl` is on. Jira: DB-8289 Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressJoin' Reviewers: mtakahara, gkukreja Reviewed By: gkukreja Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D29260
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-8289
Description
Currently, there is no easy mechanism to control the appearance of Batched NL joins as there is for other join types through pg_hint_plan. It would be nice to force certain joins to be BNL's like
/*+BatchedNL(t1 (t2 t3))*/
until the CBO is GA'd.Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: