Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,opt: add hint for STRAIGHT join #116013

Merged
merged 1 commit into from Feb 5, 2024

Conversation

kevinmingtarja
Copy link
Contributor

Our current join hints (a INNER HASH JOIN b, a LEFT LOOKUP JOIN b, etc) fixes both the join order and the join algorithm. This commit adds the syntax and support for hinting the join order without hinting the join algorithm. This will be useful for the (few) cases where the optimizer processes the tables in a suboptimal order.

Resolves: #115308

Release note (sql change): It is now possible to hint to the optimizer that it should plan a straight join by using the syntax ... INNER STRAIGHT JOIN .... If the hint is provided, the optimizer will now fix the join order as given in the query, even if it estimates that a different plan using join reordering would have a lower cost.

@kevinmingtarja kevinmingtarja requested review from a team as code owners December 11, 2023 08:22
@kevinmingtarja kevinmingtarja requested review from rharding6373 and removed request for a team December 11, 2023 08:22
@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Dec 11, 2023

CLA assistant check
All committers have signed the CLA.

Copy link

blathers-crl bot commented Dec 11, 2023

It looks like your PR touches SQL parser code but doesn't add or edit parser tests. Please make sure you add or edit parser tests if you edit the parser.

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Dec 11, 2023
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link

blathers-crl bot commented Dec 11, 2023

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@kevinmingtarja kevinmingtarja force-pushed the straight-join branch 3 times, most recently from 5a134d4 to c918bdb Compare December 11, 2023 16:08
@michae2 michae2 self-requested a review December 11, 2023 16:55
Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, this looks pretty good! I have one test suggestion. I will defer the rest of the review to Michael, though, since he's thought about this feature more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kevinmingtarja and @michae2)


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

			// is inherently part of the join's logic.
			panic(pgerror.Newf(pgcode.Syntax,
				"%s can only be used with INNER joins", tree.AstStraight,

It would be great to have some logic tests that exercise this failure mode (and validate that the hint only works with INNER). Adding a subtest to pkg/sql/logictest/testdata/logic_test/inner-join seems like a good home for this kind of test, and it can use a lot of what you have already int the optbuilder test. Something like:

subtest straight_join

statement ok
CREATE TABLE t1 (x INT, PRIMARY KEY (x))

statement ok
CREATE TABLE t2 (x INT, y INT, PRIMARY KEY (x), INDEX idx_y (y))

statement ok
[INSERT some values into t1 and t2 here]

query I
SELECT * FROM t1 INNER STRAIGHT JOIN t2 ON t1.x = t2.y
----
[some output from the join here]

query I
SELECT * FROM t2 INNER STRAIGHT JOIN t1 ON t1.x = t2.y
----
[some output from the join here]

statement error pgcode: 42601 pq: STRAIGHT can only be used with INNER joins
SELECT * FROM t1 LEFT OUTER STRAIGHT JOIN t2 ON t1.x = t2.y

statement error pgcode: 42601 pq: STRAIGHT can only be used with INNER joins
SELECT * FROM t1 FULL STRAIGHT JOIN t2 ON t1.x = t2.y

subtest end

@mgartner
Copy link
Collaborator

pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

It would be great to have some logic tests that exercise this failure mode (and validate that the hint only works with INNER). Adding a subtest to pkg/sql/logictest/testdata/logic_test/inner-join seems like a good home for this kind of test, and it can use a lot of what you have already int the optbuilder test. Something like:

subtest straight_join

statement ok
CREATE TABLE t1 (x INT, PRIMARY KEY (x))

statement ok
CREATE TABLE t2 (x INT, y INT, PRIMARY KEY (x), INDEX idx_y (y))

statement ok
[INSERT some values into t1 and t2 here]

query I
SELECT * FROM t1 INNER STRAIGHT JOIN t2 ON t1.x = t2.y
----
[some output from the join here]

query I
SELECT * FROM t2 INNER STRAIGHT JOIN t1 ON t1.x = t2.y
----
[some output from the join here]

statement error pgcode: 42601 pq: STRAIGHT can only be used with INNER joins
SELECT * FROM t1 LEFT OUTER STRAIGHT JOIN t2 ON t1.x = t2.y

statement error pgcode: 42601 pq: STRAIGHT can only be used with INNER joins
SELECT * FROM t1 FULL STRAIGHT JOIN t2 ON t1.x = t2.y

subtest end

Outer joins can be reordered in some cases. For example, here's a test case for join ordering (apologies if the output isn't straightforward to read) that shows two different orderings considered for a query with two adjacent left joins:

# Left-asscom property for left joins.
reorderjoins format=hide-all
SELECT *
FROM bx
LEFT JOIN cy ON b = c
LEFT JOIN dz ON x = z
----
--------------------------------------------------------------------------------
Join Tree #1
--------------------------------------------------------------------------------
left-join (hash)
├── scan bx
├── scan cy
└── filters
└── b = c
Vertexes
A:
scan bx
B:
scan cy
Edges
b = c [left, ses=AB, tes=AB, rules=()]
Joining AB
A B [left, refs=AB]
Joins Considered: 1
--------------------------------------------------------------------------------
Join Tree #2
--------------------------------------------------------------------------------
left-join (hash)
├── left-join (hash)
│ ├── scan bx
│ ├── scan cy
│ └── filters
│ └── b = c
├── scan dz
└── filters
└── x = z
Vertexes
A:
scan bx
B:
scan cy
C:
scan dz
Edges
b = c [left, ses=AB, tes=AB, rules=()]
x = z [left, ses=AC, tes=AC, rules=()]
Joining AB
A B [left, refs=AB]
Joining AC
A C [left, refs=AC]
Joining ABC
AC B [left, refs=AB]
AB C [left, refs=AC]
Joins Considered: 4
================================================================================
Final Plan
================================================================================
left-join (hash)
├── left-join (merge)
│ ├── scan bx
│ ├── scan cy
│ └── filters (true)
├── scan dz
└── filters
└── x = z

I would expect l LEFT STRAIGHT JOIN r to lock in the ordering of joining r to l.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kevinmingtarja, @michae2, and @rharding6373)


pkg/sql/opt/memo/expr.go line 527 at r1 (raw file):

	// DisallowHashJoinStoreRight, DisallowLookupJoinIntoRight,
	// DisallowInvertedJoinIntoRight, and DisallowMergeJoin.
	AllowAllJoinsIntoRight = disallowAll ^ DisallowHashJoinStoreRight ^ DisallowLookupJoinIntoRight ^ DisallowInvertedJoinIntoRight ^ DisallowMergeJoin

The current logic here:

if !flags.Empty() || jb.joinCount > int(jb.evalCtx.SessionData().ReorderJoinsLimit) {
// If the join has flags or the join limit has been reached, we can't
// reorder. Simply treat the join as a base relation.
jb.addBaseRelation(t)
break
}

Will prevent reordering if there is any join flag present. So I suggest creating a seperate join flag under PreferLookupJoinIntoRight called something like DisallowReordering and using just that.

Copy link
Contributor Author

@kevinmingtarja kevinmingtarja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rharding6373)


pkg/sql/opt/memo/expr.go line 527 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

The current logic here:

if !flags.Empty() || jb.joinCount > int(jb.evalCtx.SessionData().ReorderJoinsLimit) {
// If the join has flags or the join limit has been reached, we can't
// reorder. Simply treat the join as a base relation.
jb.addBaseRelation(t)
break
}

Will prevent reordering if there is any join flag present. So I suggest creating a seperate join flag under PreferLookupJoinIntoRight called something like DisallowReordering and using just that.

Got it, that will be cleaner.


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Outer joins can be reordered in some cases. For example, here's a test case for join ordering (apologies if the output isn't straightforward to read) that shows two different orderings considered for a query with two adjacent left joins:

# Left-asscom property for left joins.
reorderjoins format=hide-all
SELECT *
FROM bx
LEFT JOIN cy ON b = c
LEFT JOIN dz ON x = z
----
--------------------------------------------------------------------------------
Join Tree #1
--------------------------------------------------------------------------------
left-join (hash)
├── scan bx
├── scan cy
└── filters
└── b = c
Vertexes
A:
scan bx
B:
scan cy
Edges
b = c [left, ses=AB, tes=AB, rules=()]
Joining AB
A B [left, refs=AB]
Joins Considered: 1
--------------------------------------------------------------------------------
Join Tree #2
--------------------------------------------------------------------------------
left-join (hash)
├── left-join (hash)
│ ├── scan bx
│ ├── scan cy
│ └── filters
│ └── b = c
├── scan dz
└── filters
└── x = z
Vertexes
A:
scan bx
B:
scan cy
C:
scan dz
Edges
b = c [left, ses=AB, tes=AB, rules=()]
x = z [left, ses=AC, tes=AC, rules=()]
Joining AB
A B [left, refs=AB]
Joining AC
A C [left, refs=AC]
Joining ABC
AC B [left, refs=AB]
AB C [left, refs=AC]
Joins Considered: 4
================================================================================
Final Plan
================================================================================
left-join (hash)
├── left-join (merge)
│ ├── scan bx
│ ├── scan cy
│ └── filters (true)
├── scan dz
└── filters
└── x = z

I would expect l LEFT STRAIGHT JOIN r to lock in the ordering of joining r to l.

Ah I missed that, thanks for pointing it out @mgartner!

Copy link
Contributor Author

@kevinmingtarja kevinmingtarja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rharding6373)


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, kevinmingtarja (Kevin Mingtarja) wrote…

Ah I missed that, thanks for pointing it out @mgartner!

Hi, I have a follow up question. Does this mean we should disable the CommuteRightJoin and CommuteLeftJoin rule as well to prevent the inputs from being swapped?

Which means in CommuteJoinFlags(), we should probably add an OR on p.Flags.Has(memo.DisallowReordering) right (line 586)?

// CommuteJoinFlags returns a join private for the commuted join (where the left
// and right sides are swapped). It adjusts any join flags that are specific to
// one side.
func (c *CustomFuncs) CommuteJoinFlags(p *memo.JoinPrivate) *memo.JoinPrivate {
if p.Flags.Empty() {
return p
}

Copy link
Contributor Author

@kevinmingtarja kevinmingtarja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rharding6373)


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, kevinmingtarja (Kevin Mingtarja) wrote…

Hi, I have a follow up question. Does this mean we should disable the CommuteRightJoin and CommuteLeftJoin rule as well to prevent the inputs from being swapped?

Which means in CommuteJoinFlags(), we should probably add an OR on p.Flags.Has(memo.DisallowReordering) right (line 586)?

// CommuteJoinFlags returns a join private for the commuted join (where the left
// and right sides are swapped). It adjusts any join flags that are specific to
// one side.
func (c *CustomFuncs) CommuteJoinFlags(p *memo.JoinPrivate) *memo.JoinPrivate {
if p.Flags.Empty() {
return p
}

My understanding isn't the clearest yet, but I believe if we don't disable CommuteRightJoin and CommuteLeftJoin, it is still possible for the optimizer to convert a LEFT JOIN b to b RIGHT JOIN a (even if we set the DisallowReordering flag), and we don't want this right?

But if we set the join flag as disallowAll ^ DisallowHashJoinStoreRight ^ DisallowLookupJoinIntoRight ^ DisallowInvertedJoinIntoRight ^ DisallowMergeJoin (which is what I did originally), I think this won't happen as we are explicitly disallowing the joins that processes the right table first.

So even if in the memo, the reordering from commute left/right is "enumerated", it does not get explored as the optimizer will see from the join flags that those joins that processes the right table first are disallowed.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 11 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kevinmingtarja, @michae2, and @rharding6373)


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, kevinmingtarja (Kevin Mingtarja) wrote…

My understanding isn't the clearest yet, but I believe if we don't disable CommuteRightJoin and CommuteLeftJoin, it is still possible for the optimizer to convert a LEFT JOIN b to b RIGHT JOIN a (even if we set the DisallowReordering flag), and we don't want this right?

But if we set the join flag as disallowAll ^ DisallowHashJoinStoreRight ^ DisallowLookupJoinIntoRight ^ DisallowInvertedJoinIntoRight ^ DisallowMergeJoin (which is what I did originally), I think this won't happen as we are explicitly disallowing the joins that processes the right table first.

So even if in the memo, the reordering from commute left/right is "enumerated", it does not get explored as the optimizer will see from the join flags that those joins that processes the right table first are disallowed.

Great find! Thanks for bringing up CommuteRightJoin and CommuteLeftJoin. I think you're absolutely correct, your original implementation will force the optimizer to pick the non-commuted joins, even though those rules are still applied and the commuted joins are explored. Let's stick with your original.

It may be more efficient to not apply/explore those rules at all, but there could be some complications there, so let's not do that for now (specifically, CommuteRightJoin is a normalization rule that makes it simpler for other rules to pattern match on joins—they only need to match left joins, not right joins too. Disabling CommuteRightJoin could lead to unexpectedly poor plans when rules dependent on it aren't applied).


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 1 at r1 (raw file):

# LogicTest: local

This is a good integration test, but I think this also deserves some more extensive and "unit-y" optimizer tests. I see two main things to test:

  1. That join reordering is not explored with the STRAIGHT keyword. I think this is a good place to test that: https://github.com/cockroachdb/cockroach/blob/b0e509b273235abc88d28a2dc0385c6b47c59c1b/pkg/sql/opt/xform/testdata/rules/join_order
  2. That a really high cost is given to commuted joins so that non-commuted joins are picked as the best plan by the optimizer. I think this is a good place to test that:
    # Verify that if we force lookup join but one isn't possible, the hash join has
    # huge cost (this will result in an error if we try to execbuild the result).
    opt
    SELECT k, x FROM a INNER LOOKUP JOIN b ON k=x
    ----

In both places it'd be great to test INNER, LEFT, and RIGHT joins.

If you have any questions about how these tests work, let me know.

@michae2 michae2 requested a review from mgartner January 11, 2024 00:55
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @kevinmingtarja! This is looking great!

Reviewed 11 of 11 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kevinmingtarja, @mgartner, and @rharding6373)


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Great find! Thanks for bringing up CommuteRightJoin and CommuteLeftJoin. I think you're absolutely correct, your original implementation will force the optimizer to pick the non-commuted joins, even though those rules are still applied and the commuted joins are explored. Let's stick with your original.

It may be more efficient to not apply/explore those rules at all, but there could be some complications there, so let's not do that for now (specifically, CommuteRightJoin is a normalization rule that makes it simpler for other rules to pattern match on joins—they only need to match left joins, not right joins too. Disabling CommuteRightJoin could lead to unexpectedly poor plans when rules dependent on it aren't applied).

Just chiming in to agree that it would be nice to also support l LEFT STRAIGHT JOIN r if possible!

IMO we should allow InnerJoin, LeftOuterJoin, LeftSemiJoin, LeftAntiJoin, IntersectAllJoin, and ExceptAllJoin here (though I don't think we can produce plans with those latter 4 join types and the STRAIGHT hint yet).


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 73 at r1 (raw file):

# thus enforcing t1 to be on the right side of the join.
query T
EXPLAIN (VERBOSE) SELECT * FROM t2 INNER STRAIGHT JOIN t1 ON t1.x = t2.y

It would be interesting to add another testcase which naturally produces a hash join into the left side and see that we get a hash join into the right side instead. Maybe something like:

SELECT * FROM t1 INNER JOIN t2 ON t1.x + 1 = t2.y * 2;

@kevinmingtarja kevinmingtarja force-pushed the straight-join branch 2 times, most recently from 409ae40 to 365ea77 Compare January 29, 2024 05:48
Copy link
Contributor Author

@kevinmingtarja kevinmingtarja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rharding6373)


pkg/sql/opt/optbuilder/join.go line 106 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Just chiming in to agree that it would be nice to also support l LEFT STRAIGHT JOIN r if possible!

IMO we should allow InnerJoin, LeftOuterJoin, LeftSemiJoin, LeftAntiJoin, IntersectAllJoin, and ExceptAllJoin here (though I don't think we can produce plans with those latter 4 join types and the STRAIGHT hint yet).

Got it! I have removed the restriction on the join type, so now l LEFT STRAIGHT JOIN r and l RIGHT STRAIGHT JOIN r is possible, plus I've added test cases for those as well.


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 1 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This is a good integration test, but I think this also deserves some more extensive and "unit-y" optimizer tests. I see two main things to test:

  1. That join reordering is not explored with the STRAIGHT keyword. I think this is a good place to test that: https://github.com/cockroachdb/cockroach/blob/b0e509b273235abc88d28a2dc0385c6b47c59c1b/pkg/sql/opt/xform/testdata/rules/join_order
  2. That a really high cost is given to commuted joins so that non-commuted joins are picked as the best plan by the optimizer. I think this is a good place to test that:
    # Verify that if we force lookup join but one isn't possible, the hash join has
    # huge cost (this will result in an error if we try to execbuild the result).
    opt
    SELECT k, x FROM a INNER LOOKUP JOIN b ON k=x
    ----

In both places it'd be great to test INNER, LEFT, and RIGHT joins.

If you have any questions about how these tests work, let me know.

Thanks for the pointers @mgartner ! This was super helpful, I've added test cases for INNER, LEFT, and RIGHT joins for (1), and for LEFT and RIGHT joins for (2).

For (2), I used exploretrace to output the transformation step that involves the commuted join with huge cost, since opt and memo only shows the best plan. Let me know if this works or if there's a better way to capture this in the test!


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 73 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

It would be interesting to add another testcase which naturally produces a hash join into the left side and see that we get a hash join into the right side instead. Maybe something like:

SELECT * FROM t1 INNER JOIN t2 ON t1.x + 1 = t2.y * 2;

That's a good idea. I've added a test for that by adding a new unindexed column t2.z and use that as the join key, which will naturally produce a hash join with t1 as the "build" side.

@rharding6373 rharding6373 removed their request for review January 29, 2024 21:42
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: This looks good to me! Great job @kevinmingtarja. I just have a few last suggestions about the execbuilder test.

@mgartner do you have any comments?

Reviewed 10 of 10 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @kevinmingtarja, @mgartner, and @rharding6373)


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 16 at r2 (raw file):

    "created_at": "2018-01-01 1:00:00.00000+00:00",
    "row_count": 100,
    "distinct_count": 100

Let's stick a "null_count": 0 in here, too, just to be sure.


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 28 at r2 (raw file):

    "distinct_count": 10000
  }
]'

Using "columns": ["y", "z"] actually creates a single multi-column stat rather than the more usual two single-column stats. Let's revise this a little to something like:

statement ok
ALTER TABLE t2 INJECT STATISTICS '[
  {
    "columns": ["x"],
    "created_at": "2018-01-01 1:00:00.00000+00:00",
    "row_count": 10000,
    "distinct_count": 10000,
    "null_count": 0
  },
  {
    "columns": ["y"],
    "created_at": "2018-01-01 1:00:00.00000+00:00",
    "row_count": 10000,
    "distinct_count": 10000,
    "null_count": 0
  },
  {
    "columns": ["z"],
    "created_at": "2018-01-01 1:00:00.00000+00:00",
    "row_count": 10000,
    "distinct_count": 10000,
    "null_count": 0
  }
]'

pkg/sql/opt/exec/execbuilder/testdata/straight_join line 218 at r2 (raw file):

# Now, the best plan should no longer be picked, as we're forcing the join order.
query T
EXPLAIN (VERBOSE) SELECT * FROM t1 LEFT STRAIGHT JOIN t2 ON t1.x = t2.z

It would be nice to also confirm that EXPLAIN (VERBOSE) SELECT * FROM t1 RIGHT STRAIGHT JOIN t2 ON t1.x = t2.z also produces the same plan except with a right outer hash join on top.

Copy link
Contributor Author

@kevinmingtarja kevinmingtarja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and suggestions @michae2 ! I've just pushed the fixes.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @michae2, and @rharding6373)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 11 files at r1, 9 of 10 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kevinmingtarja and @rharding6373)


pkg/sql/parser/sql.y line 13795 at r3 (raw file):

//  - STRAIGHT forces the join order, but not the algorithm; in other words, the
//    left table is always read before the right table. `STRAIGHT` can only
//    be used with INNER joins (though this is not enforced by the syntax).

The part about only being used with INNER joins is no longer correct.

nit: "in other words, the left table is always read before the right table" is not technically correct. I would just remove that part to avoid confusion.


pkg/sql/opt/exec/execbuilder/testdata/straight_join line 1 at r1 (raw file):

For (2), I used exploretrace to output the transformation step that involves the commuted join with huge cost, since opt and memo only shows the best plan. Let me know if this works or if there's a better way to capture this in the test!

Works for me. Nice job finding and making use of the exploretrace test command!


pkg/sql/opt/xform/testdata/rules/join_order line 3542 at r3 (raw file):

 ├── scan straight_join2
 └── filters
      └── straight_join1.x = y

I think it'd be good to have some tests with 2 joins where both or only one of the joins has a STRAIGHT hint.

Our current join hints (a INNER HASH JOIN b, a LEFT LOOKUP JOIN b, etc)
fixes both the join order and the join algorithm. This commit adds the
syntax and support for hinting the join order without hinting the
join algorithm. This will be useful for the (few) cases where the
optimizer processes the tables in a suboptimal order.

Resolves: cockroachdb#115308

Release note (sql change): It is now possible to hint to the optimizer
that it should plan a straight join by using the syntax `... INNER
STRAIGHT JOIN ...`. If the hint is provided, the optimizer will now
fix the join order as given in the query, even if it estimates
that a different plan using join reordering would have a lower cost.
Copy link
Contributor Author

@kevinmingtarja kevinmingtarja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @rharding6373)


pkg/sql/parser/sql.y line 13795 at r3 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

The part about only being used with INNER joins is no longer correct.

nit: "in other words, the left table is always read before the right table" is not technically correct. I would just remove that part to avoid confusion.

Done.


pkg/sql/opt/xform/testdata/rules/join_order line 3542 at r3 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I think it'd be good to have some tests with 2 joins where both or only one of the joins has a STRAIGHT hint.

Done.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Great work! Thanks again for contributing!

bors r+

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @kevinmingtarja and @rharding6373)

@craig
Copy link
Contributor

craig bot commented Feb 5, 2024

Build failed (retrying...):

@craig craig bot merged commit f68f0fe into cockroachdb:master Feb 5, 2024
7 checks passed
@craig
Copy link
Contributor

craig bot commented Feb 5, 2024

Build succeeded:

@kevinmingtarja kevinmingtarja deleted the straight-join branch February 6, 2024 02:17
@kevinmingtarja
Copy link
Contributor Author

Thanks for all the help and review folks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: join hint to fix join order without fixing join algorithm
5 participants