Skip to content

sql: hint joins after the JOIN keyword, not before#35354

Closed
knz wants to merge 1 commit into
cockroachdb:masterfrom
knz:20190304-sql-hints
Closed

sql: hint joins after the JOIN keyword, not before#35354
knz wants to merge 1 commit into
cockroachdb:masterfrom
knz:20190304-sql-hints

Conversation

@knz
Copy link
Copy Markdown
Contributor

@knz knz commented Mar 4, 2019

Fixes #35347.

This commit changes the syntax a MERGE JOIN b to a JOIN @MERGE b.

The reason for this is that the general syntax a b JOIN c is also
standard SQL shorthand for a AS b JOIN c. So a user can mistakenly
enter a MERGE JOIN b, expecting to specify a join hint, and instead
without noticing be requesting a AS "merge" JOIN b instead.

(The observant eye may notice that the grammar did not allow a naked
MERGE directly after an identifier, and instead required it to follow
one of the other JOIN keywords: INNER, LEFT, RIGHT etc. However, this
is a red herring: because these other JOIN keywords are optional, a
user can legitimately expect them to also be optional when a hint is
specified.)

Release note: None

This commit changes the syntax `a MERGE JOIN b` to `a JOIN @merge b`.

The reason for this is that the general syntax `a b JOIN c` is also
standard SQL shorthand for `a AS b JOIN c`. So a user can mistakenly
enter `a MERGE JOIN b`, expecting to specify a join hint, and instead
without noticing be requesting `a AS "merge" JOIN b` instead.

(The observant eye may notice that the grammar did not allow a naked
MERGE directly after an identifier, and instead required it to follow
one of the other JOIN keywords: INNER, LEFT, RIGHT etc. However, this
is a red herring: because these other JOIN keywords are optional, a
user can legitimately expect them to also be optional when a hint is
specified.)

Release note: None
@knz knz requested review from a team March 4, 2019 18:54
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 4, 2019

cc @RaduBerinde @awoods187.

This PR is merely my strawman counter-proposal that both solves #35347 and also enables us to extend the syntax further with additional join flags, e.g. via the syntax @{...} which we also use to specify scan hints.

(The PR is not yet complete -- I'd need to extend the logic tests -- but I'll do that after we've settled on an approach.)

@knz knz requested a review from RaduBerinde March 4, 2019 18:56
@RaduBerinde
Copy link
Copy Markdown
Member

CC @awoods187 @andy-kimball

IMO the consistency with index hints is not a plus - table@index makes a lot of sense (and @{..} was an extension of that), whereas JOIN @MERGE seems kind of confusing to me. Would JOIN { MERGE } work? (I didn't try your approach of putting the extra chars inside opt_join_hint)

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 4, 2019

Maybe @ is not nice, but the braces would be a rather ... surprising new grouping.

What about + perhaps? a join +hash b?

@RaduBerinde
Copy link
Copy Markdown
Member

Don't know what to say. inner merge join was pretty self-explanatory whereas these alternatives look cryptic. I lean toward sticking to the existing syntax despite #35347 (hints are supposed to be an "advanced" feature to be used with care, so I'm not too worried about improper use).

@RaduBerinde
Copy link
Copy Markdown
Member

Forgot to add: I will defer to @awoods187 if he has a strong opinion.

@andy-kimball
Copy link
Copy Markdown
Contributor

One other piece of data: SQLS uses the INNER MERGE JOIN syntax that we've currently got implemented. I think we can still add a more general-purpose hinting syntax on top of that. But I think join hints are so common and important that it's OK to use a specialized syntax for that case. It's OK if they don't fit into the general-purpose syntax later on.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 4, 2019

ok I'd agree with supporting the MSSQL syntax for convenience but for the purpose of regularity we must document syntax that has predictable behavior, and the mssql approach does not pass that criterion.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 4, 2019

(i.e. I suggest we can implement both but only document the one that's idiot-proof simple and predictable)

@RaduBerinde
Copy link
Copy Markdown
Member

RaduBerinde commented Mar 5, 2019

Another datapoint - Spanner SQL uses similar syntax (INNER HASH JOIN) but also an extended syntax like JOIN@{JOIN_TYPE=HASH_JOIN}. Their best practices docs only mention the latter.

Edit: the former syntax only supports HASH (which is a reserved keyword), and it supports a HASH JOIN b without INNER.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 5, 2019

My vote is to do what spanner does.

@RaduBerinde
Copy link
Copy Markdown
Member

My vote is to do what spanner does.

Note that we'd have to reserve the HASH et al keywords to support both these variants.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 5, 2019

No I disagree. Supporting both variants comes back to the initial issue -- we'd need to inform users that a HASH JOIN b is not a thing, and that only a INNERT HASH JOIN b is supported (like in Spanner). i.e. Push back against Andy's concern with explanatory docs.

@RaduBerinde
Copy link
Copy Markdown
Member

Ok, so when you say "do what spanner does", what exactly do you mean? Spanner supports both a INNER HASH JOIN b and a HASH JOIN b, as well as the extended syntax. Do you mean only the extended @{JOIN_TYPE=HASH_JOIN} syntax?

@cockroachdb cockroachdb deleted a comment from knz Mar 5, 2019
@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 5, 2019

In your comment above:

  • I read that Spanner supports a INNER HASH JOIN b
  • I read that Spanner supports a JOIN@{JOIN_TYPE=HASH_JOIN} b
  • I do not read that Spanner supports a HASH JOIN b (I would be surprised if they did)

If my reading is right, then I recommend doing these 3 things and not support a HASH JOIN b.
If my reading is wrong, then I do not recommend doing what Spanner does any more :) the idea we should not reserve more keywords is absolutely the strongest.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 5, 2019

Oh I just see you edited that comment. I'll take my support of Spanner back then.

@RaduBerinde
Copy link
Copy Markdown
Member

Sorry, should have posted a separate comment.

I'm not convinced that we should keep multiple variants. If we implement and plan to document only the extended one, I think we should remove the short one and avoid this potential point of confusion.

@andy-kimball
Copy link
Copy Markdown
Contributor

My vote is to do what Spanner does and support both (eventually, since I don't think we should do anything more for this release). I think the INNER HASH JOIN syntax is more memorable and intuitive, whereas JOIN@{JOIN_TYPE=HASH_JOIN} is more general and extensible. I think supporting both is fine from a design point of view. Having multiple ways to do the same thing is fine if one is simple and specialized for the 90% case, and the other is more complex but generalized for the 100% case.

@awoods187
Copy link
Copy Markdown

We also know of another type of hint that Oracle and MySQL use that is a variant of C-style comment syntax:

specify a hint using /* + ... */
MySQL: https://dev.mysql.com/doc/refman/5.7/en/optimizer-hints.html
Oracle: https://docs.oracle.com/cd/B19306_01/server.102/b14211/hintsref.htm#i27644

I think we should proceed with the a INNER HASH JOIN b syntax and then invest more in one of the other more extensible syntax options. It will be a little annoying that users can run into implicit aliases but we can work with docs around this in the short term.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Mar 7, 2019

As discussed offline: we're keeping the current syntax. Will need documentation effort to clarify the possible ambiguity.

@knz knz closed this Mar 7, 2019
@knz knz deleted the 20190304-sql-hints branch March 7, 2019 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants