Skip to content

sql/parser: recognize qualified type names#47216

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rohany:sql-gram
Apr 11, 2020
Merged

sql/parser: recognize qualified type names#47216
craig[bot] merged 1 commit intocockroachdb:masterfrom
rohany:sql-gram

Conversation

@rohany
Copy link
Copy Markdown
Contributor

@rohany rohany commented Apr 8, 2020

This PR allows the parser to recognize qualified typenames
and generic typenames.

To accomplish this, we move some type and typecast related
productions closer to how the postgres grammar operates,
and separate out type names from the unreserved_keyword_list.

Release note: None

@rohany rohany requested a review from knz April 8, 2020 18:00
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 8, 2020

Opening up a draft with the ideas i mentioned, please take a look!

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 8, 2020

❌ The GitHub CI (Cockroach) build has failed on 2307f3dc.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 8, 2020

Nice, the failures don't seem worrying, just slightly different output in some error messages.

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 8, 2020

❌ The GitHub CI (Cockroach) build has failed on b8b78641.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@knz knz changed the title parser: add unimplemented support for parsing qualified types sql/parser: recognize qualified type names Apr 9, 2020
@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 9, 2020

Good start!

Here are some things that postgres supports but your patch doesn't (yet):

  • SELECT 1::notatype (custom type name without qualification)
  • CREATE TABLE t(x notatype) (ditto)
  • SELECT my.type '1' (typed literal)
  • SELECT 1::int4.mytype (all the predefined type names are also valid db/schema names)

@rohany rohany force-pushed the sql-gram branch 2 times, most recently from a416562 to 2ba58aa Compare April 9, 2020 14:41
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 9, 2020

❌ The GitHub CI (Cockroach) build has failed on 2ba58aa7.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 9, 2020

PTAL -- my original commit had support for the first two but I forgot to add a test.

Adding @jordanlewis here, might have some comments.

I think that we because added type aliases as keywords (something postgres doesn't do) led to a variety of conflicts. I removed the keywords for these and moved some of these aliases to be resolved by a lookup table during parsing (it replaces what we have now, but an argument could be that this should only be done during typechecking).

For these aliases, we can reference them like you mentioned above (select 1::int4.ty), but for some reserved types like select 1::int.ty both us and postgres can't.

Additionally, it seems like allowing types to be named FAMILY with no quotes causes some abiguity that I can't seem to get around, which led me to separate our type_func_name_keywords into two sets: with and without our extensions (eg. FAMILY).

Does this all make sense, or am i just making this tower of blocks more fragile? I don't have a good intuition for this yet.

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 9, 2020

❌ The GitHub CI (Cockroach) build has failed on 6f43cf36.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@rohany rohany force-pushed the sql-gram branch 2 times, most recently from 443a0df to 21bc1b1 Compare April 9, 2020 20:37
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 9, 2020

Additionally I don't see a way of parsing certain type keywords that are used in other places (like JSON, STRING) without quotes as qualified accessors for types -- e.g. I don't see a way to get SELECT 1::json.mytype to work due to a conflict related to the <TYPEname> SCONST implicit casting syntax. This casting syntax is driving a lot of keyword separation introduced in this PR.

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 9, 2020

❌ The GitHub CI (Cockroach) build has failed on 21bc1b1a.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 10, 2020

❌ The GitHub CI (Cockroach) build has failed on d07d3c57.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 10, 2020

Well it's odd to me that this is so complex, given that pg supports it with comparatively fewer grammar rules.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 10, 2020

I think I'm bringing us closer to pg here -- PG doesn't have all of these typenames in unreserved, so it doesn't run into the conflicts I'm trying to fix here (its unclear to me how it avoids conflicts for the few keywords I have left in unreserved_type_keyword). The apparent complexity of this PR is just in separating these unreserved_type_keywords and "FAMILY" from valid typenames. The work this PR is doing to dispatch to a known type when seeing an identifier should be removed in future PRs because it seems like type resolution that depends on a search path -- it's incorrect to be doing it at parse time here.

The most confusing this about pg's grammar around this for me is their rule for GenericType. it looks like this:

type_function_name opt_type_modifiers {
 $$ = makeTypeName($1);
 $$->typmods = $2;
 $$->location = @1;
}

and type_function_name looks like

type_function_name:	IDENT { $$ = $1; }
  | unreserved_keyword		{ $$ = pstrdup($1); }
  | type_func_name_keyword	{ $$ = pstrdup($1); }

So I don't know where they get their qualifications from.

Our conflicts arise from names potentially starting with unreserved_keywords, which causes conflicts when we need to construct types out of those keywords.

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 10, 2020

You misread pg's grammar:

GenericType:
            type_function_name opt_type_modifiers
            | type_function_name attrs opt_type_modifiers
        ;

The magic is in attrs (like it was in crdb's original .y file)

@rohany rohany force-pushed the sql-gram branch 2 times, most recently from 916f117 to 4cc9fdb Compare April 10, 2020 18:35
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 10, 2020

Alright, I was able to simplify this more. I got rid of the separation between reserved and unreserved type keywords, which don't exist in PG's grammar. Now the only separation is in the type_func_keywords to possibly contain our extensions.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very, very good.

Reviewed 4 of 4 files at r1, 1 of 7 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rohany)


pkg/sql/parser/parse_test.go, line 3287 at r2 (raw file):

		{`SELECT 1::int4.typ`, 0, `qualified types`, ``},
		{`SELECT 1::db.schem.typ`, 0, `qualified types`, ``},
		{`CREATE TABLE t (x special.type)`, 0, `qualified types`, ``},

can you add the additional CREATE TABLE tests with composite type names containing keywords here to see which errors are reported.

Also verify:

  • <composite type> ARRAY [ ]
  • <composite type> ARRAY [ 123 ]
  • 123 IS OF (<composite type>, <composite type>)

also with mixes of keywords and non-keywords.


pkg/sql/parser/sql.y, line 8354 at r2 (raw file):

  }
| func_name '(' expr_list opt_sort_clause ')' SCONST { return unimplemented(sqllex, $1.unresolvedName().String() + "(...) SCONST") }
// The key here is that none of the keywords in the func_name_no_type_reserved

Would it be possible to pull both this rule and that below (const_typename SCONST) into a separate non-terminal called typed_literal or something and explain it as a separate thing.
This makes it easier visually to connect the two things together.


pkg/sql/parser/sql.y, line 8357 at r2 (raw file):

// production can overlap with the type rules in const_typename, otherwise
// we will have conflicts between this rule and the one below.
| func_name_no_type_reserved SCONST

Separate comment: maybe the name func_name_excluding_const_typenames would be better? This way it's clearer visually that they don't overlap when quickly glancing through.


pkg/sql/parser/sql.y, line 9802 at r2 (raw file):

// Type/function identifier without CRDB extra reserved keywords.
type_function_name_no_crdb_extra:

Keep the naming pattern from type_func_name_keyword:

type_function_name_no_crdb_extra -> type_func_name_keyword_no_crdb_extra

This ensures that both type_func_name_keyword AND type_func_name_keyword_no_crdb_extra get highlighted in a text editor search when searching for the common prefix.


pkg/sql/parser/sql.y, line 10158 at r2 (raw file):

type_func_name_keyword:
  pg_type_func_name_keyword
| cockroachdb_extra_type_func_name_keyword

You already used _no_crdb_extra above. I recommend the following renames:

  • pg_type_func_name_keyword -> func_name_keyword_no_crdb_extra
  • cockroachdb_extra_type_func_name_keyword -> func_name_keyword_crdb_extra

Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/sql/parser/parse_test.go, line 3287 at r2 (raw file):

Previously, knz (kena) wrote…

can you add the additional CREATE TABLE tests with composite type names containing keywords here to see which errors are reported.

Also verify:

  • <composite type> ARRAY [ ]
  • <composite type> ARRAY [ 123 ]
  • 123 IS OF (<composite type>, <composite type>)

also with mixes of keywords and non-keywords.

Done


pkg/sql/parser/sql.y, line 8354 at r2 (raw file):

Previously, knz (kena) wrote…

Would it be possible to pull both this rule and that below (const_typename SCONST) into a separate non-terminal called typed_literal or something and explain it as a separate thing.
This makes it easier visually to connect the two things together.

Done, that makes thing clearer.


pkg/sql/parser/sql.y, line 8357 at r2 (raw file):

Previously, knz (kena) wrote…

Separate comment: maybe the name func_name_excluding_const_typenames would be better? This way it's clearer visually that they don't overlap when quickly glancing through.

I renamed it -- after I got rid of separate typenames, this should just be named func_name_no_crdb_extra, since that is the only difference now.


pkg/sql/parser/sql.y, line 9802 at r2 (raw file):

Previously, knz (kena) wrote…

Keep the naming pattern from type_func_name_keyword:

type_function_name_no_crdb_extra -> type_func_name_keyword_no_crdb_extra

This ensures that both type_func_name_keyword AND type_func_name_keyword_no_crdb_extra get highlighted in a text editor search when searching for the common prefix.

I think type_function_name_no_crdb_extra follows along with the patterns we have so far -- it mirrors the rule type_function_name, so I don't want to change that prefix.


pkg/sql/parser/sql.y, line 10158 at r2 (raw file):

Previously, knz (kena) wrote…

You already used _no_crdb_extra above. I recommend the following renames:

  • pg_type_func_name_keyword -> func_name_keyword_no_crdb_extra
  • cockroachdb_extra_type_func_name_keyword -> func_name_keyword_crdb_extra

I did something similar. I need to end the production with keyword so the keyword generator picks it up, so I can't follow this exactly.

@rohany rohany marked this pull request as ready for review April 10, 2020 20:57
Copy link
Copy Markdown
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great simplification! Looks good to me.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rohany)


pkg/sql/parser/parse_test.go, line 2652 at r3 (raw file):

		{
			`SELECT CAST(1.2+2.3 AS notatype)`,
			`at or near ")": syntax error: type does not exist

Hmm... I'd definitely like to see this error improved somehow. Can it say type "notatype" does not exist?


pkg/sql/parser/sql.y, line 7355 at r3 (raw file):

          case 0:
            // Note: we can only report an unimplemented error for specific
            // known type names. Anything else may return PII.

nit: I'm not sure the phrase "PII" is used much here - how about "sensitive info"?

This PR allows the parser to recognize qualified typenames
and generic typenames.

To accomplish this, we move some type and typecast related
productions closer to how the postgres grammar operates,
and separate out type names from the unreserved_keyword_list.

Release note: None
Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @knz)


pkg/sql/parser/parse_test.go, line 2652 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Hmm... I'd definitely like to see this error improved somehow. Can it say type "notatype" does not exist?

Done.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: 💯 nice

Reviewed 2 of 7 files at r2, 3 of 5 files at r3, 2 of 2 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Apr 11, 2020

bors r+

TFTR!

@craig
Copy link
Copy Markdown
Contributor

craig Bot commented Apr 11, 2020

Build succeeded

@craig craig Bot merged commit aa705a8 into cockroachdb:master Apr 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants