Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtins: automatically add builtins for each type cast #97093

Merged
merged 3 commits into from
Mar 2, 2023

Conversation

otan
Copy link
Contributor

@otan otan commented Feb 14, 2023

In PG, casts from one type to another can also be use the function syntax, e.g. date(now()) = now()::date. This is done at type resolution time.

Unfortunately we do not support that in type resolution, and from experience long ago it was tricky to do so (happy to be proven wrong).

This change instead defines a builtin for each castable type, which emulates the same behavior. We already kind of do this for oid and inet, so this isn't much worse right?

Release note (sql change): Each type cast is now expressable as a function, e.g. now()::date can be expressed as date(now()).

Resolves #97067

@otan otan requested a review from a team February 14, 2023 05:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss
Copy link
Collaborator

rafiss commented Feb 14, 2023

this is cool! did you also see #94859

there's one note about PG though -- if i recall correctly, in postgres each cast is implemented by calling the respective builtin function. this PR is going the other way though: each builtin function that is being added here is implemented by calling into the cast code. i'm not sure how much this distinction matters. one thing that could be related is in the pg_cast table, we should start filling out the castfunc column

@otan
Copy link
Contributor Author

otan commented Feb 14, 2023

there's one note about PG though -- if i recall correctly, in postgres each cast is implemented by calling the respective builtin function.

that's true; ideally we make this work in the type resolution layer like pg does:

If no exact match is found, see if the function call appears to be a special type conversion request. This happens if the function call has just one argument and the function name is the same as the (internal) name of some data type. Furthermore, the function argument must be either an unknown-type literal, or a type that is binary-coercible to the named data type, or a type that could be converted to the named data type by applying that type's I/O functions (that is, the conversion is either to or from one of the standard string types). When these conditions are met, the function call is treated as a form of CAST specification. [11]

i don't recall it being easy to put this into type resolution (pls prove me wrong :D).


this is also annoying as this PR will probably tie us in deeper to the int8/float8 by default instead of int4/float4. basically to make this work:

query error OID out of range: -2147483649
SELECT oid(-2147483649)

@otan otan force-pushed the autocast branch 2 times, most recently from 60991c5 to 7efa731 Compare February 14, 2023 07:52
@otan otan requested a review from a team as a code owner February 14, 2023 07:52
@otan otan force-pushed the autocast branch 3 times, most recently from e378bbe to c764856 Compare February 14, 2023 20:54
@otan otan requested a review from a team as a code owner February 14, 2023 20:54
@otan otan requested a review from cucaroach February 14, 2023 20:54
@mgartner
Copy link
Collaborator

i don't recall it being easy to put this into type resolution (pls prove me wrong :D).

If you define the builtins, I wonder if the function resolution we have will work well enough. It's unlikely to work 100% consistently with Postgres until we rewrite our function/type resolution (see #75101 and ##88374 (comment)).

@otan
Copy link
Contributor Author

otan commented Feb 14, 2023

If you define the builtins, I wonder if the function resolution we have will work well enough.

based on my manual tests + my random test for casts it works good enough for now.

It's unlikely to work 100% consistently with Postgres until we rewrite our function/type resolution

we'll get there one day. can't say i didn't try :P

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me but this is a a little outside my purview, would like to defer to Rafi or Marcus.

Reviewed 8 of 13 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan)


pkg/internal/sqlsmith/schema.go line 500 at r1 (raw file):

		if n := tree.Name(def.Name); n.String() != def.Name {
			// sqlsmith doesn't seem to know how to quote these.

Whats going on here?

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 13 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @otan)


pkg/sql/sem/builtins/pg_builtins.go line 200 at r1 (raw file):

	}
	for toOID, def := range castBuiltins {
		n := strings.ToLower(types.OidToType[toOID].SQLString())

I don't think SQLString works here - Postgres's naming of these functions doesn't make the SQLString naming of a type. For example, varchar, bit, decimal, and char don't seem to be functions in Postgres. It looks like they might exist, but with different names. For example bpchar seems to work for the char type:

marcus=# select pg_typeof('foo'::CHAR), pg_typeof(bpchar('foo'));
 pg_typeof | pg_typeof
-----------+-----------
 character | character
(1 row)

pkg/sql/sem/builtins/cast_test.go line 68 at r1 (raw file):

	switch typ.Oid() {
	case oid.T_char:
		return "char"

oid.T_char represents the "char" type, not to be confused with the CHAR type:

marcus=# SELECT pg_typeof('foo'::CHAR), pg_typeof('foo'::"char"), pg_typeof("char"('foo'));
 pg_typeof | pg_typeof | pg_typeof
-----------+-----------+-----------
 character | "char"    | "char"
(1 row)

So you'll want to keep the double quotes in the query, e.g., SELECT "char"(...) instead of SELECT char(...). I think typ.SQLString already includes the double quotes for this type.

@otan
Copy link
Contributor Author

otan commented Feb 15, 2023

pkg/sql/sem/builtins/pg_builtins.go line 200 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I don't think SQLString works here - Postgres's naming of these functions doesn't make the SQLString naming of a type. For example, varchar, bit, decimal, and char don't seem to be functions in Postgres. It looks like they might exist, but with different names. For example bpchar seems to work for the char type:

marcus=# select pg_typeof('foo'::CHAR), pg_typeof(bpchar('foo'));
 pg_typeof | pg_typeof
-----------+-----------
 character | character
(1 row)

i think this is still right, you just need to quote the name.

otan=# select "varbit"('0100');
 varbit
--------
 0100
(1 row)
otan=# select "numeric"(1234::int);
 numeric
---------
    1234
(1 row)
otan=# select "char"(96::int);
 char
------
 `
(1 row)

@otan
Copy link
Contributor Author

otan commented Feb 15, 2023

pkg/sql/sem/builtins/cast_test.go line 68 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

oid.T_char represents the "char" type, not to be confused with the CHAR type:

marcus=# SELECT pg_typeof('foo'::CHAR), pg_typeof('foo'::"char"), pg_typeof("char"('foo'));
 pg_typeof | pg_typeof | pg_typeof
-----------+-----------+-----------
 character | "char"    | "char"
(1 row)

So you'll want to keep the double quotes in the query, e.g., SELECT "char"(...) instead of SELECT char(...). I think typ.SQLString already includes the double quotes for this type.

no this is right, tree.Name("char").String() makes it a "char".

@otan
Copy link
Contributor Author

otan commented Feb 15, 2023

pkg/sql/sem/builtins/cast_test.go line 68 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

no this is right, tree.Name("char").String() makes it a "char".

ah nah good catch, had to rearrange a few things around.

@otan
Copy link
Contributor Author

otan commented Feb 15, 2023

pkg/sql/sem/builtins/cast_test.go line 68 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

ah nah good catch, had to rearrange a few things around.

(sorry i mean, this is still the same, but bpchar / numeric was a surprise)

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you fill in the castfunc column of pg_cast?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @mgartner, and @otan)


docs/generated/sql/functions.md line 361 at r2 (raw file):

</table>

### Cast functions

i'd actually be in favor of marking all these as hidden. what do you think?

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, castfunc would be "wrong" though compared to PG. is that ok?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @mgartner, and @rafiss)


docs/generated/sql/functions.md line 361 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i'd actually be in favor of marking all these as hidden. what do you think?

Done.

@rafiss
Copy link
Collaborator

rafiss commented Feb 21, 2023

wrong in which way? you mean it would have different function OIDs? i think that's fine -- the OID would still refer to the correct function in pg_proc.

@otan
Copy link
Contributor Author

otan commented Feb 21, 2023

hmm not so good for something like castfunc::regproc - i think i prefer NULL for now until we rehaul our type overload system to make this point the correct way (i.e. casts call funcs, not funcs being derived from casts)

@rafiss
Copy link
Collaborator

rafiss commented Feb 21, 2023

i still don't follow. what's wrong with castfunc::regproc? i think if you shared the error or incorrect result i would understand better

@otan
Copy link
Contributor Author

otan commented Feb 21, 2023

ah, nvm, i see what you're getting at.

@otan
Copy link
Contributor Author

otan commented Feb 21, 2023

hmm it's a little tricky because we defined casts on all oids, but the functions derived is in actuality based on the family.

@otan otan force-pushed the autocast branch 5 times, most recently from bed6dd5 to f13beb7 Compare February 24, 2023 05:07
@otan otan requested review from cucaroach and removed request for cucaroach February 24, 2023 05:19
@otan
Copy link
Contributor Author

otan commented Feb 28, 2023

(changes made + ready for another look!)

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @mgartner, and @otan)


pkg/sql/sem/builtins/fixed_oids.go line 2048 at r6 (raw file):

	2070: `crdb_internal.num_inverted_index_entries(val: tsvector, version: int) -> int`,
	2072: `crdb_internal.upsert_dropped_relation_gc_ttl(desc_id: int, gc_ttl: interval) -> bool`,
	2073: `box2d(geometry: geometry) -> box2d`,

just wanna confirm - so if someone adds a new type later on, they'll get an init error telling them they need to add an entry into this fixed OID map?

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r=rafiss

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @mgartner, and @rafiss)


pkg/sql/sem/builtins/fixed_oids.go line 2048 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

just wanna confirm - so if someone adds a new type later on, they'll get an init error telling them they need to add an entry into this fixed OID map?

yep!

@craig
Copy link
Contributor

craig bot commented Mar 1, 2023

👎 Rejected by code reviews

otan added 3 commits March 2, 2023 14:09
PG adds the mask if it doesn't exist - but only on casts (not pgwire
formatting).

Release note (bug fix): Previously, casting an `inet` to a string type
omitted the mask if a mask was not provided. This didn't match
postgresql and is now resolved.
In PG, casts from one type to another can also be use the function
syntax, e.g. `date(now())` = `now()::date`. This is done at type
resolution time.

Unfortunately we do not support that in type resolution, and from
experience long ago it was tricky to do so (happy to be proven wrong).

This change instead defines a builtin for each castable type, which
emulates the same behavior. We already kind of do this for `oid` and
`inet`, so this isn't much worse right?

Release note (sql change): Each type cast is now expressable as a
function, e.g. `now()::date` can be expressed as `date(now())`.
This commit correct fills in the castfunc column in pg_catalog.pg_cast
now that we have all the builtins defined.

Release note: None
@otan
Copy link
Contributor Author

otan commented Mar 2, 2023

bors r=rafiss

@craig
Copy link
Contributor

craig bot commented Mar 2, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: support postgresql date() function to convert timestamp to date
5 participants