Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix nested expression type-checking #109635

Merged
merged 1 commit into from Aug 31, 2023

Conversation

mgartner
Copy link
Collaborator

PR #108387 introduced new logic to type-checking that allows nested
expressions of a single expression to have different types in some cases
when the types can be implicitly casted to a common type. For example,
the expression ARRAY[1::INT, 1::FLOAT8] would have failed
type-checking but is not successfully typed as []FLOAT8.

However, #108387 does not add an implicit cast to ensure that each
nested expression actually has that type during execution, which causes
internal errors in some cases. This commit add the necessary implicit
casts.

Fixes #109629

There is no release note because the bug is not present in any releases.

Release note: None

@mgartner mgartner requested review from msirek and a team August 28, 2023 21:47
@mgartner mgartner requested a review from a team as a code owner August 28, 2023 21:47
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/sem/tree/type_check.go line 2672 at r1 (raw file):

			if typ := typedExpr.ResolvedType(); !(typ.Equivalent(candidateType) || typ.Family() == types.UnknownFamily) {
				return nil, nil, unexpectedTypeError(exprs[i], candidateType, typ)
			}

This error checking code should be moved down below and altered because now that we apply casts, the type of typedExpr may change and should not error out when cast. This is in line with Postgres, for example, the following should succeed:

# The DATE should be implicitly CAST to a timestamp.
query T
SELECT ARRAY['2005-11-14 21:51:05.000581':::TIMESTAMP, '1989-01-15':::DATE];
----
{"2005-11-14 21:51:05.000581","1989-01-15 00:00:00"}

pkg/sql/sem/tree/type_check.go line 2677 at r1 (raw file):

		if !constIdxs.Empty() {
			if _, err := typeCheckSameTypedConsts(s, candidateType, true); err != nil {
				return nil, nil, err

typeCheckSameTypedConsts may error out here without giving us a chance to apply the cast. One way to handle this may be to pass a new requireType parameter as false here, meaning typeCheckSameTypedConsts should call s.exprs[i].TypeCheck instead oftypeCheckAndRequire. Similar comment for typeCheckSameTypedPlaceholders.

Since we're allowing the above loop over s.resolvableIdxs to modify candidateType, I'm wondering if we'd need to also give typeCheckSameTypedConsts and typeCheckSameTypedPlaceholders the capability of modifying candidateType in the same manner (at least in these specific calls).


pkg/sql/sem/tree/type_check.go line 2691 at r1 (raw file):

		// check again.
		// NOTE: We don't add this cast if any part of the candidate type is a
		// tuple type, becuase we don't have a way to serialize this cast over

spelling: becuase -> because


pkg/sql/sem/tree/type_check.go line 2705 at r1 (raw file):

			return false
		}
		if !hasNestedTupleType(candidateType) {

Can we detect if this type checking is being done in the optimizer vs. during distsql execution? There's this failing test case (in fakedist config), which could benefit from applying the implicit cast:

statement ok
CREATE TABLE tab (b BOOL);
INSERT INTO tab VALUES (false);

# The NULL should be cast as a named tuple, and referencing
# _case-expression_.a in the query should not cause a panic.
query I
SELECT (CASE WHEN b THEN ((ROW(1) AS a)) ELSE NULL END).a from tab
----
NULL

pkg/sql/sem/tree/type_check.go line 2707 at r1 (raw file):

		if !hasNestedTupleType(candidateType) {
			for i, e := range typedExprs {
				if !e.ResolvedType().Identical(candidateType) {

Add error checking here, something like:

if !cast.ValidCast(e.ResolvedType(), candidateType, cast.ContextImplicit) {
		typ := e.ResolvedType()
		if !(typ.Equivalent(candidateType) || typ.Family() == types.UnknownFamily) {
				return nil, nil, unexpectedTypeError(exprs[i], candidateType, typ)
		}
} else {
		typedExprs[i] = NewTypedCastExpr(e, candidateType)
}

typeCheckSameTypedConsts and typeCheckSameTypedPlaceholders never checked for valid implicit casts. And above, when cast.ValidCast(candidateType, typ, cast.ContextImplicit) is false, we never check that the cast in the other direction to the candidateType is valid.


pkg/sql/sem/tree/type_check.go line 2711 at r1 (raw file):

				}
			}
		}

Add an else clause here for the moved error checking:

		} else {
			for i, e := range typedExprs {
				if typ := e.ResolvedType(); !(typ.Equivalent(candidateType) || typ.Family() == types.UnknownFamily) {
					return nil, nil, unexpectedTypeError(exprs[i], candidateType, typ)
				}
			}
		}

@rafiss
Copy link
Collaborator

rafiss commented Aug 29, 2023

Do you know if this test can be uncommented?

# TODO(rytaft): Uncomment this case once #109105 is fixed.
# query B
# SELECT (CASE WHEN b THEN ((ROW(1) AS a)) ELSE NULL END).a from t78159
# ----
# NULL

@msirek
Copy link
Contributor

msirek commented Aug 29, 2023

Do you know if this test can be uncommented?

# TODO(rytaft): Uncomment this case once #109105 is fixed.
# query B
# SELECT (CASE WHEN b THEN ((ROW(1) AS a)) ELSE NULL END).a from t78159
# ----
# NULL

That test is still failing, but I have an open question in this review on how we can support it. Apparently there's some issue with serializing casts which use tuple types in distsql execution.

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough review.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @msirek)


pkg/sql/sem/tree/type_check.go line 2672 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

This error checking code should be moved down below and altered because now that we apply casts, the type of typedExpr may change and should not error out when cast. This is in line with Postgres, for example, the following should succeed:

# The DATE should be implicitly CAST to a timestamp.
query T
SELECT ARRAY['2005-11-14 21:51:05.000581':::TIMESTAMP, '1989-01-15':::DATE];
----
{"2005-11-14 21:51:05.000581","1989-01-15 00:00:00"}

I've left a TODO, and I'll attempt this in a follow-up PR so it doesn't delay this fix if it's non-trivial.


pkg/sql/sem/tree/type_check.go line 2677 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

typeCheckSameTypedConsts may error out here without giving us a chance to apply the cast. One way to handle this may be to pass a new requireType parameter as false here, meaning typeCheckSameTypedConsts should call s.exprs[i].TypeCheck instead oftypeCheckAndRequire. Similar comment for typeCheckSameTypedPlaceholders.

Since we're allowing the above loop over s.resolvableIdxs to modify candidateType, I'm wondering if we'd need to also give typeCheckSameTypedConsts and typeCheckSameTypedPlaceholders the capability of modifying candidateType in the same manner (at least in these specific calls).

According to Postgres documentation, expressions with unknown type, like these "constants" (defined as "literals" in the Postgres docs), should be ignored for choosing a candidate type, unless they all have the unknown type.

The docs don't specifically mention placeholder expressions, so I assume we are doing the wrong thing by have special logic for them.

Ultimately, all this type-checking logic is largely inconsistent with Postgres. This commit is a small step in the right direction, but it'll take a lot more effort to perfect. I'd rather not tackle that at the moment, since my goal here is to quell this specific test failure.


pkg/sql/sem/tree/type_check.go line 2691 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

spelling: becuase -> because

Done.


pkg/sql/sem/tree/type_check.go line 2705 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

Can we detect if this type checking is being done in the optimizer vs. during distsql execution? There's this failing test case (in fakedist config), which could benefit from applying the implicit cast:

statement ok
CREATE TABLE tab (b BOOL);
INSERT INTO tab VALUES (false);

# The NULL should be cast as a named tuple, and referencing
# _case-expression_.a in the query should not cause a panic.
query I
SELECT (CASE WHEN b THEN ((ROW(1) AS a)) ELSE NULL END).a from tab
----
NULL

I'm not sure I understand your question. This type-checking is happening during semantic analysis.


pkg/sql/sem/tree/type_check.go line 2707 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

Add error checking here, something like:

if !cast.ValidCast(e.ResolvedType(), candidateType, cast.ContextImplicit) {
		typ := e.ResolvedType()
		if !(typ.Equivalent(candidateType) || typ.Family() == types.UnknownFamily) {
				return nil, nil, unexpectedTypeError(exprs[i], candidateType, typ)
		}
} else {
		typedExprs[i] = NewTypedCastExpr(e, candidateType)
}

typeCheckSameTypedConsts and typeCheckSameTypedPlaceholders never checked for valid implicit casts. And above, when cast.ValidCast(candidateType, typ, cast.ContextImplicit) is false, we never check that the cast in the other direction to the candidateType is valid.

Done.


pkg/sql/sem/tree/type_check.go line 2711 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

Add an else clause here for the moved error checking:

		} else {
			for i, e := range typedExprs {
				if typ := e.ResolvedType(); !(typ.Equivalent(candidateType) || typ.Family() == types.UnknownFamily) {
					return nil, nil, unexpectedTypeError(exprs[i], candidateType, typ)
				}
			}
		}

Done.

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea keeping the scope of this fix limited.
:lgtm:

Reviewed 3 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/sem/tree/type_check.go line 2672 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I've left a TODO, and I'll attempt this in a follow-up PR so it doesn't delay this fix if it's non-trivial.

OK


pkg/sql/sem/tree/type_check.go line 2677 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

According to Postgres documentation, expressions with unknown type, like these "constants" (defined as "literals" in the Postgres docs), should be ignored for choosing a candidate type, unless they all have the unknown type.

The docs don't specifically mention placeholder expressions, so I assume we are doing the wrong thing by have special logic for them.

Ultimately, all this type-checking logic is largely inconsistent with Postgres. This commit is a small step in the right direction, but it'll take a lot more effort to perfect. I'd rather not tackle that at the moment, since my goal here is to quell this specific test failure.

OK, I wasn't aware constIdxs only dealt with unknown type expressions. It seems like this area needs a bit more investigation, but we can defer that in favor of a succinct fix here.


pkg/sql/sem/tree/type_check.go line 2705 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I'm not sure I understand your question. This type-checking is happening during semantic analysis.

I'm just trying to understand the cases where casts involving tuple types can fail. I successfully handled this case in my branch with no error (distributed plan), so I'm wondering in which scenarios these casts could fail. In any case, this test case could be handled in a separate fix.

@mgartner
Copy link
Collaborator Author

pkg/sql/sem/tree/type_check.go line 2705 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

I'm just trying to understand the cases where casts involving tuple types can fail. I successfully handled this case in my branch with no error (distributed plan), so I'm wondering in which scenarios these casts could fail. In any case, this test case could be handled in a separate fix.

The cast needs to be representable in a SQL string in order to distribute. AFAIK we don't have a way to represent a cast to a tuple type because there is no syntax for a specific tuple type. There is only the generic tuple type RECORD.

Did you test cases all have constant values? If so, the cast might have been folded during optimization and never made it to the DistSQL phase. A non-constant value might do the trick.

I'll try to put together a test case that fails, to be sure that this is needed.

PR cockroachdb#108387 introduced new logic to type-checking that allows nested
expressions of a single expression to have different types in some cases
when the types can be implicitly casted to a common type. For example,
the expression `ARRAY[1::INT, 1::FLOAT8]` would have failed
type-checking but is not successfully typed as `[]FLOAT8`.

However, cockroachdb#108387 does not add an implicit cast to ensure that each
nested expression actually has that type during execution, which causes
internal errors in some cases. This commit add the necessary implicit
casts.

Fixes cockroachdb#109629

There is no release note because the bug is not present in any releases.

Release note: None
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @msirek)


pkg/sql/sem/tree/type_check.go line 2705 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

The cast needs to be representable in a SQL string in order to distribute. AFAIK we don't have a way to represent a cast to a tuple type because there is no syntax for a specific tuple type. There is only the generic tuple type RECORD.

Did you test cases all have constant values? If so, the cast might have been folded during optimization and never made it to the DistSQL phase. A non-constant value might do the trick.

I'll try to put together a test case that fails, to be sure that this is needed.

I was unable to come up with a test that breaks this — I think it's because candidateType never contains a tuple type. I've removed this logic. Thanks for pushing back on this.

@mgartner
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 31, 2023

Build succeeded:

@craig craig bot merged commit 2e02cfb into cockroachdb:master Aug 31, 2023
8 checks passed
@mgartner mgartner deleted the 109629-implicit-cast branch September 1, 2023 16:53
msirek pushed a commit to msirek/cockroach that referenced this pull request Sep 8, 2023
This uncomments a unit test from cockroachdb#78159 which was fixed by cockroachdb#109635.

Informs: cockroachdb#109105

Release note: None
msirek pushed a commit to msirek/cockroach that referenced this pull request Sep 10, 2023
Function typeCheckSameTypedExprs is updated by cockroachdb#108387 and cockroachdb#109635 to
apply implicit CASTs to the different expressions input to an array,
tuple, case expression or other similar expression, to coerce them all
to a common data type, instead of erroring out. This is not done,
however, if the data types of these expressions are not equivalent with
each other, causing some cases to error out.

The fix is to match Postgres behavior and apply the casts even for
non-equivalent types.

Informs: cockroachdb#109105

Release note (sql change): This patch modifies type checking of arrays,
tuples, and case statements to allow implicit casting of scalar
expressions referenced in these constructs to a common data type, even
for types in different type families, as long at the implicit cast is
legal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql/tests: TestRandomSyntaxFunctions failed [array_cat_agg: 9 trailing bytes in encoded value]
4 participants