Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tree: make typeCheckSameTypedExprs order invariant for tuples #110325

Merged
merged 1 commit into from Sep 28, 2023

Conversation

msirek
Copy link
Contributor

@msirek msirek commented Sep 11, 2023

Type checking of CASE expressions (and other expressions handled by typeCheckSameTypedExprs) does not type check tuples properly because typeCheckSameTypedTupleExprs checking is only done when the tuple is the first expression in exprs.

This is fixed by finding the first tuple in exprs, searching the slice starting at index 0, and using that to drive typeCheckSameTypedExprs.

Fixes: #109105

Release note: None

@msirek msirek requested review from mgartner and a team September 11, 2023 06:26
@msirek msirek requested a review from a team as a code owner September 11, 2023 06:26
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@@ -2564,6 +2564,15 @@ type typeCheckExprsState struct {
resolvableIdxs intsets.Fast // index into exprs/typedExprs
}

func findFirstTupleIndex(exprs ...Expr) (index int, ok bool) {
for i, expr := range exprs {
if _, ok := expr.(*Tuple); ok {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm a broken record here but we really need to stop inspecting the type of AST node to help determine the SQL type of expressions. All these cases are only partial fixes because they won't fix cases where the AST isn't Tuple but the SQL type produced is a tuple type.

This is fine for now, because it's a bit better than the current behavior. But maybe we should start labelling these places with TODOs that make it clear this is not the long-term solution? I don't want people to unknowingly repeat this pattern.

if _, ok := exprs[idx].(*Tuple); ok {
// typeCheckSameTypedTupleExprs expects the first expression in the slice
// to be a tuple.
exprs[0], exprs[idx] = exprs[idx], exprs[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it is safe to reorder subexpressions. Won't this change the semantics of parent nodes that pass in their exprs directly from an AST node, like CoalesceExpr.TypeCheck:

typedSubExprs, retType, err := typeCheckSameTypedExprs(ctx, semaCtx, desired, expr.Exprs...)

Type checking of CASE expressions (and other expressions handled by
typeCheckSameTypedExprs) does not type check tuples properly because
typeCheckSameTypedTupleExprs checking is only done when the tuple is the
first expression in `exprs`.

This is fixed by finding the first tuple in `exprs`, searching the slice
starting at index 0, and using that to drive typeCheckSameTypedExprs.

Fixes: cockroachdb#109105

Release note: None
Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/sem/tree/type_check.go line 2569 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I know I'm a broken record here but we really need to stop inspecting the type of AST node to help determine the SQL type of expressions. All these cases are only partial fixes because they won't fix cases where the AST isn't Tuple but the SQL type produced is a tuple type.

This is fine for now, because it's a bit better than the current behavior. But maybe we should start labelling these places with TODOs that make it clear this is not the long-term solution? I don't want people to unknowingly repeat this pattern.

Thanks for pointing this out. I agree that we want to handle other expressions that resolve to a type family of TupleFamily. I've added a TODO. The whole pre-existing logic here makes the assumption that we're working with Tuples, for example, the comment here:

https://github.com/msirek/cockroach/blob/9e526be84c3a68929a27e62853784d0a31476d80/pkg/sql/sem/tree/type_check.go#L2987-L2990

		if !(isTuple || isNull) {
			// We avoid calling TypeCheck on Tuple exprs since that causes the
			// types to be resolved, which we only want to do later in type-checking.
			typedExpr, err := expr.TypeCheck(ctx, semaCtx, types.Any)

So it may be a much larger change to rework the logic to handle other expressions, and for this PR I'm just targeting the smallest/safest change to handle #109105.


pkg/sql/sem/tree/type_check.go line 2604 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I'm not convinced it is safe to reorder subexpressions. Won't this change the semantics of parent nodes that pass in their exprs directly from an AST node, like CoalesceExpr.TypeCheck:

typedSubExprs, retType, err := typeCheckSameTypedExprs(ctx, semaCtx, desired, expr.Exprs...)

All of the type tests for typeCheckSameTypedExprs imply that this function finds a result type independent of the order of the input expressions, which matches my understanding of what this function is trying to do (find a common data type for all expressions in the slice). For example, both SELECT pg_typeof(COALESCE(3, 3.3)) and SELECT pg_typeof(COALESCE(3.3, 3)) result in a final type of numeric for the expression. If typeCheckSameTypedExprs were order dependent, we'd expect to see one result be bigint and the other to be numeric.

The line added below:

typedExprs[0], typedExprs[idx] = typedExprs[idx], typedExprs[0]

ensures that the typedExprs returned to the caller match up with their corresponding untyped expressions that were passed to typeCheckSameTypedExprs, so there is no change in behavior in that regard. There is no re-ordering of returned expressions relative to the input expressions.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @msirek)


pkg/sql/sem/tree/type_check.go line 2569 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

Thanks for pointing this out. I agree that we want to handle other expressions that resolve to a type family of TupleFamily. I've added a TODO. The whole pre-existing logic here makes the assumption that we're working with Tuples, for example, the comment here:

https://github.com/msirek/cockroach/blob/9e526be84c3a68929a27e62853784d0a31476d80/pkg/sql/sem/tree/type_check.go#L2987-L2990

		if !(isTuple || isNull) {
			// We avoid calling TypeCheck on Tuple exprs since that causes the
			// types to be resolved, which we only want to do later in type-checking.
			typedExpr, err := expr.TypeCheck(ctx, semaCtx, types.Any)

So it may be a much larger change to rework the logic to handle other expressions, and for this PR I'm just targeting the smallest/safest change to handle #109105.

👍


pkg/sql/sem/tree/type_check.go line 2604 at r1 (raw file):
Oops, I missed that line.

All of the type tests for typeCheckSameTypedExprs imply that this function finds a result type independent of the order of the input expressions, which matches my understanding of what this function is trying to do

This is currently how our type-checking works, but, as we discovered in #108387, type-checking of expressions like these is indeed order-dependent. No need to fix that now though.


pkg/sql/logictest/testdata/logic_test/tuple line 1278 at r2 (raw file):

             WHEN a = 4 THEN ROW(1::INT, null, NULL)
             WHEN a = 5 THEN NULL
             ELSE NULL END) FROM t109105 ORDER BY 1;

This example errors with 42883: could not identify a comparison function for type unknown in Postgres. So is this PR a temporary fix for the internal error in #109105? Or is there some reason we want to diverge from Postgres here?

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the type tests for typeCheckSameTypedExprs imply that this function finds a result type independent of the order of the input expressions, which matches my understanding of what this function is trying to do

This is currently how our type-checking works, but, as we discovered in #108387, type-checking of expressions like these is indeed order-dependent. No need to fix that now though.

Sure. Just to clarify, it is the type checking of row types in this case which tries to find a common type among all column 1 elements, and among all column 2 elements, etc. This PR is not attempting to change that behavior, just make it consistent when the first item in the slice is a null instead of a row.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/logictest/testdata/logic_test/tuple line 1278 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This example errors with 42883: could not identify a comparison function for type unknown in Postgres. So is this PR a temporary fix for the internal error in #109105? Or is there some reason we want to diverge from Postgres here?

See my other comment. To match postgres, typeCheckSameTypedTupleExprs should be modified so that a NULL in the same column of a row expression as a non-null errors out. Similar queries such as:

SELECT (CASE WHEN a = 3 THEN ROW(1::INT, 1.1::DECIMAL, 1.1e3::FLOAT)
             WHEN a = 2 THEN ROW(NULL, NULL, 1.1e3::FLOAT)
             WHEN a = 3 THEN ROW(NULL, 1.1::DECIMAL, NULL)
             WHEN a = 4 THEN ROW(1::INT, NULL, NULL)
             ELSE ROW(1::INT, 1.1::DECIMAL, 1.1e3::FLOAT) END) FROM t109105 ORDER BY 1;

don't error out on v23.1 and prior (but do on postgres), so the non-erroring on comparison with null issue is pre-existing and independent of tuple typing.

The motivation behind this fix is consistent typing so the execution engine does the right thing and doesn't hit an internal error.

Note that a similar query to the one which hit the internal error does not error out on postgres:

test=# SELECT (CASE WHEN b THEN ((ROW(1))) ELSE NULL END) from t78159;
 case
------
(0 rows)

The effort to get the tuple typing rules to match postgres may be non-trivial, so it's better to fix the internal error in the short term, matching our current behavior, than to try a quick fix of the postres rules.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @msirek)

@msirek
Copy link
Contributor Author

msirek commented Sep 28, 2023

TFTR!
bors r+

@craig
Copy link
Contributor

craig bot commented Sep 28, 2023

Build succeeded:

@craig craig bot merged commit 2111b61 into cockroachdb:master Sep 28, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: CASE with RECORD type and NULL causes internal error with vectorized engine
3 participants