Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL AST: ifnull/nullif improvements #592

Merged
merged 2 commits into from
Apr 9, 2023
Merged

Conversation

hemberger
Copy link
Contributor

Previously the IfNullReturnTypeExtension was used for both IFNULL and NULLIF. However, these SQL functions should have different behavior.

NULLIF(expr1,expr2)

Returns NULL if expr1 = expr2 is true, otherwise returns expr1.

IFNULL(expr1,expr2)

If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2.

Note that IFNULL has additional type-mutating logic because it has to compare the two values in a temporary table:

The default return type of IFNULL(expr1,expr2) is the more "general"
of the two expressions, in the order STRING, REAL, or INTEGER.

We now also require that the functions take exactly two arguments.

The typemix table is used for these tests since its column names clearly identify the column type (e.g. c_tinyint vs. gesperrt).

Comment on lines +61 to +62
$stmt = $pdo->query('SELECT ifnull(c_int, c_float) as col from typemix');
assertType('PDOStatement<array{col: float, 0: float}>', $stmt);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering why this can't get int. c_int is not nullable, shouldn't it be PDOStatement<array{col: int, 0: int}>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a consequence of the following statement from the documentation:

The default return type of IFNULL(expr1,expr2) is the more “general” of the two expressions, in the order STRING, REAL, or INTEGER.

See https://dev.mysql.com/doc/refman/5.7/en/flow-control-functions.html#function_ifnull.

Even if expr1 is not nullable, the result type will still be the type of expr2 if its type is "more general" than expr1. In this test case, the result really will be integer-valued floats:

CREATE TEMPORARY TABLE `temp` SELECT IFNULL(c_int, c_float) FROM `typemix`;
DESCRIBE `temp`;
 
Field                     Type
IFNULL(c_int, c_float)	  double

And we can confirm this with PDO as well:

$result = $pdo->query('SELECT ifnull(c_int, c_float) as col from typemix');
var_dump($result->fetch(PDO::FETCH_ASSOC));
array (size=1)
  'col' => float 1

It's surprising behavior, to be sure, but it is accurate to the best of my knowledge.

This may be a crazy (and/or bad) idea, but... is it possible to have an IntersectionType between IntegerRangeType and FloatType?

Copy link
Owner

@staabm staabm Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a crazy (and/or bad) idea, but... is it possible to have an IntersectionType between IntegerRangeType and FloatType?

its not crazy at all. I think this should be the default return type for the IFNULL extension.
we should narrow the type only if we are sure we know better for the example at hand.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random idea: maybe we can add some magic to our test-suite (separate PR) which executes the queries against the real database and verifies our asserted-types are really matching those happen at runtime?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not crazy at all. I think this should be the default return type for the IFNULL extension.

Are you sure this works? When I try to intersect an IntegerType with FloatType (or numeric-string), I get a NeverType. I'm trying:

// Narrow the result type if it can only contain values from arg1
if (! $arg1ContainsNull) {
    $resultType = TypeCombinator::intersect($argType1, $resultType);
}

random idea: maybe we can add some magic to our test-suite (separate PR) which executes the queries against the real database and verifies our asserted-types are really matching those happen at runtime?

It seems like we could already do this at runtime. We already perform the actual query in QueryReflection::getResultType:

$resultType = $reflector->getResultType($queryString, $fetchType);

Then after we narrow it with SQL AST, we could make sure the narrowed type is a sub-type of the actual query result? Something like:

+$originalResultType = $resultType;
 $resultType = $parserInference->narrowResultType($queryString, $resultType);
+if (! $originalResultType->isSuperTypeOf($resultType)->yes()); {
+    throw new ShouldNotHappenException($resultType->describe(VerbosityLevel::precise()) . ' is not a subtype of ' . $originalResultType->describe(VerbosityLevel::precise()));
+}

I gave this a quick try, and it seems to properly find cases where SQL AST is returning the improper types. For example, with this SQL AST test of IF:

$stmt = $pdo->query('SELECT if(freigabe1u1 > 100, "a", 1) as col from ada');
assertType("PDOStatement<array{col: 1|'a', 0: 1|'a'}>", $stmt);

we would get PHPStan\ShouldNotHappenException: array{col: 1|'a', 0: 1|'a'} is not a subtype of array{col: string, 0: string}, because the actual return type should be array{col: '1'|'a', 0: '1':'a'}, because IF does the same thing as IFNULL where it widens the type of the result set to the more general of the two types, regardless of which arg the result value is taken from.

Copy link
Owner

@staabm staabm Apr 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this works? When I try to intersect an IntegerType with FloatType (or numeric-string), I get a NeverType

sorry, I was talking about a union, not a intersection.


It seems like we could already do this at runtime.

the initial goal of my idea was to verify whether the asserts within the test-suite actually reflect reality.
I don't want to do this checks at runtime of phpstan-dba in the end-user project.
just a mean to harden CI and improve our test-suite.

the idea is also not only meant for the SQL AST part, but for all assertions manually noted in the test-suite.

nevertheless: I like your idea of verifying compatibility of the SQL Ast narrowed type to the native type. thats another mean we could use in the testing-job (not at scan time of the end-user project)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the feedback, it's very much appreciated!

sorry, I was talking about a union, not a intersection.

It seems to me that a union is not appropriate here since, even though the floats may be integer valued, they will never be integer type. For example, if the user calls is_int on the result, it will return false at runtime, but PHPStan would think this is maybe legal. Am I thinking about this the wrong way?


nevertheless: I like your idea of verifying compatibility of the SQL Ast narrowed type to the native type. thats another mean we could use in the testing-job (not at scan time of the end-user project)

I'll think about how we might be able to achieve this in the test suite. It seems like it might be tricky in the gatherAssertTypes framework, but maybe we can come up with something workable. Another alternative might be to enable it only in debug mode.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that a union is not appropriate here since, even though the floats may be integer valued, they will never be integer type. For example, if the user calls is_int on the result, it will return false at runtime, but PHPStan would think this is maybe legal. Am I thinking about this the wrong way?

I get it now. its a float then.


I'll think about how we might be able to achieve this in the test suite. It seems like it might be tricky in the gatherAssertTypes framework, but maybe we can come up with something workable. Another alternative might be to enable it only in debug mode.

yeah I think we need to play a bit. maybe it can be built using a rule which analyzes the tests... I am not sure yet

hemberger and others added 2 commits April 9, 2023 09:24
Previously the `IfNullReturnTypeExtension` was used for both IFNULL and
NULLIF. However, these SQL functions should have different behavior.

`NULLIF(expr1,expr2)`
> Returns NULL if expr1 = expr2 is true, otherwise returns expr1.

`IFNULL(expr1,expr2)`
> If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2.

Note that `IFNULL` has additional type-mutating logic because it has to
compare the two values in a temporary table:

> The default return type of IFNULL(expr1,expr2) is the more "general"
> of the two expressions, in the order STRING, REAL, or INTEGER.

We now also require that the functions take exactly two arguments.

The `typemix` table is used for these tests since its column names
clearly identify the column type (e.g. `c_tinyint` vs. `gesperrt`).
Add reference link to MySQL docs

Co-authored-by: Markus Staab <maggus.staab@googlemail.com>
@staabm staabm merged commit 3a3ba02 into staabm:main Apr 9, 2023
28 checks passed
@staabm
Copy link
Owner

staabm commented Apr 9, 2023

thanks for the in detail discussion and your analysis. I really appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants