Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement RuntimeConfig->parameterTypeValidation('strict'|'lax') and use it in DoctrineKeyValueStyleRule #563

Merged
merged 16 commits into from
Mar 16, 2023

Conversation

hemberger
Copy link
Contributor

We now require that column types do accept the value type, whereas previously we only required that column types maybe accept the value type.

This fixes some cases where errors were not properly being reported, such as if the value type is nullable, but the column has a "NOT NULL" constraint.

The one exception we make is that integer ranges are still not checked, because it is likely uncommon for codebases to have range restrictions on all the integers that are passed to the database. Checking integer ranges in this rule could maybe be part of a future "strict mode".

We now require that column types _do_ accept the value type, whereas
previously we only required that column types _maybe_ accept the value
type.

This fixes some cases where errors were not properly being reported,
such as if the value type is nullable, but the column has a "NOT NULL"
constraint.

The one exception we make is that integer ranges are still not checked,
because it is likely uncommon for codebases to have range restrictions
on all the integers that are passed to the database. Checking integer
ranges in this rule could maybe be part of a future "strict mode".
if ($argColumn->getType()->accepts($valueType, true)->no()) {
$errors[] = 'Column "' . $table->getName() . '.' . $argColumnName . '" expects value type ' . $argColumn->getType()->describe(VerbosityLevel::precise()) . ', got type ' . $valueType->describe(VerbosityLevel::precise());

// Convert IntegerRangeType column types into IntegerType so
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we wouldn't make use of integer ranges if present.

sure most of the time we won't get a type with a range, but if we get one, it should be compatibel wit the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I've made an exception for integer ranges is because if a codebase doesn't diligently use integer ranges throughout their PHP code (i.e. they just use int), then this rule is practically unusable.

For example, in one of my relatively small projects, when I remove the IntegerRangeType transformation, I get over 1000 new errors of the kind:

Query error: Column "message.account_id" expects value type int<0, 65535>, got type int

It's my hope that some day I can be more strict about integer ranges, but I suspect that I'm not alone in living with a codebase that only uses int. In the meantime, knowing whether I have an int vs. int|null vs. float vs. string etc. is enormously valuable, and so it's my hope that we can continue to provide practical support for this situation.

Here are a couple options, but I'll do my best to accommodate whatever your preference is:

  1. Skip the type transformation if the data value type is (or contains) IntegerRangeType, so that projects that are already using integer ranges in PHP can get more precise checks.
  2. Add a configuration option to enable/disable this transformation (e.g. enableDoctrineIntegerRangeChecks).

Let me know what you think. Thanks!

Copy link
Owner

@staabm staabm Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the arg-column type is based on our reflector inference, therefore will be most of the time a integer range type.

I wonder why/whether we have this very same problem in other phpstan-dba rules or whether its a already solved problem. could you check whether other rules produce the same annyoing errors?

I agree that we shouldn't be too strict and enforce integer-ranges on the user.
in case we get a integer range from the user we should validate it though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added logic to skip the transformation if the input value type is IntegerRangeType as well. As I note in the commit message, this doesn't handle the case where the input value is a compound type that contains an IntegerRangeType, but I can add that if you prefer. I was hesitant because I'm not aware of any convenient way to express "is or contains a specific type" in the PHPStan Types system. I'm still learning - thanks for your patience and guidance!

I'll look into your question about integer range issues in other rules, and will get back to you asap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For SyntaxErrorInPreparedStatementMethodRule and similar, I can't find any issues with integer ranges, because as far as I can tell there is no validation of the types apart from whether or not the database engine accepts it. This is something I have been pondering recently, as was planning to ask about soon anyway. :)

For example, if I have a table account with an integer column account_id, then the following prepared statement does not report an error, even though we're passing a string as the account_id instead of an int:

$stmt = $db->prepare(
    'SELECT * FROM account WHERE account_id = :account_id',
    [
        'account_id' => 'foo',
    ],
);

Presumably this is because the database engine is perfectly content to compare account_id to 'foo' even though it will never return a result. I don't know if it is even theoretically possible for phpstan-dba to detect this and report an error (it certainly seems complicated when you're doing more than just a simple equality check against a column value), but if it could, I would definitely make good use of it.

This seems a bit far afield from the current PR though, so if it's useful for you, I can submit it as a separate issue.

Use this instead of `accepts`. This fixes some additional edge cases
(such as when we should get *ERROR* and for mixed types).

Thanks to @staabm for the suggestion!
Do not transform a column IntegerRangeType if the input data is also
an IntegerRangeType. This allows for more precise checks for projects
using integer ranges in PHP.

Note that this doesn't change cases where the data IntegerRangeType is
part of a CompoundType.
@hemberger
Copy link
Contributor Author

I'm finding the conditional integer range checks added in 046ba5c (when the input value type is an integer range as well) to be counter-productive. My codebase is still fundamentally not database-int-size aware, but I perform some checks on integers, like making sure a value is positive, which makes them integer ranges. As a result, I get errors like the following:

Query error: Column "table.column" expects value type int<0, 4294967295>, got type int<1, max>

I don't see any way to achieve a middle-ground here. If I try to add size-awareness for these cases, then I end up needing it everywhere in my code. The only solution I can think of is to remove 046ba5c from this PR and possibly add a configuration option to enable/disable integer range checks.

@staabm
Copy link
Owner

staabm commented Mar 12, 2023

I don't see any way to achieve a middle-ground here. If I try to add size-awareness for these cases, then I end up needing it everywhere in my code. The only solution I can think of is to remove 046ba5c from this PR and possibly add a configuration option to enable/disable integer range checks.

I see.

lets add a new RuntimeConfig which allows to enable parameter validation with a strict and a lax mode.

…hecks

This replaces the problematic conditional integer range checks.
Now you either get no integer range type checks (default) or all
integer range types are checked (w/ config option).
@hemberger hemberger requested a review from staabm March 14, 2023 05:35
/**
* @var bool
*/
private $doctrineKeyValueIntegerRangeChecks = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we dhould have a more generic name and and use the same switch for parameter validation in other rules

Suggested change
private $doctrineKeyValueIntegerRangeChecks = false;
/** @var 'lax'|'strict' */
private $parameterTypeValidation ='lax';

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm certainly happy to change the option name. Could I ask, what did you envision using it for? My intention was for it to be specific to integer range types, since in my use case I don't think I'd want "lax" validation for any other type. Would you be okay with something like strictIntegerRangeChecks = bool (or integerRangeChecks = 'lax'|'strict')?

Copy link
Owner

@staabm staabm Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't want strict/lax validation for e.g. numeric-string/non-falsy-string/non-empty-string/string ?


integerRangeChecks = 'lax'|'strict'

Sounds good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't want strict/lax validation for e.g. numeric-string/non-falsy-string/non-empty-string/string ?

With the integer range checks, the issue is that the parameter types (IntegerType, in most of my cases) are wider than the database type (IntegerRangeType), and so a strict check fails for those parameters. Is the same scenario possible for strings? Looking at MysqlTypeMapper, it seems that string-like columns are all mapped to StringType, and so valid parameter types shouldn't be wider than that, even if we're being lax.

However, I could definitely be missing something here. Do you have an example in mind? I hope this doesn't come across as being difficult, I'm genuinely trying to learn so that I can become a better contributor to this repo.

Either way, I won't be able to follow-up until tomorrow, as it is midnight here. Thanks for your patience. :)

Copy link
Owner

@staabm staabm Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I could definitely be missing something here. Do you have an example in mind?

the RuntimeConfig->stringifyTypes() option turns integers into numeric-string. this means people are expected to pass numeric-string into the db layer instead of a plain string. I think thats a similar situation as with integer-ranges.

also as we progress further and support more SQL AST inference we might realize that some parameters should have a certain type, because of the place/syntax they are used in, e.g.

UPDATE your_table
SET date_field = DATE(STR_TO_DATE(:date, '%m/%d/%Y'))

where we could expect a non-empty-string for :date in the way it is used.

Either way, I won't be able to follow-up until tomorrow, as it is midnight here. Thanks for your patience. :)

you are doing a great job. no need to hurry or to be worried about anything :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I went with your original suggestion, then. If we end up needing more fine-grained control in the future, hopefully it won't be too much trouble to add extra options (or perhaps extra modes for this option).

Thanks for the kind words :)

This option name is more generic, so that it can be used for future
type validation configuration in the future. It currently only applies
to DoctrineKeyValueStyleRule integer range checks.
src/QueryReflection/RuntimeConfiguration.php Outdated Show resolved Hide resolved
@@ -92,7 +93,7 @@ public function testIntegerRanges(): void
],
];

QueryReflection::getRuntimeConfiguration()->enableDoctrineKeyValueIntegerRangeChecks(true);
QueryReflection::getRuntimeConfiguration()->parameterTypeValidation(RuntimeConfiguration::VALIDATION_MODE_STRICT);
Copy link
Owner

@staabm staabm Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a separate test class for strict tests
(changing global state within a test can lead to problems)

Copy link
Owner

@staabm staabm Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be more precise: we need to test the integer ranges (or the validation errors) in strict and in lax mode and assert the expectations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it seems like a single test class would help avoid a lot of code duplication. For example, I could have one test method (parameterized by the config option) that performs checks that don't change, and then another method where the assertions change with the config option.

So long as we are properly resetting the config option in the tearDown method, I don't think there should be any issues with the global state changing in a test method. Up to you though - if you're willing to entertain an attempt.

Copy link
Owner

@staabm staabm Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just analyze the doctrine-key-value-style-integer-ranges.php file twice.

once from each test-class and using different runtimeconfig in each test-class.

that way we also make sure when adding a new case in doctrine-key-value-style-integer-ranges.php that we get 2 expectation errors and get reminded we need to assert lax and strict variants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that we get 2 expectation errors and get reminded we need to assert lax and strict variants

By parameterizing the test, you would get 2 expectation errors, but then you only need to change one line of code. E.g.

/**
 * @testWith ['lax']
 *           ['strict']
 */
public function testErrors(string $validationMode)
{
    QueryReflection::getRuntimeConfiguration()->parameterTypeValidation($validationMode);

    $expectedErrors = [...];

    $this->analyse(...);
}

This spares us having two exact copies of the large $expectedErrors array, and makes maintaining the tests easier.

(If you don't like the @testWith annotation, we could use a data provider method. In PHPUnit 10, the annotations are replaced with much nicer attributes, but I don't know if you're interested in updating to PHPUnit 10 any time soon.)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't. I want straigt forward tests which are as easy to read as we can get, without any test-framework magic.

I am fine with dataprovider itself, but I think when our test-results depend on a global switch we should have one test-class per value of that global switch. I don't mind the duplication for that at all

src/QueryReflection/RuntimeConfiguration.php Outdated Show resolved Hide resolved
@staabm
Copy link
Owner

staabm commented Mar 16, 2023

New last nits

hemberger and others added 3 commits March 16, 2023 00:08
@staabm
Copy link
Owner

staabm commented Mar 16, 2023

I will add the lax test case and merge it so I can push the next release

@hemberger
Copy link
Contributor Author

Okay, I would appreciate if you let me know next time, because I was working on it for the past 20 minutes.

@staabm
Copy link
Owner

staabm commented Mar 16, 2023

oh, I am sorry to hear that.

@staabm staabm changed the title DoctrineKeyValueStyleRule: stricter type checks Implement RuntimeConfig->parameterTypeValidation('strict'|'lax') and use it in DoctrineKeyValueStyleRule Mar 16, 2023
@staabm staabm enabled auto-merge (squash) March 16, 2023 09:27
@staabm staabm merged commit bb2b845 into staabm:main Mar 16, 2023
@hemberger hemberger deleted the doctrine-types-fix branch March 16, 2023 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants