Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.6: Fix hidden query bugs a.k.a. IS NULL issue of != NULL #11789

Closed
2 tasks done
dereuromark opened this issue Mar 5, 2018 · 17 comments
Closed
2 tasks done

3.6: Fix hidden query bugs a.k.a. IS NULL issue of != NULL #11789

dereuromark opened this issue Mar 5, 2018 · 17 comments

Comments

@dereuromark
Copy link
Member

This is a (multiple allowed):

  • bug
  • enhancement

Motivation:

  • Hidden bugs in software, often times from 2.x to 3.x upgrades
  • High level needs to know about things it shouldnt need to know
  • Methods like findOrCreate() and using $conditions array for finding + creating is way more complex currently

IN

This has been annoying me since day one of 3.x, and every time I upgrade a 2.x app, this annoys me even more.
The main reason: It contains hidden bugs as the resulting SQL query is invalid but silently just returns 0 results.
So no one ever knows if those queries work since no results just means "no hits" usually, no one would suspect the query to be invalid then, of course.

$x = $this->Logs->find()
    ->where(['comment' => null])
    ->all();

SQL:

WHERE 
  `comment` = NULL

=> 0 rows, silently returning nothing

That makes no sense to me. In what use case would one ever want to produce such a query? It does not throw a DB exception but is an invalid query that will never yield any useful result.
The desired behavior is pretty clear - make the DB check for the fields that are NULL (the developer doesnt want/need to know the specifics of the query building IMO).

$x = $this->Logs->find()
    ->where(['comment IS' => null])
    ->all();

and the resulting

WHERE 
  (`comment`) IS NULL

suddenly returns the expected 3 records.

So two things we can do for 3.6:

Solution A

The obvious one: Throw an exception for invalid queries instead of silently doing nothing creates a false sense of correctness here.
This would be somewhat in line with IN / NOT IN behavior for empty rows.

Solution B

Go back to the very intuitive and IMO most correct way of using 2.x behavior:
Once a null is passed, use that for the query with proper IS or IS NOT usage.

Especially for the positive case (IS) I do not see any harm here - especially compared to the current bugs we have due to the silent query fault.

That said I am fine with adding extra methods that would provide this. But I honestly do not think those should be necessary.
Cake ORM should go back to the dev friendly way of converting this to what the DB needs without having to force the developer to add extras everywhere.

I still have hundreds of these silent issues somewhere in upgraded apps and they are currently impossible to detect really, unless you use 100% test coverage
and always at least fixture data of >= 1 returned (never allow empty result collection for reading in tests). This is not feasable.
Let's please fix this now.

NOT IN

To complete Solution B, also the negatation should then work as expected (or throw meaningful exception as A suggests):

find()->where(['comment !=' => null])->all();

The generated query should never be

WHERE 
  `comment` != NULL

but

WHERE 
  (`comment`) IS NOT NULL

Are there any issues of doing this properly inside the ORM? Or do we have to go with the exception because certain use cases such as nested 'NOT' conditions might fail here?

Schema NOT NULL

Implications for when the schema doesnt even allow null values are simple:
We could automatically use empty string then right away (NO IS NULL check) for the condition.

This becomes necessary IMO if we start throwing exceptions - for the other case it would be a useful convenience here, as the other one would never result valid results either.

Nullable data integrety

This also relates somewhat to #9678 since it then could based on schema always just require the right empty value for the condition, and with these changes here require the right IS vs ==/NOT IS vs != usage.
But I can see why some people would not want too much magic here going on based on schema and instead probably prefer some runtime exception or more manual thought put into query building.
But with the related issue being fixed this one here will also become more clear hopefully.

@inoas
Copy link
Contributor

inoas commented Mar 5, 2018

👍 for caring and bringing up these kind of issues!

Do I get it right that one solution would be to treat NULL as NULL and '' as '' and not mix/cast those?

However I don't like rewriting "foo !=" => $nullVar to FOO IS NOT NULL. And I am not sure I like any kind of casting magic in where conditions at all. Sounds like the chances are high for head aches and security holes. Maybe sometimes a column is nullable but also can take '', or it is not nullable but it can take '', or it is not nullable and neither is '' a legit value (in terms of app domain validation)... etc.

What about adding:

->whereIsNull($exp)
->whereIsNotNull($exp)
->whereEqEmptyString($exp)
->whereEqNotEmptyString($exp)

Not sure how to (conveniently) use these with 'OR' => or or_() but then I didn't try yet.

@inoas
Copy link
Contributor

inoas commented Mar 5, 2018

Another way maybe would be to add a \Cake\ORM\Query\functions file/namespace and simply add eq(), not(), isNull(), isNotNull(), notEmptyString(), eqEmptyString() and isNullish() as well as notNullish() latter to adding checks for IS NOT NULL and != '' at the same time.

Like:

->where(isNotNull($exp))->where(notEmptyString($exp));
->where(eq($exp, $val))->where(notEmptyString($exp));
->where(notNullish($exp));

@markstory
Copy link
Member

We left out the automatic null and automatic in rewrites as they create harder to grok query conditions that are not always obvious. I personally favor throwing an exception when the ORM is going to generate effectively invalid SQL.

@raul338
Copy link
Contributor

raul338 commented Mar 5, 2018

There are functions in QueryExpression: isNull, isNotNull

Maybe there should be more examples like this

$query->where([
    $query->newExpr()->isNull('comment'),
]);

instead of

$query->where([
    'comment IS' => null,
]);

However, the problem would still exists if its used this way

$parent_id = null;
if (someBussinessRule()) {
     $parent_id = 1;
}
$query->where([
    'parent_id' => $parent_id, // some might expect this would be converted to IS NULL
]);

@inoas
Copy link
Contributor

inoas commented Mar 5, 2018

@lorenzo got any neat idea how to handle these things strict and concise?

@lorenzo
Copy link
Member

lorenzo commented Mar 6, 2018

I’m also in the camp of throwing an exception when we are going to generate invalid sql.

The story of null gets complicated in the face of custom type classes. Is null in json columns the same as a json null, for example? I rather not make magic clever transformations in conditions that were not visibly asked for

@dereuromark
Copy link
Member Author

So before we don't act for 3.6, I am fine with the exception :)

@markstory markstory self-assigned this Mar 14, 2018
@markstory
Copy link
Member

Is this partially or fully addressed by #11782 @dereuromark ?

@dereuromark
Copy link
Member Author

This is a completely separate issue.

@markstory markstory removed their assignment Mar 29, 2018
@markstory markstory modified the milestones: 3.6.0, 3.7.0 Apr 15, 2018
@dereuromark
Copy link
Member Author

So are we OK with either making it behave correctly again like it used to in 2.x (expected behavior).
Or do we throw exceptions in 4.x to at least avoid the silent broken behavior of a completly unexpected and invalid SQL syntax?

@ADmad
Copy link
Member

ADmad commented Oct 15, 2018

We can throw exception if resulting query would have

`foo` = NULL

in 4.x.

@markstory markstory modified the milestones: 3.7.0, Future, 4.0.0 Oct 29, 2018
@dereuromark
Copy link
Member Author

Shall we add a deprecation notice then in 3.7 now?
So this is already found and fixed during 3.x development?

@markstory
Copy link
Member

A deprecation notice makes sense in 3.x

@dereuromark
Copy link
Member Author

Did we already implement this for 4.0? Then we can easily backport the depreciation and protection notice.

@markstory
Copy link
Member

I don't think any deprecation notices were added for this scenario.

@dereuromark
Copy link
Member Author

I added test cases as beginning for this fix:
4.x...4.x-null

But I am not sure how to best fix up the code to throw those meaningful exceptions.
Anyone wants to add a commit to (or PR towards) this branch?

markstory added a commit to cakephp/docs that referenced this issue Aug 11, 2019
@dereuromark
Copy link
Member Author

I guess we can close this now. For 3.x there is not much we can do/help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants