Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected result when using string comparison AND IS NULL #15395

Closed
suyZhong opened this issue Jan 18, 2024 · 7 comments · Fixed by #15401
Closed

Unexpected result when using string comparison AND IS NULL #15395

suyZhong opened this issue Jan 18, 2024 · 7 comments · Fixed by #15401
Assignees
Labels
bug Clear identification of incorrect behaviour

Comments

@suyZhong
Copy link

CrateDB version

CrateDB 5.6.0-SNAPSHOT built f6f4067

CrateDB setup information

Manual build following the instructions here: https://github.com/crate/crate/blob/master/devs/docs/basics.rst

Problem description

Consider the test cases below. It is unexpected that the third query returns empty result, since the second query returns TRUE, which means that the WHERE clause in the third query should be evaluated to TRUE and thus return the row in the table.

Steps to Reproduce

DROP TABLE IF EXISTS t1;

CREATE TABLE t1(c0 VARCHAR, PRIMARY KEY(c0));
INSERT INTO t1 (c0) VALUES ('a');
REFRESH TABLE t1;

SELECT * FROM t1; -- 'a'
SELECT NOT ((t1.c0 != '1' )AND (t1.c0 IS NULL)) FROM t1; -- TRUE
SELECT * FROM t1 WHERE NOT ((t1.c0 != '1' )AND (t1.c0 IS NULL));
-- Expected: 'a'
-- Actual:   empty result

Actual Result

Empty

Expected Result

'a'

@suyZhong suyZhong added the triage An issue that needs to be triaged by a maintainer label Jan 18, 2024
@matriv matriv added bug Clear identification of incorrect behaviour and removed triage An issue that needs to be triaged by a maintainer labels Jan 18, 2024
@matriv
Copy link
Contributor

matriv commented Jan 18, 2024

Also exists in 5.5.2.

@jeeminso jeeminso self-assigned this Jan 18, 2024
@jeeminso
Copy link
Contributor

Thanks a lot for reporting, @suyZhong!

SELECT * FROM t1 WHERE NOT ((t1.c0 != '1' )AND (t1.c0 IS NULL)); is equivalent to SELECT * FROM t1 WHERE NOT (t1.c0 IS NULL); but the former uses get (invalid optimization from collect attempting to utilize the primary key in the where clause) while latter uses collect operator:

cr> EXPLAIN SELECT * FROM t1 WHERE NOT ((t1.c0 != '1' )AND (t1.c0 IS NULL)); 
                                                                                                                 
+--------------------------------------------------------------------------------------+
| QUERY PLAN                                                                           |
+--------------------------------------------------------------------------------------+
| Get[doc.t1 | c0 | DocKeys{'1'} | (NOT ((NOT (c0 = '1')) AND (c0 IS NULL)))] (rows=1) |
+--------------------------------------------------------------------------------------+

cr> EXPLAIN SELECT * FROM t1 WHERE NOT ((t1.c0 IS NULL)); 
                                                                                                                 
+------------------------------------------------------------+
| QUERY PLAN                                                 |
+------------------------------------------------------------+
| Collect[doc.t1 | [c0] | (NOT (c0 IS NULL))] (rows=unknown) |
+------------------------------------------------------------+

@jeeminso
Copy link
Contributor

Thanks again for reporting @suyZhong , this is fixed by #15401.

@matriv
Copy link
Contributor

matriv commented Jan 29, 2024

The fix causes issues with outer joins and it's being reverted with: #15468

@matriv matriv reopened this Jan 29, 2024
@jeeminso
Copy link
Contributor

jeeminso commented Jan 29, 2024

The fix causes issues with outer joins and it's being reverted with: #15468

Thank you for handling the revert @matriv. For the outer join scenario, the arg to is null was actually core.users_roles.user_id which I think could been some kind of a ScopedSymbol(or AliasSymbol?) such that it would be prevented from is null normalization. WDYT? If sounds feasible, this could be something to look into during feature freeze.

@matriv
Copy link
Contributor

matriv commented Jan 29, 2024

Thank you for handling the revert @matriv. For the outer join scenario, the arg to is null was actually core.users_roles.user_id which I think could been some kind of a ScopedSymbol(or AliasSymbol?) such that it would be prevented from is null normalization. WDYT? If sounds feasible, this could be something to look into during feature freeze.

We should not optimize/normalize the is null/is not null of a where clause, when it acts on a column of the table on the "outer" side of the join, I don't think AliasSymbol or so is enough to catch that.

@jeeminso
Copy link
Contributor

jeeminso commented Feb 1, 2024

Fixed by #15487.

@jeeminso jeeminso closed this as completed Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Clear identification of incorrect behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants