Raise error on empty in_ #283

erebus1 · 2016-06-05T09:56:52Z

I think it would be good idea to add error on empy in_ usage in version 1.1.*
Because current implementation is not fully correct:
Firstly, it's not optimised, because of sequential scan
Secondly, in current implementation null in () will return null, when in sql it should return False, try this:
select null in (select 1 from some_table where false); -> false but not null

I understant that this can break some legacy code, but current implementation is obscure and quite expensive

I think it would be good idea to add error on empy in_ usage in version 1.1.* Because current implementation is not fully correct: Firstly, it's not optimised, because of sequential scan Secondly, in current implementation null in () will return null, when in sql it should return False, try this: select null in (select 1 from some_table where false); -> false but not null I understant that this can break some legacy code, but current implementation is obscure and quite expensive

zzzeek · 2016-06-06T16:33:02Z

this raises a warning which you should promote to be an exception. how to do this is described at https://docs.python.org/3/library/warnings.html. Making this raise for everyone right now would be disruptive to small-data applications that aren't impacted by performance here, yet would require difficult reorganization of their queries to not include empty results within. Each major release e.g. 1.0, 1.1 etc already produces a lot of disruptions for people usually due to uncaught behavioral regressions and throwing in forcing this to be an error in all cases would be too much at this point. I would favor any number of ways to customize what empty in() does but all of them would require the user does something codewise to make it happen.

erebus1 · 2016-06-06T19:30:00Z

Thanks, I've missed about Warning customization.

But one last question, why you decide that null in () = null,
Isn't it better to change expr != expr on 1!=1 or something like this.
I think you've meant that null can be an empty set and by boolean logic empty set is contained in each empty set.
But as I can see, databases do not share your opinion and return false instead of null.
I'll be very grateful if you provide some explanation why you have chosen such strategy or some links on such kind of explanation.

zzzeek · 2016-06-06T21:49:18Z

there's no such thing as "empty in" on the database so I'm not sure how you can say what "NULL IN ()" should evaluate to. NULL means "unknown" which is why things like "NULL = NULL" return NULL, not False. For "NULL in ()", who knows, since relational databases refuse to even evaluate "1 IN ()". The expression we have here is the closest thing we could get to as many cases as possible which you can read about here: http://docs.sqlalchemy.org/en/rel_1_0/faq/sqlexpressions.html#why-does-col-in-produce-col-col-why-not-1-0 . I think beyond that we'd have to render an enormous CASE statement which we'd rather not get into. Feel free to suggest an expression that covers all the cases in that FAQ entry more effectively on every database. this is really old stuff.

erebus1 · 2016-06-07T19:14:22Z

Yes, you are right, we can't write like this 1 in () in SQL
But we can run such code: select null in (select 1 from some_table where false); which is nearly the same.
And in such case db will return false.
I can't fully understand why you consider, that null in () should be equal to null (undefined)?

I will appreciate, if you can comment above sql expression.

I'd like to clarify some aspects of algebra:

The empty set is the unique set having no elements

The empty set is a trivial subset of itself, but not a member of itself

Cardinality of empty set is zero

All this means that there is no such element that could consist in an empty set.
So even consider that null can be any element it could not consist in an empty set.

If you will change expr!=expr on 1!=1 it will save all algebra rules and decrease DB load

Based on all of the above, I think that it would be a good idea to change expr!=expr on 1!=1
because current implementation breaks algebra rules and makes useless work.

And of course, emitting warnings is useful stuff.

zzzeek · 2016-06-07T19:23:36Z

select null in (select 1 from some_table where false)

interestingly you might have a good way for us to get an empty set into IN there.

I can't fully understand why you consider, that null in () should be equal to null (undefined)?

I don't.

All this means that there is no such element that could consist in an empty set.
So even consider that null can be any element it could not consist in an empty set.

none of that matters because relational databases don't support this case anyway and we are only trying to approximate how close we can get.

If you will change expr!=expr on 1!=1 it will save all algebra rules and decrease DB load

it does not preserve algebra. Please read http://docs.sqlalchemy.org/en/rel_1_0/faq/sqlexpressions.html#why-does-col-in-produce-col-col-why-not-1-0.

zzzeek · 2016-06-07T19:35:35Z

heres where this is at. I will see if i can find the original discussion from about 8 years ago:

test=# create table t (x integer, y integer);
CREATE TABLE

test=# insert into t (x, y) values (1, 1), (2, 2), (3, 3), (4, NULL);
INSERT 0 4


-- 1. integer version:
-- SELECT * FROM t WHERE y IN ()
test=# select * from t where 1 != 1;
 x | y 
---+---
(0 rows)

-- 2. column version:
-- SELECT * FROM t WHERE y IN () 
test=# select * from t where y != y;
 x | y 
---+---
(0 rows)

-- 3. integer version - is NULL NOT IN the empty set?
-- however, your idea of (SELECT 1 WHERE false) does suggest it is
-- SELECT * FROM t WHERE y NOT IN ()
test=# select * from t where not 1 != 1;
 x | y 
---+---
 1 | 1
 2 | 2
 3 | 3
 4 |  
(4 rows)

-- 4. column version - different answer.  
-- SELECT * FROM t WHERE y NOT IN () 
test=# select * from t where not y != y;
 x | y 
---+---
 1 | 1
 2 | 2
 3 | 3
(3 rows)

zzzeek · 2016-06-07T19:53:53Z

the longest discussion on this is on this one: https://bitbucket.org/zzzeek/sqlalchemy/issues/1628

it comes down to:

"NULL NOT IN ()"

True, False, or NULL ?

erebus1 · 2016-06-07T19:57:40Z

`

Mike, I've read this article (http://docs.sqlalchemy.org/en/rel_1_0/faq/sqlexpressions.html#why-does-col-in-produce-col-col-why-not-1-0.) after your previous comment.

Single example, that I've found was:
column not in ()
which SQL alchemy translate in
column != column

the case is when column = null
Algebra version:
null not in () -> True
SqlAlchemy version:
not null != null -> null
Naive version:
not 1 != 1 -> True

So, as I can see, changing expr != expr on 1!=1 will only repair algebra logic.

I'm sorry if I've missed something and will appreciate if you explain what

If you want to use (select x where false)
you should take into account that you should conform type of x to type of column.
Because select true in (select 1 where False)' will raise error, you should writeselect true in (select true where false)'
but any case this seems like over-engineering, and using 1!=1 with emitting warning seems pretty good (IMHO)

zzzeek · 2016-06-07T19:58:09Z

and here is why it is NULL:

 # select NULL NOT IN (1, 2, 3);
 ?column? 
----------

(1 row)

this is why we ten consider NULL NOT IN () to also be NULL. But, perhaps that's wrong. If NULL means, "might be 1, 2 or 3, who knows?" and that's why it returns NULL, then maybe "NULL NOT IN ()" is true. I really don't think this is a cut and dry issue.

zzzeek · 2016-06-07T19:59:27Z

if we decide "NULL NOT IN ()" is True then we really could say "1 != 1" and remove the warning, I think.

erebus1 · 2016-06-07T20:13:43Z

Consider such example:
'select NULL NOT IN (1, 2, 3);'
It's wrong proving, and it seems I've provided enough mathematical and practical explanation why (empty set has no elements).

Consider warning:
Even if we change expr != expr on 1!=1 it will decrease DB load, but still emit useless query.
So warning can be useful for someone.

Anyway, you are the owner, and you'll decide how it should be, I've just found some misbehaviour, and tried to fix it.

As for me, I've promoted this warning to error in debug mode, and now debug code to remove all possible situations with empty in.
But I'll be glad to see a new version of SQL alchemy without such obscure bottleneck.

zzzeek · 2016-06-07T20:18:19Z

I think the rationale for "NULL NOT IN ()" is NULL was because "NULL NOT IN (1, 2, 3)" is NULL.

Still, you get very different results with this change. I'm going to run it through a few openstack tests at https://gerrit.sqlalchemy.org/#/c/103/ and I doubt anything in openstack hits this because they likely avoid the warning.

erebus1 · 2016-06-07T20:31:37Z

Ok, let's see what it'll be.
Thanks for the good discussion.

zzzeek · 2017-02-08T22:31:45Z

https://bitbucket.org/zzzeek/sqlalchemy/issues/3907/propose-null-is-not-in-the-empty-set is opened for 1.2.

erebus1 · 2017-02-28T16:21:23Z

Cool
When do you plan to release 1.2 version?

zzzeek · 2017-02-28T16:24:44Z

hopefully not any later than early summer, more hopefully sometime in the spring. the branch is not even set up yet.

erebus1 · 2017-02-28T16:29:47Z

Ok, I'll looking forward it

zzzeek closed this Jun 6, 2016

sqlalchemy-bot mentioned this pull request Nov 27, 2018

propose null is not in the empty set sqlalchemy/sqlalchemy#3907

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise error on empty in_ #283

Raise error on empty in_ #283

erebus1 commented Jun 5, 2016

zzzeek commented Jun 6, 2016

erebus1 commented Jun 6, 2016

zzzeek commented Jun 6, 2016

erebus1 commented Jun 7, 2016

zzzeek commented Jun 7, 2016

zzzeek commented Jun 7, 2016

zzzeek commented Jun 7, 2016

erebus1 commented Jun 7, 2016 •

edited

zzzeek commented Jun 7, 2016 •

edited

zzzeek commented Jun 7, 2016

erebus1 commented Jun 7, 2016

zzzeek commented Jun 7, 2016

erebus1 commented Jun 7, 2016

zzzeek commented Feb 8, 2017

erebus1 commented Feb 28, 2017

zzzeek commented Feb 28, 2017

erebus1 commented Feb 28, 2017

Raise error on empty in_ #283

Raise error on empty in_ #283

Conversation

erebus1 commented Jun 5, 2016

zzzeek commented Jun 6, 2016

erebus1 commented Jun 6, 2016

zzzeek commented Jun 6, 2016

erebus1 commented Jun 7, 2016

zzzeek commented Jun 7, 2016

zzzeek commented Jun 7, 2016

zzzeek commented Jun 7, 2016

erebus1 commented Jun 7, 2016 • edited

zzzeek commented Jun 7, 2016 • edited

zzzeek commented Jun 7, 2016

erebus1 commented Jun 7, 2016

zzzeek commented Jun 7, 2016

erebus1 commented Jun 7, 2016

zzzeek commented Feb 8, 2017

erebus1 commented Feb 28, 2017

zzzeek commented Feb 28, 2017

erebus1 commented Feb 28, 2017

erebus1 commented Jun 7, 2016 •

edited

zzzeek commented Jun 7, 2016 •

edited