Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The construct "where.not" doesn't respect boolean algebra #31209

Closed
fterrazzoni opened this issue Nov 23, 2017 · 17 comments

Comments

Projects
None yet
8 participants
@fterrazzoni
Copy link

commented Nov 23, 2017

Steps to reproduce

When two predicates A and B are given to a where clause, Rails combines them using AND, which is the expected behavior.

Model.where(col1: 1, col2: 2).to_sql

# => SELECT "models".* FROM "models" WHERE "models"."col1" = 1 AND "models"."col2" = 2

When the clause is negated (using where.not), Rails distributes the NOT to every predicate before combining them using AND. Why?

Model.where.not(col1: 1, col2: 2).to_sql

# => SELECT "models".* FROM "models" WHERE ("models"."col1" != 1) AND ("models"."col2" != 2)

Expected behavior

NOT(A AND B) = NOT(A) OR NOT(B)

In my example, the query should look like:

SELECT "models".* FROM "models" WHERE ("models"."col1" != 1) OR ("models"."col2" != 2)

Actual behavior

NOT(A AND B) = NOT(A) AND NOT(B) (wrong)

It's worth noting that the correct behavior can still be obtained by writing SQL directly:

Model.where.not('"models"."col1" = 1 AND "models"."col2" = 2').to_sql
# => SELECT "models".* FROM "models" WHERE (NOT ("models"."col1" = 1 AND "models"."col2" = 2))

It makes the construct where.not(...) behave very differently depending on the kind of arguments which are passed to it.

The culprit seems to be located here :

WhereClause.new(inverted_predicates)

System configuration

Rails version: 5.1.4

Ruby version: MRI 2.4.2

@matthewd

This comment has been minimized.

Copy link
Member

commented Nov 23, 2017

There's an example in the API documentation which shows this behaviour, so it's definitely "we can't silently change this" documented. (Plus I'm sure people are depending on it, and we reallllly don't want to change how queries evaluate.)

I do wish this didn't work the way it does, though. ISTM the best path forward would be a new where.foo (inverse?) that does the Right Thing -- then we can use a deprecation to encourage people to switch. (And then, ultimately, reclaim not as an alias.)

We can do all that while only disrupting people who are passing multiple values, so it seems viable on the benefit-vs-inconvenience front.

@jeremy does that plan sound reasonable? I'm not certain about the history... was this meaning a deliberate choice that we don't want to change, or more of an edge case that "just happened"? Do you recall?

@fterrazzoni

This comment has been minimized.

Copy link
Author

commented Nov 23, 2017

Thank you for this very quick answer. Indeed, I misread the documentation and didn't see the example. So unfortunately, this isn't a bug strictly speaking...

I would love to see this behavior deprecated/changed though.

@jeremy

This comment has been minimized.

Copy link
Member

commented Nov 23, 2017

Some discussion here: #8332 (comment) (@amatsuda)

And the documentation was added later: 6b42e0f (@fxn)

@fxn

This comment has been minimized.

Copy link
Member

commented Nov 24, 2017

For me it is fine the way it is.

If where.not was meant to be defined in terms of boolean operators, this would be a bug. But where.not is just an AND of negations by definition. I personally see no problem and the behavior of where does not make me believe where.not should behave one way or another. (Let me add that in my last years in the math faculty including one of PhD I specialized on mathematical logic and foundations of math).

A different topic is if we want API to express an OR of negations.

@matthewd

This comment has been minimized.

Copy link
Member

commented Nov 25, 2017

I guess the main reason I'm inclined to change it, separate from any expectation/intuition, is that the alternative seems far more useful to me.

It would be perfectly defensible, designed from scratch, for multiple keys to where to mean OR by definition. But it doesn't, because that's not the most common thing to need to express. I claim that while any well-stated definition (and let's be honest, a single example with no mention in the text is not well-stated) is fine by definition, the current definition is not a tool people will regularly have reason to reach for -- by the same justification that we lived without an actual OR operator for so long.

We also already have another syntax to briefly AND two not-conditions together: x.where.not(a: 1).where.not(b: 2).

(Likewise, anyone currently encountering this and just wanting to produce the right expression can use x.where.not(a: 1).or(x.where.not(b: 2)))

@fxn

This comment has been minimized.

Copy link
Member

commented Nov 25, 2017

Yes, agree that ORing would have been just as defensible from scratch, and to choose the behavior based on what is the most common need (I would not be able to say which that is though).

Also, this issue means it is probably a good idea to mention the AND explicitly in the docs, which right now kinda say: "basically same as where, read the details there". If we close this one, I'll write it myself.

@octavpo

This comment has been minimized.

Copy link

commented Dec 4, 2017

I strongly believe the current behavior is the right one. The OP is based on the interpretation that the string col1: 1, col2: 2 as argument of where and where.not means a logical AND between the two conditions. I don't see a justification for this interpretation. This is just a sequence of method arguments, used as a shortcut for the alternative of specifying each of the conditions separately. In other words:

where(col1: 1, col2: 2)

is just a shortcut form for:

where(col1: 1).where(col2: 2)

I think it's highly desirable then if:

where.not(col1: 1, col2: 2)

were also just a shortcut for:

where.not(col1: 1).where.not(col2: 2)

So to give it the semantics in the OP, you'd have to give the second form the same semantics, which I think is highly problematic. You'd have to explain that where.not conditions are collected separately from where conditions, then combined with AND and the result put under a single NOT. I don't think that would be what most people would expect, especially when the conditions come from different scopes.

@kamipo kamipo added the activerecord label Dec 4, 2017

@matthewd

This comment has been minimized.

Copy link
Member

commented Dec 4, 2017

You'd have to explain that where.not conditions are collected separately from where conditions, then combined with AND and the result put under a single NOT. I don't think that would be what most people would expect, especially when the conditions come from different scopes.

That's not what's proposed here. The proposal is that where(col1: 1, col2: 2) is approximately where(col1 == 1 and col2 == 2), and that therefore where.not(col1: 1, col2: 2) would be where(not(col1 == 1 and col2 == 2)).

i.e., that "where.not" is a negation of the "where" (taking its arguments as a whole), instead of inverting each argument individually.


@octavpo where(col: [1,2]) means "where col = 1 OR col = 2". Should it then follow that where.not(col: [1,2]) must mean "where col != 1 OR col != 2"? If not, why not?

Moreover, as I suggested in #31209 (comment), this isn't about what's "right", because whatever we document is right by definition. The question is which is useful. I am contending that the useful opposite of "where these conditions are all true" is "where these conditions are not all true", and not "where these conditions are all false" -- especially when the latter has quite variable interpretation as to the boundary between each condition.

@octavpo

This comment has been minimized.

Copy link

commented Dec 4, 2017

That's not what's proposed here. The proposal is that where(col1: 1, col2: 2) is approximately where(col1 == 1 and col2 == 2), and that therefore where.not(col1: 1, col2: 2) would be where(not(col1 == 1 and col2 == 2)).

Yes but what's proposed here breaks the equivalence of where.not(col1: 1, col2: 2) and where.not(col1: 1).where.not(col2: 2). I was saying that if you still wanted to maintain the equivalence you'd have to do the above.

@octavpo where(col: [1,2]) means "where col = 1 OR col = 2". Should it then follow that where.not(col: [1,2]) must mean "where col != 1 OR col != 2"? If not, why not?

where(col: [1,2]) doesn't mean "where col = 1 OR col = 2", it means "where col in (1,2)". So it's natural that where.not(col: [1,2]) would mean "where col not in (1,2)". You don't have two method arguments here.

The problem comes from interpreting where(col1: 1, col2: 2) as meaning logical AND, when it just means 'add these two conditions to the list of where conditions'. So where.not(col1: 1, col2: 2) should just mean 'add these two negated conditions to the list'. where as currently defined in AR doesn't take logical expressions as arguments, just lists of conditions.

Of course the whole issue comes from where using this strange syntax instead of proper logical expressions, like where(col1 == 1 && col2 == 2). This couldn't quite work since && and || are not methods, but I don't see why something like where((col1 == 1) & (col2 == 2)) couldn't work, see squeel.

Moreover, as I suggested in #31209 (comment), this isn't about what's "right", because whatever we document is right by definition. The question is which is useful. I am contending that the useful opposite of "where these conditions are all true" is "where these conditions are not all true", and not "where these conditions are all false" -- especially when the latter has quite variable interpretation as to the boundary between each condition.

I'd say that what's right is what's expected according to the language semantics, so it avoids surprises. In terms of usefulness, I could also argue that it would be more useful if where(col1: 1, col2: 2) actually was translated to "where col1 = 1 or col2 = 2", since if I wanted an AND I could just write where(col1: 1).where(col2: 2). The OR is currently harder to generate.

@fterrazzoni

This comment has been minimized.

Copy link
Author

commented Dec 6, 2017

The problem comes from interpreting where(col1: 1, col2: 2) as meaning logical AND, when it just means 'add these two conditions to the list of where conditions'.

All predicates in where clause are combined using a logical AND, so this is exactly the same thing.

Most people I talked with are understanding this as where('col1 = 1 AND col2 = 2'), and would think that the following assertion is correct (whereas it's not)

where('NOT(col1 = 1 AND col2 = 2)') = where.not(col1: 1, col2: 2)

The choice of distributing the NOT operator to every argument is arbitrary. Sure, the documented behavior is right by definition, but it would be nicer and less surprising to be compliant with basic boolean algebra. Another consequence, the following assertion is also incorrect if there are more than one condition.

all = where.not(conds) + where(conds)

If you're not aware of this behavior you would probably assume it to be true. The documentation doesn't make it obvious today.

@al2o3cr

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2017

@fterrazzoni A minor quibble: all == where.not(conds) + where(conds) is also likely to be false if any of the conditions reference nullable columns. As a concrete example, given a table things with a nullable integer column a_number the code:

Thing.where(a_number: 5) + Thing.where.not(a_number: 5)

does not yield any rows with a_number NULL.

SQL logic != boolean logic.

@fterrazzoni

This comment has been minimized.

Copy link
Author

commented Dec 6, 2017

Hum, you're totally right about NULLs and indeed, my comment is only valid when all columns are NOT NULL. I should have mentioned that!

However, even in SQL three-valued logic, these statements are always equivalent:

NOT(col1 = 1 AND col2 = 2)
NOT(col1 = 1) OR NOT(col2 = 2)

The NOT never distributes like it does in Rails.

@octavpo

This comment has been minimized.

Copy link

commented Dec 7, 2017

Let me try to rephrase this. What we're talking about here is what's the "right" scope of the NOT in where.not methods, under some not clearly specified definition of "right" (useful, consistent with ruby, consistent with SQL, consistent with boolean algebra, etc). There are 3 choices I can see:

  1. It has scope over each individual condition
  2. It has scope over all conditions listed in a single whole where.not method
  3. It has scope over all conditions collected from all where.not methods

The current semantics is 1. and in my opinion it's the simplest and most consistent with the ruby language semantics, but you're right it's arbitrary. You prefer 2. and you have your arguments, but you cannot say it's less arbitrary. It's just a matter of preference.

More important from my point of view is the question of why we have to deal with these 'pseudo-logical' expressions instead of being able to write exactly the logical formulas we want. As I was saying before, squeel showed us what it could look like, and the author was saying there were discussions to incorporate it into Rails core. Does anybody know or can point to some discussion on why this never happened?

@matthewd

This comment has been minimized.

Copy link
Member

commented Dec 7, 2017

It has scope over each individual condition

When col: [1,2] is an "individual condition" but rel: { c1: 1, c2: 2 } is two, I have many doubts about the simplicity and the consistency of that definition. And that's not even touching where.not(polymorphic_association: obj), which currently indefensibly (I would hope!) expands to where.not(polymorphic_association_type: obj.class.name, polymorphic_association_id: obj.id) 🙈

squeel showed us what it could look like, and the author was saying there were discussions to incorporate it into Rails core

I know of no such discussion; it might've been before my time. Personally, I find that style of extreme pseudo-ruby DSL to be too fragile for core... but that's what gems are for.

@octavpo

This comment has been minimized.

Copy link

commented Dec 12, 2017

When col: [1,2] is an "individual condition" but rel: { c1: 1, c2: 2 } is two, I have many doubts about the simplicity and the consistency of that definition.

Well, they do generate 1 and respectively 2 SQL conditions. How many conditions does something like col: 1..10 represent then? 2? 10? An infinity? It still generates a single SQL condition.

And that's not even touching where.not(polymorphic_association: obj), which currently indefensibly (I would hope!) expands to where.not(polymorphic_association_type: obj.class.name, polymorphic_association_id: obj.id)

This is an interesting case. I'd have thought that this is exactly the expansion you'd want under your preferred scope for not. So you could use it as an argument saying that under option 2 Rails wouldn't need to do anything special about it, it would just work. Under the current scope it's definitely bad.

I was trying to look through my code for anything like this, to see what would be most useful to me. I couldn't find anything exactly like it, but I have a couple of cases where I have a negative condition on some other attribute of a polymorphic association, and in those cases what I needed is an extra condition that checks that the type is still the same as for the elements with the negated condition. So by analogy it seems that what would be most useful to me under option 1 was if it expanded to:

where(polymorphic_association_type: obj.class.name).where.not(polymorphic_association_id: obj.id)

I know of no such discussion; it might've been before my time. Personally, I find that style of extreme pseudo-ruby DSL to be too fragile for core... but that's what gems are for.

Well, I wouldn't call redefining existing methods for new types of objects extreme pseudo-ruby, I think this is pretty standard technique in object-oriented programming. And actually Arel is already using it, but it's defining them as bitwise operators. Which makes sense, and maybe it would be a little too much to overload them also as regular boolean operators, but it's still pretty bad that we don't have something similar for && and ||, that would be much more useful.

The main issue with having this as a separate gem is that it depends a lot on internal Rails stuff, so it's hard to maintain by somebody not part of Rails core team. If Rails core team were willing to maintain it that would be great.

@inopinatus

This comment has been minimized.

Copy link
Contributor

commented Feb 25, 2018

Is this related? I have scopes producing incorrect results on polymorphic "where not":

class MyThing
  belongs_to :owner, polymorphic: true
  scope :not_owned_by, ->(owner) { where.not(owner: owner) }
  #...
end

Now MyThing.not_owned_by(Account.find(172483)) incorrectly produces

SELECT "my_things".* FROM "my_things" WHERE ("my_things"."owner_type" != 'Account') AND ("my_things"."owner_id" != 172483)

which returns too few records, rather than the correct

SELECT "my_things".* FROM "my_things" WHERE NOT ("my_things"."owner_type" = 'Account' AND "my_things"."owner_id" = 172483)

which just excludes one record.

@rails-bot rails-bot bot added the stale label May 26, 2018

@rails-bot

This comment has been minimized.

Copy link

commented May 26, 2018

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 5-2-stable branch or on master, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot closed this Jun 2, 2018

kamipo added a commit to kamipo/rails that referenced this issue Apr 19, 2019

Deprecate `where.not` working as NOR and will be changed to NAND in R…
…ails 6.1

`where.not` with polymorphic association is partly fixed incidentally at
213796f (refer rails#33493, rails#26207, rails#17010, rails#16983, rails#14161), and I've added
test case e9ba12f to avoid lose that fix accidentaly in the future.

In Rails 5.2, `where.not(polymorphic: object)` works as expected as
NAND, but `where.not(polymorphic_type: object.class.polymorphic_name, polymorphic_id: object.id)`
still unexpectedly works as NOR.

To will make `where.not` working desiredly as NAND in Rails 6.1, this
deprecates `where.not` working as NOR. If people want to continue NOR
conditions, we'd encourage to them to `where.not` each conditions
manually.

```ruby
all = [treasures(:diamond), treasures(:sapphire), cars(:honda), treasures(:sapphire)]
assert_equal all, PriceEstimate.all.map(&:estimate_of)
```

In Rails 6.0:

```ruby
sapphire = treasures(:sapphire)

nor = all.reject { |e| e.estimate_of_type == sapphire.class.polymorphic_name }.reject { |e| e.estimate_of_id == sapphire.id }
assert_equal [cars(:honda)], nor

without_sapphire = PriceEstimate.where.not(estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id)
assert_equal nor, without_sapphire.map(&:estimate_of)
```

In Rails 6.1:

```ruby
sapphire = treasures(:sapphire)

nand = all - [sapphire]
assert_equal [treasures(:diamond), cars(:honda)], nand

without_sapphire = PriceEstimate.where.not(estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id)
assert_equal nand, without_sapphire.map(&:estimate_of)
```

Resolves rails#31209.

kamipo added a commit to kamipo/rails that referenced this issue Apr 19, 2019

Deprecate `where.not` working as NOR and will be changed to NAND in R…
…ails 6.1

`where.not` with polymorphic association is partly fixed incidentally at
213796f (refer rails#33493, rails#26207, rails#17010, rails#16983, rails#14161), and I've added
test case e9ba12f to avoid lose that fix accidentally in the future.

In Rails 5.2, `where.not(polymorphic: object)` works as expected as
NAND, but `where.not(polymorphic_type: object.class.polymorphic_name, polymorphic_id: object.id)`
still unexpectedly works as NOR.

To will make `where.not` working desiredly as NAND in Rails 6.1, this
deprecates `where.not` working as NOR. If people want to continue NOR
conditions, we'd encourage to them to `where.not` each conditions
manually.

```ruby
all = [treasures(:diamond), treasures(:sapphire), cars(:honda), treasures(:sapphire)]
assert_equal all, PriceEstimate.all.map(&:estimate_of)
```

In Rails 6.0:

```ruby
sapphire = treasures(:sapphire)

nor = all.reject { |e| e.estimate_of_type == sapphire.class.polymorphic_name }.reject { |e| e.estimate_of_id == sapphire.id }
assert_equal [cars(:honda)], nor

without_sapphire = PriceEstimate.where.not(estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id)
assert_equal nor, without_sapphire.map(&:estimate_of)
```

In Rails 6.1:

```ruby
sapphire = treasures(:sapphire)

nand = all - [sapphire]
assert_equal [treasures(:diamond), cars(:honda)], nand

without_sapphire = PriceEstimate.where.not(estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id)
assert_equal nand, without_sapphire.map(&:estimate_of)
```

Resolves rails#31209.

kamipo added a commit to kamipo/rails that referenced this issue Apr 19, 2019

Deprecate `where.not` working as NOR and will be changed to NAND in R…
…ails 6.1

`where.not` with polymorphic association is partly fixed incidentally at
213796f (refer rails#33493, rails#26207, rails#17010, rails#16983, rails#14161), and I've added
test case e9ba12f to avoid lose that fix accidentally in the future.

In Rails 5.2, `where.not(polymorphic: object)` works as expected as
NAND, but `where.not(polymorphic_type: object.class.polymorphic_name,
polymorphic_id: object.id)` still unexpectedly works as NOR.

To will make `where.not` working desiredly as NAND in Rails 6.1, this
deprecates `where.not` working as NOR. If people want to continue NOR
conditions, we'd encourage to them to `where.not` each conditions
manually.

```ruby
all = [treasures(:diamond), treasures(:sapphire), cars(:honda), treasures(:sapphire)]
assert_equal all, PriceEstimate.all.map(&:estimate_of)
```

In Rails 6.0:

```ruby
sapphire = treasures(:sapphire)

nor = all.reject { |e| e.estimate_of_type == sapphire.class.polymorphic_name }.reject { |e| e.estimate_of_id == sapphire.id }
assert_equal [cars(:honda)], nor

without_sapphire = PriceEstimate.where.not(estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id)
assert_equal nor, without_sapphire.map(&:estimate_of)
```

In Rails 6.1:

```ruby
sapphire = treasures(:sapphire)

nand = all - [sapphire]
assert_equal [treasures(:diamond), cars(:honda)], nand

without_sapphire = PriceEstimate.where.not(estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id)
assert_equal nand, without_sapphire.map(&:estimate_of)
```

Resolves rails#31209.

kamipo added a commit to kamipo/rails that referenced this issue Apr 19, 2019

Deprecate `where.not` working as NOR and will be changed to NAND in R…
…ails 6.1

`where.not` with polymorphic association is partly fixed incidentally at
213796f (refer rails#33493, rails#26207, rails#17010, rails#16983, rails#14161), and I've added
test case e9ba12f to avoid lose that fix accidentally in the future.

In Rails 5.2, `where.not(polymorphic: object)` works as expected as
NAND, but `where.not(polymorphic_type: object.class.polymorphic_name,
polymorphic_id: object.id)` still unexpectedly works as NOR.

To will make `where.not` working desiredly as NAND in Rails 6.1, this
deprecates `where.not` working as NOR. If people want to continue NOR
conditions, we'd encourage to them to `where.not` each conditions
manually.

```ruby
all = [treasures(:diamond), treasures(:sapphire), cars(:honda), treasures(:sapphire)]
assert_equal all, PriceEstimate.all.map(&:estimate_of)
```

In Rails 6.0:

```ruby
sapphire = treasures(:sapphire)

nor = all.reject { |e|
  e.estimate_of_type == sapphire.class.polymorphic_name
}.reject { |e|
  e.estimate_of_id == sapphire.id
}
assert_equal [cars(:honda)], nor

without_sapphire = PriceEstimate.where.not(
  estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id
)
assert_equal nor, without_sapphire.map(&:estimate_of)
```

In Rails 6.1:

```ruby
sapphire = treasures(:sapphire)

nand = all - [sapphire]
assert_equal [treasures(:diamond), cars(:honda)], nand

without_sapphire = PriceEstimate.where.not(
  estimate_of_type: treasure.class.polymorphic_name, estimate_of_id: treasure.id
)
assert_equal nand, without_sapphire.map(&:estimate_of)
```

Resolves rails#31209.

kamipo added a commit to kamipo/rails that referenced this issue Apr 19, 2019

Deprecate `where.not` working as NOR and will be changed to NAND in R…
…ails 6.1

`where.not` with polymorphic association is partly fixed incidentally at
213796f (refer rails#33493, rails#26207, rails#17010, rails#16983, rails#14161), and I've added
test case e9ba12f to avoid lose that fix accidentally in the future.

In Rails 5.2, `where.not(polymorphic: object)` works as expected as
NAND, but `where.not(polymorphic_type: object.class.polymorphic_name,
polymorphic_id: object.id)` still unexpectedly works as NOR.

To will make `where.not` working desiredly as NAND in Rails 6.1, this
deprecates `where.not` working as NOR. If people want to continue NOR
conditions, we'd encourage to them to `where.not` each conditions
manually.

```ruby
all = [treasures(:diamond), treasures(:sapphire), cars(:honda), treasures(:sapphire)]
assert_equal all, PriceEstimate.all.map(&:estimate_of)
```

In Rails 6.0:

```ruby
sapphire = treasures(:sapphire)

nor = all.reject { |e|
  e.estimate_of_type == sapphire.class.polymorphic_name
}.reject { |e|
  e.estimate_of_id == sapphire.id
}
assert_equal [cars(:honda)], nor

without_sapphire = PriceEstimate.where.not(
  estimate_of_type: sapphire.class.polymorphic_name, estimate_of_id: sapphire.id
)
assert_equal nor, without_sapphire.map(&:estimate_of)
```

In Rails 6.1:

```ruby
sapphire = treasures(:sapphire)

nand = all - [sapphire]
assert_equal [treasures(:diamond), cars(:honda)], nand

without_sapphire = PriceEstimate.where.not(
  estimate_of_type: sapphire.class.polymorphic_name, estimate_of_id: sapphire.id
)
assert_equal nand, without_sapphire.map(&:estimate_of)
```

Resolves rails#31209.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.