Empty array interpolates as NULL in where condition #778

Closed
lighthouse-import opened this Issue May 16, 2011 · 34 comments
@lighthouse-import

Imported from Lighthouse. Original ticket at: http://rails.lighthouseapp.com/projects/8994/tickets/6133
Created by Mat Schaffer - 2011-02-19 09:28:28 UTC

This works fine:

Comment.where("id NOT IN (?)", [1,2,3])

But passed an empty array like so:

Comment.where("id NOT IN (?)", [])

Generates this query which doesn't have the intended effect of including all records:

SELECT "comments".* FROM "comments" WHERE (id NOT IN (NULL))

Seems like empty arrays should interpolate as an empty string rather than NULL.

@lighthouse-import

Imported from Lighthouse.
Comment by Mat Schaffer - 2010-12-08 16:17:23 UTC

Update: Used [''] as a workaround on sqlite. This doesn't seem to work on PostgreSQL.

@lighthouse-import

Imported from Lighthouse.
Comment by Mat Schaffer - 2010-12-08 16:35:29 UTC

Looks like both sqlite and PostgreSQL adapters (tested via heroku's console) both generate the NULL. PostgreSQL seems unhappy with the empty array though since it considers (id NOT IN ()) to be invalid.

@lighthouse-import

Imported from Lighthouse.
Comment by 2kan - 2010-12-19 11:23:15 UTC

@Mat Why do you think that

id NOT IN (NULL)

is incorrect?

For me everything works correct.

@lighthouse-import

Imported from Lighthouse.
Comment by Mat Schaffer - 2010-12-19 20:38:06 UTC

Originally I was expecting NOT IN () which at least under sqlite had the intended effect of giving me all records. But I'm pretty sure postgres didn't like that either. Right now I have an "if ids.empty?" in my model. But that seemed less than awesome.

Is there maybe a better way to do this using an arel condition that I missed?

@lighthouse-import

Imported from Lighthouse.
Comment by 2kan - 2010-12-19 22:55:14 UTC

@Mat, Oh, I think maybe you are right, and it is a bug, because in Postgres this query will return empty set of records, so in Ruby you will get an empty array. And I think we expect all records.

@lighthouse-import

Imported from Lighthouse.
Comment by gmile - 2011-02-12 12:59:08 UTC

@2kan, going further, does

IN (NULL)

ever works for you? For me, it seems that MySQL doesn't want to retrieve any records, while trying to do something like (codes from head):

Post.all(
  :joins      => "LEFT JOIN view_logs ON posts.id = view_logs.post_id",
  :conditions => { :view_logs => { :user_id => [1, 2, nil] }
}

The need to retrieve using nil may appear when using, for example LEFT JOIN, where associated records with nulls may appear. I believe, the true retrieve of records with nulls in terms of SQL should look like

view_logs.user_id IS NULL

Thus, I believe, the above Post.all(...) retrieval should generate the following SQL piece:

... WHERE (view_logs.user_id IN (1, 2) OR view_logs.user_id IS NULL)

I've made an Arel patch to fix the issue, as well as a patch with a couple of Rails test to check the behavior. Could you plese take a look on them and say what you think?

@leomao10

Any one fixed this issue? I still found this behaviour in 3.2.2.

It is quite confusion as I expect that ActiveRecord should return all Post if I go Post.where('status NOT IN (?)', [])

@agraves

Still seeing this in 3.2.8 on Postgres.

@agraves

Just to be 100% clear, I'm seeing this:

1.9.3p194 :007 > Foo.where('id NOT IN (?)', [])
  Foo Load (0.6ms)  SELECT "foos".* FROM "foos" WHERE (id NOT IN (NULL))

where I should be seeing:

1.9.3p194 :007 > Foo.where('id NOT IN (?)', [])
  Foo Load (0.6ms)  SELECT "foos".* FROM "foos" WHERE (id NOT IN (SELECT 1 WHERE false))
@steveklabnik steveklabnik reopened this Sep 16, 2012
@steveklabnik
Ruby on Rails member

Re-opening since you're seeing this on 3.2.x

@bai
bai commented Jan 7, 2013

I confirm that this issue is still relevant for 3.2.10.

@jalada

+1 also having this issue.

@guilleiguaran
Ruby on Rails member

This will be fixed with the next release of ARel 3.0.x

@agraves

Excellent! Thanks for all your hard work @guilleiguaran et al!

@guilleiguaran
Ruby on Rails member

kudos to @ernie that worked the last weekend backporting the fix to ARel 3.0.x!!!

rails/arel@d3a8fa9

@ernie

@guilleiguaran heh thanks for the kudos, but to be fair, it was only a 5 minute cherry-pick of my commit to master from a while back. :)

@ernie

I should also add that if you're using string SQL, you're still not going to see what you'd like. But as long as you're using ARel predicates (if you use :column => [] syntax, you are, via PredicateBuilder) then you should be good.

@steveklabnik
Ruby on Rails member

Since Arel 3.0.2 is now out, I'm giving this a close.

@aman199002

This issue still exists in rails 3.2.11. While using empty array it returns nil result.

Comment.where("id NOT IN (?)", [])

returns empty array.

It will work fine if array elements are joined as a string i.e:

@ids = []           #some array
Comment.where("id NOT IN (?)", @ids.join(','))
@agraves

I'm running Arel 3.0.2 and still seeing the exact same but I reported earlier.

irb(main):030:0> User.where('id NOT IN (?)', [])
  User Load (1.1ms)  SELECT "users".* FROM "users" WHERE (id NOT IN (NULL))
=> []

Am I missing something? Is there some circumstance under which this is correct behavior?

@guigs

I have the same problem using Rails 4.0.0.rc1 and arel 4.0.0 with Postgresql.

@guigs

I figured out the solution: User.where.not(id: []) works as expected in Rails 4.0.0.rc1 and is also more elegant!

But I still think that User.where('id NOT IN (?)', []) should be equivalent to User.where.not(id: [])

@agraves

@guigs Glad to hear you got something working on Rails 4! Too bad those of us on 3 though are still stuck having to write some nasty code to work around this :(

@zephyr-dev

+1 would like to see this fixed in Rails 3

@robin850
Ruby on Rails member

@zephyr-dev : This won't be fixed in Rails 3 since we are not supporting this version anymore (only 3.2.x can receive eventual security fixes).

@dmur

I think @guigs makes a good point that User.where('id NOT IN (?)', []) should be equivalent to User.where.not(id: []). Is the reasoning behind that simply that NOT IN is old syntax?

@masterkrang

not fixed in Rails 3, sorta a bummer

@masterkrang

A slightly unsafe way in Rails 3 to fix this is to just create a fake array with an id that will likely not exist. For example, in Heroku, record ids go in increments of 10, so you'd never have a record id of 2. So you could do something like this:

// check if the array count is 0
if arr.count < 1
//dump a fake value into the array
arr = [2]
end

//should never be null now
Tag.where('id not in (?)',arr)

(damn I forget how to show hash symbols in markdown)

@dmitry

@masterkrang use Markdown supported button on above of the editor (textarea) to find out how to format code.

@iDiogenes

I see this issue has been closed, but I am still seeing this behavior in Rails 4.1 with postgres.

User.where('report_ids NOT IN (?)', [])
=>   User Load (7.0ms)  SELECT "users".* FROM "users"  WHERE (report_ids NOT IN (NULL))

Since I am using postgres arrays the following worked for me for querying all none empty arrays.

User.where("report_ids != ?", "{}")
@Arugin

@iDiogenes +1, reproduced for me too in rails 4.2

@matthewd
Ruby on Rails member

@jerefrer please feel free to open a pull request if you can see a way of solving this. Scrolling through nearly 5 years of comments, I see complaints that the currently-generated SQL is wrong, but no suggestion of what it should look like instead (other than IN (), which is not well-formed SQL, and works only in SQLite).

Even absent a PR, I'd be glad to see a new issue, that specifically describes what we're doing wrong, and what "right" looks like.

I'm actually unclear as to why this behaviour is causing you any problem; AIUI, it should only be (unavoidably) problematic for NOT IN conditions.


FWIW, with Rails 5, I would spell that something like:

@messages = messages.where("subject LIKE ?", params[:search]).or(
              messages.where(sender_ids: user_ids))
@jerefrer

@matthewd As you may have seen I have removed my comments. I was getting an error because of a mistake I made: I did where('sender_id IN :user_ids', user_ids) instead of where('sender_id IN (:user_ids)', user_ids). Adding the () provided the expected resultats so my complain was not really relevant to this debate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment