Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device.find(camera_id: nil) does not find devices #216

Open
tetherit opened this issue Nov 1, 2016 · 5 comments
Open

Device.find(camera_id: nil) does not find devices #216

tetherit opened this issue Nov 1, 2016 · 5 comments

Comments

@tetherit
Copy link

tetherit commented Nov 1, 2016

time-agent(prod)> Device.all.each {|d| p d.camera_id}
nil
nil
nil
"56f290913d1f9207691bab65"
nil
time-agent(prod)> Device.find(camera_id: '56f290913d1f9207691bab65').count
=> 1
time-agent(prod)> Device.find(camera_id: nil).count
=> 0

I'm expecting it to find the 4 devices where camera_id is nil, but it's not finding anything. Any ideas?

@soveran
Copy link
Owner

soveran commented Nov 13, 2016

Hey @xanview, sorry for not replying earlier. I was on a trip and didn't have time for a proper answer. I think at some point, in earlier versions, it was possible to index nil values. But then we decided against it because if you have a lot of nils (which is a common scenario), then the indices created are huge. What you can do is to force a value so that you can find it later (for example, you could use the string "nil").

@tetherit
Copy link
Author

Hi Soveran,

Thank you for the reply. Sure that makes sense not to index nil by default, but maybe it could be an optional option, something like:

index :device_id, index_nil: true

This way you have the best of both worlds? - without resorting to hacks such as a "nil" string value.

@soveran
Copy link
Owner

soveran commented Nov 24, 2016

I explored that solution, but in the end I didn't find it satisfying. First, a bit more about my previous proposal: instead of using the string "nil" you can use some relevant value, for example "available", or "unassigned", or anything that makes sense to the problem domain. Using "nil" indeed looks like a hack, but I think using something relevant is the best solution to this problem.

Regarding the index_nil feature, the advantage of such implementation would be to cover this use case, but sadly there are some disadvantages: it will add a check for nil and a special case to all calls, and behind the scenes it will generate a string value for the index (so there's no optimization in the final solution if we compare it with the proposal of using a meaningful value). As an aside, on a subjective level, while playing with this implementation it felt wrong to index nil values. I think it's something we shouldn't do in principle. I checked how other databases deal with this situation. Postgres recommends creating a partial index for columns with NULL. MySQL creates an index for IS [NOT] NULL only for certain storage backends and certain keys (for example, it never indexes NULL for primary or unique keys). Oracle provides a NVL (NULL value) function that can be used for creating indices with NULL values, but all the function does is translate NULL into the string "NULL", and I got the impression by reading the docs that there's no gain in using that function vs storing the "NULL" string directly. I found this explanation for the mixed support regarding that kind of indices: "By default, relational databases ignore NULL values because the relational model says that NULL means not present". That's why the default behavior when searching for IS [NOT] NULL in many databases is a full table scan.

After reflecting on this issue, I think the best approach would be to use a meaningful value for that field and rely on the existing indexing infrastructure.

@tetherit
Copy link
Author

tetherit commented Nov 27, 2016

I can see what you're saying, but in terms of other databases, I'm used to doing this with Mongo (Mongoid):

User.where(:auth_token.exists => false)

You're right, if no index is configured, it will have to scan the entire database, but at least it gives the expected behaviour of running the requested query.

Maybe it should throw an exception when searching for nil at least? - it can even link to this issue in the exception so the person has a workaround :)

@soveran
Copy link
Owner

soveran commented Nov 28, 2016

Two great ideas right there: first, raise an error if nil is used in a filtering operation, and second, the idea of providing a lot of help in the error message. I'll work on it, thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant