$in operator is in-efficient #3251
-
We are on version 2.2.0 (we are using blockchain on kubernetes docker images) Our use-case is us using an $in query for a particular set of documents after caching what documents a user can see, so it is very critical that it is efficient in the long run although are dataset isn't particularly large(around a few thousand in a year). It still will affect load times on our front end. Given the background, and dataset size what do you recommend for us to do? Desired BehaviourOptimize data fetches relating to $in queries |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
In our development environment:
|
Beta Was this translation helpful? Give feedback.
-
Hiya, the CouchDB query planner has room for improvement here. Take a look at the https://github.com/apache/couchdb/blob/3.1.1/src/mango/src/mango_idx_view.erl#L252-L312 Currently it does not try to satisfy
|
Beta Was this translation helpful? Give feedback.
-
Thank you @kocolosk , could not have asked for a better response. We are going to look into creating a view, we are also looking into this concepts for joins in the future, luckily we have more than enough time before our app starts crashing |
Beta Was this translation helpful? Give feedback.
Hiya, the CouchDB query planner has room for improvement here. Take a look at the
indexable_fields
function:https://github.com/apache/couchdb/blob/3.1.1/src/mango/src/mango_idx_view.erl#L252-L312
Currently it does not try to satisfy
$in
or$or
operators using an index, so it's returning everything withdocType: data
and then filtering at query time. One could certainly imagine improving the planner so it performs multiple point lookups in the index, one for each element of your$in
array, but that's not present today. Some possible workarounds:dataId
and then combine the results in your app[ "docType", "dataId" ]
…