Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix incorrect index selection when sort specified #866
Mango has an apparently long-standing bug whereby if an index is deemed valid for sorting a query but is not valid according to mango_idx:is_usable, the query planner would fall back to _all_docs silently, therefore ignoring the sort order specified in the query.
This reverses the index selection logic so that the list of usable indexes is generated prior to filtering them based on the sort order specified in the query.
Similar logic is applied to use_index, allowing us to generate a more specific error message when a user specifies an index which isn't valid for the current
This fix exposes that the change introduced in #816 may cause existing queries with sort fields to fail (indeed, several queries in our test suite needed fixing), so we need to consider whether this is a severe enough breaking change to warrant a major version bump.
Run the test suite. Test that you can run Mango queries with sort criteria.
Related Pull Requests
This is a tricky spot. I think the situation where the user would complain about backwards compatibility is as follows:
In the scenario where we ignore sort order and used _all_docs, I would assume the user would have already complained that their documents are not sorted.
Perhaps we should improve the error messages returned?
@tonysun83 If a
This would allow us to safely validate the selector/index combination against
@willholley : I considered appending
Index: [a, b, c]
Overall the change looks good. I can't figure out norm_deduplicate though. I assume that's to try and deduplicate the $exists check which a) I'm not sure is a requirement, and b) seems to be unused near as I can tell.
Also there's a bug in has_use_index I noted below.
Couple things to unpack in the discussion:
First, for #816 I think that's actually the opposite, its not whether it needs a new version bump but how many versions should be be back porting this fix to. A database that gives bad answers to queries is more a bug that needs to be patched backwards, not something we cut the cord on and just bump version numbers.
There is of course the issue around notifying users that a pending change may break their application (technically fix, but users have a different perspective :), so this is definitely something we should talk about. We'll also want to update the documentation as we've long said that using column prefixes was a suggestion for sharing indexes so that'll need to be fixed.
For @tonysun83's example of this changing a query from something that uses _all_docs to something that uses an existing index I don't think that's an issue. Originally I would have ixnayed it because of the lack of range restriction on A and B but we've already relaxed that by having an _all_docs default. Also the behavior here is different than the issue introduced by #816 in that previously Tony's example would have been an incorrect sort returned where the new version changes that (ie, same set of docs would be returned both places, just now its sorted). Which I think is quite a bit different than subtley different sets of documents returned. Also, if a user specifies a sort and isn't detecting the wrong sort order based on the _all_docs fall back then I'm failing to see how they'd be relying on that sort, and thus fixing it seems like not an issue.
For #816 with the _all_docs fallback is that also not more of a concern about performance rather than an app that stops working? Ie, all queries will continue to work and be "fixed" but some unknown number of queries will suddenly switch to using _all_docs and become much slower? Granted suddenly slower performance that lasts until an index can be built isn't nothing either.
I've pushed a new fix which doesn't require rewriting the selector. Instead, we just include sort fields when considering whether a JSON index provides coverage for the selector. This seems a lot simpler and, unlike the previous implementation, doesn't impact the validity of other index types (i.e. text indexes).