New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix maximum key value when using JSON indexes #881

Merged
merged 3 commits into from Oct 16, 2017

Conversation

Projects
None yet
4 participants
@willholley
Member

willholley commented Oct 9, 2017

Overview

Mango previously constrained range queries against JSON indexes
(map/reduce views) to startkey=[]&endkey=[{}]. In Mango, JSON
index keys are always compound (i.e. always arrays), but this
restriction resulted in Mango failing to match documents where
the indexed value was an object.

For example, an index with keys:

[1],
[2],
[{"foo": 3}]

would be restricted such that only [1] and [2] were returned
if a range query was issued.

On its own, this behaviour isn't necessarily unintuitive, but
it is different from the behaviour of a non-indexed Mango
query, so the query results would change in the presence of an
index.

Additonally, it prevented operators or selectors which explicitly
depend on a full index scan (such as $exists) from returning a
complete result set.

This commit changes the maximum range boundary from [{}] to {},
so all array/compound keys will be included.

Testing recommendations

Create some documents with a field that contains a mix of value types - integers, strings, objects, nulls.

Without adding an index, run some Mango queries using range operators ($lt, $lte, $gt, $gte) and those that depend on full index scans - $exists, $regex etc.

If you then add a JSON index on the field, ensure you get the same results.

GitHub issue number

Related Pull Requests

Checklist

  • Code is written and works correctly;
  • Changes are covered by tests;
  • Documentation reflects the changes;

@willholley willholley requested a review from davisp Oct 9, 2017

@davisp

davisp approved these changes Oct 9, 2017

+1

Yeap, looks like a silly bug on my part.

@janl

This comment has been minimized.

Show comment
Hide comment
@janl

janl Oct 9, 2017

Member

@willholley this seems to have broken other mango tests (see Travis for details, all builds failed).

Member

janl commented Oct 9, 2017

@willholley this seems to have broken other mango tests (see Travis for details, all builds failed).

@willholley

This comment has been minimized.

Show comment
Hide comment
@willholley

willholley Oct 10, 2017

Member

oops - this turned out to be a bit trickier than I first thought because end_key returns a list. I've changed the PR to just add a special case when we want no upper bound.

Member

willholley commented Oct 10, 2017

oops - this turned out to be a bit trickier than I first thought because end_key returns a list. I've changed the PR to just add a special case when we want no upper bound.

@willholley

This comment has been minimized.

Show comment
Hide comment
@willholley

willholley Oct 11, 2017

Member

@davisp I'd appreciate a second review of this. My niggly concern is that it dosn't distinguish between a user specified upper bound of [{}] and json_max_value, but this seems like very edgey case. In that situation, all that would happen is that the index would not be as restrictive as it could be - the correct result should still be returned, with some records potentially filtered out in memory.

Member

willholley commented Oct 11, 2017

@davisp I'd appreciate a second review of this. My niggly concern is that it dosn't distinguish between a user specified upper bound of [{}] and json_max_value, but this seems like very edgey case. In that situation, all that would happen is that the index would not be as restrictive as it could be - the correct result should still be returned, with some records potentially filtered out in memory.

willholley added some commits Oct 9, 2017

Fix maximum key value when using JSON indexes
Mango previously constrained range queries against JSON indexes
(map/reduce views) to startkey=[]&endkey=[{}]. In Mango, JSON
index keys are always compound (i.e. always arrays), but this
restriction resulted in Mango failing to match documents where
the indexed value was an object.

For example, an index with keys:

[1],
[2],
[{"foo": 3}]

would be restricted such that only [1] and [2] were returned
if a range query was issued.

On its own, this behaviour isn't necessarily unintuitive, but
it is different from the behaviour of a non-indexed Mango
query, so the query results would change in the presence of an
index.

Additonally, it prevented operators or selectors which explicitly
depend on a full index scan (such as $exists) from returning a
complete result set.

This commit changes the maximum range boundary from {} to an object
that collates higher than anything allows in JSON,
so all array/compound keys will be included.

Note that this uses an invalid UTF-8 character, so we depend
on the view engine not barfing when this is passed as a
parameter. In addition, we can't represent the value in JSON
so we need to subtitute is when returning a query plan
in the _explain endpoint.
@garrensmith

This looks great. One small change to consider otherwise I'm +1

Show outdated Hide outdated src/mango/src/mango_cursor_view.erl

@willholley willholley merged commit 641aa56 into apache:master Oct 16, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@willholley willholley deleted the willholley:mango_fix_upper_key_boundary branch Oct 16, 2017

wohali added a commit that referenced this pull request Oct 19, 2017

Fix maximum key value when using JSON indexes (#881)
Mango previously constrained range queries against JSON indexes
(map/reduce views) to startkey=[]&endkey=[{}]. In Mango, JSON
index keys are always compound (i.e. always arrays), but this
restriction resulted in Mango failing to match documents where
the indexed value was an object.

For example, an index with keys:

[1],
[2],
[{"foo": 3}]

would be restricted such that only [1] and [2] were returned
if a range query was issued.

On its own, this behaviour isn't necessarily unintuitive, but
it is different from the behaviour of a non-indexed Mango
query, so the query results would change in the presence of an
index.

Additonally, it prevented operators or selectors which explicitly
depend on a full index scan (such as $exists) from returning a
complete result set.

This commit changes the maximum range boundary from {} to a 
value that collates higher than any JSON object, so all 
array/compound keys will be included.

Note that this uses an invalid UTF-8 character, so we depend
on the view engine not barfing when this is passed as a
parameter. In addition, we can't represent the value in JSON
so we need to subtitute is when returning a query plan
in the _explain endpoint.

iilyak pushed a commit to cloudant/couchdb that referenced this pull request Nov 6, 2017

willholley added a commit to willholley/couchdb that referenced this pull request May 22, 2018

Fix maximum key value when using JSON indexes (#881)
Mango previously constrained range queries against JSON indexes
(map/reduce views) to startkey=[]&endkey=[{}]. In Mango, JSON
index keys are always compound (i.e. always arrays), but this
restriction resulted in Mango failing to match documents where
the indexed value was an object.

For example, an index with keys:

[1],
[2],
[{"foo": 3}]

would be restricted such that only [1] and [2] were returned
if a range query was issued.

On its own, this behaviour isn't necessarily unintuitive, but
it is different from the behaviour of a non-indexed Mango
query, so the query results would change in the presence of an
index.

Additonally, it prevented operators or selectors which explicitly
depend on a full index scan (such as $exists) from returning a
complete result set.

This commit changes the maximum range boundary from {} to a 
value that collates higher than any JSON object, so all 
array/compound keys will be included.

Note that this uses an invalid UTF-8 character, so we depend
on the view engine not barfing when this is passed as a
parameter. In addition, we can't represent the value in JSON
so we need to subtitute is when returning a query plan
in the _explain endpoint.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment