SOLR-13309: Add IntRangeField for Lucenes IntRange#4141
Open
gerlowskija wants to merge 13 commits intoapache:mainfrom
Open
SOLR-13309: Add IntRangeField for Lucenes IntRange#4141gerlowskija wants to merge 13 commits intoapache:mainfrom
gerlowskija wants to merge 13 commits intoapache:mainfrom
Conversation
This commit adds a new field type, IntRangeField, that can be used to
hold singular or multi-dimensional (up to 4) ranges of integers.
Field values are represented using brackets and the "TO" operator, with
commas used to delimit dimensions (when a particular field is defined as
having more than 1 dimension), e.g.
- [-1 TO 5]
- [1,2 TO 5,10]
- [1 TO 1]
IntRangeField does not support docValues or uninversion, meaning it's
primarily only used for querying. The field can be stored and returned
in search-results. Searches on these range-fields mostly rely on a
QParser, {!myRange}, which supports "intersects", "crosses", "within",
and "contains" semantics via a "criteria" local param. e.g.
- {!myRange field=price_range criteria=within}[1 TO 5]
Matches docs whose 'price_range' field falls fully within [1 TO 5].
A doc with [2 TO 3] would match; [3 TO 6] or [8 TO 10] would not.
- {!myRange field=price_range criteria=crosses}[1,10 TO 5,20]
Matches docs whose 'price_range' field is partially but not fully
contained within [1,10 TO 5,20]. A doc with [2,11 TO 6,21] would
match, but [3,11 TO 5,19] would not.
TODO
- renaming of QParser, 'myRange' stinks
- general cleanup
- switch around 'external', 'internal', 'native' representations.
Contributor
Author
|
Still a lot of cleanup to be done here, but I thought this was ready to publish as a "draft" so folks can provide feedback on the general approach. |
Contributor
Author
|
So, I wanted to highlight some design choices for potential reviewers - these are decisions I made in putting this together that 100% need a second set of eyes:
Still TODO
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://issues.apache.org/jira/browse/SOLR-13309
Description
Lucene offers a variety of 'Range' field types, where the value stored in the field is itself a range (e.g.
[1 TO 5]). Lucene then allows efficient search on these using itsRangeFieldQuery.Solr offers no similar functionality, despite having access to these underlying Lucene capabilities. We should expose start exposing these, starting with what's probably the most popular option, ints.
Solution
This commit adds a new field type, IntRangeField, that can be used to hold singular or multi-dimensional (up to 4) ranges of integers.
Field values are represented using brackets and the "TO" operator, with commas used to delimit dimensions (when a particular field is defined as having more than 1 dimension), e.g.
[-1 TO 5][1,2 TO 5,10][1 TO 1]IntRangeField does not support docValues or uninversion, meaning it's primarily only used for querying. The field can be stored and returned in search-results. Searches on these range-fields rely on a new QParserPlugin implementation,
{!numericRange}, which supports "intersects", "crosses", "within", and "contains" semantics via a "criteria" local param. e.g.{!numericRange field=price_range criteria=within}[1 TO 5]Matches docs whose 'price_range' field falls fully within [1 TO 5]. A doc with [2 TO 3] would match; [3 TO 6] or [8 TO 10] would not.{!numericRange field=price_range criteria=crosses}[1,10 TO 5,20]Matches docs whose 'price_range' field is partially but not fully contained within [1,10 TO 5,20]. A doc with [2,11 TO 6,21] would match, but [3,11 TO 5,19] would not.Tests
New test classes: IntRangeFieldTest and IntRangeQParserPluginTest.
Checklist
Please review the following and check all that apply:
mainbranch../gradlew check.