Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RangeQuery without lower term and inclusive=false skips blank fields [LUCENE-38] #1116

Closed
asfimport opened this issue Jun 11, 2002 · 5 comments

Comments

@asfimport
Copy link

This was reported by "James Ricci" <james@riccinursery.com> at:
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=1835

When you create a ranged query and omit the lower term, my expectation
would be that I would find everything less than the upper term. Now if I pass
false for the inclusive term, then I would expect that I would find all
terms less than the upper term excluding the upper term itself.

What is happening in the case of lower_term=null, upper_term=x,
inclusive=false is that empty strings are being excluded because
inclusive is set false, and the implementation of RangedQuery creates a default
lower term of Term(fieldName, ""). Since it's not inclusive, it excludes "".
This isn't what I intended, and I don't think it's what most people would
imagine RangedQuery would do in the case I've mentioned.

I equate lower=null, upper=x, inclusive=false to Field < x. lower=null,
upper=x, inclusive=true would be Field <= x. In both cases, the only
difference should be whether or not Field = x is true for the query.


Migrated from LUCENE-38 by Otis Gospodnetic, resolved Nov 13 2008
Environment:

Operating System: other
Platform: Other

Attachments: LUCENE-38.patch, TestRangeQuery.patch

@asfimport
Copy link
Author

Dejan Nenov (migrated from JIRA)

Added additional tests, using "null" as the lower term in the range query. The tests are commented to indicate how they should be modified to behave once this LUCENE-38 is fixed.

@asfimport
Copy link
Author

Mark Miller (@markrmiller) (migrated from JIRA)

Does this need to be 'fixed' ? RangeQuery now uses the semantics from ConstantScoreRangeQuery, which decided that open ended sides of a range must be inclusive (and are converted as such if not). Is that acceptable and we close this bug? Or jump a hoop or two for this rather niche case?

@asfimport
Copy link
Author

Otis Gospodnetic (@otisg) (migrated from JIRA)

This thing is 6+ years old and I don't recall this being mentioned on the list in the last half a decade. I'll leave you the Won't Fix pleasure, Mark.

@asfimport
Copy link
Author

Michael McCandless (@mikemccand) (migrated from JIRA)

Actually, this should have already worked, because RangeTermEnum forces includeLower to be true when lowerTermText is null.

But indeed the test still fails, so I dug into a bit and I think the test is faulty. The test expects the empty string doc ("") to be returned as a result, but the problem is the empty string doc when analyzed does not produce an empty string Token. So I modified the test (attached) to use an analyzer that emits empty string token, and then the test passes as expected.

I'll commit shortly.

@asfimport
Copy link
Author

Michael McCandless (@mikemccand) (migrated from JIRA)

Committed revision 713696.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant