-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-9634: Fix highlighting of extended intervals matched using offset #16
base: main
Are you sure you want to change the base?
LUCENE-9634: Fix highlighting of extended intervals matched using offset #16
Conversation
@@ -250,6 +250,10 @@ public MatchesIterator getSubMatches() throws IOException { | |||
|
|||
@Override | |||
public Query getQuery() { | |||
return queue.top().getQuery(); | |||
if (queue.size() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is needed to fix two other tests that would throw NPE from here when matchesIterator.getQuery()
is called in OffsetsFromMatchIterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was away for the weekend -- sorry, Zach. I'm not sure if I'm a fan of reanalyzing the token stream... This would be always my last-resort option, to be honest.
Maybe we could somehow augment the matches API so that those "correct" before/after offsets are returned or somehow provided instead? @romseygeek would be an authoritative source to ask here, I'm just a humble user of this awesome function - Alan knows the internals.
No problem Dawid and thanks for the feedback! I agree that re-analyzing the token stream does seems to be wasteful if the index already has position and offset information. I think the central issue here is that the current lucene/lucene/queries/src/java/org/apache/lucene/queries/intervals/Intervals.java Lines 251 to 252 in 471f38c
However, I'm also wondering if the position to / from offset mappings were already computed / stored for other purposes and readily available in OffsetRetrievalStrategy classes? If so, we could skip the re-analysis as well. Alan, could you please let me know if you have any pointer here? I'm happy to try out different approaches to find the best solution here. |
Hi @romseygeek, just want to circle back to this PR to see if you could provide any guidance here? |
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution! |
This PR is currently in draft state
Description
Fix highlighting of extended intervals matched using offset
Proposed Solution
In
OffsetsFromMatchIterator
, retrievesExtendedIntervalsSource
and itsbefore
/after
position values, and use them to adjust highlight offset range matched by offsetTests
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.