-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-15595: Partial results from shard queries needlessly discarded for queries without sort fields #267
Conversation
sortFieldValues = new NamedList<>(); | ||
} | ||
// skip merging results for this shard if the sortSpec includes a non-scoredoc field but the sortFieldValues is empty. | ||
if (thisResponseIsPartial && sortFieldValues.size() == 0 && ss.includesNonScoreOrDocField()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's interesting to me that we do a similar check later on line 1122, not sure if we can combine or remove some of this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unmarshalSortValues method just returns an empty list if the sort if by score and/or docid and no other field. I could revise the mergeIds method to not call it for this case. The only thing that gives me pause is that unmarshalSortValues is protected, so presumably its being overridden someplace? It doesn't appear to be anywhere in the code base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding skipping calling unmarshalSortValues if it is a no-op.
…. Refactor QueryComponent changes.
import org.junit.Test; | ||
import org.mockito.Mockito; | ||
|
||
import java.util.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no wildcard imports please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed import wildcard.
solrDocument.addField("score", id * 1.1F); | ||
solrDocument.addField(SORT_FIELD_NAME, id * 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these values actually matter? What if we made them constant. Or equal to id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used just the id since the values don't matter,
assertEquals((shouldIncludePartialShardResult ? shard1Size : 0) + shard2Size, responseBuilder.getResponseDocs().size()); | ||
} | ||
|
||
private static class MockResponseBuilder extends ResponseBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost feel like all the Mock* classes shouldn't be in here, but could have their own classes. It looks like the convention we use is to do inner Mock classes when they are nearly trivial and give them their own files otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted the Mock classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mock implementations are very bespoke to the new tests, but this way others can extend them.
// shard 1 is marked partial | ||
NamedList<Object> responseHeader1 = new NamedList<>(); | ||
responseHeader1.add(SolrQueryResponse.RESPONSE_HEADER_PARTIAL_RESULTS_KEY, Boolean.TRUE); | ||
|
||
// shard 2 is not partial | ||
NamedList<Object> responseHeader2 = new NamedList<>(); | ||
|
||
MockShardRequest shardRequest = MockShardRequest.create() | ||
.withShardResponse(responseHeader1, createSolrDocumentList(shard1Size)) | ||
.withShardResponse(responseHeader2, createSolrDocumentList(shard2Size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be pulled out into a field that only gets created once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
boolean needsUnmarshalling = ss.includesNonScoreOrDocField(); | ||
// skip merging results for this shard if the sortSpec sort values need to be marshalled but the sortFieldValues is empty. | ||
if (thisResponseIsPartial && sortFieldValues.size() == 0 && needsUnmarshalling) { | ||
continue; | ||
} | ||
NamedList<List<Object>> unmarshalledSortFieldValues = needsUnmarshalling ? unmarshalSortValues(ss, sortFieldValues, schema) : new NamedList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this needs more comments, or I'm being dense here. I have to mentally trace the code every time I come back to it and convince myself that it's right (which I'm sure it is because I trust you've done the testing, not because I actually understand it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments. Hope they help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do! I am super happy with these, and they help me understand it much better than I thought I got it before.
…st and use a singleton for test ShardRequest.
@@ -0,0 +1,36 @@ | |||
package org.apache.solr.handler.component; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh!
@@ -0,0 +1,54 @@ | |||
package org.apache.solr.handler.component; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license header
boolean needsUnmarshalling = ss.includesNonScoreOrDocField(); | ||
// skip merging results for this shard if the sortSpec sort values need to be marshalled but the sortFieldValues is empty. | ||
if (thisResponseIsPartial && sortFieldValues.size() == 0 && needsUnmarshalling) { | ||
continue; | ||
} | ||
NamedList<List<Object>> unmarshalledSortFieldValues = needsUnmarshalling ? unmarshalSortValues(ss, sortFieldValues, schema) : new NamedList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do! I am super happy with these, and they help me understand it much better than I thought I got it before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending the two missing headers, and precommit/tests coming back happy
I just triggered the precommit check ;-) |
LGTM, yes! |
…or queries without sort fields (apache#267) * No longer skip merging partial shard results for non-sort queries * Add unit test for QueryComponent. Add javadoc for new SortSpec method. Refactor QueryComponent changes. * Improve javadocs. Fix parameter name in unit test class method. * Add comments to QueryComponent. Extract inner test class for added test and use a singleton for test ShardRequest.
https://issues.apache.org/jira/browse/SOLR-15595
Description
Partial shard results due to timeouts are not merged into the final query results. This is unnecessary for queries sorted only by score and/or docid.
Solution
No longer exit the shard merge early when partial results are returned for queries sorted just by score or docid.
Tests
Additional tests added for SortSpecParser for detecting that the sort only includes the score and/or docid.
Added unit test class for QueryComponent.
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.