Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional shards routing info in ShardSearchRequest #29533

Merged
merged 11 commits into from Apr 26, 2018

Conversation

Projects
None yet
4 participants
@jimczi
Copy link
Member

commented Apr 16, 2018

This commit propagates the preference and routing of the original SearchRequest in the ShardSearchRequest.
This information is then use to fix a bug in sliced scrolls when executed with a preference (or a routing).
Instead of computing the slice query from the total number of shards in the index, this commit computes this number from the number of shards per index that participates in the request.

Fixes #27550

Add additional shards routing info in ShardSearchRequest
This commit adds two new methods to ShardSearchRequest:
 * #numberOfShardsIndex() that returns the number of shards of this index
   that participates in the request.
 * #remapShardId() that returns the remapped shard id of this shard for this request.
   The remapped shard id is the id of the requested shard among all shards
   of this index that are part of the request. Note that the remapped shard id
   is equal to the original shard id if all shards of this index are part of the request.

These informations are useful when the _search is executed with a preference (or a routing) that
restricts the number of shards requested for an index.
This change allows to fix a bug in sliced scrolls executed with a preference (or a routing).
Instead of computing the slice query from the total number of shards in the index, this change allows to
compute this number from the number of shards per index that participates in the request.

Fixes #27550
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Apr 16, 2018

Pinging @elastic/es-search-aggs

@s1monw
Copy link
Contributor

left a comment

left 2 comments / ideas

* of this index that are part of the request. Note that the remapped shard id
* is equal to the original shard id if all shards of this index are part of the request.
*/
int remapShardId();

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 19, 2018

Contributor

I wonder if we should call it shardRequestOrdinal shard ID is trappy at least for me since I alwasy think of ShardID.java when I read it.

This comment has been minimized.

Copy link
@jimczi

jimczi Apr 19, 2018

Author Member

+1. I pushed b90716c

@@ -318,8 +319,11 @@ public final ShardSearchTransportRequest buildShardSearchRequest(SearchShardIter
AliasFilter filter = aliasFilter.get(shardIt.shardId().getIndex().getUUID());
assert filter != null;
float indexBoost = concreteIndexBoosts.getOrDefault(shardIt.shardId().getIndex().getUUID(), DEFAULT_INDEX_BOOST);
return new ShardSearchTransportRequest(shardIt.getOriginalIndices(), request, shardIt.shardId(), getNumShards(),
filter, indexBoost, timeProvider.getAbsoluteStartMillis(), clusterAlias);
int[] indexShards = getIndexShards(shardIt.shardId().getIndex());

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 19, 2018

Contributor

this is quite a complex operation since we call if for every shard and then do consume the entire iterator again. I wonder if we can pre-sort the ShardRoutings in SearchShardIterator and then calculate this on the fly and simply call SearchShardIterator#getIndexShardOrdinal() to get it?

This comment has been minimized.

Copy link
@jimczi

jimczi Apr 19, 2018

Author Member

I changed the logic to compute the needed informations only once in InitialSearchPhase constructor. I need the complete GroupShardsIterator to do so which is why it's not in SearchShardIterator but it's the same idea:
b90716c

jimczi added some commits Apr 19, 2018

Address reviews
Rename remapShardId to shardRequestOrdinal.
Compute request ordinal per shard and number of request shard per index only once.
Address reviews
Propagate preference and routing in the ShardSearchRequest in order to compute
the shard ordinal and number of requested shard lazily on each node.
The computation is done in the slice builder only when preference or routing are set
on the original search request.
@jimczi

This comment has been minimized.

Copy link
Member Author

commented Apr 20, 2018

We discussed with Simon and decided to propagate the routing and preference in the ShardSearchRequest (rather than computing the request shard ordinal and number of shards in the coordinating node). The slice builder can use this information to compute the needed information lazily. @s1monw can you take another look ?

@@ -250,7 +268,12 @@ protected void innerWriteTo(StreamOutput out, boolean asKey) throws IOException
if (out.getVersion().onOrAfter(Version.V_6_3_0)) {
out.writeOptionalBoolean(allowPartialSearchResults);
}

if (!asKey) {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 20, 2018

Contributor

can we use == false?

* slices use the same numbers.
*/
GroupShardsIterator<ShardIterator> group = buildShardIterator(clusterService, request);
assert group.size() <= numShards;

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 20, 2018

Contributor

give this assert a message it's a bummer when it fails and we don't have one

numShards = group.size();
int ord = 0;
shardId = -1;
for (ShardIterator it : group) {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 20, 2018

Contributor

can you leave an inline comment what we are doing here?

final Settings settings = clusterService.getSettings();
final String[] indices;
if (request instanceof IndicesRequest) {
indices = ((IndicesRequest) request).indices();

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 20, 2018

Contributor

should this always be the index name?

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 20, 2018

Contributor

oh now I see. Damned I didn't think of Aliases. If you have a routing alias we have a race condition here. I think we can't do it this way. The alias might change on the way to the shard which will cause wrong results. I think that renders my idea here as invalid?

This comment has been minimized.

Copy link
@jimczi

jimczi Apr 20, 2018

Author Member

Maybe we can change the ShardSearchRequest and instead of sending the global routing for the request we send the list of extracted routings for the requested index. ShardSearchRequest#routings and when there is no alias routing it returns the global routing in a singleton ?

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 21, 2018

Contributor

yeah I think the routing needs to be resolved on the coordinator

@s1monw

s1monw approved these changes Apr 24, 2018

Copy link
Contributor

left a comment

this looks great! thanks for the extra iterations!!!

@jimczi jimczi added v6.4.0 and removed v6.3.0 labels Apr 26, 2018

@jimczi jimczi merged commit 8b8c0c0 into elastic:master Apr 26, 2018

3 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@jimczi jimczi deleted the jimczi:bug/slice_with_routing branch Apr 26, 2018

jimczi added a commit that referenced this pull request Apr 26, 2018

Add additional shards routing info in ShardSearchRequest (#29533)
This commit propagates the preference and routing of the original SearchRequest in the ShardSearchRequest.
This information is then use to fix a bug in sliced scrolls when executed with a preference (or a routing).
Instead of computing the slice query from the total number of shards in the index, this commit computes this number from the number of shards per index that participates in the request.

Fixes #27550

jimczi added a commit that referenced this pull request Apr 26, 2018

dnhatn added a commit that referenced this pull request Apr 27, 2018

Merge branch '6.x' into ccr-6.x
* 6.x:
  Watcher: Ensure mail message ids are unique per watch action (#30112)
  SQL: Correct error message (#30138)
  SQL: Add BinaryMathProcessor to named writeables list (#30127)
  Tests: Use buildDir as base for generated-resources (#30191)
  Fix SliceBuilderTests#testRandom failures
  Fix edge cases in CompositeKeyExtractorTests (#30175)
  Document time unit limitations for date histograms (#30177)
  Remove licenses missed by the migration
  [DOCS] Updates docker installation package details (#30110)
  Fix TermsSetQueryBuilder.doEquals() method (#29629)
  [Monitoring] Remove unhelpful Monitoring tests (#30144)
  [Test] Fix RenameProcessorTests.testRenameExistingFieldNullValue() (#29655)
  [test] include oss tar in packaging tests (#30155)
  TEST: Update settings should go through cluster state (#29682)
  Add additional shards routing info in ShardSearchRequest (#29533)
  Reinstate missing documentation (#28781)
  Clarify documentation of scroll_id (#29424)
  Fixes Eclipse build for sql jdbc project (#30114)
  Watcher: Fold two smoke test projects into smoke-test-watcher (#30137)

@jimczi jimczi referenced this pull request Oct 2, 2018

Closed

Connector uncorrectly determines the number of partitions #1196

1 of 1 task complete

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.