Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ability to minimize round-trips in CCS #37828

Merged
merged 33 commits into from Jan 31, 2019

Conversation

Projects
None yet
7 participants
@javanna
Copy link
Member

commented Jan 24, 2019

With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.

This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.

Relates to #32125

javanna added some commits Jan 21, 2019

wip
fix
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Jan 24, 2019

@jimczi
Copy link
Member

left a comment

LGTM, I left a minor comment regarding another pr that will conflict with this one.

@javanna javanna referenced this pull request Jan 24, 2019

Closed

Cross-cluster search alternate execution mode #32125

11 of 11 tasks complete

@javanna javanna changed the title Introduce ccs remote reduce mode Introduce CCS remote reduce mode Jan 24, 2019

@javanna javanna requested a review from s1monw Jan 28, 2019

@s1monw
Copy link
Contributor

left a comment

I left a question

involved, as it treats all shards the same, at the cost of sending many
requests to each remote cluster.

- `auto`: `remote` is used whenever possible. In case a scroll is provided,

This comment has been minimized.

Copy link
@s1monw

s1monw Jan 28, 2019

Contributor

so what happens if I specify remote and I have inner hits?

This comment has been minimized.

Copy link
@s1monw

s1monw Jan 28, 2019

Contributor

I wonder if we have to have a mode at all. can't we simply go with local_reduce: true|false if it's true we force local if not we try remote if possible?

This comment has been minimized.

Copy link
@javanna

javanna Jan 28, 2019

Author Member

so what happens if I specify remote and I have inner hits?

If you specify remote and you provide a scroll or request inner hits as part of fields collapsing, you get back an error. I think it's better to let users know that what they have requested is not supported when they are explicit in their request.

I wonder if we have to have a mode at all. can't we simply go with local_reduce: true|false if it's true we force local if not we try remote if possible?

I think that having remote (or false) act only as a preference is confusing, as users are requesting something that may not be possible. Ideally, if somebody sets local_reduce: false (or reduce_mode: remote), we would never go and fan out to the shards. It sounds weird to me to have an option act as a proper on/off behaviour and the other one as a preference, unless they have different names.

If we want to have only two values and get rid of the auto option, we could have reduce_mode: always_local | prefer_remote which I find much more self-explanatory than a boolean. Yet I am not sure it is much different from what we have now in the PR.

This comment has been minimized.

Copy link
@javanna

javanna Jan 28, 2019

Author Member

or we could also rename auto to prefer_remote and keep remote.

This comment has been minimized.

Copy link
@javanna

javanna Jan 28, 2019

Author Member

we discussed this a bit with @jimczi and the team, we could go with a boolean but I would like to have some force part in the name of the option. We could go with ccs_force_local_reduce which defaults to false, or maybe even ccs_optimize_for_latency which defaults to true? I think I prefer the latter as it tells when such option should be used. Any other thoughts/preference?

@clintongormley

This comment has been minimized.

Copy link
Member

commented Jan 28, 2019

We could go with ccs_force_local_reduce which defaults to false, or maybe even ccs_optimize_for_latency which defaults to true? I think I prefer the latter as it tells when such option should be used.

With ccs_optimize_for_latency it is not clear to me which option I'd be opting for - i'd have to really understand the feature. Also, given that we want to use remote reduce as the default because we think it'll be better, then i think ccs_force_local_reduce: true works as a good way to opt out of that behaviour.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jan 28, 2019

With ccs_optimize_for_latency it is not clear to me which option I'd be opting for - i'd have to really understand the feature. Also, given that we want to use remote reduce as the default because we think it'll be better, then i think ccs_force_local_reduce: true works as a good way to opt out of that behaviour.

I am not sure if ccs_force_local_reduce explains what's going on. the reduce is not relevant here IMO. This is more about the network calls we do. Something like css_minimize_roundtrips: true|false would be more descriptive IMO.

@zuketo

This comment has been minimized.

Copy link

commented Jan 28, 2019

Does the term WAN help here? E.g. ccs_wan_optimized: true|false

"We're operating across a large network, let's keep this set to true" vs "We're operating within a local network, let's investigate changing this setting"

@javanna

This comment has been minimized.

Copy link
Member Author

commented Jan 29, 2019

I am not sure if ccs_force_local_reduce explains what's going on. the reduce is not relevant here IMO. This is more about the network calls we do. Something like css_minimize_roundtrips: true|false would be more descriptive IMO.

I agree with this, I like ccs_minimize_roundtrips, I think it explains the trade-off well, at the same time staying high-level and without mentioning reduction. I briefly chatted to @clintongormley and he also agrees. I will adapt the PR accordingly.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jan 29, 2019

thanks @javanna

@javanna

This comment has been minimized.

Copy link
Member Author

commented Jan 31, 2019

heya @s1monw can you have another look? it should be ready.

@javanna

This comment has been minimized.

Copy link
Member Author

commented Jan 31, 2019

run elasticsearch-ci/2

@s1monw

s1monw approved these changes Jan 31, 2019

Copy link
Contributor

left a comment

LGTM thanks @javanna

@javanna javanna changed the title Introduce CCS remote reduce mode Introduce ability to minimize round-trips in CCS requests Jan 31, 2019

@javanna javanna merged commit 622fb78 into elastic:master Jan 31, 2019

8 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/docs-check Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details
@javanna

This comment has been minimized.

Copy link
Member Author

commented Jan 31, 2019

thanks a lot everybody involved! ;)

@javanna javanna changed the title Introduce ability to minimize round-trips in CCS requests Introduce ability to minimize round-trips in CCS Jan 31, 2019

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 31, 2019

Merge branch 'master' into retention-leases-version
* master: (100 commits)
  Push primary term to replication tracker (elastic#38044)
  Introduce ability to minimize round-trips in CCS (elastic#37828)
  Don't Assert Ack on when Publish Timeout is 0 in Test (elastic#38077)
  Reduce object creation in Rounding class (elastic#38061)
  Treat put-mapping calls with `_doc` as a top-level key as typed calls. (elastic#38032)
  Fix test bug when testing the merging of mappings and templates. (elastic#38021)
  spelling: java script -- not JavaScript (elastic#37057)
  Enable SSL in reindex with security QA tests (elastic#37600)
  Disable BWC tests during backport (elastic#38074)
  SQL: Added SSL configuration options tests (elastic#37875)
  Minor fixes in the release notes script. (elastic#37967)
  Fix typo in docs. (elastic#38018)
  Update Lucene repo for 7.0.0-alpha2 (elastic#37985)
  Fix size of rolling-upgrade bootstrap config (elastic#38031)
  fix DateIndexNameProcessorTests offset pattern (elastic#38069)
  Speed up converting of temporal accessor to zoned date time (elastic#37915)
  Work around JDK8 timezone bug in tests (elastic#37968)
  Correct arg names when update mapping/settings from leader (elastic#38063)
  Introduce ssl settings to reindex from remote (elastic#37527)
  Mute testRetentionLeasesSyncOnExpiration
  ...

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.