Skip to content
Permalink
Branch: master
Commits on Apr 4, 2019
  1. Make remote cluster resolution stricter (#40419)

    javanna committed Apr 4, 2019
    Remote cluster resolution is currently lenient, to support local
    indices that may contain `:` in their name. From 8.0 on, there can no
    longer be indices in the cluster that contain `:` in their name, hence
    we can make remote cluster resolution stricter. Instead of treating
    any index expression containing a `:` whenever there is no corresponding
    matching remote cluster registered, we now throw a
    `NoSuchRemoteClusterException`.
    
    Closes #37863
Commits on Apr 1, 2019
Commits on Mar 28, 2019
  1. Mute DataFrameAuditorIT#testAuditorWritesAudits

    javanna committed Mar 28, 2019
    Relates to #40594
  2. Move top-level pipeline aggs out of QuerySearchResult (#40319)

    javanna committed Mar 28, 2019
    As part of #40177 we have added top-level pipeline aggs to
    `InternalAggregations`. Given that `QuerySearchResult` holds an
    `InternalAggregations` instance, there is no need to keep on setting
    top-level pipeline aggs separately. Top-level pipeline aggs can then
    always be transported through `InternalAggregations`. Such change is
    made in a backwards compatible manner.
Commits on Mar 27, 2019
  1. Increase suite timeout to 30 minutes for docs tests (#40521)

    javanna committed Mar 27, 2019
    I have been hitting the suite timeout on `DocsClientYamlTestSuiteIT`
    As far as I can see, the docs tests are taking quite a while, I assume
    it's because more and more docs snippets get added over time, which
    means more tests. The current suite timeout is the default 20 minutes.
    It takes me just a little less than 20 minutes to run these on my
    laptop. On my CI, I end up hitting the suite timeout. Hereby I propose
    that we increase the suite timeout to 30 minutes.
Commits on Mar 25, 2019
  1. Add integration tests to verify CCS output (#40038)

    javanna committed Mar 25, 2019
    We recently introduced the option to minimize network roundtrips when
    executing cross-cluster search requests. All the changes made around
    that are separately unit tested, and there are some yaml tests that
    exercise the new code-path which involves multiple coordination steps.
    This commit adds new integration tests that compare the output given by
    CCS when running the same queries using the two different execution
    modes available.
    
    Relates to #32125
Commits on Mar 20, 2019
  1. Remove throws IOException from PipelineAggregationBuilder#create (#40222

    javanna committed Mar 20, 2019
    )
    
    IOException are never thrown in any of the existing pipeline aggregation
    builders. Removing the throws IOException from the create method allows
    to remove it also from a couple of other methods which ends up simplifying
     AggregationPhase (one less catch).
Commits on Mar 19, 2019
  1. Re-enable bwc tests (#40215)

    javanna committed Mar 19, 2019
    Relates to #40177 which is now merged and backported to all branches.
  2. Remove version conditionals from InternalAggregations (#40193)

    javanna committed Mar 19, 2019
    * Remove version conditionals from InternalAggregations
    
    Version conditionals are no longer needed once #40177 is back-ported all the way to 6.7.
    
    * Disable bwc tests
    
    Relates to #40177
    
    * indentation
  3. Serialize top-level pipeline aggs as part of InternalAggregations (#4…

    javanna committed Mar 19, 2019
    …0177)
    
    We currently convert pipeline aggregators to their corresponding
    InternalAggregation instance as part of the final reduction phase.
    They arrive to the coordinating node as part of QuerySearchResult
    objects fom the shards and, despite we may incrementally reduce
    aggs (hence we may have some non-final reduce and the final
    one later) all the reduction phases happen on the same node.
    
    With CCS minimizing roundtrips though, each cluster performs its
    own non-final reduction, and then serializes the results back to
    the CCS coordinating node which will perform the final coordination.
    This breaks the assumptions made up until now around reductions
    happening all on the same node.
    
    With #40101 we have made sure that top-level pipeline aggs are not
    reduced as part of the non-final reduction. The next step is to make
    sure that they don't get lost, meaning that each coordinating node
    needs to send them back to the CCS coordinating node as part of
    the top-level `InternalAggregations` object.
    
    Closes #40059
Commits on Mar 18, 2019
  1. Skip sibling pipeline aggregators reduction during non-final reduce (#…

    javanna committed Mar 18, 2019
    …40101)
    
    Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction,  pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned.
    
    Each coordinating node should rather honour the reduce context flag that
    indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone.
    
    Note that his bug affects only pipeline aggs that don't have a parent in
    the aggs tree, while all the others work well.
    
    Relates to #40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.
  2. CCS: skip empty search hits when minimizing round-trips (#40098)

    javanna committed Mar 18, 2019
    When minimizing round-trips, each cluster returns its own independent
    search response. In case sort by field and/or field collapsing were
    requested, when one cluster has no results to return, the information
    about the field that sorting was based on (SortField array) as well as
    the field (and the values) that collapsing was performed on are missing
    in the search response. That causes problems as we can't build the
    proper `TopDocs` instance which would need to be either `TopFieldDocs`
    or `CollapseTopFieldDocs`. The merge routine expects that all the top
    docs are of the same exact type which can't be guaranteed. Given that
    the problematic results are empty, hence have no impact on the final
    results, we can simply skip them.
    
    Relates to #32125
    Closes #40067
  3. Fix bad cross-link

    javanna committed Mar 18, 2019
    Relates to #39329
  4. [DOCS] add details on version compatibility and remote gateway select…

    javanna committed Mar 18, 2019
    …ion (#40056)
    
    This commit clarifies how the gateway selection works when configuring
    remote clusters for CCR or CCS. Specifically, it clarifies compatibility
    between different versions which is a very common question.
Commits on Mar 14, 2019
  1. CCS: Disable minimizing round-trips when dfs is requested (#40044)

    javanna committed Mar 14, 2019
    When using DFS_QUERY_THEN_FETCH search type, the dfs phase is run and
    its results are used in the query phase to make scoring accurate.
    When using CCS, depending on whether the DFS phase runs in the CCS
    coordinating node (like if all shards were local) or in each remote
    cluster (when minimizing round-trips), scoring will differ.
    
    This commit disables minimizing round-trips whenever DFS is requested,
    as it is not currently  possible to ensure that scoring is accurate in
    that case.
    
    Relates to #32125
Commits on Mar 5, 2019
  1. Tie-break completion suggestions with same score and surface form (#3…

    javanna committed Mar 5, 2019
    …9564)
    
    In case multiple completion suggestion entries have the same score and
    surface form, the order in which such options will be returned is
    currently not deterministic.
    
    With this commmit we introduce tie-breaking for such situations, based
    on shard id, index name, index uuid and doc id like we already do for
     ordinary search hits. With this change we also make shardIndex
    mandatory when sorting and comparing completion suggestion options,
    which was previously only needed later when fetching hits).
    
    Also, we need to make sure shardIndex is properly set when merging
    completion suggestions coming from multiple clusters in
    `SearchResponseMerger`
Commits on Mar 4, 2019
  1. Remote private SearchHits.Total class (#39556)

    javanna committed Mar 4, 2019
    This is now possible as Lucene's `TotalHits` implements `equals`/`hashcode`,
    all the other methods can be in-lined in `SearchHits` instead, no need for
    a specific wrapper class.
Commits on Mar 1, 2019
  1. Mute failing IndexShardIT#testPendingRefreshWithIntervalChange

    javanna committed Mar 1, 2019
    Relates to #39565
Commits on Feb 26, 2019
  1. Rename SearchRequest#withLocalReduction (#39108)

    javanna committed Feb 26, 2019
    `withLocalReduction` is confusing as `local` effectively means "local
    to the remote clusters" rather than "local the coordinating node" where
    the method is executed. I propose we rename the method to
    `crossClusterSearch` which better resembles what the static method is
    used for.
  2. Completion suggestions to be reduced once instead of twice (#39255)

    javanna committed Feb 26, 2019
    We have been calling `reduce` against completion suggestions twice, once
    in `SearchPhaseController#reducedQueryPhase` where all suggestions get
    reduced, and once more in `SearchPhaseController#sortDocs` where we
    add the top completion suggestions to the `TopDocs` so their docs can
    be fetched. There is no need to do reduction twice. All suggestions can
    be reduced in one call, then we can filter the result and pass only the
    already reduced completion suggestions over to `sortDocs`. The small
    important detail is that `shardIndex`, which is currently used only
    to fetch suggestions hits, needs to be set before the first reduction,
    hence outside of `sortDocs` where we have been doing it until now.
Commits on Feb 25, 2019
  1. [DOCS] Fix typo in network-host.asciidoc

    javanna committed Feb 25, 2019
Commits on Feb 15, 2019
  1. Tie break search shard iterator comparisons on cluster alias (#38853)

    javanna committed Feb 15, 2019
    `SearchShardIterator` inherits its `compareTo` implementation from `PlainShardIterator`. That is good in most of the cases, as such comparisons are based on the shard id which is unique, even when searching against indices with same names across multiple clusters (thanks to the index uuid being different). In case though the same cluster is registered multiple times with different aliases, the shard id is exactly the same, hence remote results will be returned before local ones with same shard id objects. That is because remote iterators are added before local ones, and we use a stable sorting method in `GroupShardIterators` constructor.
    
    This PR enhances `compareTo` for `SearchShardIterator` to tie break on cluster alias and introduces consistent `equals` and `hashcode` methods. This allows to remove a TODO in `SearchResponseMerger` which otherwise has to handle this special case specifically. Also, while at it I added missing tests around equals/hashcode and compareTo and expanded existing ones.
Commits on Feb 12, 2019
  1. [TEST] address testCollectNodes rare failure (#38559)

    javanna committed Feb 12, 2019
    #37767 changed the expected exception for "no such cluster" error from
    `IllegalStateException` to a dedicated `NoSuchRemoteClusterException`.
    An assertion in `testCollectNodes` needs to be updated accordingly.
Commits on Feb 11, 2019
  1. Tie break on cluster alias when merging shard search failures (#38715)

    javanna committed Feb 11, 2019
    A recent test failure triggered an edge case scenario where failures may be coming back with the same shard id, yet from different clusters.
    This commit adapts the failures comparator to take the cluster alias into account when merging failures as part of CCS requests execution.
    Also the corresponding test has been split in two: with and without
    search shard target set to the failure.
    
    Closes #38672
  2. Clean up ShardSearchLocalRequest (#38574)

    javanna committed Feb 11, 2019
    Added a constructor accepting `StreamInput` as argument, which allowed to
    make most of the instance members final as well as remove the default
    constructor.
    Removed a test only constructor in favour of invoking the existing
    constructor that takes a `SearchRequest` as first argument.
    Also removed profile members and related methods as they were all unused.
  3. Look up connection using the right cluster alias when releasing conte…

    javanna committed Feb 11, 2019
    …xts (#38570)
    
    Whenever phase failure is raised in AbstractSearchAsyncAction, we go and
    release search contexts of shards that successfully returned their
    results, prior to notifying the listener of the failure. In case we are
    executing a CCS request, it's important to look-up the connection to
    send the release context request to.
    
    This commit makes sure that the lookup takes the cluster alias into
    account. We used to use `null` at all times instead which is not correct
    and was not caught as any exception is caught without re-throwing it.
Commits on Feb 6, 2019
  1. Remove support for maxRetryTimeout from low-level REST client (#38085)

    javanna committed Feb 6, 2019
    We have had various reports of problems caused by the maxRetryTimeout
    setting in the low-level REST client. Such setting was initially added
    in the attempts to not have requests go through retries if the request
    already took longer than the provided timeout.
    
    The implementation was problematic though as such timeout would also
    expire in the first request attempt (see #31834), would leave the
    request executing after expiration causing memory leaks (see #33342),
    and would not take into account the http client internal queuing (see #25951).
    
    Given all these issues, it seems that this custom timeout mechanism 
    gives little benefits while causing a lot of harm. We should rather rely 
    on connect and socket timeout exposed by the underlying http client 
    and accept that a request can overall take longer than the configured 
    timeout, which is the case even with a single retry anyways.
    
    This commit removes the `maxRetryTimeout` setting and all of its usages.
Commits on Feb 1, 2019
  1. Adjust SearchRequest version checks (#38181)

    javanna committed Feb 1, 2019
    The finalReduce flag is now supported on 6.x too, hence we need to update the version checks in master.
  2. Disable bwc tests while backporting #38104 (#38182)

    javanna committed Feb 1, 2019
    Relates to #38180
  3. Add finalReduce flag to SearchRequest (#38104)

    javanna committed Feb 1, 2019
    With #37000 we made sure that fnial reduction is automatically disabled
    whenever a localClusterAlias is provided with a SearchRequest.
    
    While working on #37838, we found a scenario where we do need to set a
    localClusterAlias yet we would like to perform a final reduction in the
    remote cluster: when searching on a single remote cluster.
    
    Relates to #32125
    
    This commit adds support for a separate finalReduce flag to
    SearchRequest and makes use of it in TransportSearchAction in case we
    are searching against a single remote cluster.
    
    This also makes sure that num_reduce_phases is correct when searching
    against a single remote cluster: it makes little sense to return
    `num_reduce_phases` set to `2`, which looks especially weird in case
    the search was performed against a single remote shard. We should
    perform one reduction phase only in this case and `num_reduce_phases`
    should reflect that.
    
    * line length
Commits on Jan 31, 2019
  1. Introduce ability to minimize round-trips in CCS (#37828)

    javanna committed Jan 31, 2019
    With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.
    
    This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.
    
    Relates to #32125
Commits on Jan 30, 2019
  1. Move SearchHit and SearchHits to Writeable (#37931)

    javanna committed Jan 30, 2019
    This allowed to make SearchHits immutable, while quite a few fields in
    SearchHit have to stay mutable unfortunately.
    
    Relates to #34389
Commits on Jan 29, 2019
  1. Remove clusterAlias instance member from QueryShardContext (#37923)

    javanna committed Jan 29, 2019
    The clusterAlias member is only used in the copy constructor, to be able
    to reconstruct the fully qualified index. It is also possible to remove
    the instance member and add a private constructor that accepts the already built Index object which contains the cluster alias.
  2. Remove test only SearchShardTarget constructor (#37912)

    javanna committed Jan 29, 2019
    Remove SearchShardTarget test only constructor and replace all the usages with calls to the other constructor that accepts a ShardId.
Older
You can’t perform that action at this time.