New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline reroute with process of node join/master election #18938

Merged
merged 3 commits into from Jun 17, 2016

Conversation

Projects
None yet
3 participants
@bleskes
Member

bleskes commented Jun 17, 2016

In the past, we had the semantics where the very first cluster state a node processed after joining could not contain shard assignment to it. This was to make sure the node cleans up local / stale shard copies before receiving new ones that might confuse it. Since then a lot of work in this area, most notably the introduction of allocation ids and #17270 . This means we don't have to be careful and just reroute in the same cluster state change where we process the join, keeping things simple and following the same pattern we have in other places.

@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Jun 17, 2016

Member

@ywelsch can you take a look?

Member

bleskes commented Jun 17, 2016

@ywelsch can you take a look?

@bleskes bleskes closed this Jun 17, 2016

@bleskes bleskes reopened this Jun 17, 2016

@ywelsch

This comment has been minimized.

Show comment
Hide comment
@ywelsch

ywelsch Jun 17, 2016

Contributor

Left 2 comments. I really like this change!

Contributor

ywelsch commented Jun 17, 2016

Left 2 comments. I really like this change!

@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Jun 17, 2016

Member

@ywelsch I pushed another commit addressing your comments

Member

bleskes commented Jun 17, 2016

@ywelsch I pushed another commit addressing your comments

@ywelsch

This comment has been minimized.

Show comment
Hide comment
@ywelsch

ywelsch Jun 17, 2016

Contributor

Left one minor, no need for another iteration. LGTM. Thanks @bleskes!

Contributor

ywelsch commented Jun 17, 2016

Left one minor, no need for another iteration. LGTM. Thanks @bleskes!

@bleskes bleskes merged commit 46b40f7 into elastic:master Jun 17, 2016

1 check passed

CLA Commit author is a member of Elasticsearch
Details

@bleskes bleskes deleted the bleskes:node_join_inline_reroute branch Jun 17, 2016

@bleskes bleskes restored the bleskes:node_join_inline_reroute branch Jun 23, 2016

bleskes added a commit that referenced this pull request Jun 23, 2016

revert - Inline reroute with process of node join/master election (#1…
…8938)

There are secondary issues with async shard fetch going out to nodes before they have a cluster state published to them that need to be solved first. For example:
- async fetch uses transport node action that resolves nodes based on the cluster state (but it's not yet exposed by ClusterService since we inline the reroute)
- after disruption nodes will respond with an allocated shard (they didn't clean up their shards yet) which throws of decisions master side.
- nodes deed the index meta data in question but they may not have if they didn't recieve the latest CS
@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Jun 23, 2016

Member

I reverted this one due to secondary problems with async shard fetch. I'm working on fixing those before re-committing this.

From the revert commit message:

There are secondary issues with async shard fetch going out to nodes before they have a cluster state published to them that need to be solved first. For example:

  • async fetch uses transport node action that resolves nodes based on the cluster state (but it's not yet exposed by ClusterService since we inline the reroute)
  • after disruption nodes will respond with an allocated shard (they didn't clean up their shards yet) which throws of decisions master side.
  • nodes deed the index meta data in question but they may not have if they didn't recieve the latest CS
Member

bleskes commented Jun 23, 2016

I reverted this one due to secondary problems with async shard fetch. I'm working on fixing those before re-committing this.

From the revert commit message:

There are secondary issues with async shard fetch going out to nodes before they have a cluster state published to them that need to be solved first. For example:

  • async fetch uses transport node action that resolves nodes based on the cluster state (but it's not yet exposed by ClusterService since we inline the reroute)
  • after disruption nodes will respond with an allocated shard (they didn't clean up their shards yet) which throws of decisions master side.
  • nodes deed the index meta data in question but they may not have if they didn't recieve the latest CS

jasontedor added a commit that referenced this pull request Jun 23, 2016

Merge branch 'master' into feature/seq_no
* master: (416 commits)
  docs: removed obsolete information, percolator queries are not longer loaded into jvm heap memory.
  Upgrade JNA to 4.2.2 and remove optionality
  [TEST] Increase timeouts for Rest test client (#19042)
  Update migrate_5_0.asciidoc
  Add ThreadLeakLingering option to Rest client tests
  Add a MultiTermAwareComponent marker interface to analysis factories. #19028
  Attempt at fixing IndexStatsIT.testFilterCacheStats.
  Fix docs build.
  Move templates out of the Search API, into lang-mustache module
  revert - Inline reroute with process of node join/master election (#18938)
  Build valid slices in SearchSourceBuilderTests
  Docs: Convert aggs/misc to CONSOLE
  Docs: migration notes for _timestamp and _ttl
  Group client projects under :client
  [TEST] Add client-test module and make client tests use randomized runner directly
  Move upgrade test to upgrade from version 2.3.3
  Tasks: Add completed to the mapping
  Fail to start if plugin tries broken onModule
  Remove duplicated read byte array methods
  Rename `fields` to `stored_fields` and add `docvalue_fields`
  ...

bleskes added a commit that referenced this pull request Jun 27, 2016

Make shard store fetch less dependent on the current cluster state, b…
…oth on master and non data nodes (#19044)

#18938 has changed the timing in which we send out to nodes to fetch their shard stores. Instead of doing this after the cluster state resulting of the node's join was published, #18938 made it be sent concurrently to the publishing processes. This revealed a couple of points where the shard store fetching is dependent of the current state of affairs of the cluster state, both on the master and the data nodes. The problem discovered were already present without #18938 but required a failure/extreme situations to make them happen.This PR tries to remove as much as possible of these dependencies making shard store fetching simpler and make the way to re-introduce #18938 which was reverted.

These are the notable changes:
1) Allow TransportNodesAction (of which shard store fetching is derived) callers to supply concrete disco nodes, so it won't need the cluster state to resolve them. This was a problem because the cluster state containing the needed nodes was not yet made available through ClusterService. Note that long term we can expect the rest layer to resolve node ids to concrete nodes, making this mode the only one needed.
2) The data node relied on the cluster state to have the relevant index meta data so it can find data when custom paths are used. We now fall back to read the meta data from disk if needed.
3) The data node was relying on it's own IndexService state to indicate whether the data it has corresponds to an existing allocation. This is of course something it can not know until it got (and processed) the new cluster state from the master. This flag in the response is now removed. This is not a problem because we used that flag to protect against double assigning of a shard to the same node, but we are already protected from it by the allocation deciders.
4) I removed the redundant filterNodeIds method in TransportNodesAction - if people want to filter they can override resolveRequest.

bleskes added a commit to bleskes/elasticsearch that referenced this pull request Jul 4, 2016

bleskes added a commit that referenced this pull request Jul 18, 2016

Make static Store access shard lock aware (#19416)
We currently have concurrency issue between the static methods on the Store class and store changes that are done via a valid open store. An example of this is the async shard fetch which can reach out to a node while a local shard copy is shutting down (the fetch does check if we have an open shard and tries to use that first, but if the shard is shutting down, it will not be available from IndexService).

Specifically, async shard fetching tries to read metadata from store, concurrently the shard that shuts down commits to lucene, changing the segments_N file. this causes a file not find exception on the shard fetching side. That one in turns makes the master think the shard is unusable. In tests this can cause the shard assignment to be delayed (up to 1m) which fails tests. See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+java9-periodic/570 for details.

This is one of the things #18938 caused to bubble up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment