Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster state and versions - when should we increment which version and how often? #14158

Closed
brwe opened this issue Oct 16, 2015 · 3 comments
Closed
Labels
discuss :Distributed/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure

Comments

@brwe
Copy link
Contributor

brwe commented Oct 16, 2015

While reviewing #14062 I found it somewhat difficult to reason about which version is incremented when in the cluster state. I open this issue to summarize how I think versions work right now and to maybe get some feedback on how this can be handled in a more easy to understand way.

This is the version numbers we maintain:

  • cluster state version:
    • should be incremented only by one each time a cluster state update task is processed and actually yielded any change in the whole cluster state
    • currently only changed in the cluster state update thread.
  • MetaData version:
    • should increment each time something in the cluster settings or index meta data changes.
    • currently only changed in the cluster state update thread.
  • IndexMetaData version:
    • Should increment whenever some index specific property changes (mapping, settings, etc. expect for routing changes which we track in shard version).
    • changed in many places, look at IndexMetaData.Builder.version(..) and where it is called.
  • RoutingTable version
    • should increment each time any shard routing changes
    • currently changed in the cluster state update thread and AllocationService.buildChangedResult(..).
  • Shard version:
    • incremented each time something in the routing of a single shard changes (add/remove replica, relocate shard etc.).
    • Shard version is stored in the ShardRoutings. It is updated when a single ShardRouting changes during allocation and then copied over to all ShardRoutings from the same copy in IndexRoutingTable.normalizeVersions() once allocation has finished.

Q: Cluster state version and MetaData version are updated only once per cluster state update task but the others are not or at least it is not immediately clear if they can potentially be incremented by > 1. It it OK if IndexMetaData version, RoutingTable version and shard version increment by more than one between cluster states?

While I think I understand why we do versioning now the way we do it I find it cumbersome to read and wonder if there is a cleaner way to maintain these versions.
For example: we only need the version increments before master sends the new cluster state to the other nodes. Can we build a new cluster state without incrementing any version in any of the components and then here check the difference between new and old cluster state and update all versions in one go? This would leave no questions open about when versions are updated and where. Chatted very briefly with @s1monw who thinks this will add complexity and not remove any but I have not yet given up hope and will give it a shot.

Any kind of feedback is more than welcome.

@bleskes
Copy link
Contributor

bleskes commented Oct 19, 2015

Thanks @brwe for the great write up. The way I see it the version are primarily used to allow to check for equality (with one big exception, see further). For example, this is used when updating cluster states on the nodes after receiving a new one from the master where we want to reuse local objects when the didn't change. Longer term I tend to say we should go with implementing equals methods where we need to so can avoid using this version variant (as we recently did in shard routing). We should look into how much effort this requires.

In the cluster state level, the version is also used to indicate "supersedes" relationship where a node can process only the cluster state with highest version from it's pending CS queue and it is guaranteed to include all the changes from the previous ones. It doesn't really matter at the moment if we skip a version, though I don't think it happens now.

Last - ShardRouting's version is used for selecting a primary after a full cluster restart, but we have plans to change that as you know :)

@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Cluster labels Feb 13, 2018
@DaveCTurner DaveCTurner added the :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) label Mar 15, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@DaveCTurner DaveCTurner added :Distributed/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure and removed :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Mar 15, 2018
@DaveCTurner
Copy link
Contributor

Closing this as there's no further action needed here. Possibly this will be revisited during the Zen 2 work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss :Distributed/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure
Projects
None yet
Development

No branches or pull requests

5 participants