Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove role re-balancing #230

Open
carlcsaposs-canonical opened this issue Apr 12, 2024 · 2 comments
Open

Remove role re-balancing #230

carlcsaposs-canonical opened this issue Apr 12, 2024 · 2 comments

Comments

@carlcsaposs-canonical
Copy link
Contributor

carlcsaposs-canonical commented Apr 12, 2024

If there are an even number of cluster-manager-eligible nodes, OpenSearch automatically excludes one from the voting configuration

There should normally be an odd number of master-eligible nodes in a cluster. If there is an even number, Elasticsearch leaves one of them out of the voting configuration to ensure that it has an odd size.

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-voting.html#_even_numbers_of_master_eligible_nodes

Therefore, there is no need to ensure an odd number of cluster-manager-eligible nodes by re-balancing the roles ourselves


This solves the issue with the deterministic role re-balancing in #209 where:

  • initial deployment
  • even number of units
  • leader unit is highest unit number
  • (highest unit number is not cluster manager eligible)
  • (leader unit has to start first, but must be cluster manager eligible)

Alternative solutions to that issue:

  • start highest unit as cluster manager eligible and restart it later to remove cluster-manager-eligible role (cons: extra restart, at what point do you restart unit—requires special case for initial start, could affect HA during initial start)
  • replace deterministic role re-balancing with non-deterministic role re-balancing (cons: upgrade cannot be coordinated without use of peer databag which harms rollback resiliency—detailed in Re-balance roles deterministically instead of randomly #209 (comment) or without race for lock [which has other concerns & deviates from upgrade design used in other charms])

This also removes restarts (that were needed on a unit when its roles changed)

Context: https://chat.canonical.com/canonical/pl/756bhdey33ysjx3qdee3ktgoxo

Copy link

carlcsaposs-canonical added a commit that referenced this issue Apr 12, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
carlcsaposs-canonical added a commit that referenced this issue Apr 12, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
carlcsaposs-canonical added a commit that referenced this issue Apr 12, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
carlcsaposs-canonical added a commit that referenced this issue Apr 12, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
carlcsaposs-canonical added a commit that referenced this issue Apr 12, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
@carlcsaposs-canonical
Copy link
Contributor Author

236ff8f in #211 contains initial fix for this issue

Complete fix (that removes unnecessary code) planned to be implemented by @Mehdi-Bendriss in a separate PR

carlcsaposs-canonical added a commit that referenced this issue Apr 15, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
carlcsaposs-canonical added a commit that referenced this issue Apr 17, 2024
See #230

Implemented in a minimal, hacky way to proceed with testing. Will be implemented fully (i.e. unnecessary code removed) by @Mehdi-Bendriss in another PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant