Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Validate decision table for reconciliation as mentioned in ETCD design doc #354

Closed
Tracked by #107
abdasgupta opened this issue Jun 6, 2022 · 3 comments
Assignees
Labels
kind/enhancement Enhancement, improvement, extension release/ga Planned for GA(General Availability) release of the Feature status/closed Issue is closed (either delivered or triaged)
Milestone

Comments

@abdasgupta
Copy link
Contributor

Feature (What you would like to be added):
We already enhanced reconciliation for multi node in ETCD Druid here. Now we have to validate the implementation with the decision table

This story will track the validation

Motivation (Why is this needed?):

Approach/Hint to the implement solution (optional):

@abdasgupta abdasgupta added the kind/enhancement Enhancement, improvement, extension label Jun 6, 2022
@abdasgupta abdasgupta added this to the 2022-Q3 milestone Jun 6, 2022
@abdasgupta abdasgupta modified the milestones: 2022-Q3, v0.12.0 Jul 4, 2022
@ashwani2k ashwani2k added the release/ga Planned for GA(General Availability) release of the Feature label Jul 6, 2022
@ashwani2k ashwani2k modified the milestones: v0.12.0, v0.13.0 Jul 12, 2022
@ishan16696 ishan16696 assigned ishan16696 and aaronfern and unassigned ishan16696 Aug 9, 2022
@aaronfern
Copy link
Contributor

aaronfern commented Aug 18, 2022

After validating the decision table for reconciliation, here are my findings

Sr No Event Status Comments
1 Pink of health Complete
2 Member status is out of sync with their leases Complete
3 All members are Ready but AllMembersReady condition is stale Complete
4 Not all members are Ready but AllMembersReady condition is stale Complete
5 Majority members are Ready but Ready condition is stale Complete
6 Majority members are NotReady but Ready condition is stale Complete
7 Some members have been in Unknown status for a while Complete
8 Some member pods are not Ready but have not had the chance to update their status Complete The proposal asks that a status PodNotReady be used, however a status ContainerNotReady is used instead.
9 Quorate cluster with a minority of members NotReady Complete
10 Quorum lost with a majority of members NotReady Under development
11 Scale up of a healthy cluster Feature complete - Feature is ready from etcd druid
- Not yet enabled from g/g perspective
12 Scale down of a healthy cluster Out of scope No use cases of scale-down right now
13 Superfluous member entries in Etcd status Not required - Cluster are not expected to dynamically change sizes
- Single member restoration already takes care of removing a member from the cluster if that is deemed necessary
- The feature of superfluous member deletion is already developed in etcd-backup-restore and can be enabled via a flag if deemed necessary in the future

tl;dr:
Etcd druid reconciliation is generally working as expected. The caveats are the following

  • Scale-down is deemed out of scope,
  • Quorum loss is still under development
  • Scale-up is feature ready, but is yet to be enabled on g/g

@abdasgupta
Copy link
Contributor Author

Should we close the issue as we are done with validating the decision table?

@abdasgupta
Copy link
Contributor Author

We have already validated recovery from transient quorum loss ( see here #436) . For a non quorate cluster, we will need human intervention. The human operator will decide how to recover a non quorate cluster. We will be providing a playbook for their guidance. Please follow #437 for more details. As the scope of this issue is finished , I am closing this issue.

@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Enhancement, improvement, extension release/ga Planned for GA(General Availability) release of the Feature status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

No branches or pull requests

5 participants