Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate conf change using v3 #16084

Closed
wants to merge 1 commit into from
Closed

Validate conf change using v3 #16084

wants to merge 1 commit into from

Conversation

geetasg
Copy link

@geetasg geetasg commented Jun 14, 2023

Related to #12913
Eventually the tests will stop setting v2store.
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

Copy link
Author

@geetasg geetasg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following tests also need to be updated in similar manner - to not rely on v2store.
I will update this CR if the refactoring shown in this version looks ok.

TestApplyConfigChangeUpdatesConsistIndex,
TestApplyMultiConfChangeShouldStop,
TestAddMember,
TestRemoveMember,
TestUpdateMember,

@geetasg geetasg mentioned this pull request Jun 15, 2023
21 tasks
@ptabor ptabor self-requested a review June 29, 2023 18:20
@geetasg geetasg marked this pull request as draft June 30, 2023 06:58
@geetasg
Copy link
Author

geetasg commented Jul 4, 2023

why do we need shouldApplyV3 on Validation -
When recovering from wal without a snapshot, the wal replays. The backend can have future info in that case if ci is ahead and cannot be used for validation. This is similar to how AddMember skips backend addition if ci of backend is already ahead.

@serathius serathius changed the title Update tests that exercise ConfChange entries to run without v2store Validate both v2 and v3 configuration Jul 5, 2023
@serathius
Copy link
Member

For me this is a bug fix that should be backported to v3.5. Reason is that v3.5 release was meant to switch etcd source of truth for membership from v2 to v3 store. Looks like we missed this, meaning that v3.5 incorrectly validates only v2 store, instead of validating both.

@serathius
Copy link
Member

cc @ptabor @ahrtr

@geetasg
Copy link
Author

geetasg commented Jul 5, 2023

I will prepare CR to backport to 3.5 after this review completes.

@geetasg geetasg marked this pull request as ready for review July 5, 2023 21:28
@geetasg
Copy link
Author

geetasg commented Aug 24, 2023

rebased and squashed commits

@ahrtr
Copy link
Member

ahrtr commented Aug 28, 2023

For me this is a bug fix that should be backported to v3.5. Reason is that v3.5 release was meant to switch etcd source of truth for membership from v2 to v3 store.

Agreed to backport the change to 3.5. If a member crashes before creating the v2 snapshot, when it starts again, it loads the stale membership data from v2 snapshot. Eventually may run into data inconsistency issue.

@geetasg
Copy link
Author

geetasg commented Aug 29, 2023

Thanks @ahrtr for the review. I prepared - https://docs.google.com/document/d/1RqYbNBrRyMOwbfHIfWa60Aa_PlWdl5sGZyCTuUKfaSk/edit#heading=h.5akmx4nmafa3 - to explain my current understanding of the issue and potential solution. The replay store is to protect against regression of making invalid conf change during wal replay.

Please let me know your thoughts on -

  • Do we need the replay store?
  • The replay store should should be synced up to contain full membership info once the applied index catches up to consistent index?

cc @serathius @ptabor

@ahrtr
Copy link
Member

ahrtr commented Aug 30, 2023

Per my understanding, the validation on the conf cange can be as simple as something like below. The principle is simple: we only validate it only when we are applying the conf change. Note we don't need to validate v2store, because we will eventually remove it in 3.6.

func (c *RaftCluster) ValidateConfigurationChange(cc raftpb.ConfChange, shouldApplyV3 ShouldApplyV3) error {
	if shouldApplyV3 {
		membersMap, removedMap := c.be.MustReadMembersFromBackend()
		if err := c.validateConfigurationChange(cc, membersMap, removedMap); err != nil {
			return err
		}
	} 

	return nil
}

@geetasg
Copy link
Author

geetasg commented Sep 13, 2023

Removed the replay store as per review feedback and discussion in community meeting.
Also updated the PR to not apply past entries to backend
/cc @ahrtr @serathius @ptabor

@geetasg
Copy link
Author

geetasg commented Sep 13, 2023

I will squash the two commits and potentially refactor the tests into a separate PR if the approach is agreed upon.

@ahrtr
Copy link
Member

ahrtr commented Sep 15, 2023

High level looks good to me. Ping all maintainers to take a look.

@geetasg
Copy link
Author

geetasg commented Sep 20, 2023

@serathius @ptabor ptal. thanks

@ahrtr ahrtr marked this pull request as draft September 21, 2023 08:54
@ahrtr
Copy link
Member

ahrtr commented Sep 21, 2023

@geetasg Please tidy up this PR, i.e. removing the commented out source code block. Afterwards, pls mark this PR as "ready for review", then I will take a closer look later. thx

membersMap, removedMap := membersFromStore(c.lg, c.v2store)
var membersMap map[types.ID]*Member
var removedMap map[types.ID]bool
if c.v2store != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate only V3, I know that for v3.6 release, v2 store will be removed, however I strongly suggest to drop V2 validation now. if the v3 state is the source of truth, then we should trust V3 validation.

Reason is avoiding the double validation which can cause issues if at any point the configurations have diverged like in #13348

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's true that we only need to verify v3 just as I mentioned in #16084 (comment). But just as I mentioned in one of previous community meetings, we should backport the change to 3.5, so let's keep the v2 validation for now. After the backport is done, we can remove it later. Please also see #16084 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not againt fixing 3.5 with backport in general, but it doesn't change our assumptions...
Someone can rollback from 3.6.x to 3.5.0 and experience consequences of V2 store being stale (but still considered as source of truth). Seems we would need another safeguard that invalidates Store V2 in etcd 3.6 such that rollback to version that 'does not understand' the invalidation is not possible...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

experience consequences of V2 store being stale (but still considered as source of truth)

Should we change the source of truth to V3 store (bbolt) in 3.5? I prefer to YES.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it doesn't change our assumptions...
Someone can rollback from 3.6.x to 3.5.0 and experience consequences of V2 store being stale (but still considered as source of truth).

Yes, it's true that the v2 store of 3.5 might be stale right after bootstrapping (no matter it's downgraded from 3.6, or just 3.5 restarts), because 3.5 loads membership data from snapshot which might be stale; but I do not see any issue.

  • Eventually the membership data in v2 store will be in sync with v3 store when the replaying WAL entries is done;
  • Note etcd 3.5 isn't able to serve client requests until it finishes publishing the server info to cluster through raft, which must be applied after the replaying legacy WAL entries.

So

@ptabor please kindly let me know if I misunderstood your point or you have any other concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr you are missing one important consequence of changing validation. Normal raft proposal (like PUT) can be represented as a state machine without side effect. It means that we can care about only the end state, so assuming that we all members execute same WAL entries, we are good. This however is not true for membership.

Membership "state machine" has side effect in form of ConfState, which is state of truth for raft voting members. When applying the entry we run the validation function against the membership info to decide if it should have an effect on ConfState. This means that we not only care about the state, but also transitions. Every time we replay WAL we need to guarantee that validate behaves the same. That also applies to +1/-1 etcd versions to support upgrades.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In term of "side effect on ConfState", I assume you are talking about server.go#L1989 or server.go#L2001. Pls let me know if we are not on the same page.

Note that I don't think this PR changes the validation logic on membership data. It just applies the validation on v3 store as well. If the validation on v2 store fails (or succeeds) [Of course, we will remove the validation on v2], then the validation on v3 should fails (or succeeds) as well.

Pls be more specific if you still think there is any issue.

Again, just as I mentioned in #16084 (comment), since this is the most important part, so any suggestion on how to improve the test cases is welcome.

@serathius
Copy link
Member

I think review is spinning around in circles, same problems repeated in different context, things misattributed. I see two ways out of it:

  • Write a proper design document
  • Refactor the code that change is easier to understand.

@ahrtr
Copy link
Member

ahrtr commented Sep 22, 2023

I think review is spinning around in circles, same problems repeated in different context, things misattributed.

I am strongly (but politely) against raising vague & abstract comments, especially after a long long period that there wasn't any review comments. Pls raise direct comment on code regarding any technical concern.

@serathius
Copy link
Member

serathius commented Sep 22, 2023

I am strongly (but politely) against raising vague & abstract comments, especially after a long long period that there wasn't any review comments. Pls raise direct comment on code regarding any technical concern.

Asking for design is not vague or abstract, and there is no rush in review. We should prioritize correctness and clarity when making decisions. I unfortunately make the mistake of proposing the current strategy of v2 storage removal without writing down the design.

The mistake is on me, so I will write down the design ASAP. Hope it will bring clarity that will allow everyone to agree on one solution.

@serathius
Copy link
Member

serathius commented Sep 22, 2023

Draft of the design https://docs.google.com/document/d/1Ei0v1uL8F_hmAu3klCrim__IHsF-AwkeVo-2lHM7L-c/edit?usp=sharing

Wrote down the background and my understanding of etcd membership change logic. Didn't have time to fully fresh out the proposal. Happy to continue on the document as it should be much easier to provide and respond to feedback.

@ahrtr
Copy link
Member

ahrtr commented Sep 25, 2023

I don't think this PR is that complicated, although it took me some time to clarify some concerns/comments. Previously etcd (main branch) only validates v2 store when applying conf change, now it validates both v2 and v3 stores [Of course, eventually we will remove v2 validation in 3.6 after backporting the PR to 3.5]. Overall it looks good. But the applying process is the most important workflow, so any technical insights or comments is welcome; and we are happy & open to clarify & discuss.

If anyone has any comment on the overall v2 deprecation task, please let's discuss under the umbrella ticket #12913.

One minor improvement I can think of for now is adding the verification just as I mentioned in #16084 (comment). Any suggestion on how to improve the test & verification is welcome!

With regard the concern for 3.5 of bootstrapping on stale v2 store (no matter it's downgraded from 3.6, or just 3.5 restarts), please see my comment #16084 (comment).

@geetasg
Copy link
Author

geetasg commented Sep 25, 2023

squashed commits and updated to latest

@geetasg geetasg marked this pull request as ready for review September 25, 2023 19:50
@serathius
Copy link
Member

[Of course, eventually we will remove v2 validation in 3.6 after backporting the PR to 3.5]

I'm against backporting to v3, as it doesn't bring any benefits just creates risk. We can remove v2 store in v3.6 without needing any changes to v3.5. Any design that would depend on backporting changes to v3.5 would be flawed, as for upgrade we cannot assume that users upgrade to latest v3.5 before upgrading to v3.6.

@ahrtr
Copy link
Member

ahrtr commented Sep 26, 2023

Any design that would depend on backporting changes to v3.5 would be flawed, as for upgrade we cannot assume that users upgrade to latest v3.5 before upgrading to v3.6.

Note that the PR or design doesn't depend on backporting to 3.5 at all. They are totally independent. No matter we backport the PR or not, it will not block the task of v2 deprecation in 3.6.

I'm against backporting to v3, as it doesn't bring any benefits just creates risk. We can remove v2 store in v3.6 without needing any changes to v3.5.

It's definitely nice to have to validate v3 store as well in 3.5 when applying conf change. Read #16084 (comment) as well. If the validation on v2 store succeeds (or fails), but the validation on v3 store fails (or succeeds), then it means the cluster has issue.

@serathius
Copy link
Member

serathius commented Sep 26, 2023

Me and @ahrtr met and looked at the problem together. New thing we discovered was that there was already an attempt to migrate to v3 store, however it was unfinished #12914. This led to a big problem in v3.5.0 release #13196 and required dedicated patch #13348 that reverted the switch. Unfortunatelly the fix was not backported, so currently v3.6 is in half broken state.

I think everyone agrees on the end goal, migrate etcd v3.6 to v3 store, however there are many ways of doing it. Like this PR that does only part of migration. It switches RaftCluster to v3, however WAL is still replayed from last snapshot.

My main learning from #12914 is fact that efforts can be de-prioritized, forgotten and abandoned. For efforts like migration to v3, we should ensure that any PR we merge is correct and complete on it's own and assume that it could be the last one. As so I would prefer to not change validation without migrating off v2 store. We need to have a proper plan and we need a proper design.

I will send a draft PR on how I expect migration of v2 store should look like.

@serathius
Copy link
Member

#16656 the draft. It's goal is to move etcd bootstrap to be from v3 and not last snapshot. It doesn't touch RaftCluster apart of validation that needs to be switched to v3 now.

@ahrtr
Copy link
Member

ahrtr commented Sep 27, 2023

Actually this PR includes a bug fix and a feature. We should breakdown this PR into two dedicated PRs, so that it's clearer.

Regarding #16656, it's totally an another big topic, [although it's sort of related.], I suggest not to complicate the conversation and discussion for now. We can discuss it separately after we finish the v2 deprecation task.

For the concern of #13348 not being forward ported to 3.6 (main branch), please see #16655 (comment).

I imagine @geetasg is going to finish the v2 deprecation task in the following 3~5 PRs. I agree that It would be better if there is a design documentation, so that everyone is on the same page and no any small task will be forgotten.

than consistent index.

Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
@ahrtr
Copy link
Member

ahrtr commented Nov 27, 2023

Resolved in #17017.

Thank you @geetasg ! And really sorry for the bad experience you had!

@ahrtr ahrtr closed this Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants