Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync #13348

serathius · 2021-09-14T09:38:52Z

As discussed in #13196 there is a chance that between v3.1 and v3.4 state of bbolt and v2store diverged. This is only noticeable after we upgrade to v3.5 as the authoritative storage was changed from v2store to bbolt.

This PR is a direct fix this problem in v3.5. This is not cherry-pick from master as v3.6 plans to remove v2store totally. Current fix is designed to work assuming that users uses v3.5.1 before upgrading to v3.6. This approach was picked to avoid delaying removal of v2storage in v3.6 (discussed more on original issue).

What this PR does:

AddMember/RemoveMember operation will execute on both storeV2 and backend before returning. If they have diverged, and storev2 has member that is not present in backend, re-adding this member will succeed (mirrored for removal). Adding an exiting member will only fail if it's present in both storages(mirrored for removal) . As writing to 2 different storages is not transactional, they might diverge during runtime. This should allow users to repeat the Add/Remove Member to ensure that storages are in sync. This should be enough to fix the problem as member changes are not done often.
When etcd bootstraps, RaftCluster Recover will sync contents of backend to match v2store. This operation is always correct and it should be cheap as number of cluster members should be around 3-5, and historically it should not grow beyond tens. This should fix problem of zombie members appearing after upgrade to v3.5. Resigned from implementing this as during Recover it's expected that v2store and backend are out of sync. Reason is that Recover is called during member bootstrap before applying WAL, as storev2 and backend are persisted at different time (storev2 only after snapshot, backend every 5 seconds) .

cc @ptabor

server/etcdserver/api/membership/cluster.go

gyuho · 2021-09-15T01:43:43Z

server/etcdserver/api/membership/store.go

 	tx.UnsafePut(buckets.MembersRemoved, mkey, []byte("removed"))
+	if !unsafeMemberExists(tx, mkey) {


Can we document this logic? So, we "dont'" delete the "mkey" from "Members" bucket, and expect "mkey" does not exist?

I think that deletion of mkey was moved to line 84. Its just not needed if its not found.

But +1 to providing doc to this and other methods about errors being reported.

serathius · 2021-09-15T09:26:21Z

Found one issue when running integration tests. Looks like current implementation breaks TestV3HashRestart test. This test starts etcd, reads DB hash, restarts it, reads the hash again and compare if there was difference. I found that current implementation with v2store overriding db members breaks this assumption.

…ckend out of sync

serathius · 2021-09-15T12:37:58Z

Decided to drop syncing during Recover call. Updated PR description to provide the reasoning.

server/etcdserver/api/membership/cluster.go

ptabor · 2021-09-17T16:20:17Z

How about syncing on StoreV2->Backend just before storeV2 snapshot operation ?

serathius · 2021-09-20T10:42:08Z

By default snapshot happens only 10000 wal entries. I'm worried it might complicate and obfuscate things even more. I'm little scared about using StoreV2-<backend sync. It's a internal change that is not visible in WAL and not tied to any consistent index value. This means it has user visible impact (changes hash of data) without user interaction.

Currently proposed fix just to AddMember/Remove member commands will not resolve corrupted state automatically (will require user to notice and delete/add members themselves). However it gives user full visibility in what is happening. Would love to get some opinion from @gyuho or @hexfusion

ptabor · 2021-09-25T15:33:51Z

Thank you. Merging as the discussion is related to a potential next step.

I think we should have either automated 'fix' process or a fix process that requires issuing a single command to bring the stores to sync (independently on number of out-of-sync members).

Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync

… on v2 store. Migration off v2 store was unfinished during v3.5 development etcd-io#12914 leaving RaftCluster broken, half migrated. This was found in v3.5 and fixed in etcd-io#13348, however this is was not patched on main branch. This PR makes v3.6 RaftCluster consistent with v3.5 so we can cleanly migrate off v2 store. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>

serathius force-pushed the sync branch 5 times, most recently from fbd2a0c to 32bfadb Compare September 14, 2021 12:14

serathius requested a review from ptabor September 14, 2021 12:16

hexfusion reviewed Sep 14, 2021

View reviewed changes

server/etcdserver/api/membership/cluster.go Show resolved Hide resolved

serathius force-pushed the sync branch from 32bfadb to 7ae97db Compare September 14, 2021 14:08

gyuho reviewed Sep 15, 2021

View reviewed changes

serathius force-pushed the sync branch 2 times, most recently from f30e181 to a564fa5 Compare September 15, 2021 12:30

server: Ensure that adding and removing members handle storev2 and ba…

e68c7ab

…ckend out of sync

serathius force-pushed the sync branch from a564fa5 to e68c7ab Compare September 15, 2021 12:36

ptabor reviewed Sep 17, 2021

View reviewed changes

server/etcdserver/api/membership/cluster.go Show resolved Hide resolved

ptabor reviewed Sep 17, 2021

View reviewed changes

server/etcdserver/api/membership/cluster.go Show resolved Hide resolved

ptabor approved these changes Sep 17, 2021

View reviewed changes

ptabor merged commit 4312298 into etcd-io:release-3.5 Sep 25, 2021

serathius mentioned this pull request Sep 29, 2021

etcd patch release plan #13369

Closed

5 tasks

uthark mentioned this pull request Oct 15, 2021

Upgrade etcd to 3.5.1 kubernetes/kubernetes#105706

Merged

moonovo mentioned this pull request Nov 15, 2021

etcd crashes after restart with panic: unexpected removal of unknown remote peer #13119

Closed

hasbro17 pushed a commit to hasbro17/etcd that referenced this pull request Feb 2, 2022

Merge pull request etcd-io#13348 from serathius/sync

a2eaa4e

Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync

jiapeish mentioned this pull request Feb 15, 2022

etcd v3.5.0 with cherry-pick(pull/13348) still resurrects ancient members, it restarts failed with "member count is unequal"... #13697

Closed

serathius deleted the sync branch June 15, 2023 20:39

serathius mentioned this pull request Sep 21, 2023

Validate conf change using v3 #16084

Closed

serathius mentioned this pull request Sep 26, 2023

Fix RaftCluster boostrap being based on v3 store while apply is based on v2 store. #16655

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync #13348

Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync #13348

serathius commented Sep 14, 2021 •

edited

Loading

gyuho Sep 15, 2021

ptabor Sep 17, 2021

serathius commented Sep 15, 2021

serathius commented Sep 15, 2021

ptabor commented Sep 17, 2021 •

edited

Loading

serathius commented Sep 20, 2021

ptabor commented Sep 25, 2021

		tx.UnsafePut(buckets.MembersRemoved, mkey, []byte("removed"))
		if !unsafeMemberExists(tx, mkey) {

Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync #13348

Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync #13348

Conversation

serathius commented Sep 14, 2021 • edited Loading

gyuho Sep 15, 2021

Choose a reason for hiding this comment

ptabor Sep 17, 2021

Choose a reason for hiding this comment

serathius commented Sep 15, 2021

serathius commented Sep 15, 2021

ptabor commented Sep 17, 2021 • edited Loading

serathius commented Sep 20, 2021

ptabor commented Sep 25, 2021

serathius commented Sep 14, 2021 •

edited

Loading

ptabor commented Sep 17, 2021 •

edited

Loading