New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-zero shards may have stale values of gossiper application states of a restarted node for some time #3798

Closed
tgrabiec opened this Issue Oct 2, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@tgrabiec
Contributor

tgrabiec commented Oct 2, 2018

Installation details
Scylla version: 2.1+
Cluster size: >1

Application states of each node are versioned per-node with a pair of generation number (more significant) and value version. Generation number uniquely identifies the life time of the scylla process. Generation number changes after restart. Value versions start from 0 on each restart. When a node gets updates for application states, it merges them with its view on given node. Value updates with older versions are ignored.

Gossiper processes updates only on shard 0, and replicates value updates to other shards. When it sees a value with a new generation, it correclty forgets all previous values. However, non-zero shards don't forget values from previous generations. As a result, replication will fail to override the values on non-zero shards when generation number changes until their value version exceeds the version prior to the restart.

Introduced in 2d5fb9d.

@tgrabiec tgrabiec added the bug label Oct 2, 2018

@tgrabiec tgrabiec self-assigned this Oct 2, 2018

@tzach tzach added the high label Oct 3, 2018

@tgrabiec

This comment has been minimized.

Contributor

tgrabiec commented Oct 3, 2018

One of the possible effects is that heat-weighted load-balancing will work with stale values of hit ratio for restarted replicas.

@tgrabiec

This comment has been minimized.

Contributor

tgrabiec commented Oct 3, 2018

Another incarnation of this issue will result in non-seeds to have their STATUS set to shutdown on non-zero shards after being restarted.

When restarting a non-seed node, it will do a shadow gossip round before setting its STATUS to NORMAL. In the shadow round it will learn from other nodes about itself, and set its STATUS to shutdown on all shards with a high value version. Later, when it sets its status to NORMAL, it will override it only on shard 0, because on other shards the version of STATUS is higher.

Note that other nodes will see their STATUS correctly as NORMAL, on all shards, becuase the change to shutdown is not replicated to non-zero shards, due to effects mentioned in the issue description.

The problem doesn't happen when restarting a seed, because it doesn't do a shadow gossip round.

@slivne slivne added this to the 3.0 milestone Oct 4, 2018

duarten added a commit that referenced this issue Oct 8, 2018

Merge 'Fix issues with endpoint state replication to other shards' fr…
…om Tomasz

Fixes #3798
Fixes #3694

Tests:

  unit(release), dtest([new] cql_tests.py:TruncateTester.truncate_after_restart_test)

* tag 'fix-gossip-shard-replication-v1' of github.com:tgrabiec/scylla:
  gms/gossiper: Replicate enpoint states in add_saved_endpoint()
  gms/gossiper: Make reset_endpoint_state_map() have effect on all shards
  gms/gossiper: Replicate STATUS change from mark_as_shutdown() to other shards
  gms/gossiper: Always override states from older generations

avikivity added a commit that referenced this issue Oct 9, 2018

Merge 'Fix issues with endpoint state replication to other shards' fr…
…om Tomasz

Fixes #3798
Fixes #3694

Tests:

  unit(release), dtest([new] cql_tests.py:TruncateTester.truncate_after_restart_test)

* tag 'fix-gossip-shard-replication-v1' of github.com:tgrabiec/scylla:
  gms/gossiper: Replicate enpoint states in add_saved_endpoint()
  gms/gossiper: Make reset_endpoint_state_map() have effect on all shards
  gms/gossiper: Replicate STATUS change from mark_as_shutdown() to other shards
  gms/gossiper: Always override states from older generations

(cherry picked from commit 48ebe65)

tgrabiec added a commit that referenced this issue Oct 17, 2018

Merge 'Fix issues with endpoint state replication to other shards' fr…
…om Tomasz

Fixes #3798
Fixes #3694

Tests:

  unit(release), dtest([new] cql_tests.py:TruncateTester.truncate_after_restart_test)

* tag 'fix-gossip-shard-replication-v1' of github.com:tgrabiec/scylla:
  gms/gossiper: Replicate enpoint states in add_saved_endpoint()
  gms/gossiper: Make reset_endpoint_state_map() have effect on all shards
  gms/gossiper: Replicate STATUS change from mark_as_shutdown() to other shards
  gms/gossiper: Always override states from older generations

(cherry picked from commit 48ebe65)
@tzach

This comment has been minimized.

Contributor

tzach commented Oct 29, 2018

@tgrabiec @slivne we need to backport to this fix to 2018.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment