Don't replicate token metadata when tokens don't change #2869

tgrabiec · 2017-10-05T21:58:45Z

When applying endpoint state, significant amount of CPU is spent in copying of token_metadata:

We do it inefficiently, for each application state change, but we probably only have to do it when tokens change.

tgrabiec · 2017-10-05T22:00:47Z

Saw apply_state_locally() wall time cut in half when applying this.

tgrabiec · 2017-10-10T12:52:40Z

This could cause latency regression in large clusters in 2.0 due to now frequent updates of CACHE_HITRATES. Changes in heartbeat don't trigger this.

gleb-cloudius · 2017-10-10T12:57:19Z

On Tue, Oct 10, 2017 at 05:52:49AM -0700, Tomasz Grabiec wrote: This could cause latency regression in large clusters in 2.0 due to now frequent updates of CACHE_HITRATES. Changes in heartbeat don't trigger this.

I think this is the case. 50 node cluster test started to fail and the failure was bisected to the patch that adds CACHE_HITRATES to gossiper state.

…

-- Gleb.

avikivity · 2017-10-10T14:36:19Z

So, it was a mistake to add CACHE_HITRATES to gossip. It was based on my wrong assumption that gossip will move node A's hitrates to node B via intermediary node C, without A talking to C directly.

Perhaps we should just drop it, and rely on piggy-backs and resets on node restarts.

gleb-cloudius · 2017-10-10T14:39:12Z

On Tue, Oct 10, 2017 at 02:36:34PM +0000, Avi Kivity wrote: So, it was a mistake to add CACHE_HITRATES to gossip. It was based on my wrong assumption that gossip will move node A's hitrates to node B via intermediary node C, without A talking to C directly. Perhaps we should just drop it, and rely on piggy-backs and resets on node restarts.

It will cause initial spikes after restart because rebooted node will think that all other nodes have zero cache hit rate and will send all its traffic to itself before it learns otherwise.

…

-- Gleb.

tgrabiec · 2017-10-10T14:42:53Z

10.10.2017 4:36 PM "Avi Kivity" <notifications@github.com> napisał(a): So, it was a mistake to add CACHE_HITRATES to gossip. It was based on my wrong assumption that gossip will move node A's hitrates to node B via intermediary node C, without A talking to C directly. It could do that, why do you think it's not the case? Nodes exchange information about all other nodes in each exchange. Perhaps we should just drop it, and rely on piggy-backs and resets on node restarts. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#2869 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AARUL-mpjsi4OIRwg-sIrebaGfWCV2VOks5sq4DngaJpZM4PvwfZ> .

avikivity · 2017-10-10T14:57:43Z

@tgrabiec I asked and was told it was point-to-point, but maybe there was a miscommunication.

avikivity · 2017-10-10T14:59:01Z

It will cause initial spikes after restart because rebooted node will
think that all other nodes have zero cache hit rate and will send all
its traffic to itself before it learns otherwise.

@gleb-cloudius perhaps when we first learn about a node, we can ask it about hitrates, before we declare ourselves ready.

gleb-cloudius · 2017-10-10T15:04:23Z

On Tue, Oct 10, 2017 at 02:59:20PM +0000, Avi Kivity wrote: > It will cause initial spikes after restart because rebooted node will think that all other nodes have zero cache hit rate and will send all its traffic to itself before it learns otherwise. @gleb-cloudius perhaps when we first learn about a node, we can ask it about hitrates, before we declare ourselves ready.

Isn't this just a round of gossiper of a sort. But less efficient since gossiper is not really point-to-point after all. Why not just fix a bug here and do not replicate token metadata in case it did not change instead?

…

-- Gleb.

avikivity · 2017-10-10T15:26:13Z

Agree that gossip fixes are better.

btw, will gossip hitrate override piggyback hitrate? Because gossip hitrate is likely to be much older.

gleb-cloudius · 2017-10-10T15:28:42Z

On Tue, Oct 10, 2017 at 03:26:23PM +0000, Avi Kivity wrote: Agree that gossip fixes are better. btw, will gossip hitrate override piggyback hitrate? Because gossip hitrate is likely to be much older.

No, it will not. It is used only once when there is no info about hitrate from other sources.

…

-- Gleb.

tgrabiec · 2017-10-10T16:05:15Z

2017-10-10 14:57 GMT+02:00 Gleb Natapov <notifications@github.com>:

On Tue, Oct 10, 2017 at 05:52:49AM -0700, Tomasz Grabiec wrote: > This could cause latency regression in large clusters in 2.0 due to now frequent updates of CACHE_HITRATES. Changes in heartbeat don't trigger this. > I think this is the case. 50 node cluster test started to fail and the failure was bisected to the patch that adds CACHE_HITRATES to gossiper

state.

What kind of failure are you seeing?

gleb-cloudius · 2017-10-11T06:38:13Z

On Tue, Oct 10, 2017 at 09:05:25AM -0700, Tomasz Grabiec wrote: 2017-10-10 14:57 GMT+02:00 Gleb Natapov ***@***.***>: > On Tue, Oct 10, 2017 at 05:52:49AM -0700, Tomasz Grabiec wrote: > > This could cause latency regression in large clusters in 2.0 due to now > frequent updates of CACHE_HITRATES. Changes in heartbeat don't trigger this. > > > I think this is the case. 50 node cluster test started to fail and the > failure was bisected to the patch that adds CACHE_HITRATES to gossiper state. > What kind of failure are you seeing?

Not me, Shlomi. The kind you are debugging. Missing rows after adding a node because streaming was not down to a node.

…

-- Gleb.

Fixes #2869 Message-Id: <20171101105629.22104-1-calle@scylladb.com> (cherry picked from commit 8c257c4)

tgrabiec added the symptom/performance Issues causing performance problems label Oct 5, 2017

tgrabiec changed the title ~~Don't replicate token metadata when tokens don~~ Don't replicate token metadata when tokens don't change Oct 5, 2017

tgrabiec added the scale/large cluster label Oct 5, 2017

tgrabiec self-assigned this Oct 6, 2017

tgrabiec added this to the 2.0 milestone Oct 6, 2017

tgrabiec added the area/gossip label Oct 16, 2017

tzach added the high label Oct 21, 2017

slivne assigned elcallio and unassigned tgrabiec Oct 29, 2017

avikivity closed this as completed in 8c257c4 Nov 1, 2017

avikivity pushed a commit that referenced this issue Nov 7, 2017

storage_service: Only replicate token metadata iff modified in on_change

59aae50

Fixes #2869 Message-Id: <20171101105629.22104-1-calle@scylladb.com> (cherry picked from commit 8c257c4)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't replicate token metadata when tokens don't change #2869

Don't replicate token metadata when tokens don't change #2869

tgrabiec commented Oct 5, 2017 •

edited

tgrabiec commented Oct 5, 2017

tgrabiec commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

avikivity commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

tgrabiec commented Oct 10, 2017 via email

avikivity commented Oct 10, 2017

avikivity commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

avikivity commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

tgrabiec commented Oct 10, 2017 via email

gleb-cloudius commented Oct 11, 2017 via email

Don't replicate token metadata when tokens don't change #2869

Don't replicate token metadata when tokens don't change #2869

Comments

tgrabiec commented Oct 5, 2017 • edited

tgrabiec commented Oct 5, 2017

tgrabiec commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

avikivity commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

tgrabiec commented Oct 10, 2017 via email

avikivity commented Oct 10, 2017

avikivity commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

avikivity commented Oct 10, 2017

gleb-cloudius commented Oct 10, 2017 via email

tgrabiec commented Oct 10, 2017 via email

gleb-cloudius commented Oct 11, 2017 via email

tgrabiec commented Oct 5, 2017 •

edited