New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce unncessary CACHE_HITRATES updates in gossip #5971
Comments
@gleb-cloudius What do you think? |
POC is here https://github.com/haaawk/scylla/commits/5200-v1 |
The link is dead. I want to revive this work. The extra unnecessary traffic is fine to a cluster in a good shape but when some of the node or shards are loaded, such messages and the handling of such messages can make the system even busy. |
Ping. It is not urgent. OK, I saw you assigned someone. |
@asias can you pick this up? |
Can we use table ids in the state string, instead of table names? Table names can be arbitrary long, whereas id as UUIDs with a fixed size. |
This patch avoids unncessary CACHE_HITRATES updates through gossip. After this patch: Publish CACHE_HITRATES in case: - We haven't published it at all - The diff is bigger enough and we haven't published in the last 5 seconds Note: A peer node can know the cache hitrate through read_data read_mutation_data and read_digest RPC verbs which have cache_temperature in the response. So there is no need to update CACHE_HITRATES through gossip in high frequency. We do the recalculation faster if the diff is bigger than 0.01. It is useful to do the calculation even if we do not publish the CACHE_HITRATES though gossip, since the recalculation will call the table->set_global_cache_hit_rate to set the hitrate. Fixes scylladb#5971
This patch avoids unncessary CACHE_HITRATES updates through gossip. After this patch: Publish CACHE_HITRATES in case: - We haven't published it at all - The diff is bigger enough and we haven't published in the last 5 seconds Note: A peer node can know the cache hitrate through read_data read_mutation_data and read_digest RPC verbs which have cache_temperature in the response. So there is no need to update CACHE_HITRATES through gossip in high frequency. We do the recalculation faster if the diff is bigger than 0.01. It is useful to do the calculation even if we do not publish the CACHE_HITRATES though gossip, since the recalculation will call the table->set_global_cache_hit_rate to set the hitrate. Fixes scylladb#5971
I've sent an PR here: #11079 |
This patch avoids unncessary CACHE_HITRATES updates through gossip. After this patch: Publish CACHE_HITRATES in case: - We haven't published it at all - The diff is bigger than 1% and we haven't published in the last 5 seconds - The diff is really big 10% Note: A peer node can know the cache hitrate through read_data read_mutation_data and read_digest RPC verbs which have cache_temperature in the response. So there is no need to update CACHE_HITRATES through gossip in high frequency. We do the recalculation faster if the diff is bigger than 0.01. It is useful to do the calculation even if we do not publish the CACHE_HITRATES though gossip, since the recalculation will call the table->set_global_cache_hit_rate to set the hitrate. Fixes scylladb#5971
The size of CACHE_HITRATES message is O(n), n is the number of tables. The message can be very big.
We update CACHE_HITRATES unconditionally and periodically even if the values are not changed. I am wondering if we could avoid sending CACHE_HITRATES if it does not change since last update.
The downside of such updates:
The text was updated successfully, but these errors were encountered: