Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Chunkspan wrapping seconds should be shifted across metrics #450

Closed
replay opened this issue Jan 6, 2017 · 5 comments
Closed

Chunkspan wrapping seconds should be shifted across metrics #450

replay opened this issue Jan 6, 2017 · 5 comments
Labels

Comments

@replay
Copy link
Contributor

replay commented Jan 6, 2017

With the merge of the chunk cache we can hopefully remove a lot of requests from Cassandra by serving them out of memory.
Currently, the chunkspans for all metrics are the same. That means that all metrics need to persist a chunk and allocate a new chunk at the same second. Assuming a query pattern where most queries can be served out of the chunk cache, this means that each chunkspan there is going to be a second at which all queries will hit Cassandra to request the newly persisted chunk and add it into the cache. This will likely lead to a spike of queries per time on Cassandra and in the worst case even to timeouts.
I don't think it would be a good idea to push chunks into the cache at the time they get persisted to Cassandra, because this would likely add pressure to evict hot data from the cache in exchange for something that might never get queried.
If we could shift the times when the chunks get wrapped around and distribute that wrapping second of the different metrics across time, then we could get rid of the load spike on Cassandra. For example we could hash the metric name into a number and then wrap at now() % chunkspan + hash.

@woodsaj
Copy link
Member

woodsaj commented Jan 6, 2017

When persisting a chunk to cassandra can we just check if the last chunk is in the cache, and if so add this chunk as well. Essentially pre-filling the cache we data we expect to be queried.

So if a request is being executed periodically for the last 1hour of data, we can just continually feed new chunks into the cache.

@replay
Copy link
Contributor Author

replay commented Jan 6, 2017

Ok, that's possible as well. Then I'd adding some method to the cache interface AddIfHot(*chunk) or something like that, which takes a chunk and only adds it to the cache if the previous chunk is there, otherwise it would drop it.
Then I'll just pass every chunk in there at the time it gets evicted from the ring buffer.

@Dieterbe
Copy link
Contributor

Dieterbe commented Jan 9, 2017

When persisting a chunk to cassandra can we just check if the last chunk is in the cache, and if so add this chunk as well. Essentially pre-filling the cache we data we expect to be queried.

+1 I also like this approach, it would solve this problem nicely so let's first confirm that we actually see the problem manifest in deployments, and then implement the cache.AddIfHot().

However there might be other merits to the OP idea. It would address #357 (comment) (but we're still not sure if this is a problem to solve), and #99
My main concern about the shifting is that it makes reasoning and understandability harder. Personally I find it hard enough already to reason about what happens to chunks, chunk saves, synchronisation between nodes etc. That's why I would hold off, but we can revisit this once we iron out some other things, especially things that simplify operation.

@replay
Copy link
Contributor Author

replay commented Jan 9, 2017

I'll create a new ticket about add new chunks to cache if metric is hot and give that higher prio than this one

@stale
Copy link

stale bot commented Apr 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 4, 2020
@stale stale bot closed this as completed Apr 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants