-
Notifications
You must be signed in to change notification settings - Fork 106
Ability to "revive" archived series #1976
Comments
you'd also need to increment the LastUpdate field, otherwise it would still be considered stale and subsequently
A big challenge here is the re-indexing. Trying to resync a live index with what's in the persistent index seems problematic.
(a similar new "message type" has come up before. not sure if we documented this anywhere, but we were discussing at one point how a metric delete api call is undone when an instance restarts and consumes data for that metric that was submitted before the delete call was triggered. if we were to do deletes via kafka messages it would make sure the instances always correctly execute them, even if they were temporarily down at the time of the rest api call, or had to restart) |
Is it though? IMO long write locks are the only danger. If we limit the scope to just "Add missing" (i.e. not a full diff) we can call
|
This is on my near/mid-term roadmap, so I can take the implementation in a month or two if we settle on the details. |
Maybe i'm missing something, but... If we were to do this via a tool that hits metrictank's api endpoint, then I envision the tool would first add the entries to the index table and then do RPC calls to all metrictank shards to add those entries to their index. But if an instance is starting up it may miss the addition to the index table. (e.g. if it just finished loading that partition but it's still loading other partitions). Trying to do an RPC call against such an instance to add index entries may be problematic because at least currently, all index methods are only accepted once the instance is ready (loaded all index partitions and replayed kafka, amongst others). Technically, the new rpc method could bypass that restriction but that seems like a hacky custom code path that goes against the current design, so I would rather avoid that. Also if you hit the index while it's initializing, you compete with very long write locks ( So the alternative I propose is to extend https://github.com/grafana/metrictank/blob/master/schema/msg/format.go |
Ah, yes, that is a corner case. As you allude to later this is an existing case for deletes. In fact, I believe the delete requests currently fail if the entire cluster isn't healthy.
This would make deletes asynchronous in the request path? Which instance is responsible for actually deleting/archiving the record from cassandra? So long as the client doesn't need to know about kafka, partitions, how to format the mkey, etc. I think that this is reasonable. |
true, but it's still better than executing incorrectly
The answer to this doesn't really change whether the delete comes in via REST or via kafka.
It's probably harmless for multiple replicas of the same shard to execute this query redundantly, but it's only needed that 1 replica per shard does it. That's also the recommended setup (write pods that have
For revival: For deletion: user submits query over rest. query gets published to kafka (whereas revival publishes MetricData's to be re-ingested, deletions are simply the query pattern). MT peers consume query, execute it against live index and update cassandra index accordingly (as described above) |
Funnily enough, that is how we used to have it configured, but we had to set read pods to true because our write pods are a completely different cluster (so they don't get the delete request at all).
We don't use mt-gateway. I imagine that the revival tool could have the "smarts" to do this. Since it's already crawling the archive index table, it has the partition and id. Really, it just needs the kafka broker/topic information.
Hmm, I thought that there would be a message per series (similar to how kafka-mdm-in works today). Putting the query in opens the window to differing behaviors. For example, to save on memory we prune more aggressively in our write instances than we do in the read instances. That means the write instances might not have the same "view" of the index. |
Somehow you still need to execute the query though, who will it be if not your metrictank cluster? I think the proposal works for both the standard (same pruning rules on all instances,
IOW perhaps the real requirement is, "whoever has the least aggressive pruning (and thus has the most complete view of the index), is the one who should |
Right...but my thought would be to process the query synchronously at least. Existing endpoints will likely need this anyway (to return the count of deleted series). Send the request to peers and either: A) collect matching definitions to one node and produce the kafka messages This means that so long as any replica is healthy for a shard group the message gets produced and can be processed later by unhealthy instances when they catch up.
This does introduce write amplification since read nodes are generally run with replicas (we set the |
The only thing metrictank needs, at a minimum, is compatibility with Graphite, which has no api to delete metrics So It sounds then like what we need is an api server that can receive
in both cases it gets a list of MKeys which can be published to kafka to the proper partitions for consumption by the right metrictank shards. something like that? (I know deletes are out of scope for this but i find it useful to mention them here as there seems to be some common ground) |
I think this sounds like a clean solution. It would also make it trivial to put the API/admin server behind authentication and cleaner to add non-graphite standard endpoints there. Some open questions/notes (don't need answers now):
|
What are some use cases for revival? Or parameters to control which series should be revived?
you mean for the api server to make modifications to the persistent index (both the live and archive tables) ? |
For us, the use case is "Revive series matching this name and these tags". I expect it will be an infrequent operation, so a standalone tool is fine.
Yes. Should the deletes be synchronous or rely on a metrictank instance configured to update the index (e.g. via |
I guess that is common enough. I now think a standalone service will be simpler than a service + a cli tool, even if we have to grow the api over time to accommodate some use cases (e.g. "revive series matching this pattern and also another one but not if they match this substring, and only if their lastUpdate is bigger than...")
This is a bit misleading because synchronous would usually imply when the request returns, the deleted data will no longer show up. This is not true here: deleting data from the persistent store may be synchronous, but queries get served by metrictanks and they would still show the data until they consume the message. "True sync" seems impossible in light of the problems we want to solve (instances being down and/or restarting) What's the advantage to this? I feel like it may be related to your deployment style but right now i gotta go and can't re-read what you said on this before. However I wanted to get some thoughts out for now.
|
That is one possible definition. But in "eventually consistent" platforms that isn't really true. In this case it's a consistency guarantee, I suppose. The benefit, IMO, is you don't need to know that any cluster members are consuming your control messages and are configured to update cassandra. You know that the table has been cleaned up and will eventually be reflected in running instances. It seems easy enough to start with fully async behavior and add in "consistency" declarations to the API if ever needed. In my case, I don't think I actually need it.
Also, I think that I didn't accurately convey the fact that it wouldn't be an exclusive "or". In my head, all instance confiugred to update the index would always do so (redundantly, if the API server was also issuing the delete/archive/revive operations to the index table). This is needed to handle small race conditions with the data input. The synchronous design would just be to add a consistency guarantee. As mentioned, we don't need to worry about that for now. |
Is your feature request related to a problem? Please describe.
We use
mt-index-prune
to keep themetric_idx
table lean for speedy start up. It would be nice to have a tool that could be run to "revive" series by partition/pattern.Describe the solution you'd like
This would be something that could run in the background similar to mt-index-prune itself. Reviving a series would likely entail putting that series back into the
metric_idx
table and somehow triggering a re-index operation in metrictank (to avoid needing a rolling restart).Describe alternatives you've considered
On several occasions we've needed to restore archived series and we currently use a script querying cassandra + publish fake data. Making this a "feature" of metrictank would clean this process up and simplify it a little.
The text was updated successfully, but these errors were encountered: