Remote storage ordering #1931

tomwilkie · 2016-08-29T16:52:26Z

By splitting the single queue into multiple queues and flushing each individual queue in serially (and all queues in parallel), we can guarantee to preserve the order of timestamps in samples sent to downstream systems.

Also, rationalise the metrics exported by the queue manager.

Fixes #1843

juliusv · 2016-08-29T16:57:11Z

👍 like in the downstream copy of this (tomwilkie#72 (comment)).

I agree that it seems like @brian-brazil's concerns are orthogonal to the parallelization/ordering you're doing here, but would like him to confirm.

tomwilkie · 2016-08-29T17:00:23Z

@brian-brazil said:
To put numbers on that, for a 100k samples/s Prometheus the chosen constants require an RTT of <10ms to work.

True, but how is that any different to the current code in master?

The current code is considered experimental, and this is a known issue.

I see; whats the issue number?

My aim was not to address all existing problems with this code in this PR, although I'd love to discuss / implement solutions to the problems you describe in the not-to-distant future, especially as we're likely to be the ones that hit them!

I fully intend to allow the parameterisation of this in the config file in the future.

I think it should be dynamic, especially as we've two users building distributed prometheus
systems where the prometheus itself wouldn't be able to alert on the problem.

I don't follow; why can't the 'retrieval' prometheus alert on this problem? The metrics expose dropped samples and remote latencies (I also happened to clean up the metrics in this PR - mind taking a look?)

brian-brazil · 2016-08-29T17:03:54Z

I see; whats the issue number?

There isn't one, but it's a known problem. If we were just talking long term storage some flags would be okay, but that's not the case with distributed storage.

I don't follow; why can't the 'retrieval' prometheus alert on this problem?

In the Frankenstein architecture there's nothing scraping the Prometheus having the problem.

tomwilkie · 2016-08-29T17:12:03Z

There isn't one, but it's a known problem. If we were just talking long term storage some flags would be okay, but that's not the case with distributed storage.

Why so?

I'm not against making this dynamic per-se, but I don't see it as a blocker for this PR - this is no worse than what exists and I've factored out the config so we can make it configurable in the next few days. I'd be happy to make it configurable in this PR, and then by running a bunch of these we can learn how to make them more dynamic where appropriate.

In the Frankenstein architecture there's nothing scraping the Prometheus having the problem.

Interestingly we have the retrieval scrapers setup to scrape themselves, so we could indeed detect this problem upstream / service side. And we're also planing to implement alerts for when we don't see any incoming samples for some duration. I think we're now straying off topic for this PR though...

brian-brazil · 2016-08-29T17:19:56Z

Why so?

With long term storage, it doesn't matter if the long term storage doesn't work as you still have all the data in your local Prometheus and almost all your alerts and graphs will be working off that. With distributed storage if all the data can't get to/from the remote storage you're toast.
Depending on humans to keep a flag updated to prevent their monitoring silently breaking is not likely to work out too well.

Interestingly we have the retrieval scrapers setup to scrape themselves, so we could indeed detect this problem upstream / service side

Only if the data is making it to the distributed storage.

And we're also planing to implement alerts for when we don't see any incoming samples for some duration.

This would present as some samples coming through, but not all. Practically speaking it'd top out at some upper bound determined by RTT, so gradual growth could get you into this situation.

juliusv · 2016-08-29T19:35:44Z

@brian-brazil Can we agree that this is an orthogonal problem which can be fixed independently and that this PR doesn't materially make anything worse with regards to that?

brian-brazil · 2016-08-30T07:57:47Z

Okay, but this needs to be handled automatically.

tomwilkie · 2016-08-30T08:16:10Z

Agreed, filed #1933 for the follow up.

juliusv · 2016-08-30T09:25:48Z

storage/remote/queue_manager.go

+	BatchSendDeadline time.Duration // Maximum time sample will wait in buffer.
+}
+
+var defaultConfig = StorageQueueManagerConfig{


One last nit: if something is called defaultConfig, I would expect that the constructor takes a pointer to a config and if it's nil, use the default config automatically, rather than having every caller pass it in.

👍 done, PTAL

By splitting the single queue into multiple queues and flushing each individual queue in serially (and all queues in parallel), we can guarantee to preserve the order of timestampsin samples sent to downstream systems.

…n both samples and batches, in a consistent fashion. Also, report total queue capacity of all queues, i.e. capacity * shards.

tomwilkie mentioned this pull request Aug 29, 2016

Shard/parrallelise samples by fingerprint in StorageQueueManager, tomwilkie/prometheus#72

Merged

tomwilkie mentioned this pull request Aug 30, 2016

Scheme for automatically handling remote queue backlog / failures / scaling #1933

Closed

juliusv reviewed Aug 30, 2016
View reviewed changes

tomwilkie and others added 3 commits August 30, 2016 17:42

Shard/parrallelise samples by fingerprint in StorageQueueManager

ece12bf

By splitting the single queue into multiple queues and flushing each individual queue in serially (and all queues in parallel), we can guarantee to preserve the order of timestampsin samples sent to downstream systems.

Rationalise retrieval metrics so we have the state (success/failed) o…

a6931b7

…n both samples and batches, in a consistent fashion. Also, report total queue capacity of all queues, i.e. capacity * shards.

Update for new generic remote storage.

d41d913

tomwilkie force-pushed the 1843-remote-storage-ordering branch from ec6f7d0 to d41d913 Compare August 30, 2016 15:43

juliusv merged commit 1c27189 into prometheus:master Aug 30, 2016

juliusv deleted the 1843-remote-storage-ordering branch August 30, 2016 15:49

tomwilkie mentioned this pull request Feb 23, 2017

Allow users to tweak remote queue parameters #2445

Closed

tomwilkie mentioned this pull request May 24, 2018

Only give remote queues 1 minute to flush samples on shutdown. #4187

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote storage ordering #1931

Remote storage ordering #1931

tomwilkie commented Aug 29, 2016

juliusv commented Aug 29, 2016

tomwilkie commented Aug 29, 2016

brian-brazil commented Aug 29, 2016

tomwilkie commented Aug 29, 2016

brian-brazil commented Aug 29, 2016

juliusv commented Aug 29, 2016

brian-brazil commented Aug 30, 2016

tomwilkie commented Aug 30, 2016

juliusv Aug 30, 2016

tomwilkie Aug 30, 2016 •

edited

Remote storage ordering #1931

Remote storage ordering #1931

Conversation

tomwilkie commented Aug 29, 2016

juliusv commented Aug 29, 2016

tomwilkie commented Aug 29, 2016

brian-brazil commented Aug 29, 2016

tomwilkie commented Aug 29, 2016

brian-brazil commented Aug 29, 2016

juliusv commented Aug 29, 2016

brian-brazil commented Aug 30, 2016

tomwilkie commented Aug 30, 2016

juliusv Aug 30, 2016

Choose a reason for hiding this comment

tomwilkie Aug 30, 2016 • edited

Choose a reason for hiding this comment

tomwilkie Aug 30, 2016 •

edited