Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: expose `/api/v1/write` enpoint for remote_write storage API #4769

Open
valyala opened this Issue Oct 22, 2018 · 11 comments

Comments

Projects
None yet
4 participants
@valyala
Copy link

valyala commented Oct 22, 2018

Proposal

Sometimes is is useful to have a single big Prometheus instance with all the data from other Prometheus instances located in different networks / datacenters.

Solution

The most straightforward solution for this use case is to expose /api/v1/write endpoint for remote_write storage API on the big Prometheus instance, so other instances could write to it using remote storage protocol.

@semyonslepov

This comment has been minimized.

Copy link

semyonslepov commented Oct 31, 2018

Have you tried Prometheus federation mechanism for this purpose yet?

@valyala

This comment has been minimized.

Copy link
Author

valyala commented Nov 1, 2018

Have you tried Prometheus federation mechanism for this purpose yet?

Federation doesn't fit, since it requires access from the top Prometheus instance to all the remote leaf Prometheus instances in different networks / datacenters (which may be behind NATs / firewalls with varying configs). This is harder to operate comparing to the case when leaf Prometheus instances write directly to the top Prometheus instance via standard remote write mechanism. In this case it is enough to configure a single access to the top Prometheus instance from all the leaf Prometheus instances.

@semyonslepov

This comment has been minimized.

Copy link

semyonslepov commented Nov 1, 2018

I'm not sure if anybody goes to allow direct writing to Prometheus TSDB apart from Prometheus server on the same machine itself and it seems as quite a bad idea to me too.
What might work as a workaround - set up all the Prometheus servers writing via remote write to a local TSDBs (InfluxDB, CrateDB, whatever) and then set up writing data from that "small" TSDB instances to a "single big" one. So, "single big" Prometheus will have access to all this data in the "single big" TSDB via remote read.

@valyala

This comment has been minimized.

Copy link
Author

valyala commented Dec 6, 2018

There is another use case for the exposed remote write API - seamless integration of push model via external adapter service that can collect metrics in various popular formats such as InfluxDB's line protocol, various Graphite formats and formats based on message queues.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 7, 2018

Hi, this is basically a request for adding push support to Prometheus, which has come up a lot in the past, and which we've been very careful about so far.

First, a bit of background why push support hasn't been added yet to Prometheus so far: Prometheus is primarily a pull-based monitoring system (which just happens to include a TSDB, but we don't see Prometheus as a TSDB primarily), several of its core features assume that the server is in control of scraping metrics and attaching timestamps at its own configured pace. For example, recording and alerting rules are evaluated based on the server's notion of the current time. So the underlying metrics must arrive in lockstep with rule evaluations, which is best ensured with a pull model. Service discovery integration (with target metadata attaching etc.) and automatic target health monitoring are other features that rely on the pull model. So if we added a push endpoint and advertised it loudly, we would be worried that many users who don't know 100% what they are doing would shoot themselves in the foot, thinking that Prometheus supports "push" now.

However, I started a rough discussion doc a long while back about pros/cons of adding push support to Prometheus: https://docs.google.com/document/d/1H47v7WfyKkSLMrR8_iku6u9VB73WrVzBHb2SB6dL9_g/edit#heading=h.2v27snv0lsur

The consensus at the last dev summit was to add an item to the Prometheus roadmap to only allow backfilling of entire time series (not appending individual samples like via remote write): https://prometheus.io/docs/introduction/roadmap/#backfill-time-series

And yeah, in the originally described case, federation is the usually adopted architecture, which does require making individual Prometheus servers reachable from the central one.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 7, 2018

And yeah, in the originally described case, federation is the usually adopted architecture, which does require making individual Prometheus servers reachable from the central one.

I need to qualify that comment: federation is not recommended for transferring all data from one Prometheus server to another, as neither the source TSDB is optimized for that, not is it resource-efficient to do single humungous scrapes. It's more meant to transfer over a select set of aggregated series in the thousands, not in the millions. So unless the data that needs to be federated into the global Prometheus server is small-ish, there is indeed no great solution for this at the moment with Prometheus. Mind you, remote write is super inefficient resource-wise too, though.

Maybe you are rather looking for something like Thanos, which gives you a global view and long-term storage, while being efficient at the transfer of large amounts of data too (much more so than remote write or federation).

@bwplotka

This comment has been minimized.

Copy link
Contributor

bwplotka commented Dec 7, 2018

To sum up, if you use Prometheus to push its metrics via remote write to another Prometheus (not possible, but requested in this issue) or some LTS storage, this is still pull model, because what matters is how data were collected, right? (:

So if I understand this right, the main blocker for adding this one is to avoid abuse and pushing arbitrary samples from other non-pull collectors? (which is fair point)

I'm asking as we consider adding remote write receiver endpoint for Thanos system: improbable-eng/thanos#659 and we need to educate users which one to choose (sidecar + query for fresh data) or remote write.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 7, 2018

@bwplotka This depends on your perspective. From the final Prometheus's perspective, the collection happens as a push, so it is not a pull model from its perspective. It cannot reliably do the things that Prometheus normally does with the data (attach target metadata from its SD perspective, compute rules in the confidence that timestamps will arrive in lockstep with its current time, know what data should be coming in, etc.). You are effectively treating it as a TSDB mainly then, not a monitoring system first. For Thanos this might be fine, as Thanos aims to mainly be a long-term storage TSDB and not a monitoring system. For Prometheus so far the opinion of devs has been to not go this direction.

@bwplotka

This comment has been minimized.

Copy link
Contributor

bwplotka commented Dec 7, 2018

You are effectively treating it as a TSDB mainly then, not a monitoring system first.

Yup, totally agree.

@valyala

This comment has been minimized.

Copy link
Author

valyala commented Dec 7, 2018

Mind you, remote write is super inefficient resource-wise too, though.

Why? From my experience it works quite well for Prometheus scraping 50K-100K samples/sec. Probably, it must be optimized somehow for Prometheus scraping millions of samples per second?

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 9, 2018

@valyala On the sending side, remote write totally blows up memory usage (or at least used to, not sure about the current state of optimization) because normally sample ingestion is a very highly memory-tuned process that avoids allocations as much as possible, appends samples into the TSDB using internal TSDB series IDs when possible (rather than storing full label sets for every sample), etc. For remote write, every sample has to be re-encoded, buffered, and then sent over the wire in its fully expanded form, in protobuf format, in near-time (instead of larger compressed batches). In contrast, Thanos just ships completed (and very well-compressed) on-disk blocks, which requires much less memory memory and other resources.

I'm actually glad to hear that remote write still works well for you with 50K-100K samples/sec. I'd be curious to hear how much the same Prometheus server would use if you turned remote write off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.