Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd support for local storage engine that only supports federation queries #2174
Comments
This comment has been minimized.
This comment has been minimized.
|
If you're not using chunks to store data, then where are you storing it? A low retention period should do what you want, though it's best to push down your monitoring rather than introducing additional races and artifacts by using federation for everything. It sounds like what you want is a proxy server. |
This comment has been minimized.
This comment has been minimized.
I think he is referring to no series files. All chunks that are not yet purged are kept in the checkpoint. This makes sense for servers with low retention. If you know you can keep everything in RAM, there is no need to ever write to series files. It should be fairly easy to implement. |
This comment has been minimized.
This comment has been minimized.
onorua
commented
Dec 3, 2016
|
I second this proposal. I was really amused how federation in prometheus works, or better say doesn't work. |
This comment has been minimized.
This comment has been minimized.
Federation is not intended to dump entire servers. It's intended to provide a limited number of aggregated stats to a higher level of Prometheus server.
This is correct, and why we advise not using sharding unless you really have to. The vast majority of users do not have the thousands of machines in a single DC that'd justify horizontal sharding. |
This comment has been minimized.
This comment has been minimized.
|
I'd also find this useful. We have use case where we need to aggregate very
high cardinality metrics and federate the aggregated metrics to another
Prometheus instance.
I was planning on using a short retention period but this would be better.
…On Tue, 8 Nov 2016, 14:23 Björn Rabenstein, ***@***.***> wrote:
If you're not using chunks to store data, then where are you storing it?
I think he is referring to no series files. All chunks that are not yet
purged are kept in the checkpoint. This makes sense for servers with low
retention. If you know you can keep everything in RAM, there is no need to
ever write to series files. It should be fairly easy to implement.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2174 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEJbsKzRzQy7euX6eMED-cLKQNroSBHBks5q8IX2gaJpZM4Kr4cO>
.
|
This comment has been minimized.
This comment has been minimized.
|
In that sort of case I'd still advise keeping the data around in a Prometheus for a while, so you can debug with the raw data. If you never want to use the high cardinality metrics, then you should remove the unwanted labels from instrumentation rather than adding a Prometheus to do so. |
This comment has been minimized.
This comment has been minimized.
onorua
commented
Dec 4, 2016
We've got more than 3.2mil metrics from <400 servers, which was enough to utilize 32GB of RAM and 12 cores and got throttling every 2 hours. I had to use federation just to scale to more than 500 nodes. We are currently doing testing of prometheus on our test cluster, and it is good enough for small scale (<300 nodes) with one node, but not good at all if you need to have horizontal scale. |
This comment has been minimized.
This comment has been minimized.
Makes sense. |
ncabatoff
added a commit
to ncabatoff/prometheus
that referenced
this issue
Dec 4, 2016
This comment has been minimized.
This comment has been minimized.
Prometheus is first and foremost not a horizontally scaling system and trying to treat it as one will inevitably leave you disappointed. Are these 3.2 million metrics from all sorts of different applications running on these servers too? It's totally sane to have e.g. three Prometheus servers. One for node-exporters, one for frontend services, one for databases. There's a chance we'll figure something out in the future if remote-storage comes into play. But it likely won't remove the idea to functionally shard your instances. |
This comment has been minimized.
This comment has been minimized.
onorua
commented
Dec 5, 2016
|
I'm absolutely fine with sharding idea. I did shard our nodes by node_exporter/cadviser/our_own_exporter, but apart from theoretical simplicity, you have operational and maintenance overhead. Let me elaborate on this: Right now we have 700 nodes managed by one prometheus server (fraction of cadviser), which is +- limit for current 32GB RAM server. We want to replace all our 8GB RAM compute nodes with 4 times smaller one's, which means we will have around 3200 nodes in peak and around 2000 off-peak. So, the only way to handle this amount of metrics and nodes, is to use sharding. But I can't do sharding based on purpose or metric type, most probably I'll be doing sharding based on hash. Because otherwise you are dealing with your prometheus as a "pet", each has different config which you can't automate. So, you have got like 6 prometheus servers to handle this traffic solely for node_exporter, and you double it to make HA of your data. Than you do maintenance, release upgrade, kernel upgrade, what ever, and you have got some slight increase in CPU usage, first thing you do - go to your prometheus server and dig as deep as you can. But, as you did hash based sharding, you have no clue where to go. So you go to each and every node, and make a query. But than, you discover that one of the nodes were restarting and due to this, some data was not scraped that time, but you must come to it's twin to get it. Basically in order to make some sort of analysis in this case - you may need to do 12 times more work than if it would be one server. There is no magic in what I describe, this kind of federation we have now, applies additional overhead on operation's team. Which basically forced DO guys to reimplement Prometheus. |
This comment has been minimized.
This comment has been minimized.
The trick to this is to federate up one non-aggregated metric per host that includes a label indicating the slave into the master Prometheus, and use that to find the slave Prometheus you want. |
This comment has been minimized.
This comment has been minimized.
onorua
commented
Dec 6, 2016
|
@brian-brazil thanks for the suggestion! This should work in static environment. I have problems applying this into dynamic environment though. |
This comment has been minimized.
This comment has been minimized.
Changing the number of slaves should be quite rare, every few years maybe. That doesn't affect the trick I proposed though.
That's what the master node is for, to do aggregation.
Nothing changes, it'll just work. |
This comment has been minimized.
This comment has been minimized.
onorua
commented
Dec 6, 2016
with all the respect, why do you think that changing amount of nodes is quite rare or it should be rare? I've tripled amount of nodes within past 2 weeks, and this is just the beginning, we are not even close to required capacity. We will make amount of nodes x4, because we are moving to micro-nodes and micro-services, which means every couple of weeks, I'll add new prometheus node.
Completely agree with you! That is the whole point of the ticket! Let us use master node for federation (hide internal topology) and deep node analysis simultaneously or at least transparently and everybody will be happy. |
This comment has been minimized.
This comment has been minimized.
You should be capacity planning so that it's rare. On the (large) systems I've worked on previously we doubled every 2-3 years. Adding nodes to a setup like this is disruptive, you want to keep it rare.
That's not what this issue is requesting. Attempting to use what this issue requests to help scaling issues won't help, as you're still pulling all data into the master. |
brian-brazil
added
the
wont-fix
label
Jul 14, 2017
This comment has been minimized.
This comment has been minimized.
|
A short retention period with new storage covers this. |
brian-brazil
closed this
Jul 14, 2017
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
ncabatoff commentedNov 8, 2016
The goal of this enhancement proposal is to reduce the resource usage for Prometheus instances that live only to collect metrics and answer federation queries. The use case is to work around networking issues by using this Prometheus instance as something like a "pull gateway", collecting metrics on the local host and allowing a downstream Prometheus to federate them over a single port, without having to expose all the ports of the underlying metric sources.
The proposal is to add a "federation" option for -storage.local.engine, which would lie in between "persisted" and "none" in terms of functionality. In this mode, indexes would be created as usual and checkpoints performed, but no sample chunks would ever be created. Running in this mode would result in a low-impact "federation" Prometheus instance that can collect metrics locally without consuming much RAM or disk IO, and can expose all the metrics on a single port via the federation API (which doesn't rely on chunk storage.)
One option to do this that exists already is to run with a very small retention period, but with a large number of metrics this makes series maintenance very busy, and it seems wasteful to spend CPU time encoding chunks when we're really only interested in the last sample value.
The firewall issue could instead be addressed via a proxy without involving a Prometheus at all, i.e. allow a single process to proxy metric fetching requests to all the underlying metric sources. The advantage of using Prometheus is that it simplifies some kinds of service discovery in that you don't need to have a global view of the entire system. In my case it's easier to learn what metric sources exist locally.