-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal - config component within Thanos #387
Conversation
To be honest I don't think it's a good idea to re-invent configuration management within Thanos. Everything in terms of correctness from consistency to reproduce-ability is going to be a huge pain. I recommend sticking to the sidecar approach and build a wrapper around the sidecar for more specific or unique uses. Regarding your tenancy distribution, sure it's a little overhead, but how bad is it really to have separate sets of Prometheus servers per tenant? (and from experience I can tell you some high profile customers have strict compliance requirements that will force you to have physically separated tenants) In our case we really are only looking to shard targets of a single tenant across multiple Prometheus servers, so if we can handle cutting tsdb chunks directly at shutdown and uploading them to object storage we would probably be fine with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions to the motivation section.
@domgreen I think the main thing here is to clearly divide TWO goals we want to solve here:
- enable auto-scaling of scrapers to meet dynamic targets demands (including scaling down if they are under-utilized)
- keep tenant's targets together if possible (I don't think we are sure it is must-have)
|
||
Currently, each scraper manages their own configuration via [Prometheus Configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/) which contains information about the [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cscrape_config%3E) and targets that the scraper will be collecting metrics from. | ||
|
||
As we start to dynamically scale the collection of metrics (new Prometheus instances) or increase the targets for a current tenant we wish to keep the collection of metrics to a consistent node and not re-allocate shards to other scrapers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add some mention, that this is not always the case and not always needed for example:
"Most of the time you have static number of targets that allows you to use Prometheus hashmod
to shard targets among available Prometheus instances. Even if they rarely change, you can quite easily add another instance and reshard which increases time series churn, but should be fine if done not too often.
However, this does not really work well for:
- very dynamic targets (a couple of hundreds of targets going up and down every couple of hours)
- multi-tenancy awareness. We might want to limit which scrapes we want to ask for data for certain user's metrics
I would love to bring down our use case to very simple form that would be easy to imagine. What do you think?
docs/proposals/config.md
Outdated
|
||
As we start to dynamically scale the collection of metrics (new Prometheus instances) or increase the targets for a current tenant we wish to keep the collection of metrics to a consistent node and not re-allocate shards to other scrapers. | ||
|
||
One key use case would be keeping metrics from a given tenant (customer / team etc.) on a consistant and minimal amount of nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if we should attack this goal from the very beginning. Basically, we don't really care if every query will ask 10 scrapers or 40 for just 1d data... Or in other words, we don't know if that's is the issue or not. (:
Can we start with first goal to support massive amount of dynamic targets going up and down in quite short time?
docs/proposals/config.md
Outdated
|
||
An alternative to this is to use the existing [hashmod](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) functionality within Prometheus to enable [horizontal scaling](https://www.robustperception.io/scaling-and-federating-prometheus/), in this scenario a user would supply their entire configuaration to every Prometheus node and use hashing to scale out. | ||
The main downside of this is that as a Prometheus instance is added or removed from the cluster the targets will be moved to a new scrape instance. This is bad if you are wanting to keep tenants to a minimal number of Prometheus instances. | ||
We have seen this approach also have a problem with hot shards in the past whereby a given Prometheus instance may endup being overloaded and is difficult to redistribute the load onto other Promethus instances in the scrape pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this very strong argument, and should be placed IMO in the very up in motivation
section
I can agree with it but what if we don't have any other options?
Sure, let's assume we do so. That will meet our goal to keep tenant in single known places. But I don't think that helps us with our main problem here. We still want dynamic scaling. Tenant's targets can go up and down quite often in large number. Ideally, we would like to equally balance the targets and create scrapers in advance automatically and tear down when some shard is underutilized. Definitely, current |
We(and I suspect many who don't run multi-tenant systems) don't have this requirement when it comes to our current Thanos use-case, which does mean it'd be a significant management overhead & source of underutilisation. Worth noting that enabling multi-tenant target distribution on the scrapers doesn't preclude the model of single-tenant target-distribution. |
What about the alerts? How will alerts move to scrapers? Or everyone will need to move stuff onto |
@povilasv not sure how alerts could be affected here. And even though you should not be worried. Our proposal is about adding an extra logic that helps with a highly dynamic target changes. It will not affect the current flow, especially alerting. It will be only an optional thing. EDIT: @povilasv you are right about alert handling. This will mean the same issue with alerts as with |
@povilasv @Bplotka correct, it currently only covers allocating targets to scrapers for a dynamic environment. |
docs/proposals/config.md
Outdated
An alternative to this is to use the existing [hashmod](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) functionality within Prometheus to enable [horizontal scaling](https://www.robustperception.io/scaling-and-federating-prometheus/), in this scenario a user would supply their entire configuaration to every Prometheus node and use hashing to scale out. | ||
The main downside of this is that as a Prometheus instance is added or removed from the cluster the targets will be moved to a new scrape instance. This is bad if you are wanting to keep tenants to a minimal number of Prometheus instances. | ||
|
||
### Prometheus & Consistant Hashing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: 'consistent' instead of 'consistant'
docs/proposals/config.md
Outdated
|
||
### Prometheus & Consistant Hashing | ||
|
||
[Issue 4309](https://github.com/prometheus/prometheus/issues/4309) looks at adding a consistant hashing algorith to Promethus which would allow us to minimise the re-allocation of targets as the scrape pool is scaled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: algorith -> algorithm
docs/proposals/config.md
Outdated
### Prometheus & Consistant Hashing | ||
|
||
[Issue 4309](https://github.com/prometheus/prometheus/issues/4309) looks at adding a consistant hashing algorith to Promethus which would allow us to minimise the re-allocation of targets as the scrape pool is scaled. | ||
Unfortunatley, based on discussions this does not look like it will be accepted and added to the Prometheus codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Unfortunatley -> Unfortunately
docs/proposals/config.md
Outdated
|
||
### Prometheus & Consistant Hashing | ||
|
||
[Issue 4309](https://github.com/prometheus/prometheus/issues/4309) looks at adding a consistant hashing algorith to Promethus which would allow us to minimise the re-allocation of targets as the scrape pool is scaled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Promethus -> Prometheus
The way I understand this proposal is that you want to solve horizontal scalability of ingestion within Thanos. That in itself seems reasonable (although as you already described should be opt-in), but couldn't be mixed with configuration management itself I think. I think you would be better off leaving configuration management completely up to the user and just solve distributing targets across multiple Prometheus servers via file discovery. That way you have a clear separation where the Thanos sidecar perform the consistent hashing (or any other sharding mechanism) to distribute the targets on ingesting Prometheus servers, and configuration stays unopinionated. (fwiw we're building integrated support for Thanos into the Prometheus Operator for Kubernetes) |
What do you exactly mean by I think we are totally fine with it, however, the essential part where we want to reuse Thanos sidecar is
@domgreen do you think it makes sense to add the scale down logic description to this proposal?
Do you mean any particular feature for horizontal scalability for ingestion? (Prom sharding?) |
As @Bplotka mentions the main reason we were going to add it as an optional component into Thanos was that a sidecar (either as part of Thanos or separately) would be being used to reload the targets on a scraper and then manage the scale down of a scraper as outlined above. So we would have to update Thanos regardless to ensure it knew to flush / snapshot the current WAL and upload when all targets had been removed and it is being drained. It is also worth noting in the scale down scenario the "drained" scraper would still serve requests from The integration with the Prometheus scraper will always be @krasi-georgiev - from other issue ... my main reservation with the DIBS approach is the pod annotations as although our scrapers (Prom/Thanos) are within k8s our targets do not have to be within a k8s cluster so would not all be within pods. @Bplotka will update with considerations around scaling down and ensure that it is explicit that we want minimal targets to move scrapers when scaling up. |
The format of this proposal mirrors an existing draft proposal: thanos-io#387
@Bplotka: By 'integral part of Thanos', I mean that you're adding new functionality to the Thanos codebase (regardless of how the functionality is invoked) that assumes a new responsibility - orchestrating Prometheus' scrape configuration.
Potentially. If you're using Prometheus for scraping and persistence (and not alerting), I think it might be cleaner to build a component that does just that, giving you maximum flexibility in your implementation. If you are using Prometheus for alerting as well as scraping, then I would expect to solve this problem at the layer above using configuration management or an operator. I think Thanos is an unsuitable place for this logic because Thanos currently has no knowledge of how Prometheus scrapes the metrics - it just deals with the output from that scrape (the TSDB chunks on disk). |
Looking at the discussion it seems that the proposal could be fairly specific requirements to Improbable. |
Should the proposal still be merged but with the status changed to
'rejected' with a one-line explanation linking to the discussion here?
…On Mon, 9 Jul 2018, 17:45 Dominic Green, ***@***.***> wrote:
Closed #387 <#387>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#387 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEJbsLiwbhMECxzaTRSG9L-rHJK6YxNSks5uE4jBgaJpZM4U4GU9>
.
|
I like @mattbostock’s idea. Kind of like a “non-goals of Thanos”. |
Yeah, that sounds good. Will update the status and add a short summary. |
Yup, your opinion was really helpful and shaped our approach. Definitely agree that it is not the best idea to put this inside Thanos and make this project even more complex. I think when we will have some MVP working we will try to open source it in a separate repo, maybe someone will find it useful. (: Cheers! |
Awesome idea with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not commit idea
file and we are good to land this (:
.idea/workspace.xml
Outdated
@@ -0,0 +1,626 @@ | |||
<?xml version="1.0" encoding="UTF-8"?> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not commit this file (:
* Add proposal for store instance high availability The format of this proposal mirrors an existing draft proposal: #387 * Update HA store proposal based on feedback
Changes
Added a proposal document on how we can do configuration management within Thanos, this should allow an improved and more consistent experience when scaling.
cc @krasi-georgiev @brancz