-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal - config component within Thanos #387
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# Thanos Cluster Configuration | ||
|
||
Status: **draft** | in-review | rejected | accepted | complete | ||
|
||
Implementation Owner: [@domgreen](https://github.com/domgreen) | ||
|
||
## Motivation | ||
|
||
Currently, each scraper manages their own configuration via [Prometheus Configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/) which contains information about the [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cscrape_config%3E) and targets that the scraper will be collecting metrics from. | ||
|
||
As we start to dynamically scale the collection of metrics (new Prometheus instances) or increase the targets for a current tenant we wish to keep the collection of metrics to a consistent node and not re-allocate shards to other scrapers. | ||
|
||
One key use case would be keeping metrics from a given tenant (customer / team etc.) on a consistant and minimal amount of nodes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wonder if we should attack this goal from the very beginning. Basically, we don't really care if every query will ask 10 scrapers or 40 for just 1d data... Or in other words, we don't know if that's is the issue or not. (: Can we start with first goal to support massive amount of dynamic targets going up and down in quite short time? |
||
|
||
## Proposal | ||
|
||
As we scale out horizontally we wish to manage configuration such as target assignment to scrapes in a more intelligent way, firstly by keeping targets on the same scrape instance as much as possible as we scale the scrape pool and in the future more intelligent bin packing of targets onto scrapers. | ||
|
||
We would look to add a new component to the Thanos system `thanos config` that would be a central point for loading configuration for the entire cluster and advertising via APIs the configuration for each sidecar. | ||
|
||
The `thanos config` will have a configuration endpoint that each `thanos sidecar` will call into to get their own scrape_config jobs along with their targets. Once the sidecar has its jobs it will be able to update targets / scrape_config for the Prometheus instance it is running alongside. This instance can then be reloaded or automatically pick up new targets via file_sd_config. | ||
|
||
The config component will keep track of what sidecar's are in the cluster at any time so will be able to have a central view of where targets / labels are being scraped from. Therefore, as a new scrape instance comes online it join the gossip cluster and therefoe config will know that it can start assigning configuration to this new node. | ||
|
||
``` | ||
┌─ reload ─┐ ┌─ reload ─┐ | ||
v │ v │ | ||
┌──────────────────────┐ ┌────────────┬─────────┐ | ||
│ Prometheus │ Sidecar │ │ Prometheus │ Sidecar │ | ||
└─────────────────┬────┘ └────────────┴────┬────┘ | ||
│ │ | ||
GetConfig GetConfig | ||
│ │ | ||
v v | ||
┌─────────────────────────────┐ | ||
│ Config │ | ||
└──────────────┬──────────────┘ | ||
│ | ||
Read files | ||
│ | ||
v | ||
┌─────────────────────────────┐ | ||
│ prom.yml │ | ||
└─────────────────────────────┘ | ||
``` | ||
|
||
|
||
## User Experience | ||
|
||
### Use Cases | ||
|
||
- We would wish to dynamically configure what targets / labels are on each Prometheus instance within the Thanos cluster | ||
- Allocate targets to a Prometheus instance based on data such as CPU utilization of a node. | ||
- Ensure that as we scale the number of Prometheus instances in the scrape pool we do not move scrape targets to other instances. | ||
|
||
## Implementation | ||
|
||
### Thanos Config | ||
|
||
The config component will read one or many Prometheus configuration files and dynamically allocate configuration to sidecars within the cluster. It initially joins a Thanos cluster mesh and can therefore find sidecars that it wishes to assign configuration. | ||
|
||
``` | ||
$ thanos config \ | ||
--http-address "0.0.0.0:9090" \ | ||
--grpc-address "0.0.0.0:9091" \ | ||
--cluster.peers "thanos-cluster.example.org" \ | ||
--config "promA.yaml;promB.yaml" \ | ||
``` | ||
|
||
The configuration supplied to the config component will then be distributed to the peers that config is aware of and exposed via a gRPC API so that each sidecar can get its configuarion. | ||
The gRPC endponit will take a request from a sidecar and based on the sidecar making the request it will give back the computed configuration. | ||
|
||
`thanos config` will initially only support a subset of the Prometheus Config will initially include a subset of global and scrape_config along with static_sd_config and file_sd_config. | ||
|
||
The inital configuration strategy will be intended for a multi-tenant use case and so keep jobs with a given label together on a single Prometheus instance. | ||
|
||
### Thanos Sidecar | ||
|
||
With the addition of `thanos config` we will look to update the `thanos sidecar` to periodically get its configuration from the config component. Each time it does so it will update the Prometheus instance by either updating the targets via file_sd_config or adding a new configuration job and forcing a reload on the Prometheus configuration. | ||
|
||
To do this the sidecar will need to know the location of the config component to get its configuration. | ||
|
||
``` | ||
$ thanos sidecar \ | ||
--tsdb.path "/path/to/prometheus/data/dir" \ | ||
--prometheus.url "http://localhost:9090" \ | ||
--gcs.bucket "example-bucket" \ | ||
--cluster.peers "thanos-cluster.example.org" \ | ||
--config.url "thanos-cluster.config.org" \ | ||
``` | ||
|
||
### Client/Server Backwards/Forwards compatibility | ||
|
||
This change would be fully backwards compatible, you do not have to use the config component within your Thanos setup and can still choose to manually set up configuration and service discovery for each node. | ||
|
||
The alternatives below might be a good starting point for users that do not need to worry about re-allocation of targets when the number of Prometheus instances in a cluster scales. | ||
|
||
## Alternatives considered | ||
|
||
### Prometheus & Hashmod | ||
|
||
An alternative to this is to use the existing [hashmod](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) functionality within Prometheus to enable [horizontal scaling](https://www.robustperception.io/scaling-and-federating-prometheus/), in this scenario a user would supply their entire configuaration to every Prometheus node and use hashing to scale out. | ||
The main downside of this is that as a Prometheus instance is added or removed from the cluster the targets will be moved to a new scrape instance. This is bad if you are wanting to keep tenants to a minimal number of Prometheus instances. | ||
We have seen this approach also have a problem with hot shards in the past whereby a given Prometheus instance may endup being overloaded and is difficult to redistribute the load onto other Promethus instances in the scrape pool. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this very strong argument, and should be placed IMO in the very up in |
||
|
||
### Prometheus & Consistant Hashing | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo: 'consistent' instead of 'consistant' |
||
|
||
[Issue 4309](https://github.com/prometheus/prometheus/issues/4309) looks at adding a consistant hashing algorith to Promethus which would allow us to minimise the re-allocation of targets as the scrape pool is scaled. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo: algorith -> algorithm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo: Promethus -> Prometheus |
||
Unfortunatley, based on discussions this does not look like it will be accepted and added to the Prometheus codebase. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo: Unfortunatley -> Unfortunately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add some mention, that this is not always the case and not always needed for example:
"Most of the time you have static number of targets that allows you to use Prometheus
hashmod
to shard targets among available Prometheus instances. Even if they rarely change, you can quite easily add another instance and reshard which increases time series churn, but should be fine if done not too often.However, this does not really work well for:
I would love to bring down our use case to very simple form that would be easy to imagine. What do you think?