New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: Add Documentation to do HA deduplication on Mimir Helm Deployment #2972
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start! In general I think we need to add Grafana agent setup and assume that the user already has a working Mimir - to avoid duplicating documentation.
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
@krajorama please help to do another pass when you are free. I will unset draft from this PR too now. Changes from your first pass:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for writing this! I would also add a mention to this article in the existing "Configuring Grafana Mimir high-availability deduplication"
|
||
# Configuring Grafana Mimir Helm Chart for high-availability deduplication with Consul | ||
|
||
Grafana Mimir can deduplicate data from high-availability (HA) Prometheus setup. In this guide, you will see how to configure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd also add a link to the HA dedup docs here, since here is the first mention of HA dedup in this doc.
You also have to configure HA for Prometheus or Grafana Agent. Last, you need a KV store. In this guide you will | ||
use Consul. You will be guided on the setup if you haven't had one. | ||
|
||
## Configure Prometheus or Grafana Agent to send HA external labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is covered by the other guide. Should we repeat the info here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried that keep referring to the external document will make the reader keep jumping around to different documents. External label configuration is the most important part of HA configuration, hence I feel that for good reading/guide continuity, it will be better to keep this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We need to add a note about what content is copied and pasted between documents so that we name one source as definitive. This is a workaround because we do not have single-sourcing capabilities for documentation that is this granular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to put a note that this section contains similar content from another HA doc. Please review and advise if this note is useful or better just to remove it.
weight: 70 | ||
--- | ||
|
||
# Configuring Grafana Mimir Helm Chart for high-availability deduplication with Consul |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the name of the chart - mimir-distributed? Maybe a tech writer can also pitch in, but using the name seems more concrete and searchable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look at this soon; on UTC-4 atm and might not get to this until next Monday.
Install Consul in Kubernetes using | ||
[Consul helm chart](https://github.com/hashicorp/consul-k8s/tree/main/charts/consul). Follow the documentation to get | ||
the chart and installing a Consul release. Put a note on the Consul endpoint, because you will need it for Mimir | ||
configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried installing consul. Is there anything specific about this installation that they should do? Like configure this as X or the other thing as [A, B, C]
? or the default config would work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default config will work.
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
|
||
### HA Deduplication Global | ||
|
||
You need a Mimir setup installed by Helm. Add the following configuration to your `values.yaml` to configure HA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick about values.yaml. I think we've used custom.yaml in a few articles so far to refer to the values file that the user supplies. It makes it obvious that it's a different file from the values.yaml in the root of the helm chart. Those values.yaml are the default ones and helm uses them every time, even without providing a flag.
helm -n <mimir-namespace> upgrade --install mimir grafana/mimir-distributed -f values.yaml | ||
``` | ||
|
||
## Verifying deduplication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the distributor also exposes some metrics around this. I think it's worth mentioning. They all start with cortex_ha_tracker_
, I think the ones that start with cortex_ha_tracker_elected_replica_
will be most useful to users.
These are also exposed on the "Mimir / Tenants" dashboard in the "Distributor deduplicated/non-HA" panel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on another topic - what do you think about moving this section to the other article - "Configuring Grafana Mimir high-availability deduplication"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the distributor also exposes some metrics around this...
I will mention the metrics info in this section.
on another topic - what do you think about moving this section to the other article - "Configuring Grafana Mimir high-availability deduplication"?
I am fine to move the section to the "Configuring Grafana Mimir high-availability deduplication". But maybe we want to hear first what @krajorama or @osg-grafana think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mimir does not have self monitoring like GEM so that won't make sense there in itself.
And the Mimir documentation tries to be deployment agnostic so talking about port forward seems out of scope.
Given that I'd keep these here.
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,138 @@ | |||
--- | |||
aliases: | |||
- /docs/mimir/latest/operators-guide/configuring/setting-helm-ha-deduplication-consul/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this alias and title to use an imperative: configure-helm-ha-deduplication-consul
.
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
deduplication globally. | ||
|
||
```yaml | ||
# other configurations above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean, "Put other configurations before this one." or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the section above this comment is all configurations added when installing mimir-distributed helm chart in the first time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway to make it non confusing, I just removed the whole comment.
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
host: <consul-endpoint> # example: http://consul.consul.svc.cluster.local:8500 | ||
``` | ||
|
||
Upgrade the Mimir's helm release using the above configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade the Mimir's helm release using the above configuration. | |
1. Upgrade the Mimir’s helm chart version by using the following configuration. |
This looks inaccurate, so I changed it to "following".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "following" code block is not a configuration but is a command. Hence I changed this to:
2. Upgrade the Mimir's helm release using the following command:
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
ha_replica_label: __replica__ | ||
``` | ||
|
||
Upgrade the Mimir's helm release using the above configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Upgrade the Mimir’s helm chart version by using the following configuration:
The image docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/verify-deduplication.png is unreadable. Do you have one that has higher readability? |
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/setting-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase |
docs/sources/operators-guide/configure/configure-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/configure-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/configure-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
Configure Prometheus or Grafana Agent HA setup by setting label called `cluster` and `__replica__`. These two labels | ||
name are default labels for HA setup in Grafana Mimir. If you want to change the labels, make sure to change it in Mimir | ||
too to make Grafana Mimir and Prometheus/Grafana Agent config matches with each other, otherwise HA deduplication | ||
wouldn't work. `cluster` label value must be same across replica belong to the same cluster. `__replica__` must be | ||
unique across different replica in the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configure Prometheus or Grafana Agent HA setup by setting label called `cluster` and `__replica__`. These two labels | |
name are default labels for HA setup in Grafana Mimir. If you want to change the labels, make sure to change it in Mimir | |
too to make Grafana Mimir and Prometheus/Grafana Agent config matches with each other, otherwise HA deduplication | |
wouldn't work. `cluster` label value must be same across replica belong to the same cluster. `__replica__` must be | |
unique across different replica in the cluster. | |
Configure the Prometheus or Grafana Agent HA setup by setting the labels named `cluster` and `__replica__`, | |
which are the default labels for a HA setup in Grafana Mimir. If you want to change a label, | |
make sure to change it in Mimir as well. This ensures that the configurations of Grafana Mimir, Prometheus, and Grafana Agent all match each other. Otherwise, HA deduplication will not work. | |
* The value of the `cluster` label must be same across replica that belong to the same cluster. | |
* The value of the `__replica__` label must be unique across different replica within the same cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please vet this for technical accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For technical accuracy, the second sentence should be something like
If you want to change the HA labels, make sure to change them in Mimir as well.
instead of
If you want to change a label, make sure to change it in Mimir as well.
because we are talking about specific label pairs with the default name cluster
and __replica__
.
Please review it again.
docs/sources/operators-guide/configure/configure-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
host: <consul-endpoint> # example: http://consul.consul.svc.cluster.local:8500 | ||
``` | ||
|
||
2. Upgrade the Mimir's helm release using the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Upgrade the Mimir's helm release using the following command: | |
2. Upgrade the Mimir's Helm chart release using the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this Helm or the Helm chart? Please vet my change for technical accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are upgrading mimir-distributed
helm chart release. Since we are upgrading the mimir-distributed
chart using the commands that follow that line, a new release will be installed.
See this reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are upgrading mimir-distributed
helm chart release. Since we are upgrading the mimir-distributed
chart using the commands that follow that line, a new release will be installed.
See this reference
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
…lication-consul/index.md Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
…lication-consul/index.md Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
…lication-consul/index.md Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
…lication-consul/index.md Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
This is looking good to me now overall. I wonder where to add some information about using sharding in prometheus operator that I discovered during a support case: https://github.com/grafana/support-escalations/issues/4030#issuecomment-1262289642 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With two nitpicks
docs/sources/operators-guide/configure/configure-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
docs/sources/operators-guide/configure/configure-helm-ha-deduplication-consul/index.md
Outdated
Show resolved
Hide resolved
Yes most likely it is better to be covered in separate PR. I am still more than happy to work on that. |
CHANGELOG.md
Outdated
@@ -138,6 +138,7 @@ | |||
### Documentation | |||
|
|||
* [ENHANCEMENT] Added documentation on how to configure storage retention. #2970 | |||
* [ENHANCEMENT] Documented how to configure HA deduplication using Consul in a Mimir Helm deployment. #2972 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase to current main and move this line to around line 15 in the main/unreleased section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry missed to do it immediately yesterday. Updated now.
What this PR does
Add Documentation to do HA deduplication on Mimir Helm Deployment.
Which issue(s) this PR fixes or relates to
Fixes #2597
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]