Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Enabling availability zone awareness in metric R/W with ingesters. (#…
…2317) * adding Zone information to Token and Ingester description structs. Added distinct zone check in ring Get ingesters function Signed-off-by: Ken Haines <khaines@microsoft.com> * updating AddIngester to include zone Signed-off-by: Ken Haines <khaines@microsoft.com> * updating lifecylcer config and tests to enable availability zone Signed-off-by: Ken Haines <khaines@microsoft.com> * adding zone info to ring's HTTP method Signed-off-by: Ken Haines <khaines@microsoft.com> * adding positive & negative zone aware replica set tests Signed-off-by: Ken Haines <khaines@microsoft.com> * updating config ref doc Signed-off-by: Ken Haines <khaines@microsoft.com> * updating changelog Signed-off-by: Ken Haines <khaines@microsoft.com> * correcting misspell caught in linting Signed-off-by: Ken Haines <khaines@microsoft.com> * updating lifecycler config for availbility zone so that the docs are generated in a consistent/verifiable way but still have a decent default value Signed-off-by: Ken Haines <khaines@microsoft.com> * Adding some zone based replication docs Signed-off-by: Ken Haines <khaines@microsoft.com>
- Loading branch information
Showing
11 changed files
with
277 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
title: "Ingester Hand-over" | ||
linkTitle: "Ingester Hand-over" | ||
weight: 5 | ||
slug: ingester-handover | ||
--- | ||
|
||
In a default configuration, time-series written to ingesters are replicated based on the container/pod name of the ingester instances. It is completely possible that all the replicas for the given time-series are held with in the same availability zone, even if the cortex infrastructure spans multiple zones within the region. Storing multiple replicas for a given time-series poses a risk for data loss if there is an outage affecting various nodes within a zone or a total outage. | ||
|
||
## Configuration | ||
|
||
Cortex can be configured to consider an availability zone value in its replication system. Doing so mitigates risks associated with losing multiple nodes with in the same availability zone. The availability zone for an ingester can be defined on the command line of the ingester using the `ingester.availability-zone` flag or using the yaml configuration: | ||
|
||
```yaml | ||
ingester: | ||
lifecycler: | ||
availability_zone: "zone-3" | ||
``` | ||
|
||
## Zone Replication Considerations | ||
|
||
Enabling availability zone awareness helps mitigate risks regarding data loss within a single zone, some items need consideration by an operator if they are thinking of enabling this feature. | ||
|
||
### Minimum number of Zones | ||
|
||
For cortex to function correctly, there must be at least the same number of availability zones as there is replica count. So by default, a cortex cluster should be spread over 3 zones as the default replica count is 3. It is safe to have more zones than the replica count, but it cannot be less. Having fewer availability zones than replica count causes a replica write to be missed, and in some cases, the write fails if the availability zone count is too low. | ||
|
||
### Cost | ||
|
||
Depending on the existing cortex infrastructure being used, this may cause an increase in running costs as most cloud providers charge for cross availability zone traffic. The most significant change would be for a cortex cluster currently running in a singular zone. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.