Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch AWS EKS with multiple availability zones #927

Closed
DaveWHarvey opened this issue Nov 15, 2018 · 5 comments
Closed

Elasticsearch AWS EKS with multiple availability zones #927

DaveWHarvey opened this issue Nov 15, 2018 · 5 comments
Labels
stale 15 days without activity

Comments

@DaveWHarvey
Copy link

DaveWHarvey commented Nov 15, 2018

When I ran the elasticsearch chart on AWS after configuring multiple availability zones, some pods for statefulSets would not start. This is due to AWS volumes: once created they can only be used in one AZ. A newer k8s version will delay binding of volume until pod is created, but that is not available.

Even it is was available, elasticsearch, with shard allocation awareness enabled, is strict about placement and also routing of traffic from coordinating nodes to data nodes. An approximately even random distribution of pods across AZs will be less desirable than a guaranteed even distribution, so even features of 1.13 will not create a complete solution.

To support multiple AZs, it is best to have a storage class per AZ, as well as a statefulset per AZ that uses that storage class (along with an even distribution of worker nodes across AZs). We can then configure the number of AWS volumes per AZ, and be sure that we launch, in each AZ, enough stateful sets distribute replicas completely evenly across AZs.

My intent is to learn enough about helm to implement this is some other way than brute force manual creation of 3 completely different statefulSets. I'd like to be able to instantiate each statefulSet from a common template. Please let me know if I'm missing something.

@javsalgar
Copy link
Contributor

Hi,

This is something we have been experiencing issues with. Not only with EKS, but also with other clusters that have nodes in different availability zones. Having the pod deployed in one area and the volume in a different one makes the solution undeployable.

I'm sure that k8s will implement a solution that addresses this kind of issues:

Some threads about the issue:

kubernetes/kubernetes#49906
rancher/rancher#12704

@DaveWHarvey
Copy link
Author

1.12 introduced "VolumeBindingMode: WaitForFirstConsumer", would would have improved the situation, likely avoiding the pods startup failure. EKS is at 1.11

However, elasticsearch behavior when shard allocation awareness is set justifies more control of placement across AZs than kubernetes will provide with a single stateful set. If there are 3 coordinating nodes and 4 AZs, and pods are distributed across AZs using anti-affinity, then no load will go to one AZ, at least once there is a copy of each shard in each AZ (guaranteed if there are 4 replicas).

There is a tradeoff between abstraction and control. Availability zone and region boundaries have implicit and explicit cost in crossing those boundaries (AWS charges for cross-AZ traffic). For elasticsearch, our conclusion is we need explicit control of number of pods and PVs in each AZ. I will likely hack in templates data2* and data3* as I try to learn enough about helm to do something cleaner.

@javsalgar
Copy link
Contributor

I see. In principle I do not see any other alternative than creating different statefulsets per AZ. You could have a loop in helm that creates different statefulsets (using the range control structure), provided that you introduce in the values.yaml each of the AZs. I think your use case is interesting so my advice would be to create a post in the kubernetes page so they consider this kind of cases for future releases of K8s. Maybe they come up with an alternative.

@stale
Copy link

stale bot commented Dec 4, 2018

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@stale stale bot added the stale 15 days without activity label Dec 4, 2018
@stale
Copy link

stale bot commented Dec 9, 2018

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale 15 days without activity
Projects
None yet
Development

No branches or pull requests

2 participants