-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch AWS EKS with multiple availability zones #927
Comments
Hi, This is something we have been experiencing issues with. Not only with EKS, but also with other clusters that have nodes in different availability zones. Having the pod deployed in one area and the volume in a different one makes the solution undeployable. I'm sure that k8s will implement a solution that addresses this kind of issues: Some threads about the issue: |
1.12 introduced "VolumeBindingMode: WaitForFirstConsumer", would would have improved the situation, likely avoiding the pods startup failure. EKS is at 1.11 However, elasticsearch behavior when shard allocation awareness is set justifies more control of placement across AZs than kubernetes will provide with a single stateful set. If there are 3 coordinating nodes and 4 AZs, and pods are distributed across AZs using anti-affinity, then no load will go to one AZ, at least once there is a copy of each shard in each AZ (guaranteed if there are 4 replicas). There is a tradeoff between abstraction and control. Availability zone and region boundaries have implicit and explicit cost in crossing those boundaries (AWS charges for cross-AZ traffic). For elasticsearch, our conclusion is we need explicit control of number of pods and PVs in each AZ. I will likely hack in templates data2* and data3* as I try to learn enough about helm to do something cleaner. |
I see. In principle I do not see any other alternative than creating different statefulsets per AZ. You could have a loop in helm that creates different statefulsets (using the |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary. |
When I ran the elasticsearch chart on AWS after configuring multiple availability zones, some pods for statefulSets would not start. This is due to AWS volumes: once created they can only be used in one AZ. A newer k8s version will delay binding of volume until pod is created, but that is not available.
Even it is was available, elasticsearch, with shard allocation awareness enabled, is strict about placement and also routing of traffic from coordinating nodes to data nodes. An approximately even random distribution of pods across AZs will be less desirable than a guaranteed even distribution, so even features of 1.13 will not create a complete solution.
To support multiple AZs, it is best to have a storage class per AZ, as well as a statefulset per AZ that uses that storage class (along with an even distribution of worker nodes across AZs). We can then configure the number of AWS volumes per AZ, and be sure that we launch, in each AZ, enough stateful sets distribute replicas completely evenly across AZs.
My intent is to learn enough about helm to implement this is some other way than brute force manual creation of 3 completely different statefulSets. I'd like to be able to instantiate each statefulSet from a common template. Please let me know if I'm missing something.
The text was updated successfully, but these errors were encountered: