Skip to content

Commit

Permalink
Merge pull request #930 from rax-maas/k8s-resource-rollout
Browse files Browse the repository at this point in the history
roll out k8s resources - elasticsearch
  • Loading branch information
iWebi committed Sep 15, 2022
2 parents ec8e50c + 0e062c9 commit 839f41c
Show file tree
Hide file tree
Showing 3 changed files with 218 additions and 0 deletions.
53 changes: 53 additions & 0 deletions contrib/blueflood-k8s/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Blueflood for Kubernetes

Building on lessons learned from [blueflood-minikube](../blueflood-minikube), this makes a fully deployable Blueflood
Kubernetes descriptor.

Start by getting your [kubectl](https://kubernetes.io/docs/tasks/tools/) connected to the cluster you want to deploy to.

This project uses [Kustomize](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/) as a light
layer of management to reduce duplication in the normal K8s resources and manage ConfigMaps. The k8s resource files may
be used the way they are, but they can be customized via overlays, courtesy of Kustomize.

## General organization

Following Kustomize's recommended layout, `base` contains the main set of resources. For organizational purposes,
resources are grouped into files according to their service, such as `cassandra.yaml` or `elasticsearch.yaml`.

All resources in a given yaml file have a label named `component` whose value is equal to the file base name. Therefore,
all Cassandra resources are labeled with `component=cassandra`. This makes managing resources in the k8s cluster much
simpler. This bears repeating: *all* resources here are assigned to a component. If a resource doesn't have a
`component` label, it shouldn't exist.

The general setup of a yaml file is as follows.

- There's a group of pods to run the actual service, organized as either a [Deployment](
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) or a
[StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/). Both ensure a certain number of
pods stay running with your desired configuration. A StatefulSet also ensures that if a pod dies, its replacement will
be assigned the same persistent volume that the old one was using, which is very important for data stores.

- There's a [Service](https://kubernetes.io/docs/concepts/services-networking/service/) with a
[ClusterIP](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types) that
makes the set of pods available to other things in the cluster via DNS.

- Pods expect a [ConfigMap](https://kubernetes.io/docs/concepts/configuration/configmap/) to be present in the
cluster with all necessary config files. By convention, the ConfigMap should be named `<component>-config`. As an
example, the ConfigMap for Elasticsearch is named `elasticsearch-config`. The files from the ConfigMap are mounted as a
directory in the pods. A good way to find what config files go in the ConfigMap is to start an instance of the pod's
image and copy the default config files out of it. ConfigMaps are easy to manage with [Kustomize's configMapGenerator](
https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/#configmapgenerator).

- Kubernetes supports mounting a ConfigMap directly to a file system as either a file or a directory of files. Often,
though, the ConfigMap isn't mounted directly onto the real pod due to ownership or access issues. Instead, pods have a
small [Volume]( https://kubernetes.io/docs/concepts/storage/volumes/) dedicated to config files. An
[InitContainer](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) prepares the volume by copying the
files to it from the ConfigMap and setting appropriate ownership, file mode, etc.

- If necessary, pods use a [PersistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) to
request persistent storage that will endure a pod restart. This is for pods that need non-ephemeral data and probably
means it's in a StatefulSet.

- For groups of pods that form a cluster (Cassandra and Elasticsearch), there's a headless Service. This doesn't provide
a ClusterIP. Instead, it resolves in DNS to the IPs of all the cluster members, making it easier to do cluster
discovery.
163 changes: 163 additions & 0 deletions contrib/blueflood-k8s/base/elasticsearch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
---
# A headless service that just returns the IPs of the master Elasticsearch pods. This is for the master nodes to
# discover each other at bootstrap time.
apiVersion: v1
kind: Service
metadata:
name: es-seed-discovery
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:
- name: es-transport
port: 9300
protocol: TCP
clusterIP: None
---
# The StatefulSet of Elasticsearch master nodes. These are the nodes eligible to be master at any time and also the seed
# nodes for other nodes to use when joining the cluster. For the moment, this cluster is just this set of nodes, any of
# which can be elected master. We don't differentiate other node types:
# https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-master
labels:
component: elasticsearch
role: master
spec:
selector:
matchLabels:
component: elasticsearch
role: master
serviceName: es-seed-discovery
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
initContainers:
# Set kernel param required by ES
- name: set-max-map-count
image: busybox:1.27.2
command: ['sysctl', '-w', 'vm.max_map_count=262144']
securityContext:
privileged: true
# Prep data and config volumes. ES runs as user 1000, and it needs to write to its config directory, so the
# ConfigMap has to be copied into a volume.
- name: prep-volumes
image: busybox:1.27.2
command:
- sh
- -c
- |
cp -rL /elasticsearch-config-source/* /elasticsearch-config-final
chown -R 1000:1000 /elasticsearch-config-final
chown -R 1000:1000 /elasticsearch-data-pv
# ES 1.7 seems to require wide-open permissions on this dir; not sure why. Do away with this once we're
# sure we don't need 1.7 anymore.
chmod -R 777 /elasticsearch-data-pv
volumeMounts:
- name: elasticsearch-data
mountPath: /elasticsearch-data-pv
- name: config-source
mountPath: /elasticsearch-config-source
- name: elasticsearch-config
mountPath: /elasticsearch-config-final
containers:
- name: elasticsearch
# Requires mode 777 on data mount, and keeps spitting out errors about disk high watermark and moving shards.
# Mainly, can't get the initial cluster to form correctly because each pod only waits 30 seconds for
# discovery, then never tries again.
#image: elasticsearch:1.7
# Seems to be the right balance between upgrading and keeping things working with minimal code change.
image: elasticsearch:6.8.23
# Works for ingest, but querying doesn't work because of types; might be able to work around this in code with
# a config setting?
#image: elasticsearch:7.17.5
# Works great, but removed index types, so current Blueflood code doesn't work with it.
#image: elasticsearch:8.3.3
env:
# Tells Elasticsearch the directory to look for config files in. This path already exists in the image, so
# it's convenient to use.
- name: ES_PATH_CONF
value: "/usr/share/elasticsearch/config"
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: es-transport
startupProbe:
tcpSocket:
port: http
# ES seems to take well over a minute to start up. I'm not sure if there's something we can do to make that
# faster. This will wait 10 seconds * 18 attempts.
periodSeconds: 10
failureThreshold: 18
livenessProbe:
tcpSocket:
port: http
readinessProbe:
tcpSocket:
port: http
resources:
limits:
cpu: "1"
memory: 4Gi
requests:
cpu: "0.5"
memory: 2Gi
volumeMounts:
# Persistent volume for long-term data storage
- name: elasticsearch-data
mountPath: /elasticsearch-data-pv
# Elasticsearch config files should be in a config map, which we mount here.
- name: elasticsearch-config
mountPath: /usr/share/elasticsearch/config
volumes:
- name: config-source
configMap:
name: elasticsearch-config
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
labels:
component: elasticsearch
spec:
accessModes: [ ReadWriteOnce ]
resources:
requests:
storage: 1Gi
- metadata:
name: elasticsearch-config
labels:
component: elasticsearch
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Mi
---
# And finally, the main service that makes Elasticsearch visible to other parts of the application.
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
labels:
component: elasticsearch
spec:
selector:
component: elasticsearch
ports:
- name: http
port: 9200
protocol: TCP
- name: es-transport
port: 9300
protocol: TCP
2 changes: 2 additions & 0 deletions contrib/blueflood-k8s/base/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
resources:
- elasticsearch.yaml

0 comments on commit 839f41c

Please sign in to comment.