New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubernetes: Config files + setup script for secure multiregion clusters #27092
Merged
Jump to file or symbol
Failed to load files and symbols.
Diff settings
| @@ -0,0 +1,86 @@ | ||
| # Running CockroachDB across multiple Kubernetes clusters | ||
| The script and configuration files in this directory enable deploying | ||
| CockroachDB across multiple Kubernetes clusters that are spread across different | ||
| geographic regions. It deploys a CockroachDB | ||
| [StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) | ||
| into each separate cluster, and links them together using DNS. | ||
| To use the configuration provided here, check out this repository (or otherwise | ||
| download a copy of this directory), fill in the constants at the top of | ||
| [setup.py](setup.py) with the relevant information about your Kubernetes | ||
| clusters, optionally make any desired modifications to | ||
| [cockroachdb-statefulset-secure.yaml](cockroachdb-statefulset-secure.yaml) as | ||
| explained in [our Kubernetes performance tuning | ||
| guide](https://www.cockroachlabs.com/docs/stable/kubernetes-performance.html), | ||
| then finally run [setup.py](setup.py). | ||
| You should see a lot of output as it does its thing, hopefully ending after | ||
| printing out `job "cluster-init-secure" created`. This implies that everything | ||
| was created successfully, and you should soon see the CockroachDB cluster | ||
| initialized with 3 pods in the "READY" state in each Kubernetes cluster. At this | ||
| point you can manage the StatefulSet in each cluster independently if you so | ||
| desire, scaling up the number of replicas, changing their resource requests, or | ||
| making other modifications as you please. | ||
| If anything goes wrong along the way, please let us know via any of the [normal | ||
| troubleshooting | ||
| channels](https://www.cockroachlabs.com/docs/stable/support-resources.html). | ||
| While we believe this creates a highly available, maintainable multi-region | ||
| deployment, it is still pushing the boundaries of how Kubernetes is typically | ||
| used, so feedback and issue reports are very appreciated. | ||
| ## Limitations | ||
| ### Pod-to-pod connectivity | ||
| The deployment outlined in this directory relies on pod IP addresses being | ||
| routable even across Kubernetes clusters and regions. This achieves optimal | ||
| performance, particularly when compared to alternative solutions that route all packets between clusters through load balancers, but means that it won't work in certain environments. | ||
| This requirement is satisfied by clusters deployed in cloud environments such as Google Kubernetes Engine, and | ||
| can also be satisfied by on-prem environments depending on the [Kubernetes networking setup](https://kubernetes.io/docs/concepts/cluster-administration/networking/) used. If you want to test whether your cluster will work, you can run this basic network test: | ||
| ```shell | ||
| $ kubectl run network-test --image=alpine --restart=Never -- sleep 999999 | ||
| pod "network-test" created | ||
| $ kubectl describe pod network-test | grep IP | ||
| IP: THAT-PODS-IP-ADDRESS | ||
| $ kubectl config use-context YOUR-OTHER-CLUSTERS-CONTEXT-HERE | ||
| $ kubectl run -it network-test --image=alpine --restart=Never -- ping THAT-PODS-IP-ADDRESS | ||
| If you don't see a command prompt, try pressing enter. | ||
| 64 bytes from 10.12.14.10: seq=1 ttl=62 time=0.570 ms | ||
| 64 bytes from 10.12.14.10: seq=2 ttl=62 time=0.449 ms | ||
| 64 bytes from 10.12.14.10: seq=3 ttl=62 time=0.635 ms | ||
| 64 bytes from 10.12.14.10: seq=4 ttl=62 time=0.722 ms | ||
| 64 bytes from 10.12.14.10: seq=5 ttl=62 time=0.504 ms | ||
| ... | ||
| ``` | ||
| If the pods can directly connect, you should see successful ping output like the | ||
| above. If they can't, you won't see any successful ping responses. Make sure to | ||
| delete the `network-test` pod in each cluster when you're done! | ||
| ### Exposing DNS servers to the Internet | ||
| As currently configured, the way that the DNS servers from each Kubernetes | ||
| cluster are hooked together is by exposing them via a load balanced IP address | ||
| that's visible to the public Internet. This is because [Google Cloud Platform's Internal Load Balancers do not currently support clients in one region using a load balancer in another region](https://cloud.google.com/compute/docs/load-balancing/internal/#deploying_internal_load_balancing_with_clients_across_vpn_or_interconnect). | ||
| None of the services in your Kubernetes cluster will be made accessible, but | ||
| their names could leak out to a motivated attacker. If this is unacceptable, | ||
| please let us know and we can demonstrate other options. [Your voice could also | ||
| help convince Google to allow clients from one region to use an Internal Load | ||
| Balancer in another](https://issuetracker.google.com/issues/111021512), | ||
| eliminating the problem. | ||
| ## Cleaning up | ||
| To remove all the resources created in your clusters by [setup.py](setup.py), | ||
| copy the parameters you provided at the top of [setup.py](setup.py) to the top | ||
| of [teardown.py](teardown.py) and run [teardown.py](teardown.py). | ||
| ## More information | ||
| For more information on running CockroachDB in Kubernetes, please see the [README | ||
| in the parent directory](../README.md). |
| @@ -0,0 +1,27 @@ | ||
| apiVersion: v1 | ||
| kind: Pod | ||
| metadata: | ||
| name: cockroachdb-client-secure | ||
| labels: | ||
| app: cockroachdb-client | ||
| spec: | ||
| serviceAccountName: cockroachdb | ||
| containers: | ||
| - name: cockroachdb-client | ||
| image: cockroachdb/cockroach:v2.0.5 | ||
| imagePullPolicy: IfNotPresent | ||
| volumeMounts: | ||
| - name: client-certs | ||
| mountPath: /cockroach-certs | ||
| # Keep a pod open indefinitely so kubectl exec can be used to get a shell to it | ||
| # and run cockroach client commands, such as cockroach sql, cockroach node status, etc. | ||
| command: | ||
| - sleep | ||
| - "2147483648" # 2^31 | ||
| # This pod isn't doing anything important, so don't bother waiting to terminate it. | ||
| terminationGracePeriodSeconds: 0 | ||
| volumes: | ||
| - name: client-certs | ||
| secret: | ||
| secretName: cockroachdb.client.root | ||
| defaultMode: 256 |
| @@ -0,0 +1,28 @@ | ||
| apiVersion: batch/v1 | ||
| kind: Job | ||
| metadata: | ||
| name: cluster-init-secure | ||
| labels: | ||
| app: cockroachdb | ||
| spec: | ||
| template: | ||
| spec: | ||
| serviceAccountName: cockroachdb | ||
| containers: | ||
| - name: cluster-init | ||
| image: cockroachdb/cockroach:v2.0.5 | ||
| imagePullPolicy: IfNotPresent | ||
| volumeMounts: | ||
| - name: client-certs | ||
| mountPath: /cockroach-certs | ||
| command: | ||
| - "/cockroach/cockroach" | ||
| - "init" | ||
| - "--certs-dir=/cockroach-certs" | ||
| - "--host=cockroachdb-0.cockroachdb" | ||
| restartPolicy: OnFailure | ||
| volumes: | ||
| - name: client-certs | ||
| secret: | ||
| secretName: cockroachdb.client.root | ||
| defaultMode: 256 |
| @@ -0,0 +1,224 @@ | ||
| apiVersion: v1 | ||
| kind: ServiceAccount | ||
| metadata: | ||
| name: cockroachdb | ||
| labels: | ||
| app: cockroachdb | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1beta1 | ||
| kind: Role | ||
| metadata: | ||
| name: cockroachdb | ||
| labels: | ||
| app: cockroachdb | ||
| rules: | ||
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - secrets | ||
| verbs: | ||
| - create | ||
| - get | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1beta1 | ||
| kind: ClusterRole | ||
| metadata: | ||
| name: cockroachdb | ||
| labels: | ||
| app: cockroachdb | ||
| rules: | ||
| - apiGroups: | ||
| - certificates.k8s.io | ||
| resources: | ||
| - certificatesigningrequests | ||
| verbs: | ||
| - create | ||
| - get | ||
| - watch | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1beta1 | ||
| kind: RoleBinding | ||
| metadata: | ||
| name: cockroachdb | ||
| labels: | ||
| app: cockroachdb | ||
| roleRef: | ||
| apiGroup: rbac.authorization.k8s.io | ||
| kind: Role | ||
| name: cockroachdb | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: cockroachdb | ||
| namespace: default | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1beta1 | ||
| kind: ClusterRoleBinding | ||
| metadata: | ||
| name: cockroachdb | ||
| labels: | ||
| app: cockroachdb | ||
| roleRef: | ||
| apiGroup: rbac.authorization.k8s.io | ||
| kind: ClusterRole | ||
| name: cockroachdb | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: cockroachdb | ||
| namespace: default | ||
| --- | ||
| apiVersion: v1 | ||
| kind: Service | ||
| metadata: | ||
| # This service is meant to be used by clients of the database. It exposes a ClusterIP that will | ||
| # automatically load balance connections to the different database pods. | ||
| name: cockroachdb-public | ||
| labels: | ||
| app: cockroachdb | ||
| spec: | ||
| ports: | ||
| # The main port, served by gRPC, serves Postgres-flavor SQL, internode | ||
| # traffic and the cli. | ||
| - port: 26257 | ||
| targetPort: 26257 | ||
| name: grpc | ||
| # The secondary port serves the UI as well as health and debug endpoints. | ||
| - port: 8080 | ||
| targetPort: 8080 | ||
| name: http | ||
| selector: | ||
| app: cockroachdb | ||
| --- | ||
| apiVersion: v1 | ||
| kind: Service | ||
| metadata: | ||
| # This service only exists to create DNS entries for each pod in the stateful | ||
| # set such that they can resolve each other's IP addresses. It does not | ||
| # create a load-balanced ClusterIP and should not be used directly by clients | ||
| # in most circumstances. | ||
| name: cockroachdb | ||
| labels: | ||
| app: cockroachdb | ||
| annotations: | ||
| # Use this annotation in addition to the actual publishNotReadyAddresses | ||
| # field below because the annotation will stop being respected soon but the | ||
| # field is broken in some versions of Kubernetes: | ||
| # https://github.com/kubernetes/kubernetes/issues/58662 | ||
| service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" | ||
| # Enable automatic monitoring of all instances when Prometheus is running in the cluster. | ||
| prometheus.io/scrape: "true" | ||
| prometheus.io/path: "_status/vars" | ||
| prometheus.io/port: "8080" | ||
| spec: | ||
| ports: | ||
| - port: 26257 | ||
| targetPort: 26257 | ||
| name: grpc | ||
| - port: 8080 | ||
| targetPort: 8080 | ||
| name: http | ||
| # We want all pods in the StatefulSet to have their addresses published for | ||
| # the sake of the other CockroachDB pods even before they're ready, since they | ||
| # have to be able to talk to each other in order to become ready. | ||
| publishNotReadyAddresses: true | ||
| clusterIP: None | ||
| selector: | ||
| app: cockroachdb | ||
| --- | ||
| apiVersion: policy/v1beta1 | ||
| kind: PodDisruptionBudget | ||
| metadata: | ||
| name: cockroachdb-budget | ||
| labels: | ||
| app: cockroachdb | ||
| spec: | ||
| selector: | ||
| matchLabels: | ||
| app: cockroachdb | ||
| maxUnavailable: 1 | ||
| --- | ||
| apiVersion: apps/v1beta1 | ||
| kind: StatefulSet | ||
| metadata: | ||
| name: cockroachdb | ||
| spec: | ||
| serviceName: "cockroachdb" | ||
| replicas: 3 | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: cockroachdb | ||
| spec: | ||
| serviceAccountName: cockroachdb | ||
| affinity: | ||
| podAntiAffinity: | ||
| preferredDuringSchedulingIgnoredDuringExecution: | ||
| - weight: 100 | ||
| podAffinityTerm: | ||
| labelSelector: | ||
| matchExpressions: | ||
| - key: app | ||
| operator: In | ||
| values: | ||
| - cockroachdb | ||
| topologyKey: kubernetes.io/hostname | ||
| containers: | ||
| - name: cockroachdb | ||
| image: cockroachdb/cockroach:v2.0.5 | ||
| imagePullPolicy: IfNotPresent | ||
| ports: | ||
| - containerPort: 26257 | ||
| name: grpc | ||
| - containerPort: 8080 | ||
| name: http | ||
| livenessProbe: | ||
| httpGet: | ||
| path: "/health" | ||
| port: http | ||
| scheme: HTTPS | ||
| initialDelaySeconds: 30 | ||
| periodSeconds: 5 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: "/health?ready=1" | ||
| port: http | ||
| scheme: HTTPS | ||
| initialDelaySeconds: 10 | ||
| periodSeconds: 5 | ||
| failureThreshold: 2 | ||
| volumeMounts: | ||
| - name: datadir | ||
| mountPath: /cockroach/cockroach-data | ||
| - name: certs | ||
| mountPath: /cockroach/cockroach-certs | ||
| env: | ||
| - name: COCKROACH_CHANNEL | ||
| value: kubernetes-secure | ||
| command: | ||
| - "/bin/bash" | ||
| - "-ecx" | ||
| # The use of qualified `hostname -f` is crucial: | ||
| # Other nodes aren't able to look up the unqualified hostname. | ||
| - "exec /cockroach/cockroach start --logtostderr --certs-dir /cockroach/cockroach-certs --advertise-host $(hostname -f) --http-host 0.0.0.0 --join JOINLIST --locality LOCALITYLIST --cache 25% --max-sql-memory 25%" | ||
| # No pre-stop hook is required, a SIGTERM plus some time is all that's | ||
| # needed for graceful shutdown of a node. | ||
| terminationGracePeriodSeconds: 60 | ||
| volumes: | ||
| - name: datadir | ||
| persistentVolumeClaim: | ||
| claimName: datadir | ||
| - name: certs | ||
| secret: | ||
| secretName: cockroachdb.node | ||
| defaultMode: 256 | ||
| podManagementPolicy: Parallel | ||
| updateStrategy: | ||
| type: RollingUpdate | ||
| volumeClaimTemplates: | ||
| - metadata: | ||
| name: datadir | ||
| spec: | ||
| accessModes: | ||
| - "ReadWriteOnce" | ||
| resources: | ||
| requests: | ||
| storage: 100Gi |
Oops, something went wrong.
ProTip!
Use n and p to navigate between commits in a pull request.