diff --git a/docs/Manual/Deployment/Kubernetes/Authentication.md b/docs/Manual/Deployment/Kubernetes/Authentication.md new file mode 100644 index 000000000..d9bff945f --- /dev/null +++ b/docs/Manual/Deployment/Kubernetes/Authentication.md @@ -0,0 +1,18 @@ +# Authentication + +The ArangoDB Kubernetes Operator will by default create ArangoDB deployments +that require authentication to access the database. + +It uses a single JWT secret (stored in a Kubernetes secret) +to provide *super-user* access between all servers of the deployment +as well as access from the ArangoDB Operator to the deployment. + +To disable authentication, set `spec.auth.jwtSecretName` to `None`. + +Initially the deployment is accessible through the web user-interface and +API's, using the user `root` with an empty password. +Make sure to change this password immediately after starting the deployment! + +## See also + +- [Secure connections (TLS)](./Tls.md) diff --git a/docs/Manual/Deployment/Kubernetes/DriverConfiguration.md b/docs/Manual/Deployment/Kubernetes/DriverConfiguration.md new file mode 100644 index 000000000..483d9de92 --- /dev/null +++ b/docs/Manual/Deployment/Kubernetes/DriverConfiguration.md @@ -0,0 +1,128 @@ +# Configuring your driver for ArangoDB access + +In this chapter you'll learn how to configure a driver for accessing +an ArangoDB deployment in Kubernetes. + +The exact methods to configure a driver are specific to that driver. + +## Database endpoint(s) + +The endpoint(s) (or URLs) to communicate with is the most important +parameter your need to configure in your driver. + +Finding the right endpoints depend on wether your client application is running in +the same Kubernetes cluster as the ArangoDB deployment or not. + +### Client application in same Kubernetes cluster + +If your client application is running in the same Kubernetes cluster as +the ArangoDB deployment, you should configure your driver to use the +following endpoint: + +```text +https://..svc:8529 +``` + +Only if your deployment has set `spec.tls.caSecretName` to `None`, should +you use `http` instead of `https`. + +### Client application outside Kubernetes cluster + +If your client application is running outside the Kubernetes cluster in which +the ArangoDB deployment is running, your driver endpoint depends on the +external-access configuration of your ArangoDB deployment. + +If the external-access of the ArangoDB deployment is of type `LoadBalancer`, +then use the IP address of that `LoadBalancer` like this: + +```text +https://:8529 +``` + +If the external-access of the ArangoDB deployment is of type `NodePort`, +then use the IP address(es) of the `Nodes` of the Kubernetes cluster, +combined with the `NodePort` that is used by the external-access service. + +For example: + +```text +https://:30123 +``` + +You can find the type of external-access by inspecting the external-access `Service`. +To do so, run the following command: + +```bash +kubectl get service -n -ea +``` + +The output looks like this: + +```bash +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR +example-simple-cluster-ea LoadBalancer 10.106.175.38 192.168.10.208 8529:31890/TCP 1s app=arangodb,arango_deployment=example-simple-cluster,role=coordinator +``` + +In this case the external-access is of type `LoadBalancer` with a load-balancer IP address +of `192.168.10.208`. +This results in an endpoint of `https://192.168.10.208:8529`. + +## TLS settings + +As mentioned before the ArangoDB deployment managed by the ArangoDB operator +will use a secure (TLS) connection unless you set `spec.tls.caSecretName` to `None` +in your `ArangoDeployment`. + +When using a secure connection, you can choose to verify the server certificates +provides by the ArangoDB servers or not. + +If you want to verify these certificates, configure your driver with the CA certificate +found in a Kubernetes `Secret` found in the same namespace as the `ArangoDeployment`. + +The name of this `Secret` is stored in the `spec.tls.caSecretName` setting of +the `ArangoDeployment`. If you don't set this setting explicitly, it will be +set automatically. + +Then fetch the CA secret using the following command (or use a Kubernetes client library to fetch it): + +```bash +kubectl get secret -n --template='{{index .data "ca.crt"}}' | base64 -D > ca.crt +``` + +This results in a file called `ca.crt` containing a PEM encoded, x509 CA certificate. + +## Query requests + +For most client requests made by a driver, it does not matter if there is any kind +of load-balancer between your client application and the ArangoDB deployment. + +{% hint 'info' %} +Note that even a simple `Service` of type `ClusterIP` already behaves as a load-balancer. +{% endhint %} + +The exception to this is cursor related requests made to an ArangoDB `Cluster` deployment. +The coordinator that handles an initial query request (that results in a `Cursor`) +will save some in-memory state in that coordinator, if the result of the query +is too big to be transfer back in the response of the initial request. + +Follow-up requests have to be made to fetch the remaining data. +These follow-up requests must be handled by the same coordinator to which the initial +request was made. + +As soon as there is a load-balancer between your client application and the ArangoDB cluster, +it is uncertain which coordinator will actually handle the follow-up request. + +To resolve this uncertainty, make sure to run your client application in the same +Kubernetes cluster and synchronize your endpoints before making the +initial query request. +This will result in the use (by the driver) of internal DNS names of all coordinators. +A follow-up request can then be sent to exactly the same coordinator. + +If your client application is running outside the Kubernetes cluster this is much harder +to solve. +The easiest way to work around it, is by making sure that the query results are small +enough. +When that is not feasible, it is also possible to resolve this +when the internal DNS names of your Kubernetes cluster are exposed to your client application +and the resuling IP addresses are routeable from your client application. +To expose internal DNS names of your Kubernetes cluster, your can use [CoreDNS](https://coredns.io). diff --git a/docs/Manual/Deployment/Kubernetes/README.md b/docs/Manual/Deployment/Kubernetes/README.md index 28849d67c..e88328853 100644 --- a/docs/Manual/Deployment/Kubernetes/README.md +++ b/docs/Manual/Deployment/Kubernetes/README.md @@ -1,6 +1,21 @@ # ArangoDB Kubernetes Operator -The ArangoDB Kubernetes Operator (`kube-arangodb`) is a set of two operators -that you deploy in your Kubernetes cluster to manage deployments of the -ArangoDB database and provide `PersistentVolumes` on local storage of your -nodes for optimal storage performance. +The ArangoDB Kubernetes Operator (`kube-arangodb`) is a set of operators +that you deploy in your Kubernetes cluster to: + +- Manage deployments of the ArangoDB database +- Provide `PersistentVolumes` on local storage of your nodes for optimal storage performance. +- Configure ArangoDB Datacenter to Datacenter replication + +Each of these uses involves a different custom resource. + +- Use an [`ArangoDeployment` resource](./DeploymentResource.md) to + create an ArangoDB database deployment. +- Use an [`ArangoLocalStorage` resource](./StorageResource.md) to + provide local `PersistentVolumes` for optimal I/O performance. +- Use an [`ArangoDeploymentReplication` resource](./DeploymentReplicationResource.md) to + configure ArangoDB Datacenter to Datacenter replication. + +Continue with [Using the ArangoDB Kubernetes Operator](./Usage.md) +to learn how to install the ArangoDB Kubernetes operator and create +your first deployment. \ No newline at end of file diff --git a/docs/Manual/Deployment/Kubernetes/Storage.md b/docs/Manual/Deployment/Kubernetes/Storage.md index e6c5d9207..2b4b6b687 100644 --- a/docs/Manual/Deployment/Kubernetes/Storage.md +++ b/docs/Manual/Deployment/Kubernetes/Storage.md @@ -10,6 +10,22 @@ In the `ArangoDeployment` resource, one can specify the type of storage used by groups of servers using the `spec..storageClassName` setting. +This is an example of a `Cluster` deployment that stores its agent & dbserver +data on `PersistentVolumes` that use the `my-local-ssd` `StorageClass` + +```yaml +apiVersion: "database.arangodb.com/v1alpha" +kind: "ArangoDeployment" +metadata: + name: "cluster-using-local-ssh" +spec: + mode: Cluster + agents: + storageClassName: my-local-ssd + dbservers: + storageClassName: my-local-ssd +``` + The amount of storage needed is configured using the `spec..resources.requests.storage` setting. @@ -17,6 +33,22 @@ Note that configuring storage is done per group of servers. It is not possible to configure storage per individual server. +This is an example of a `Cluster` deployment that requests volumes of 80GB +for every dbserver, resulting in a total storage capacity of 240GB (with 3 dbservers). + +```yaml +apiVersion: "database.arangodb.com/v1alpha" +kind: "ArangoDeployment" +metadata: + name: "cluster-using-local-ssh" +spec: + mode: Cluster + dbservers: + resources: + requests: + storage: 80Gi +``` + ## Local storage For optimal performance, ArangoDB should be configured with locally attached @@ -26,6 +58,28 @@ The easiest way to accomplish this is to deploy an [`ArangoLocalStorage` resource](./StorageResource.md). The ArangoDB Storage Operator will use it to provide `PersistentVolumes` for you. +This is an example of an `ArangoLocalStorage` resource that will result in +`PersistentVolumes` created on any node of the Kubernetes cluster +under the directory `/mnt/big-ssd-disk`. + +```yaml +apiVersion: "storage.arangodb.com/v1alpha" +kind: "ArangoLocalStorage" +metadata: + name: "example-arangodb-storage" +spec: + storageClass: + name: my-local-ssd + localPath: + - /mnt/big-ssd-disk +``` + +Note that using local storage required `VolumeScheduling` to be enabled in your +Kubernetes cluster. ON Kubernetes 1.10 this is enabled by default, on version +1.9 you have to enable it with a `--feature-gate` setting. + +### Manually creating `PersistentVolumes` + The alternative is to create `PersistentVolumes` manually, for all servers that need persistent storage (single, agents & dbservers). E.g. for a `Cluster` with 3 agents and 5 dbservers, you must create 8 volumes. @@ -54,14 +108,14 @@ metadata: ]} }' spec: - capacity: - storage: 100Gi - accessModes: - - ReadWriteOnce - persistentVolumeReclaimPolicy: Delete - storageClassName: local-ssd - local: - path: /mnt/disks/ssd1 + capacity: + storage: 100Gi + accessModes: + - ReadWriteOnce + persistentVolumeReclaimPolicy: Delete + storageClassName: local-ssd + local: + path: /mnt/disks/ssd1 ``` For Kubernetes 1.9 and up, you should create a `StorageClass` which is configured diff --git a/docs/Manual/Deployment/Kubernetes/Tls.md b/docs/Manual/Deployment/Kubernetes/Tls.md index 127c664fd..be298fb68 100644 --- a/docs/Manual/Deployment/Kubernetes/Tls.md +++ b/docs/Manual/Deployment/Kubernetes/Tls.md @@ -1,4 +1,4 @@ -# TLS +# Secure connections (TLS) The ArangoDB Kubernetes Operator will by default create ArangoDB deployments that use secure TLS connections. @@ -23,7 +23,8 @@ kubectl get secret -ca --template='{{index .data "ca.crt"}}' | base ### Windows -TODO +To install a CA certificate in Windows, follow the +[procedure described here](http://wiki.cacert.org/HowTo/InstallCAcertRoots). ### MacOS @@ -41,4 +42,13 @@ sudo /usr/bin/security remove-trusted-cert -d ca.crt ### Linux -TODO +To install a CA certificate in Linux, on Ubuntu, run: + +```bash +sudo cp ca.crt /usr/local/share/ca-certificates/.crt +sudo update-ca-certificates +``` + +## See also + +- [Authentication](./Authentication.md) diff --git a/docs/Manual/Deployment/Kubernetes/Troubleshooting.md b/docs/Manual/Deployment/Kubernetes/Troubleshooting.md new file mode 100644 index 000000000..e28e4c9d4 --- /dev/null +++ b/docs/Manual/Deployment/Kubernetes/Troubleshooting.md @@ -0,0 +1,112 @@ +# Troubleshooting + +While Kubernetes and the ArangoDB Kubernetes operator will automatically +resolve a lot of issues, there are always cases where human attention +is needed. + +This chapter gives your tips & tricks to help you troubleshoot deployments. + +## Where to look + +In Kubernetes all resources can be inspected using `kubectl` using either +the `get` or `describe` command. + +To get all details of the resource (both specification & status), +run the following command: + +```bash +kubectl get -n -o yaml +``` + +For example, to get the entire specification and status +of an `ArangoDeployment` resource named `my-arangodb` in the `default` namespace, +run: + +```bash +kubectl get ArangoDeployment my-arango -n default -o yaml +# or shorter +kubectl get arango my-arango -o yaml +``` + +Several types of resources (including all ArangoDB custom resources) support +events. These events show what happened to the resource over time. + +To show the events (and most important resource data) of a resource, +run the following command: + +```bash +kubectl describe -n +``` + +## Getting logs + +Another invaluable source of information is the log of containers being run +in Kubernetes. +These logs are accessible through the `Pods` that group these containers. + +To fetch the logs of the default container running in a `Pod`, run: + +```bash +kubectl logs -n +# or with follow option to keep inspecting logs while they are written +kubectl logs -n -f +``` + +To inspect the logs of a specific container in `Pod`, add `-c `. +You can find the names of the containers in the `Pod`, using `kubectl describe pod ...`. + +{% hint 'info' %} +Note that the ArangoDB operators are being deployed themselves as a Kubernetes `Deployment` +with 2 replicas. This means that you will have to fetch the logs of 2 `Pods` running +those replicas. +{% endhint %} + +## What if + +### The `Pods` of a deployment stay in `Pending` state + +There are two common causes for this. + +1) The `Pods` cannot be scheduled because there are not enough nodes available. + This is usally only the case with a `spec.environment` setting that has a value of `Production`. + + Solution: Add more nodes. +1) There are no `PersistentVolumes` available to be bound to the `PersistentVolumeClaims` + created by the operator. + + Solution: Use `kubectl get persistentvolumes` to inspect the available `PersistentVolumes` + and if needed, use the [`ArangoLocalStorage` operator](./StorageResource.md) to provision `PersistentVolumes`. + +### When restarting a `Node`, the `Pods` scheduled on that node remain in `Terminating` state + +When a `Node` no longer makes regular calls to the Kubernetes API server, it is +marked as not available. Depending on specific settings in your `Pods`, Kubernetes +will at some point decide to terminate the `Pod`. As long as the `Node` is not +completely removed from the Kubernetes API server, Kubernetes will try to use +the `Node` itself to terminate the `Pod`. + +The `ArangoDeployment` operator recognizes this condition and will try to replace those +`Pods` with `Pods` on different nodes. The exact behavior differs per type of server. + +### What happens when a `Node` with local data is broken + +When a `Node` with `PersistentVolumes` hosted on that `Node` is broken and +cannot be repaired, the data in those `PersistentVolumes` is lost. + +If an `ArangoDeployment` of type `Single` was using one of those `PersistentVolumes` +the database is lost and must be restored from a backup. + +If an `ArangoDeployment` of type `ActiveFailover` or `Cluster` was using one of +those `PersistentVolumes`, it depends on the type of server that was using the volume. + +- If an `Agent` was using the volume, it can be repaired as long as 2 other agents are still healthy. +- If a `DBServer` was using the volume, and the replication factor of all database + collections is 2 or higher, and the remaining dbservers are still healthy, + the cluster will duplicate the remaining replicas to + bring the number of replicases back to the original number. +- If a `DBServer` was using the volume, and the replication factor of a database + collection is 1 and happens to be stored on that dbserver, the data is lost. +- If a single server of an `ActiveFailover` deployment was using the volume, and the + other single server is still healthy, the other single server will become leader. + After replacing the failed single server, the new follower will synchronize with + the leader. diff --git a/docs/Manual/Deployment/Kubernetes/Upgrading.md b/docs/Manual/Deployment/Kubernetes/Upgrading.md index 56fe7989a..90e9b7e73 100644 --- a/docs/Manual/Deployment/Kubernetes/Upgrading.md +++ b/docs/Manual/Deployment/Kubernetes/Upgrading.md @@ -3,6 +3,8 @@ The ArangoDB Kubernetes Operator supports upgrading an ArangoDB from one version to the next. +## Upgrade an ArangoDB deployment + To upgrade a cluster, change the version by changing the `spec.image` setting and the apply the updated custom resource using: @@ -11,6 +13,21 @@ custom resource using: kubectl apply -f yourCustomResourceFile.yaml ``` +The ArangoDB operator will perform an sequential upgrade +of all servers in your deployment. Only one server is upgraded +at a time. + +For patch level upgrades (e.g. 3.3.9 to 3.3.10) each server +is stopped and restarted with the new version. + +For minor level upgrades (e.g. 3.3.9 to 3.4.0) each server +is stopped, then the new version is started with `--database.auto-upgrade` +and once that is finish the new version is started with the normal arguments. + +The process for major level upgrades depends on the specific version. + +## Upgrade the operator itself + To update the ArangoDB Kubernetes Operator itself to a new version, update the image version of the deployment resource and apply it using: @@ -18,3 +35,7 @@ and apply it using: ```bash kubectl apply -f examples/yourUpdatedDeployment.yaml ``` + +## See also + +- [Scaling](./Scaling.md) \ No newline at end of file diff --git a/docs/Manual/Deployment/Kubernetes/Usage.md b/docs/Manual/Deployment/Kubernetes/Usage.md index 5e02b338b..627bcb676 100644 --- a/docs/Manual/Deployment/Kubernetes/Usage.md +++ b/docs/Manual/Deployment/Kubernetes/Usage.md @@ -8,23 +8,31 @@ cluster first. To do so, run (replace `` with the version of the operator that you want to install): ```bash -kubectl apply -f https://raw.githubusercontent.com/arangodb/kube-arangodb//manifests/crd.yaml -kubectl apply -f https://raw.githubusercontent.com/arangodb/kube-arangodb//manifests/arango-deployment.yaml +export URLPREFIX=https://raw.githubusercontent.com/arangodb/kube-arangodb//manifests +kubectl apply -f $URLPREFIX/crd.yaml +kubectl apply -f $URLPREFIX/arango-deployment.yaml ``` -To use `ArangoLocalStorage`, also run: +To use `ArangoLocalStorage` resources, also run: ```bash -kubectl apply -f https://raw.githubusercontent.com/arangodb/kube-arangodb//manifests/arango-storage.yaml +kubectl apply -f $URLPREFIX/arango-storage.yaml +``` + +To use `ArangoDeploymentReplication` resources, also run: + +```bash +kubectl apply -f $URLPREFIX/arango-deployment-replication.yaml ``` You can find the latest release of the ArangoDB Kubernetes Operator [in the kube-arangodb repository](https://github.com/arangodb/kube-arangodb/releases/latest). -## Cluster creation +## ArangoDB deployment creation -Once the operator is running, you can create your ArangoDB cluster -by creating a custom resource and deploying it. +Once the operator is running, you can create your ArangoDB database deployment +by creating a `ArangoDeployment` custom resource and deploying it into your +Kubernetes cluster. For example (all examples can be found [in the kube-arangodb repository](https://github.com/arangodb/kube-arangodb/tree/master/examples)): @@ -32,9 +40,9 @@ For example (all examples can be found [in the kube-arangodb repository](https:/ kubectl apply -f examples/simple-cluster.yaml ``` -## Cluster removal +## Deployment removal -To remove an existing cluster, delete the custom +To remove an existing ArangoDB deployment, delete the custom resource. The operator will then delete all created resources. For example: @@ -43,6 +51,10 @@ For example: kubectl delete -f examples/simple-cluster.yaml ``` +**Note that this will also delete all data in your ArangoDB deployment!** + +If you want to keep your data, make sure to create a backup before removing the deployment. + ## Operator removal To remove the entire ArangoDB Kubernetes Operator, remove all @@ -50,6 +62,14 @@ clusters first and then remove the operator by running: ```bash kubectl delete deployment arango-deployment-operator -# If `ArangoLocalStorage` is installed +# If `ArangoLocalStorage` operator is installed kubectl delete deployment -n kube-system arango-storage-operator +# If `ArangoDeploymentReplication` operator is installed +kubectl delete deployment arango-deployment-replication-operator ``` + +## See also + +- [Driver configuration](./DriverConfiguration.md) +- [Scaling](./Scaling.md) +- [Upgrading](./Upgrading.md)