Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 121 setup ark #156

Merged
merged 3 commits into from Jan 29, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
194 changes: 194 additions & 0 deletions cluster-conf/ark/00-prereqs.yaml
@@ -0,0 +1,194 @@
# Copyright 2017 the Heptio Ark contributors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: backups.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: backups
kind: Backup

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: schedules.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: schedules
kind: Schedule

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: restores.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: restores
kind: Restore

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: downloadrequests.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: downloadrequests
kind: DownloadRequest

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: deletebackuprequests.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: deletebackuprequests
kind: DeleteBackupRequest

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: podvolumebackups.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: podvolumebackups
kind: PodVolumeBackup

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: podvolumerestores.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: podvolumerestores
kind: PodVolumeRestore

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: resticrepositories.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: resticrepositories
kind: ResticRepository

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: backupstoragelocations.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: backupstoragelocations
kind: BackupStorageLocation

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: volumesnapshotlocations.ark.heptio.com
labels:
component: ark
spec:
group: ark.heptio.com
version: v1
scope: Namespaced
names:
plural: volumesnapshotlocations
kind: VolumeSnapshotLocation

---
apiVersion: v1
kind: Namespace
metadata:
name: heptio-ark

---
apiVersion: v1
kind: ServiceAccount
metadata:
name: ark
namespace: heptio-ark
labels:
component: ark

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: ark
labels:
component: ark
subjects:
- kind: ServiceAccount
namespace: heptio-ark
name: ark
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
15 changes: 15 additions & 0 deletions cluster-conf/ark/05-ark-backupstoragelocation.yaml
@@ -0,0 +1,15 @@
# Note that we are hardcoding the bucket name here.
# This would make not possible to reuse the code as-it-is.

apiVersion: ark.heptio.com/v1
kind: BackupStorageLocation
metadata:
name: default
namespace: heptio-ark
spec:
provider: aws
objectStorage:
bucket: ark-prod-us-west-2
config:
region: us-west-2
kmsKeyId: alias/ark-prod-us-west-2
9 changes: 9 additions & 0 deletions cluster-conf/ark/06-ark-volumesnapshotlocation.yaml
@@ -0,0 +1,9 @@
apiVersion: ark.heptio.com/v1
kind: VolumeSnapshotLocation
metadata:
name: aws-default
namespace: heptio-ark
spec:
provider: aws
config:
region: us-west-2
75 changes: 75 additions & 0 deletions cluster-conf/ark/10-deployment-kube2iam.yaml
@@ -0,0 +1,75 @@
apiVersion: apps/v1beta1
kind: Deployment
metadata:
namespace: heptio-ark
name: ark
labels:
k8s-app: ark
spec:
replicas: 1
template:
metadata:
labels:
component: ark
k8s-app: ark
annotations:
iam.amazonaws.com/role: arn:aws:iam::320464205386:role/ark-role-prod-us-west-2
spec:
restartPolicy: Always
serviceAccountName: ark
containers:
- name: ark
image: gcr.io/heptio-images/ark:v0.10.1
ports:
- name: metrics
containerPort: 8085
command:
- /ark
args:
- server
volumeMounts:
- name: plugins
mountPath: /plugins
env:
- name: AWS_CLUSTER_NAME
value: kubernetes-prod-us-west-2
volumes:
- name: plugins
emptyDir: {}

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: ark
name: ark
namespace: heptio-ark
spec:
endpoints:
- interval: 60s
port: metrics
scheme: http
jobLabel: k8s-ark
selector:
matchLabels:
k8s-app: ark
---
apiVersion: v1
kind: Service
metadata:
name: ark
labels:
name: ark
k8s-app: ark
namespace: heptio-ark
spec:
selector:
k8s-app: ark
ports:
- name: metrics
protocol: TCP
port: 8085
clusterIP: None


12 changes: 12 additions & 0 deletions docs/runbooks/cluster-and-services.md
Expand Up @@ -11,6 +11,7 @@ See the [README](/README.md) for related documents.
- [Pod stuck in ContainerCreating state](#pod-stuck-containercreating)
- [No Resources Available](#no-resources-available)
- [ConfigMap/Secret/Volume not found](#resource-not-found)
- [Disaster recovery and backups](#dr-and-backups)
- [Cluster services](#cluster-services)
- [MongoDB](#mongodb)

Expand All @@ -37,6 +38,17 @@ It can be that the ASG already hit the maximum number of instances it can scale.

Other possibility is that the Pod tries to mount a volume which was created in an Availability Zone where currently there are not nodes running. We have seen this problem, and overcome it forcing the cluster to create new nodes scaling up some deployment. After the node is created in the right region, scale the Deployment down again.

# <a id="dr-and-backups"></a>Disaster recovery and backups

In order to backup Kubernetes configuration, secrets and persistent volumes we are using Ark, a piece of software developed by Heptio to make easy the process of taking and restoring this kind of backups.

Ark is composed by a server running in the cluster, and a client running in your local machine. In order to schedule and manage backups, and also to restore them you first should install Ark. It can be dowloaded from [here](https://github.com/heptio/ark/releases). It uses your KubeConfig file to find the right cluster, so if you are able to access the Kubernetes cluster, you are all set.

Now, you can take a look at the available backups running `ark backup get`. There you should see several backups with a timestamp. Find the one you want to restore and run: `ark restore create --from-backup $backup-name`, once this is done you can follow the restoring process by running `ark restore get`.

If you need more information, check the official [Ark documentation](https://heptio.github.io/ark/v0.10.0/index.html).


# <a id="cluster-services"></a>Cluster services

## <a id="mongodb"></a>MongoDB
Expand Down
@@ -0,0 +1 @@
data "aws_caller_identity" "current" {}