[stable/concourse] Upgrade to Concourse 3.8.0.

Make necessary improvements to Concourse worker lifecycle management. Add additional fatal errors emitted as of Concourse 3.8.0 that should trigger a restart, and remove "unkown volume" as one such error as this will happen normally when running multiple concourse-web pods. Try to start workers with a clean slate by cleaning up previous incarnations of a worker. Call retire-worker before starting. Also clear the concourse-work-dir before starting. Call retire-worker in a loop and don't exit that loop until the old worker is gone. This allows us to remove the fixed worker.postStopDelaySeconds duration. Add a note about persistent volumes being necessary. Add containerPlacementStrategy, default random to better spread load across workers.
helm · Jan 7, 2018 · 1918e0f · 1918e0f
1 parent d035ccb
commit 1918e0f
Show file tree

Hide file tree

Showing 6 changed files with 46 additions and 32 deletions.
diff --git a/stable/concourse/Chart.yaml b/stable/concourse/Chart.yaml
@@ -1,6 +1,6 @@
 name: concourse
-version: 0.10.8
-appVersion: 3.6.0
+version: 0.11.0
+appVersion: 3.8.0
 description: Concourse is a simple and scalable CI system.
 icon: https://avatars1.githubusercontent.com/u/7809479
 keywords:

diff --git a/stable/concourse/README.md b/stable/concourse/README.md
@@ -55,7 +55,7 @@ $ kubectl scale statefulset my-release-worker --replicas=3
 
 ### Restarting workers
 
-If a worker isn't taking on work, you can restart the worker with `kubectl delete pod`. This will initiate a graceful shutdown by "retiring" the worker, with some waiting time before the worker starts up again to ensure concourse doesn't try looking for old volumes on the new worker. The values `worker.postStopDelaySeconds` and `worker.terminationGracePeriodSeconds` can be used to tune this.
+If a worker isn't taking on work, you can restart the worker with `kubectl delete pod`. This will initiate a graceful shutdown by "retiring" the worker, to ensure Concourse doesn't try looking for old volumes on the new worker. The value`worker.terminationGracePeriodSeconds` can be used to provide an upper limit on graceful shutdown time before forcefully terminating the container.
 
 ### Worker Liveness Probe
 
@@ -68,7 +68,7 @@ The following tables lists the configurable parameters of the Concourse chart an
 | Parameter               | Description                           | Default                                                    |
 | ----------------------- | ----------------------------------    | ---------------------------------------------------------- |
 | `image` | Concourse image | `concourse/concourse` |
-| `imageTag` | Concourse image version | `3.3.2` |
+| `imageTag` | Concourse image version | `3.8.0` |
 | `imagePullPolicy` |Concourse image pull policy |  `Always` if `imageTag` is `latest`, else `IfNotPresent` |
 | `concourse.username` | Concourse Basic Authentication Username | `concourse` |
 | `concourse.password` | Concourse Basic Authentication Password | `concourse` |
@@ -85,6 +85,7 @@ The following tables lists the configurable parameters of the Concourse chart an
 | `concourse.oldResourceGracePeriod` | How long to cache the result of a get step after a newer version of the resource is found | `5m` |
 | `concourse.resourceCacheCleanupInterval` | The interval on which to check for and release old caches of resource versions | `30s` |
 | `concourse.baggageclaimDriver` | The filesystem driver used by baggageclaim | `naive` |
+| `concourse.containerPlacementStrategy` | The selection strategy for placing containers onto workers | `random` |
 | `concourse.externalURL` | URL used to reach any ATC from the outside world | `nil` |
 | `concourse.dockerRegistry` | An URL pointing to the Docker registry to use to fetch Docker images | `nil` |
 | `concourse.insecureDockerRegistry` | Docker registry(ies) (comma separated) to allow connecting to even if not secure | `nil` |
@@ -126,8 +127,7 @@ The following tables lists the configurable parameters of the Concourse chart an
 | `worker.minAvailable` | Minimum number of workers available after an eviction | `1` |
 | `worker.resources` | Concourse Worker resource requests and limits | `{requests: {cpu: "100m", memory: "512Mi"}}` |
 | `worker.additionalAffinities` | Additional affinities to apply to worker pods. E.g: node affinity | `nil` |
-| `worker.postStopDelaySeconds` | Time to wait after graceful shutdown of worker before starting up again | `60` |
-| `worker.terminationGracePeriodSeconds` | Upper bound for graceful shutdown, including `worker.postStopDelaySeconds` | `120` |
+| `worker.terminationGracePeriodSeconds` | Upper bound for graceful shutdown to allow the worker to drain its tasks | `60` |
 | `worker.fatalErrors` | Newline delimited strings which, when logged, should trigger a restart of the worker | *See [values.yaml](values.yaml)* |
 | `worker.updateStrategy` | `OnDelete` or `RollingUpdate` (requires Kubernetes >= 1.7) | `RollingUpdate` |
 | `worker.podManagementPolicy` | `OrderedReady` or `Parallel` (requires Kubernetes >= 1.7) | `Parallel` |
@@ -205,7 +205,7 @@ concourse:
     < Insert the contents of your concourse-keys/worker_key.pub file >
 ```
 
-Alternativelly, you can provide those keys to `helm install` via parameters:
+Alternatively, you can provide those keys to `helm install` via parameters:
 
 
 ```console
@@ -243,6 +243,8 @@ persistence:
     size: "20Gi"
 ```
 
+It is highly recommended to use Persistent Volumes for Concourse Workers; otherwise container images managed by the Worker is stored in an `emptyDir` volume on the node's disk. This will interfere with k8s ImageGC and the node's disk will fill up as a result. This will be fixed in a future release of k8s: https://github.com/kubernetes/kubernetes/pull/57020
+
 ### Ingress TLS
 
 If your cluster allows automatic creation/retrieval of TLS certificates (e.g. [kube-lego](https://github.com/jetstack/kube-lego)), please refer to the documentation for that mechanism.
@@ -328,5 +330,5 @@ credentialManager:
     ## initial periodic token issued for concourse
     ## ref: https://www.vaultproject.io/docs/concepts/tokens.html#periodic-tokens
     ##
-    clientToken: PERIODIC_VAULT_TOKEN 
+    clientToken: PERIODIC_VAULT_TOKEN
 ```
diff --git a/stable/concourse/templates/configmap.yaml b/stable/concourse/templates/configmap.yaml
@@ -20,6 +20,7 @@ data:
   concourse-resource-cache-cleanup-interval: {{ .Values.concourse.resourceCacheCleanupInterval | quote }}
   concourse-external-url: {{ default "" .Values.concourse.externalURL | quote }}
   concourse-baggageclaim-driver: {{ .Values.concourse.baggageclaimDriver | quote }}
+  container-placement-strategy: {{ .Values.concourse.containerPlacementStrategy | quote }}
   garden-docker-registry: {{ default "" .Values.concourse.dockerRegistry | quote }}
   garden-insecure-docker-registry: {{ default "" .Values.concourse.insecureDockerRegistry | quote }}
   github-auth-organization: {{ default "" .Values.concourse.githubAuthOrganization | quote }}
@@ -37,10 +38,9 @@ data:
   generic-oauth-auth-url-param: {{ default "" .Values.concourse.genericOauthAuthUrlParam | quote }}
   generic-oauth-scope: {{ default "" .Values.concourse.genericOauthScope | quote }}
   generic-oauth-token-url: {{ default "" .Values.concourse.genericOauthTokenUrl | quote }}
-  worker-post-stop-delay-seconds: {{ .Values.worker.postStopDelaySeconds | quote }}
   worker-fatal-errors: {{ default "" .Values.worker.fatalErrors | quote }}
   {{ if .Values.credentialManager.vault }}
   vault-url: {{ default "" .Values.credentialManager.vault.url | quote }}
-  vault-path-prefix: {{ default "/concourse" .Values.credentialManager.vault.pathPrefix | quote }} 
+  vault-path-prefix: {{ default "/concourse" .Values.credentialManager.vault.pathPrefix | quote }}
   vault-auth-backend: {{ default "" .Values.credentialManager.vault.authBackend | quote }}
   {{ end }}
diff --git a/stable/concourse/templates/web-deployment.yaml b/stable/concourse/templates/web-deployment.yaml
@@ -302,6 +302,11 @@ spec:
             - name: CONCOURSE_PROMETHEUS_BIND_PORT
               value: {{ .Values.web.metrics.prometheus.port | quote }}
             {{- end }}
+            - name: CONCOURSE_CONTAINER_PLACEMENT_STRATEGY
+              valueFrom:
+                configMapKeyRef:
+                  name: {{ template "concourse.concourse.fullname" . }}
+                  key: container-placement-strategy
           ports:
             - name: atc
               containerPort: {{ .Values.concourse.atcPort }}

diff --git a/stable/concourse/templates/worker-statefulset.yaml b/stable/concourse/templates/worker-statefulset.yaml
@@ -34,8 +34,13 @@ spec:
             - -c
             - |-
               cp /dev/null /concourse-work-dir/.liveness_probe
+              rm -rf /concourse-work-dir/*
+              while ! concourse retire-worker --name=${HOSTNAME} | grep -q worker-not-found; do
+                touch /concourse-work-dir/.pre_start_cleanup
+                sleep 5
+              done
+              rm -f /concourse-work-dir/.pre_start_cleanup
               concourse worker --name=${HOSTNAME} | tee -a /concourse-work-dir/.liveness_probe
-              sleep ${POST_STOP_DELAY_SECONDS}
           livenessProbe:
             exec:
               command:
@@ -49,16 +54,23 @@ spec:
                     >&2 echo "Fatal error detected: ${FATAL_ERRORS}"
                     exit 1
                   fi
+                  if [ -f /concourse-work-dir/.pre_start_cleanup ]; then
+                    >&2 echo "Still trying to clean up before starting concourse. 'fly prune-worker -w ${HOSTNAME}' might need to be called to force cleanup."
+                    exit 1
+                  fi
             failureThreshold: 1
             initialDelaySeconds: 10
             periodSeconds: 10
           lifecycle:
             preStop:
               exec:
                 command:
-                  - "/bin/sh"
-                  - "-c"
-                  - "concourse retire-worker --name=${HOSTNAME}"
+                  - /bin/sh
+                  - -c
+                  - |-
+                    while ! concourse retire-worker --name=${HOSTNAME} | grep -q worker-not-found; do
+                      sleep 5
+                    done
           env:
             - name: CONCOURSE_TSA_HOST
               valueFrom:
@@ -91,11 +103,6 @@ spec:
                 configMapKeyRef:
                   name: {{ template "concourse.concourse.fullname" . }}
                   key: concourse-baggageclaim-driver
-            - name: POST_STOP_DELAY_SECONDS
-              valueFrom:
-                configMapKeyRef:
-                  name: {{ template "concourse.concourse.fullname" . }}
-                  key: worker-post-stop-delay-seconds
             - name: LIVENESS_PROBE_FATAL_ERRORS
               valueFrom:
                 configMapKeyRef:

diff --git a/stable/concourse/values.yaml b/stable/concourse/values.yaml
@@ -13,7 +13,7 @@ image: concourse/concourse
 ## Concourse image version.
 ## ref: https://hub.docker.com/r/concourse/concourse/tags/
 ##
-imageTag: "3.5.0"
+imageTag: "3.8.0"
 
 ## Specify a imagePullPolicy: 'Always' if imageTag is 'latest', else set to 'IfNotPresent'.
 ## ref: https://kubernetes.io/docs/user-guide/images/#pre-pulling-images
@@ -184,6 +184,12 @@ concourse:
   ##
   baggageclaimDriver: naive
 
+  ## The selection strategy for placing containers onto workers, as of Concourse 3.7 this can be
+  ## "volume-locality" or "random". Random can better spread load across workers, see
+  ## https://github.com/concourse/atc/pull/219 for background.
+  ##
+  containerPlacementStrategy: random
+
   ## An URL pointing to the Docker registry to use to fetch Docker images.
   ## If unset, this will default to the Docker default
   ##
@@ -449,24 +455,18 @@ worker:
   #    value: "value"
   #    effect: "NoSchedule"
 
-  ## Time to delay after the worker process shuts down. This inserts time between shutdown and startup
-  ## to avoid errors caused by a worker restart.
-  postStopDelaySeconds: 60
-
-  ## Time to allow the pod to terminate before being forcefully terminated. This should include
-  ## postStopDelaySeconds, and should additionally provide time for the worker to retire, e.g.
-  ## = postStopDelaySeconds + max time to allow the worker to drain its tasks. See
-  ## https://concourse.ci/worker-internals.html for worker lifecycle semantics.
-  terminationGracePeriodSeconds: 120
+  ## Time to allow the pod to terminate before being forcefully terminated. This should provide time for
+  ## the worker to retire, i.e. drain its tasks. See https://concourse.ci/worker-internals.html for worker
+  ## lifecycle semantics.
+  terminationGracePeriodSeconds: 60
 
   ## If any of the strings are found in logs, the worker's livenessProbe will fail and trigger a pod restart.
   ## Specify one string per line, exact matching is used.
   ##
-  ## "guardian.api.garden-server.create.failed" appears when the worker's filesystem has issues.
-  ## "unknown handle" appears if a worker didn't cleanly restart.
   fatalErrors: |-
     guardian.api.garden-server.create.failed
-    unknown handle
+    guardian.api.garden-server.run.failed
+    baggageclaim.api.volume-server.create-volume-async.failed-to-create
 
   ## Strategy for StatefulSet updates (requires Kubernetes 1.6+)
   ## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset