New deployment sometimes leaves old ReplicaSet with active replicas #41641

Open
edevil opened this Issue Feb 17, 2017 · 5 comments

Projects

None yet

3 participants

@edevil
Contributor
edevil commented Feb 17, 2017

Is this a request for help?
No.

What keywords did you search in Kubernetes issues before filing this one?
dangling ReplicaSet


Is this a BUG REPORT or FEATURE REQUEST?
BUG REPORT

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T07:31:07Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1+coreos.0", GitCommit:"cc65f5321f9230bf9a3fa171155c1213d6e3480e", GitTreeState:"clean", BuildDate:"2016-12-14T04:08:28Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: Azure A1
  • OS (e.g. from /etc/os-release):
    NAME="Container Linux by CoreOS"
    ID=coreos
    VERSION=1235.9.0
    VERSION_ID=1235.9.0
    BUILD_ID=2017-02-02-0235
    PRETTY_NAME="Container Linux by CoreOS 1235.9.0 (Ladybug)"
    ANSI_COLOR="38;5;75"
    HOME_URL="https://coreos.com/"
    BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
  • Kernel (e.g. uname -a): Linux node-3-vm 4.7.3-coreos-r2 #1 SMP Thu Feb 2 02:26:10 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz GenuineIntel GNU/Linux
  • Install tools: own
  • Others:

What happened:

Created new deployment by altering the image tag of the deployment descriptor and using kubectl apply -f "ds.yaml". Noticed that one container of the previous deployment stayed there.

What you expected to happen:

All containers from the previous deployment should have been terminated.

How to reproduce it (as minimally and precisely as possible):

I cannot reliably trigger this, I do several deployments per hour and sometimes this problem manifests itself.

Anything else we need to know:

> kubectl get pods --namespace=tvifttt
NAME                                 READY     STATUS    RESTARTS   AGE
...
tviftttapp-3058473571-c0v2k          1/1       Running   0          1h <------------
tviftttapp-4232354410-c1m13          1/1       Running   0          25m
tviftttapp-4232354410-hr7b4          1/1       Running   0          25m
โœ— kubectl get rs --namespace=tvifttt
NAME                           DESIRED   CURRENT   READY     AGE
...
tviftttapp-1055562301          0         0         0         6d
tviftttapp-1076664945          0         0         0         4d
tviftttapp-1162123786          0         0         0         9d
tviftttapp-1333959179          0         0         0         9d
tviftttapp-1334680075          0         0         0         8d
tviftttapp-146905007           0         0         0         10d
tviftttapp-1506515468          0         0         0         7d
tviftttapp-1585617524          0         0         0         8d
tviftttapp-1655740915          0         0         0         9d
tviftttapp-1870895682          0         0         0         9d
tviftttapp-2074515880          0         0         0         7d
tviftttapp-208247352           0         0         0         8d
tviftttapp-2088802728          0         0         0         10d
tviftttapp-2154994192          0         0         0         9d
tviftttapp-2321390097          0         0         0         9d
tviftttapp-2343410193          0         0         0         7d
tviftttapp-2389416336          0         0         0         6d
tviftttapp-2421529002          0         0         0         3d
tviftttapp-2612435423          0         0         0         9d
tviftttapp-2656999955          0         0         0         8d
tviftttapp-2714737223          0         0         0         3h
tviftttapp-2782697952          0         0         0         9d
tviftttapp-3058473571          1         1         1         1h    <--------------------
tviftttapp-3129580002          0         0         0         9d
tviftttapp-3322845719          0         0         0         8d
tviftttapp-3336608305          0         0         0         8d
tviftttapp-3338181169          0         0         0         3h
tviftttapp-3499661874          0         0         0         7d
tviftttapp-3560282726          0         0         0         9d
tviftttapp-3695024691          0         0         0         2d
tviftttapp-397973892           0         0         0         8d
tviftttapp-4008876597          0         0         0         7d
tviftttapp-4101544605          0         0         0         6d
tviftttapp-4232354410          2         2         2         26m
tviftttapp-461281746           0         0         0         7d
tviftttapp-461478561           0         0         0         8d
tviftttapp-732207675           0         0         0         2h
tviftttapp-835623432           0         0         0         2d
tviftttapp-875534908           0         0         0         7d

Deployment descriptor:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tviftttapp
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: tviftttapp
        tier: frontend
    spec:
      containers:
      - name: tviftttapp
        image: XXX:TTT
        imagePullPolicy: Always
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /tv/
            port: 80
          initialDelaySeconds: 15
          timeoutSeconds: 1
        readinessProbe:
          httpGet:
            path: /tv/
            port: 80
          timeoutSeconds: 1
      imagePullSecrets:
      - name: brpxprivate
  strategy:
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 2
  minReadySeconds: 2
@kargakis
Member

Can you also post the output of

kubectl get deployment tviftttapp -o yaml
kubectl get rs (dangling rs) -o yaml

When this manifests again?

@edevil
Contributor
edevil commented Feb 20, 2017

Here are the current deployment, the dangling RS, and the "good" RS.

Deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "48"
    kubectl.kubernetes.io/last-applied-configuration: '{"kind":"Deployment","apiVersion":"extensions/v1beta1","metadata":{"name":"tviftttapp","creationTimestamp":null},"spec":{"replicas":2,"template":{"metadata":{"creationTimestamp":null,"labels":{"app":"tviftttapp","tier":"frontend"}},"spec":{"containers":[{"name":"tviftttapp","image":"brpxregistry-andrecabine.azurecr.io/tviftttapp:master.b34d6257","ports":[{"containerPort":80}],"resources":{"requests":{"cpu":"100m","memory":"100Mi"}},"livenessProbe":{"httpGet":{"path":"/tv/","port":80},"initialDelaySeconds":15,"timeoutSeconds":1},"readinessProbe":{"httpGet":{"path":"/tv/","port":80},"timeoutSeconds":1},"imagePullPolicy":"Always"}],"imagePullSecrets":[{"name":"brpxprivate"}]}},"strategy":{"rollingUpdate":{"maxUnavailable":0,"maxSurge":2}},"minReadySeconds":2},"status":{}}'
  creationTimestamp: 2017-02-06T16:09:12Z
  generation: 48
  labels:
    app: tviftttapp
    tier: frontend
  name: tviftttapp
  namespace: tvifttt
  resourceVersion: "7867565"
  selfLink: /apis/extensions/v1beta1/namespaces/tvifttt/deployments/tviftttapp
  uid: 99208627-ec86-11e6-844e-000d3a2709aa
spec:
  minReadySeconds: 2
  replicas: 2
  selector:
    matchLabels:
      app: tviftttapp
      tier: frontend
  strategy:
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: tviftttapp
        tier: frontend
    spec:
      containers:
      - env:
        - name: REACT_APP_USE_ESPIAL
          value: "true"
        image: XXX:master.b34d6257
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /tv/
            port: 80
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: tviftttapp
        ports:
        - containerPort: 80
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /tv/
            port: 80
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        terminationMessagePath: /dev/termination-log
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: brpxprivate
      restartPolicy: Always
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 2
  conditions:
  - lastTransitionTime: 2017-02-20T11:55:09Z
    lastUpdateTime: 2017-02-20T11:55:09Z
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 48
  replicas: 3
  unavailableReplicas: 1
  updatedReplicas: 2

Dangling RS:

apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
  annotations:
    deployment.kubernetes.io/desired-replicas: "2"
    deployment.kubernetes.io/max-replicas: "4"
    deployment.kubernetes.io/revision: "47"
  creationTimestamp: 2017-02-20T11:53:29Z
  generation: 2
  labels:
    app: tviftttapp
    pod-template-hash: "1785829850"
    tier: frontend
  name: tviftttapp-1785829850
  namespace: tvifttt
  resourceVersion: "7867563"
  selfLink: /apis/extensions/v1beta1/namespaces/tvifttt/replicasets/tviftttapp-1785829850
  uid: 31de39a6-f763-11e6-844e-000d3a2709aa
spec:
  minReadySeconds: 2
  replicas: 1
  selector:
    matchLabels:
      app: tviftttapp
      pod-template-hash: "1785829850"
      tier: frontend
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: tviftttapp
        pod-template-hash: "1785829850"
        tier: frontend
    spec:
      containers:
      - env:
        - name: REACT_APP_USE_ESPIAL
          value: "true"
        image: XXX:master.057f2e23
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /tv/
            port: 80
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: tviftttapp
        ports:
        - containerPort: 80
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /tv/
            port: 80
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        terminationMessagePath: /dev/termination-log
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: brpxprivate
      restartPolicy: Always
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  fullyLabeledReplicas: 1
  observedGeneration: 2
  readyReplicas: 1
  replicas: 1

Current RS:

apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
  annotations:
    deployment.kubernetes.io/desired-replicas: "2"
    deployment.kubernetes.io/max-replicas: "4"
    deployment.kubernetes.io/revision: "48"
  creationTimestamp: 2017-02-20T11:56:33Z
  generation: 1
  labels:
    app: tviftttapp
    pod-template-hash: "2297993693"
    tier: frontend
  name: tviftttapp-2297993693
  namespace: tvifttt
  resourceVersion: "7867553"
  selfLink: /apis/extensions/v1beta1/namespaces/tvifttt/replicasets/tviftttapp-2297993693
  uid: 9ff8f60a-f763-11e6-844e-000d3a2709aa
spec:
  minReadySeconds: 2
  replicas: 2
  selector:
    matchLabels:
      app: tviftttapp
      pod-template-hash: "2297993693"
      tier: frontend
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: tviftttapp
        pod-template-hash: "2297993693"
        tier: frontend
    spec:
      containers:
      - env:
        - name: REACT_APP_USE_ESPIAL
          value: "true"
        image: XXX:master.b34d6257
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /tv/
            port: 80
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: tviftttapp
        ports:
        - containerPort: 80
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /tv/
            port: 80
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        terminationMessagePath: /dev/termination-log
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: brpxprivate
      restartPolicy: Always
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  fullyLabeledReplicas: 2
  observedGeneration: 1
  readyReplicas: 2
  replicas: 2
@kargakis
Member

It seems that status.availableReplicas for the new replica set is never correctly updated to 2. Short term if you stop using minReadySeconds (minReadySeconds=2 doesn't offer you much anyway), this problem should stop. Long term, we should fix the status update.

cc: @janetkuo @kubernetes/sig-apps-bugs

@kargakis
Member

@edevil @gg7 can one of you guys paste logs from the controller manager when this issue occurs? Log level should be ideally set to 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment