Skip to content
This repository has been archived by the owner on Nov 30, 2021. It is now read-only.

Pods fail to mount secret on k8s 1.3.0 + GKE #372

Closed
mboersma opened this issue Jul 13, 2016 · 12 comments · Fixed by #421
Closed

Pods fail to mount secret on k8s 1.3.0 + GKE #372

mboersma opened this issue Jul 13, 2016 · 12 comments · Fixed by #421
Assignees

Comments

@mboersma
Copy link
Member

mboersma commented Jul 13, 2016

Builder, database, minio, and registry all mount the "objectstorage-keyfile" secret volume. In k8s 1.3 on GKE, this began to fail (see the final Events listing):

$ kubectl --namespace=deis describe po deis-database-dyqyu 
Name:       deis-database-dyqyu
Namespace:  deis
Node:       gke-mboersma-default-pool-21df92ab-lh7d/10.240.0.16
Start Time: Wed, 13 Jul 2016 15:28:28 -0600
Labels:     app=deis-database
Status:     Pending
IP:     
Controllers:    ReplicationController/deis-database
Containers:
  deis-database:
    Container ID:   
    Image:      quay.io/deisci/postgres:canary
    Image ID:       
    Port:       5432/TCP
    QoS Tier:
      memory:       BestEffort
      cpu:      BestEffort
    State:      Waiting
      Reason:       ContainerCreating
    Ready:      False
    Restart Count:  0
    Readiness:      exec [is_running] delay=30s timeout=1s period=10s #success=1 #failure=3
    Environment Variables:
      DATABASE_STORAGE: minio
Conditions:
  Type      Status
  Initialized   True 
  Ready     False 
  PodScheduled  True 
Volumes:
  database-creds:
    Type:   Secret (a volume populated by a Secret)
    SecretName: database-creds
  objectstore-creds:
    Type:   Secret (a volume populated by a Secret)
    SecretName: objectstorage-keyfile
  deis-database-token-ovk2r:
    Type:   Secret (a volume populated by a Secret)
    SecretName: deis-database-token-ovk2r
Events:
  FirstSeen LastSeen    Count   From                            SubobjectPath   Type        Reason      Message
  --------- --------    -----   ----                            -------------   --------    ------      -------
  4m        4m      1   {default-scheduler }                            Normal      Scheduled   Successfully assigned deis-database-dyqyu to gke-mboersma-default-pool-21df92ab-lh7d
  2m        2m      1   {kubelet gke-mboersma-default-pool-21df92ab-lh7d}           Warning     FailedMount Unable to mount volumes for pod "deis-database-dyqyu_deis(bd127bc0-4940-11e6-af68-42010af001d0)": timeout expired waiting for volumes to attach/mount for pod "deis-database-dyqyu"/"deis". list of unattached/unmounted volumes=[objectstore-creds]
  2m        2m      1   {kubelet gke-mboersma-default-pool-21df92ab-lh7d}           Warning     FailedSync  Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "deis-database-dyqyu"/"deis". list of unattached/unmounted volumes=[objectstore-creds]

This appears to be related to kubernetes/kubernetes#28750 and maybe kubernetes/kubernetes#28898 and kubernetes/kubernetes#28616.

@mboersma mboersma changed the title Pods fail to mount "objectstorage-keyfile" secret on k8s 1.3.0 + GKE Pods fail to mount secret on k8s 1.3.0 + GKE Jul 13, 2016
@felixbuenemann
Copy link
Contributor

I also saw these with k8s 1.3.0 using workflow v2.1.0. First try was with kube-aws 1.3.0 / hyperkube v1.3.0_coreos.0 / CoreOS Alpha w. docker 1.11.2 and second try on kube-aws 1.3.0 / hyperkube v1.3.0_coreos.1 / CoreOS Beta w. docker 1.10.3.

Here's an example of the logged error events taken from the tectonic console:

Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "slugbuild-edible-magician-f3a78dd4-aae4512e"/"deis". list of unattached/unmounted volumes=[objectstorage-keyfile]
Unable to mount volumes for pod "slugbuild-edible-magician-f3a78dd4-aae4512e_deis(6c69628d-468f-11e6-a66d-029730afa1db)": timeout expired waiting for volumes to attach/mount for pod "slugbuild-edible-magician-f3a78dd4-aae4512e"/"deis". list of unattached/unmounted volumes=[objectstorage-keyfile]

@mboersma
Copy link
Member Author

A workaround in GKE is to choose the "Change" link for your Node Pool and roll it back to k8s 1.2.5.

@rimusz
Copy link
Contributor

rimusz commented Jul 22, 2016

Not the issue with Workflow v2.2.0 on GKE with kubernetes v1.3.2 with GC storage.
Tried on kube-solo and kube-cluster (minio as storage) , did not get any problems too

@felixbuenemann
Copy link
Contributor

Chances are you just haven't hit the bug yet. The bugfix has been merged in kubernetes master 2 days ago (see kubernetes/kubernetes#28939) and will likely land in k8s 1.3.3. The problem ist triggered when mounting secrets, so it doesn't matter what kind of storage you are using.

@rimusz
Copy link
Contributor

rimusz commented Jul 22, 2016

@felixbuenemann you, I got hit by that bug on the forth cluster on GKE

@JorritSalverda
Copy link

Just checked with 1.3.2, and there indeed it's still an issue. Will retry later this week when 1.3.3 is available on GKE.

@sstarcher
Copy link
Contributor

Looks like it will land in 1.3.4

@bacongobbler
Copy link
Member

Yes, apparently @mboersma received tentative acknowledgement that it will land in 1.3.4.

@felixbuenemann
Copy link
Contributor

If someone wants to try if it's fixed, k8s 1.3.4 has been released a couple of hours ago.

@sstarcher
Copy link
Contributor

I'll be testing this out tomorrow

@bacongobbler
Copy link
Member

Yes, I manually tested this and it was fixed with 1.3.4-beta.0.

@mboersma
Copy link
Member Author

mboersma commented Aug 1, 2016

k8s 1.3.4 has been released a couple of hours ago

Excellent! Thanks @felixbuenemann we'll test again to make sure.

(I've also manually tested with k8s v1.4.0-beta2, and the bug stayed fixed.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants