Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImagePullBackOff when node failed #2485

Closed
stepan111 opened this issue Jul 7, 2021 · 3 comments
Closed

ImagePullBackOff when node failed #2485

stepan111 opened this issue Jul 7, 2021 · 3 comments
Milestone

Comments

@stepan111
Copy link

Hello,

First of all thanks for great project !

I am running camel-k integration on GKE cluster with preemptible nodes.(nodes recreated daily)

Currently I have to delete and apply integration each time k8s migrate pod to other node because of ImagePullBackOff error:

$ k get po -o wide
NAME                                READY   STATUS             RESTARTS   AGE    IP              NODE                                                  NOMINATED NODE   READINESS GATES
camel-k-operator-6f848d7b5c-kgrrp   0/2     Shutdown           0          30h    <none>          gke-sandbox-blue-sandbox-blue-default-f0ed42c1-3p5d   <none>           <none>
camel-k-operator-6f848d7b5c-wjd9n   2/2     Running            2          9h     10.100.18.144   gke-sandbox-blue-sandbox-blue-default-ee794f56-16s7   <none>           <none>
gmail-5d5b644958-hx9gm              1/2     ImagePullBackOff   0          5m1s   10.100.20.131   gke-sandbox-blue-sandbox-blue-default-f0ed42c1-3p5d   <none>           <none>
gmail-5d5b644958-sb2xt              0/2     Shutdown           0          25h    <none>          gke-sandbox-blue-sandbox-blue-default-ee794f56-16s7   <none>           <none>

I've added next triats to integration:

traits:                                                                                                                               
    pull-secret:                                                                                                                        
      configuration:                                                                                                                    
        auto: false                                                                                                                     
        enabled: true                                                                                                                   
        secretName: registry

But it don't take affect on deployment made for this integration.

$ k get deploy/integration | grep imagePullSecret
$ 

Operator version:

$ k get deploy/camel-k-operator -o jsonpath={.spec.template.spec.containers..image}        
docker.io/apache/camel-k:1.4.0
@nicolaferraro
Copy link
Member

So, technically the pull secret is stored on the namespace and it should be shared among all nodes, so if a node is deleted and the pod (here we are in the Kubernetes domain, the Camel K operator should not interfere with that behavior) recreated on another node, the same pull secret used by previous node should be used by the new one. So I assume you're currently not using them but want to..

The trait you've configured should be correct.
Did you try to apply that configuration upon creation of the integration, e.g. using -t pull-secret.enabled=true -t pull-secret.secret-name=registry ?

@nicolaferraro nicolaferraro added this to the 1.6.0 milestone Jul 7, 2021
@stepan111
Copy link
Author

Yes @nicolaferraro , I am using pull-secret trait for integration. Actually I see 2 pods created when I apply integration.
And I found that there are 2 replica sets :

$ k get po                                                                                                                                                                                             
NAME                                READY   STATUS             RESTARTS   AGE                                                
camel-k-operator-6f848d7b5c-9jrdn   2/2     Running            2          11h                                              
gmail-7758f9577d-bhfz6              0/2     PodInitializing    0          13s                                                                                                                                                                                                   
gmail-867d487988-sk9g9              1/2     ImagePullBackOff   0          13s   

 $ k get rs
NAME                          DESIRED   CURRENT   READY   AGE
camel-k-operator-6f848d7b5c   1         1         1       13d
gmail-7758f9577d              0         0         0       3m46s
gmail-867d487988              1         1         1       3m46s

And rs that contain imagPullSecret is scaled to 0:

$ k get rs gmail-7758f9577d -o yaml | grep -A1 imagePullSecret  
            f:imagePullSecrets:
              .: {}
--
      imagePullSecrets:
      - name: registry


$ k get rs gmail-867d487988 -o yaml | grep -A1 imagePullSecret
$

Seems that both replicaSets are revisions of deployment gmail made by integration :

$ k rollout history deployment gmail        
deployment.apps/gmail 
REVISION  CHANGE-CAUSE
1         <none>
2         <none>

And second revision doesn't contain imagePull secrets(maybe it is patched not properly or so).

From events I see that both pods assigned to one node:

12m         Normal    Scheduled                     pod/gmail-7758f9577d-bhfz6    Successfully assigned camel-k/gmail-7758f9577d-bhfz6 to gke-sandbox-blue-sandbox-blue-default-f0ed42c1-d6fq
12m         Normal    Scheduled                     pod/gmail-867d487988-sk9g9    Successfully assigned camel-k/gmail-867d487988-sk9g9 to gke-sandbox-blue-sandbox-blue-default-f0ed42c1-d6fq

So I am thinking that first pod download image on node and second using this image while node is up. And when node crashes active replicaSet can't pull image.

Thanks

@nicolaferraro nicolaferraro modified the milestones: 1.6.0, 1.7.0 Sep 7, 2021
@nicolaferraro nicolaferraro modified the milestones: 1.7.0, 1.8.0 Nov 15, 2021
@oscerd oscerd modified the milestones: 1.8.0, 1.9.0 Jan 19, 2022
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale due to 90 days of inactivity.
It will be closed if no further activity occurs within 15 days.
If you think that’s incorrect or the issue should never stale, please simply write any comment.
Thanks for your contributions!

@oscerd oscerd modified the milestones: 1.9.0, 1.9.1 Apr 26, 2022
@oscerd oscerd modified the milestones: 1.9.1, 1.9.2 May 13, 2022
@oscerd oscerd modified the milestones: 1.9.2, 2.0.0 May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants