Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imagePullSecrets not used in backup cronjob templates #78

Open
jmbarbier opened this issue Dec 26, 2021 · 4 comments
Open

imagePullSecrets not used in backup cronjob templates #78

jmbarbier opened this issue Dec 26, 2021 · 4 comments

Comments

@jmbarbier
Copy link

jmbarbier commented Dec 26, 2021

If imagePullSecrets are used to pull main database images, they are not used for backup templates. So if the backup job is scheduled on a pod without database image, it fails pulling the image.

Manually adding imagePullSecrets to CronJob definition works but this is a dirty workaround :)

@jmbarbier jmbarbier changed the title imagePullSecret not used in backup cronjob templates imagePullSecrets not used in backup cronjob templates Dec 26, 2021
@alex-arica
Copy link
Member

Thank you for reporting this issue when imagePullSecrets are used to pull main database images.

Could you please share a YAML example where your configuration does not work?

@jmbarbier
Copy link
Author

thank you for your quick reply.. here is some more info :

Launching this yaml file on a cluster (at scaleway.com => scw-xxx) with 1 node is fine

apiVersion: v1
data:
  .dockerconfigjson: REDACTED
kind: Secret
metadata:
  name: registry-secret
  namespace: kubegres-sandbox
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: kubegres-backup-issue-78-pvc
  namespace: kubegres-sandbox
spec:
  storageClassName: scw-bssd
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: kubegres.reactive-tech.io/v1
kind: Kubegres
metadata:
  name: kubegres-issue-78
  namespace: kubegres-sandbox
spec:
  replicas: 1
  image: my-private-registry/image:tag
  imagePullSecrets:
    - name: registry-secret
  database:
    size: 1Gi
    storageClassName: scw-bssd
  failover:
    isDisabled: true
  backup:
    schedule: "*/3 * * * *"
    pvcName: kubegres-backup-issue-78-pvc
    volumeMount: /var/lib/backup
  env:
    - name: POSTGRES_PASSWORD
      value: supassword
    - name: POSTGRES_REPLICATION_PASSWORD
      value: reppassword

Backup CronJob is created :

kind: CronJob
apiVersion: batch/v1beta1
metadata:
  name: backup-kubegres-issue-78
  namespace: kubegres-sandbox
  uid: 499fa9f5-5a59-48d9-8fa2-7b08c74ca34b
  resourceVersion: '1469331142'
  generation: 1
  creationTimestamp: '2021-12-27T14:02:36Z'
  ownerReferences:
    - apiVersion: kubegres.reactive-tech.io/v1
      kind: Kubegres
      name: kubegres-issue-78
      uid: db3cb326-bc37-434e-b3ba-44175b083eb6
      controller: true
      blockOwnerDeletion: true
spec:
  schedule: '*/3 * * * *'
  concurrencyPolicy: Forbid
  suspend: false
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      template:
        metadata:
          creationTimestamp: null
        spec:
          volumes:
            - name: backup-volume
              persistentVolumeClaim:
                claimName: kubegres-backup-issue-78-pvc
            - name: postgres-config
              configMap:
                name: base-kubegres-config
                defaultMode: 511
          containers:
            - name: backup-postgres
              image: my-private-registry/image:tag
              args:
                - sh
                - '-c'
                - /tmp/backup_database.sh
              env:
                - name: PGPASSWORD
                - name: KUBEGRES_RESOURCE_NAME
                  value: kubegres-issue-78
                - name: BACKUP_DESTINATION_FOLDER
                  value: /var/lib/backup
                - name: BACKUP_SOURCE_DB_HOST_NAME
                  value: kubegres-issue-78
                - name: POSTGRES_PASSWORD
                  value: supassword
                - name: POSTGRES_REPLICATION_PASSWORD
                  value: reppassword
              resources: {}
              volumeMounts:
                - name: backup-volume
                  mountPath: /var/lib/backup
                - name: postgres-config
                  mountPath: /tmp/backup_database.sh
                  subPath: backup_database.sh
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              imagePullPolicy: IfNotPresent
          restartPolicy: OnFailure
          terminationGracePeriodSeconds: 30
          dnsPolicy: ClusterFirst
          securityContext: {}
          schedulerName: default-scheduler
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
status:
  lastScheduleTime: '2021-12-27T14:06:00Z'
  lastSuccessfulTime: '2021-12-27T14:06:05Z'

the imagePullSecrets is missing there..
on my single node cluster, jobs are OK, because my private image is already present (imagePullPolicy: IfNotPresent)

backup-kubegres-issue-78-27343563   1/1           17s        7m46s
backup-kubegres-issue-78-27343566   1/1           5s         4m46s

But if i add some nodes to the cluster

NAME                                             STATUS   ROLES    AGE     VERSION
scw-k8s-solidev-default-34b29ff7154b4452a4ace2   Ready    <none>   20d     v1.23.0
scw-k8s-solidev-default-817558573a0244e09202dc   Ready    <none>   6m49s   v1.23.0
scw-k8s-solidev-default-ee42e6727e114831aeaac8   Ready    <none>   6m24s   v1.23.0

the backup job fails depending of which node it is scheduled :

✦ ➜ kubectl get jobs
NAME                                COMPLETIONS   DURATION   AGE
backup-kubegres-issue-78-27343563   1/1           17s        9m57s
backup-kubegres-issue-78-27343566   1/1           5s         6m57s
backup-kubegres-issue-78-27343569   0/1           3m57s      3m57s
✦ ➜ kubectl describe jobs/backup-kubegres-issue-78-27343569
Name:             backup-kubegres-issue-78-27343569
Namespace:        kubegres-sandbox
(...)
Events:
  Type    Reason            Age    From            Message
  ----    ------            ----   ----            -------
  Normal  SuccessfulCreate  5m33s  job-controller  Created pod: backup-kubegres-issue-78-27343569-hb2d6
✦ ➜ kubectl describe pods/backup-kubegres-issue-78-27343569-hb2d6
Name:         backup-kubegres-issue-78-27343569-hb2d6
Namespace:    kubegres-sandbox
Priority:     0
Node:         scw-k8s-solidev-default-817558573a0244e09202dc/10.197.230.31
Start Time:   Mon, 27 Dec 2021 15:09:00 +0100
(...)
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               6m12s                 default-scheduler        Successfully assigned kubegres-sandbox/backup-kubegres-issue-78-27343569-hb2d6 to scw-k8s-solidev-default-817558573a0244e09202dc
  Normal   SuccessfulAttachVolume  6m11s                 attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-cef4f654-d552-4a9d-b7c5-1accd1757530"
  Normal   Pulling                 4m46s (x4 over 6m8s)  kubelet                  Pulling image "my-private-registry/image:tag"
  Warning  Failed                  4m46s (x4 over 6m8s)  kubelet                  Failed to pull image "my-private-registry/image:tag": rpc error: code = Unknown desc = failed to pull and unpack image "my-private-registry/image:tag": failed to resolve reference "my-private-registry/image:tag": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
  Warning  Failed                  4m46s (x4 over 6m8s)  kubelet                  Error: ErrImagePull
  Warning  Failed                  4m18s (x6 over 6m7s)  kubelet                  Error: ImagePullBackOff
  Normal   BackOff                 58s (x20 over 6m7s)   kubelet                  Back-off pulling image "my-private-registry/image:tag"

the private image have not been pulled on new nodes, so when the backup is scheduled on one of these new nodes, as the imagePullSecrets field is missing in CronJob spec, the container cannot be pulled.

@alex-arica
Copy link
Member

Thank you for those details which will help me with the investigation.

@urbany
Copy link

urbany commented Mar 21, 2022

Hi @alex-arica sent a PR #103 to fix this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants