Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When i try to take the backup of deployment and persistent volume and when i include the volumes tag and include the persistent volume claim tag in it then it is struck in "In process" #2341

Closed
surekhakallam opened this issue Mar 13, 2020 · 24 comments
Labels
Needs info Waiting for information Needs investigation

Comments

@surekhakallam
Copy link

surekhakallam commented Mar 13, 2020

I have created the persistent volume , persistent volume clame , deployment and service and in deployment when i include the persistentVolumeClaim tag in the volumes tag it is struck in the "In process" state while restoring and not coming to "complete" state

here the backup is shown as complete but when i check the logs of restic , its shown as "No completed pod volume backup found for PVC" backup=velero/nginx-backup controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:337" name=nginx-backup-przb4 namespace=velero pvcUID=7a3d236b-8dce-415c-bb97-d70f1900f9ae
time="2020-03-13T05:32:48Z" level=info msg="No parent snapshot found for PVC, not using --parent flag for this backup" backup=velero/nginx-backup controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:253" name=nginx-backup-przb4 namespace=velero"

which means the pv backup is not taken

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: nginx-example
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
annotations:
backup.velero.io/backup-volumes: nginx-logs
spec:
volumes:
- name: nginx-logs
persistentVolumeClaim:
claimName: nginx-logs
containers:
- image: nginx:1.17.6
name: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/var/log/nginx"
name: nginx-logs

this is the deployment file i have created

anything else to be add please let me know
and please let me know what might be the problem also

@skriss
Copy link
Member

skriss commented Mar 13, 2020

Please provide us all of the following:

kubectl -n nginx-example get pvc -o yaml
kubectl -n nginx-example get pv -o yaml
kubectl -n velero get pods -o yaml
kubectl -n velero get podvolumebackups -o yaml

@surekhakallam
Copy link
Author

surekhakallam commented Mar 16, 2020

kubectl -n nginx-example get pvc -o yaml

apiVersion: v1
items:

  • apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
    {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-logs","namespace":"nginx-example"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"50Mi"}},"storageClassName":"minio-storage"}}
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    creationTimestamp: "2020-03-16T04:24:41Z"
    finalizers:
    • kubernetes.io/pvc-protection
      labels:
      app: nginx
      name: nginx-logs
      namespace: nginx-example
      resourceVersion: "3062019"
      selfLink: /api/v1/namespaces/nginx-example/persistentvolumeclaims/nginx-logs
      uid: 2284a566-93d7-4f4c-a085-9d6c7681d6ae
      spec:
      accessModes:
    • ReadWriteOnce
      resources:
      requests:
      storage: 50Mi
      storageClassName: minio-storage
      volumeMode: Filesystem
      volumeName: nginx-logs
      status:
      accessModes:
    • ReadWriteOnce
      capacity:
      storage: 100Gi
      phase: Bound
      kind: List
      metadata:
      resourceVersion: ""
      selfLink: ""

@surekhakallam
Copy link
Author

surekhakallam commented Mar 16, 2020

kubectl -n nginx-example get pv -o yaml

apiVersion: v1
items:

  • apiVersion: v1
    kind: PersistentVolume
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
    {"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-logs"},"spec":{"accessModes":["ReadWriteOnce"],"capacity":{"storage":"100Gi"},"claimRef":{"name":"nginx-logs","namespace":"nginx-example"},"local":{"path":"/data/atom/nginx"},"nodeAffinity":{"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"nginx-deployment","operator":"In","values":["deploy"]}]}]}},"persistentVolumeReclaimPolicy":"Retain","storageClassName":"minio-storage"}}
    creationTimestamp: "2020-03-16T04:24:41Z"
    finalizers:
    • kubernetes.io/pv-protection
      labels:
      app: nginx
      name: nginx-logs
      resourceVersion: "3062016"
      selfLink: /api/v1/persistentvolumes/nginx-logs
      uid: 5c351c70-cc3e-4303-9205-3523d213e655
      spec:
      accessModes:
      - ReadWriteOnce
      capacity:
      storage: 100Gi
      claimRef:
      apiVersion: v1
      kind: PersistentVolumeClaim
      name: nginx-logs
      namespace: nginx-example
      resourceVersion: "3062013"
      uid: 2284a566-93d7-4f4c-a085-9d6c7681d6ae
      local:
      path: /data/atom/nginx
      nodeAffinity:
      required:
      nodeSelectorTerms:
      - matchExpressions:
      - key: nginx-deployment
      operator: In
      values:
      - deploy
      persistentVolumeReclaimPolicy: Retain
      storageClassName: minio-storage
      volumeMode: Filesystem
      status:
      phase: Bound
      kind: List
      metadata:
      resourceVersion: ""
      selfLink: ""

@surekhakallam
Copy link
Author

can you please help me with this

@skriss
Copy link
Member

skriss commented Mar 16, 2020

@surekhakallam it looks like the backup completed successfully.

Did you delete the nginx-example namespace before restoring? This would be necessary, because velero will not overwrite existing resources in your cluster.

From the PVC YAML you included, it appears that the PVC was not restored by velero, since it's creation timestamp is before the PodVolumeBackup's, and it doesn't have the velero labels I'd expect.

To be clear, to test this you should follow the following process:

  1. Install velero with restic enabled
  2. Annotate pods for volume backup with restic
  3. Create backup of the namespace
  4. Delete namespace
  5. Restore from the backup

@surekhakallam
Copy link
Author

surekhakallam commented Mar 17, 2020

yes i am delete the whole nginx-example namespace and also deleting the pv as well

and those are the steps which i followed ..

@surekhakallam
Copy link
Author

ok sure .. will try it once again from the beginning

@surekhakallam
Copy link
Author

i will send you the complete process which i am doing ...Please let me know if anything to be corrected

@skriss
Copy link
Member

skriss commented Mar 17, 2020

It looks like the restic backup of the volume completed just fine:

time="2020-03-17T11:36:10Z" level=info msg="Backup completed" backup=velero/nginx-backup controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:297" name=nginx-backup-7gwqs namespace=velero

The log messages you were pointing to are not errors.

For this most recent attempt, please provide:

velero backup describe nginx-backup --details
velero restore describe <RESTORE-NAME>
velero restore logs <RESTORE-NAME>
kubectl -n nginx-example get pods -o yaml
kubectl -n nginx-example get pvc -o yaml

@surekhakallam
Copy link
Author

velero restore get

NAME BACKUP STATUS WARNINGS ERRORS CREATED SELECTOR
nginx-backup-20200317120831 nginx-backup InProgress 0 0 2020-03-17 12:08:31 -0400 EDT

@surekhakallam
Copy link
Author

velero restore describe nginx-backup-20200317120831

Name: nginx-backup-20200317120831
Namespace: velero
Labels:
Annotations:

Phase: InProgress

Backup: nginx-backup

Namespaces:
Included: *
Excluded:

Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
Cluster-scoped: auto

Namespace mappings:

Label selector:

Restore PVs: auto

Restic Restores (specify --details for more information):
New: 1

@surekhakallam
Copy link
Author

velero restore describe nginx-backup-20200317120831 --details

Name: nginx-backup-20200317120831
Namespace: velero
Labels:
Annotations:

Phase: InProgress

Backup: nginx-backup

Namespaces:
Included: *
Excluded:

Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
Cluster-scoped: auto

Namespace mappings:

Label selector:

Restore PVs: auto

Restic Restores:
New:
nginx-example/nginx-deployment-6ddf566c-5hj8m: nginx-logs

@surekhakallam
Copy link
Author

velero restore logs nginx-backup-20200317120831

Logs for restore "nginx-backup-20200317120831" are not available until it's finished processing. Please wait until the restore has a phase of Completed or Failed and try again.

@surekhakallam
Copy link
Author

kubectl -n nginx-example get pvc -o yaml
apiVersion: v1
items:

  • apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
    {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-logs","namespace":"nginx-example"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"50Mi"}},"storageClassName":"minio-storage"}}
    creationTimestamp: "2020-03-17T16:08:32Z"
    finalizers:
    • kubernetes.io/pvc-protection
      labels:
      app: nginx
      velero.io/backup-name: nginx-backup
      velero.io/restore-name: nginx-backup-20200317120831
      name: nginx-logs
      namespace: nginx-example
      resourceVersion: "3468421"
      selfLink: /api/v1/namespaces/nginx-example/persistentvolumeclaims/nginx-logs
      uid: 038e781c-509c-4535-8069-63b852343001
      spec:
      accessModes:
    • ReadWriteOnce
      resources:
      requests:
      storage: 50Mi
      storageClassName: minio-storage
      volumeMode: Filesystem
      status:
      phase: Pending
      kind: List
      metadata:
      resourceVersion: ""
      selfLink: ""

@surekhakallam
Copy link
Author

surekhakallam commented Mar 17, 2020

kubectl get pv | grep nginx-logs

[root@master~]#

so pv is not created

@surekhakallam
Copy link
Author

kubectl get pvc -n nginx-example

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nginx-logs Pending minio-storage 20m

@skriss
Copy link
Member

skriss commented Mar 17, 2020

message: '0/4 nodes are available: 1 node(s) had taints that the pod didn''t
tolerate, 3 node(s) didn''t find available persistent volumes to bind.'

Velero can't do the restic restore because the nginx pod isn't running due to the above

@surekhakallam
Copy link
Author

surekhakallam commented Mar 17, 2020

oh ok .. i am a fresher .. so can you please tell me what actually i have to change ... what changes i have to do in deployment

@surekhakallam
Copy link
Author

should i remove the below line

restic.velero.io/restic: nginx-log

@skriss
Copy link
Member

skriss commented Mar 17, 2020

I'm not entirely sure, I believe it has something to do with the PV's node affinity but it's really hard to review the YAML output when it's not formatted properly (in a Markdown code block, in a gist, etc).

@skriss
Copy link
Member

skriss commented Mar 17, 2020

Does the node that has the local PV have a taint that can't be tolerated by the nginx pod?

@surekhakallam
Copy link
Author

actually according to me no ... but i observed this only when we added the "persistantVolumeClaim" tag block of code in the below code .... only then we found this problem

volumes:

  • name: nginx-logs
    persistentVolumeClaim:
    claimName: nginx-logs

without that it is fine
but since we want this in our code we are trying with this

@skriss
Copy link
Member

skriss commented Mar 18, 2020

I would recommend trying to figure out why the nginx pod isn't able to be scheduled. It sure sounds like there's a taint/toleration issue, but without seeing the YAML for the pod + all the nodes + the PV, it's hard for me to debug further.

lastProbeTime: null
lastTransitionTime: "2020-03-17T16:08:32Z"
message: '0/4 nodes are available: 1 node(s) had taints that the pod didn''t
tolerate, 3 node(s) didn''t find available persistent volumes to bind.'
reason: Unschedulable
status: "False"
type: PodScheduled

@skriss
Copy link
Member

skriss commented Mar 25, 2020

closing this out as inactive, thanks!

@skriss skriss closed this as completed Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs info Waiting for information Needs investigation
Projects
None yet
Development

No branches or pull requests

2 participants