Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot restore pv from vcluster #1324

Open
SCLogo opened this issue Oct 26, 2023 · 9 comments
Open

Cannot restore pv from vcluster #1324

SCLogo opened this issue Oct 26, 2023 · 9 comments
Assignees
Labels

Comments

@SCLogo
Copy link

SCLogo commented Oct 26, 2023

What happened?

Tried to restore pv from vcluster but got errors:

vcluster E1019 08:54:05.247327       6 pv_protection_controller.go:118] PV pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb failed with : Operation cannot be fulfilled on persistentvolumes "pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb": StorageError: invalid object, Code: 4, Key: /registry/persistentvolumes/pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: b674d6b6-5cb8-4bed-a25f-58a727d35e52, UID in object meta: syncer E1019 08:58:54.509586       1 controller.go:329] controller persistent-volume-claim: controllerGroup  controllerKind PersistentVolumeClaim: PersistentVolumeClaim klog.ObjectRef{Name:"data-redis-server-0", Namespace:"default"}: namespace default name data-redis-server-0: reconcileID "cb8db507-99b2-4092-8e0d-f88a48178ce7": Reconciler error  "vcluster-pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb-x--07089ccaa5" not found
vcluster E1019 08:54:05.247327       6 pv_protection_controller.go:118] PV pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb failed with : Operation cannot be fulfilled on persistentvolumes "pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb": StorageError: invalid object, Code: 4, Key: /registry/persistentvolumes/pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: b674d6b6-5cb8-4bed-a25f-58a727d35e52
syncer E1019 08:58:54.509586       1 controller.go:329] controller persistent-volume-claim: controllerGroup  controllerKind PersistentVolumeClaim: PersistentVolumeClaim klog.ObjectRef{Name:"data-redis-server-0", Namespace:"default"}: namespace default name data-redis-server-0: reconcileID "cb8db507-99b2-4092-8e0d-f88a48178ce7": Reconciler error  "vcluster-pvc-2a4d2509-1bea-4f47-8895-96a685c1b7eb-x--07089ccaa5" not found
vcluster E1024 13:02:46.770158       6 pv_protection_controller.go:118] PV pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee failed with : O ││ peration cannot be fulfilled on persistentvolumes "pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee": StorageError: invalid object, Code: 4 │
│ , Key: /registry/persistentvolumes/pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee, ResourceVersion: 0, AdditionalErrorMsg: Precondition f ││ ailed: UID in precondition: 906cd14d-3bcc-49c2-a511-8657d03d28d6, UID in object meta:
E1024 12:42:16.523375       6 pv_protection_controller.go:118] PV pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee failed with : Operation cannot be fulfilled on persistentvolumes "pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee": the object has been modified; please apply your changes to the latest version and try again

What did you expect to happen?

able to restore pv in vcluster

How can we reproduce it (as minimally and precisely as possible)?

  1. Run velero on both host and vcluster
  2. sync out pvc pv from vcluster to host
  3. do a backup inside vcluster
  4. velero creates the snapshot correctly
  5. delete pv
  6. try to restore with velero from the snapshot earlier created

Anything else we need to know?

velero logs:

https://loft-sh.slack.com/files/U04Q21T430D/F0623UB3UKH/restore.log?origin_team=TDSP6B7DY&origin_channel=Vall_threads

pv object:

{
    "apiVersion": "v1",
    "kind": "PersistentVolume",
    "metadata": {
        "annotations": {
            "pv.kubernetes.io/provisioned-by": "kubernetes.io/aws-ebs",
            "vcluster.loft.sh/host-pv": "pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee"
        },
        "creationTimestamp": "2023-10-24T11:56:34Z",
        "finalizers": [
            "kubernetes.io/pv-protection"
        ],
        "labels": {
            "topology.kubernetes.io/region": "us-east-2",
            "topology.kubernetes.io/zone": "us-east-2a"
        },
        "managedFields": [
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:annotations": {
                            ".": {},
                            "f:pv.kubernetes.io/provisioned-by": {},
                            "f:vcluster.loft.sh/host-pv": {}
                        },
                        "f:finalizers": {
                            ".": {},
                            "v:\"kubernetes.io/pv-protection\"": {}
                        },
                        "f:labels": {
                            ".": {},
                            "f:topology.kubernetes.io/region": {},
                            "f:topology.kubernetes.io/zone": {}
                        }
                    },
                    "f:spec": {
                        "f:accessModes": {},
                        "f:awsElasticBlockStore": {
                            ".": {},
                            "f:fsType": {},
                            "f:volumeID": {}
                        },
                        "f:capacity": {
                            ".": {},
                            "f:storage": {}
                        },
                        "f:claimRef": {
                            ".": {},
                            "f:apiVersion": {},
                            "f:kind": {},
                            "f:name": {},
                            "f:namespace": {},
                            "f:resourceVersion": {},
                            "f:uid": {}
                        },
                        "f:mountOptions": {},
                        "f:nodeAffinity": {
                            ".": {},
                            "f:required": {
                                ".": {},
                                "f:nodeSelectorTerms": {}
                            }
                        },
                        "f:persistentVolumeReclaimPolicy": {},
                        "f:storageClassName": {},
                        "f:volumeMode": {}
                    },
                    "f:status": {
                        "f:phase": {}
                    }
                },
                "manager": "vcluster",
                "operation": "Update",
                "time": "2023-10-24T11:56:34Z"
            }
        ],
        "name": "pvc-aef9f769-c93b-4dd3-b9da-63cbf7c33dee",
        "resourceVersion": "42760",
        "uid": "97d8ac5a-bc00-4c85-b819-39113adacd62"
    },
    "spec": {
        "accessModes": [
            "ReadWriteOnce"
        ],
        "awsElasticBlockStore": {
            "fsType": "ext4",
            "volumeID": "vol-094d041aaad5ce962"
        },
        "capacity": {
            "storage": "8Gi"
        },
        "claimRef": {
            "apiVersion": "v1",
            "kind": "PersistentVolumeClaim",
            "name": "docker-registry",
            "namespace": "repository",
            "resourceVersion": "42759",
            "uid": "60cde9fa-827b-48ce-a1b5-5399c237bed5"
        },
        "mountOptions": [
            "debug"
        ],
        "nodeAffinity": {
            "required": {
                "nodeSelectorTerms": [
                    {
                        "matchExpressions": [
                            {
                                "key": "topology.kubernetes.io/zone",
                                "operator": "In",
                                "values": [
                                    "us-east-2a"
                                ]
                            },
                            {
                                "key": "topology.kubernetes.io/region",
                                "operator": "In",
                                "values": [
                                    "us-east-2"
                                ]
                            }
                        ]
                    }
                ]
            }
        },
        "persistentVolumeReclaimPolicy": "Delete",
        "storageClassName": "gp2-delete-encrypted",
        "volumeMode": "Filesystem"
    },
    "status": {
        "phase": "Bound"
    }
}

vcluster values:

ingress:
  annotations:
    external-dns.alpha.kubernetes.io/hostname: vcluster.example.com
  enabled: true
  host: v2.devhost01.control.cluster.cloud
  ingressClassName: nginx-internal
sync:
  ingresses:
    enabled: true
  networkpolicies:
    enabled: true
  nodes:
    enabled: true
  persistentvolumeclaims:
    enabled: true
  persistentvolumes:
    enabled: true
  storageclasses:
    enabled: true
syncer:
  extraArgs:
  - --tls-san=vcluster.example.com
  - --sync-labels="app.kubernetes.io/instance"
  - --mount-physical-host-paths=true
vcluster:
  image: rancher/k3s:v1.21.14-k3s1

Host cluster Kubernetes version

Client Version: v1.28.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.26.

Host cluster Kubernetes distribution

kops version
Client version: 1.28.0 (git-v1.28.0)

vlcuster version

vcluster version 0.16.4

Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)

K3S

OS and Arch

OS: MacOs
Arch: Sonoma
@FabianKramm
Copy link
Member

Hey @SCLogo ! Thanks for creating this issue and sorry for the delay, we will need to investigate that.

@SCLogo
Copy link
Author

SCLogo commented Oct 31, 2023

thanks. Let me know if you need anything else

@SCLogo
Copy link
Author

SCLogo commented Jan 15, 2024

@FabianKramm did you find anything ?

@SCLogo
Copy link
Author

SCLogo commented Jan 15, 2024

found few more things:

found the following erron in syncer:

E0115 10:41:09.440703       1 controller.go:329] controller persistent-volume-claim: controllerGroup  controllerKind PersistentVolumeClaim: PersistentVolumeClaim klog.ObjectRef{Name:"datadir-mongodb-0", Namespace:"mongo"}: namespace mongo name datadir-mongodb-0: reconcileID "fce4801d-207c-4cef-923d-b2ec26ae9e82": Reconciler error  "vcluster-pvc-938e79e1-6842-42ae-9c07-7e0dde54337c-x--67000decb8" not found

in velero the pv does not contain the -x--67000decb8 part.

@SCLogo
Copy link
Author

SCLogo commented Mar 28, 2024

using velero inside with enabledCSI. the snapshot created successfully, but in syncer I see the following issue:

cannot sync virtual object as unmanaged physical object exists with desired name

@SCLogo
Copy link
Author

SCLogo commented Mar 28, 2024

can be the possible issue that when velero creates the volumesnapshot then syncer sync the object out to the hostcluster and snapshotter on hostcluster creates the volumesnapshotcontent, and sync that back into the vcluster. But because it was not created by vcluster and that's why not managed by that and that's why it does not sync out the changes ?

@SCLogo
Copy link
Author

SCLogo commented Apr 11, 2024

new info:

 error getting handle for DataSource Type VolumeSnapshot by Name volsync-backup-redis-master-src-x-redis-x-backup66: snapshot volsync-backup-redis-master-src-x-redis-x-backup66 not bound

I see that during sync the spec.source.volumeSnapshotContentName in volumesnapshot object and the spec.volumeSnapshotRef.name in volumesnapshotcontent fields did not change. Probably that's why provisioner cannot find the objects and cannot bound them together.

@rohantmp
Copy link
Contributor

rohantmp commented May 7, 2024

Hi, sorry for taking a while to reply. Have started and paused trying to repro a few times(having been pulled away to other tasks). I've now since skimmed some valero docs, but it would help if you could be more specifc with the repro steps assuming I'm not familiar with Valero.

I may be able to read the docs and clear this up on my own, but this would speed things up:

What does it mean to:

Run velero on both host and vcluster

I understand valero is a backup tool, what is the backup source and target in this scenario

sync out pvc pv from vcluster to host

do you mean vcluster sync? sync is a vcluster term so if you're using it to refer to a valero operation, this is unclear

do a backup inside vcluster

does this mean back up the vcluster or back up to the vcluster? I assume the former

velero creates the snapshot correctly

where can I observe this?

delete pv

in vcluster?

@SCLogo
Copy link
Author

SCLogo commented May 7, 2024

Hi, sorry for taking a while to reply. Have started and paused trying to repro a few times(having been pulled away to other tasks). I've now since skimmed some valero docs, but it would help if you could be more specifc with the repro steps assuming I'm not familiar with Valero.

I may be able to read the docs and clear this up on my own, but this would speed things up:

What does it mean to:

Run velero on both host and vcluster
Wanted to have separated backup for hostcluster and for vcluster so when I delete something in vcluster I wanted to run velero restore inside vcluster and restore the object.

I understand valero is a backup tool, what is the backup source and target in this scenario

backup source :
velero running on hostcluster: Source should be the objects on hostcluster excluded vcluster NS (and all of its resources)
velero running on vcluster: source should be all objects in vcluster.

sync out pvc pv from vcluster to host

do you mean vcluster sync? sync is a vcluster term so if you're using it to refer to a valero operation, this is unclear

yes using vcluster sync. so we have statefulsets inside vcluster and we need to use pv/pvc sync to have real disks

do a backup inside vcluster

does this mean back up the vcluster or back up to the vcluster? I assume the former

Backup inside the vcluster. Do a backup like a vcluster would be a separated cluster

velero creates the snapshot correctly

where can I observe this?

on that time we used aws plugin for velero and on aws I saw the snapshots created by velero.

delete pv

in vcluster?

correct. delete pv/pvc. with that ask the ebs controller to delete the ebs volume

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants