Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RBD Async: Failed to mirrored Cloned PVC created from snapshot (PVC from snapshot) #2427

Open
Madhu-1 opened this issue Aug 19, 2021 · 17 comments
Assignees
Labels
component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo

Comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 19, 2021

Failed to mirror PVC created from a snapshot

Steps to Reproduce

Create a PVC
Create a snapshot of PVC
Create a PVC from snapshot
Create VolumeReplication to Enable Replication

74-00f8-11ec-89fe-0242ac110003 GRPC call: /replication.Controller/EnableVolumeReplication
I0819 14:18:34.837531       1 utils.go:178] ID: 2391 Req-ID: 0001-0009-rook-ceph-0000000000000005-33936174-00f8-11ec-89fe-0242ac110003 GRPC request: {"parameters":{"mirroringMode":"snapshot"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000005-33936174-00f8-11ec-89fe-0242ac110003"}
I0819 14:18:34.856837       1 omap.go:86] ID: 2391 Req-ID: 0001-0009-rook-ceph-0000000000000005-33936174-00f8-11ec-89fe-0242ac110003 got omap values: (pool="replicapool-4", namespace="", name="csi.volume.33936174-00f8-11ec-89fe-0242ac110003"): map[csi.imageid:1972a93c67485 csi.imagename:csi-vol-33936174-00f8-11ec-89fe-0242ac110003 csi.volname:pvc-0bad8f8d-e00c-4d98-ab90-af8e5bac65be csi.volume.owner:default]
E0819 14:18:35.326354       1 replicationcontrollerserver.go:245] ID: 2391 Req-ID: 0001-0009-rook-ceph-0000000000000005-33936174-00f8-11ec-89fe-0242ac110003 failed to enable mirroring on "replicapool-4/csi-vol-33936174-00f8-11ec-89fe-0242ac110003" with error: rbd: ret=-22, Invalid argument
E0819 14:18:35.326454       1 utils.go:185] ID: 2391 Req-ID: 0001-0009-rook-ceph-0000000000000005-33936174-00f8-11ec-89fe-0242ac110003 GRPC error: rpc error: code = Internal desc = failed to enable mirroring on "replicapool-4/csi-vol-33936174-00f8-11ec-89fe-0242ac110003" with error: rbd: ret=-22, Invalid argument
rbd mirror image enable replicapool-4/csi-vol-33936174-00f8-11ec-89fe-0242ac110003 snapshot
2021-08-19T14:21:17.992+0000 7ffa219492c0 -1 librbd::api::Mirror: image_enable: mirroring is not enabled for the parent
sh-4.4# rbd info replicapool-4/csi-vol-33936174-00f8-11ec-89fe-0242ac110003
rbd image 'csi-vol-33936174-00f8-11ec-89fe-0242ac110003':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 1972a93c67485
	block_name_prefix: rbd_data.1972a93c67485
	format: 2
	features: layering, operations
	op_features: clone-child
	flags: 
	create_timestamp: Thu Aug 19 14:17:35 2021
	access_timestamp: Thu Aug 19 14:17:35 2021
	modify_timestamp: Thu Aug 19 14:17:35 2021
	parent: replicapool-4/csi-snap-13fc60cc-00f8-11ec-89fe-0242ac110003@csi-snap-13fc60cc-00f8-11ec-89fe-0242ac110003
	overlap: 1 GiB

Note:- if we flatten the image before mirroring than the image can be mirrored

@Madhu-1 Madhu-1 added the component/rbd Issues related to RBD label Aug 19, 2021
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 19, 2021

cc @ShyamsundarR

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Sep 18, 2021
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Sep 20, 2021
@Rakshith-R Rakshith-R added this to the release-3.5.0 milestone Sep 20, 2021
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Oct 20, 2021
@github-actions
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@Madhu-1 Madhu-1 added keepalive This label can be used to disable stale bot activiity in the repo and removed wontfix This will not be worked on labels Oct 28, 2021
@Madhu-1 Madhu-1 reopened this Oct 28, 2021
@humblec
Copy link
Collaborator

humblec commented Jan 4, 2022

@Madhu-1 this is marked against the 3.5.0 release, so please revisit the state.

@Madhu-1 Madhu-1 removed this from the release-3.5.0 milestone Jan 4, 2022
@satoru-takeuchi
Copy link
Contributor

@Madhu-1 I have some questions.

Create a snapshot of PVC
Create a PVC from snapshot

Q1. Does "snapshot" mean VolumeSnapshot resource and the corresponding clone image?
Q2. If the answer to Q1 is yes. Does "a PVC from snapshot" mean a PVC whose source is the above VolumeSnapshot resource?
Q3. Does this succeed if VolumeReplication is also created for the parent image?
Q4. IMO, this problem comes from the limitation of Ceph. In other words, Ceph allows the replication of clone image iff the parent image enables replication. Is my understanding correct?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Oct 16, 2023

@Madhu-1 I have some questions.

Create a snapshot of PVC
Create a PVC from snapshot

Q1. Does "snapshot" mean VolumeSnapshot resource and the corresponding clone image? Q2. If the answer to Q1 is yes. Does "a PVC from snapshot" mean a PVC whose source is the above VolumeSnapshot resource?

Yes

Q3. Does this succeed if VolumeReplication is also created for the parent image?

yes but it depends if the snapshot is not deleted and and its also replicated we can replicate cloned PVC as well.

Q4. IMO, this problem comes from the limitation of Ceph. In other words, Ceph allows the replication of clone image iff the parent image enables replication. Is my understanding correct?

Yes we are working on this feature to support mirroring of cloned PVC as well.

@satoru-takeuchi
Copy link
Contributor

Thank you for your answer. I understood.

Q3. Does this succeed if VolumeReplication is also created for the parent image?
yes but it depends if the snapshot is not deleted and and its also replicated we can replicate cloned PVC as well.

IIRC, we also need to enable mirrorings of intermediate clone images ("csi-snap-XXX" or "csi-vol-XXX-temp").

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Oct 16, 2023

Thank you for your answer. I understood.

Q3. Does this succeed if VolumeReplication is also created for the parent image?
yes but it depends if the snapshot is not deleted and and its also replicated we can replicate cloned PVC as well.

IIRC, we also need to enable mirrorings of intermediate clone images ("csi-snap-XXX" or "csi-vol-XXX-temp").

Yes that correct but its hard to maintain such chain as PVC and Volume Snapshot are independent one, we will try to document such drawbacks , @Rakshith-R is working on this one (we might get the support of it in next 1 or 2 release)

@ushitora-anqou
Copy link

@Madhu-1 We are trying to mirror a cloned RBD image from one cluster to another one. As you commented above, we enabled mirroring for all of the parent image, the temporary (*-temp) image, and the cloned image. However, the RBD mirroring didn't work correctly.

When we create a cloned image by a PVC-sourced PVC, ceph-csi first creates a temporary snapshot of the sourced image. Then, it deletes the snapshot after creating the cloned image from it.

# The RBD image `csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp` was created by ceph-csi
$ kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd info -p ceph-ssd-block-pool csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp
rbd image 'csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp':
        size 10 GiB in 2560 objects
        order 22 (4 MiB objects)
        snapshot_count: 2
        id: 12b7d4917d0d
        block_name_prefix: rbd_data.12b7d4917d0d
        format: 2
        features: layering, deep-flatten, operations
        op_features: clone-parent, clone-child, snap-trash
        flags:
        create_timestamp: Mon Nov  6 04:56:59 2023
        access_timestamp: Mon Nov  6 04:56:59 2023
        modify_timestamp: Mon Nov  6 04:56:59 2023
        parent: ceph-ssd-block-pool/csi-vol-14794525-a741-4bab-894f-da5dbf33288d@19a7104b-2646-479c-93fb-806f1791cb56 # <--- This image is created from a snapshot
        overlap: 10 GiB
        mirroring state: enabled
        mirroring mode: snapshot
        mirroring global id: 43d70f92-99da-453a-86fb-b3ce033f7387
        mirroring primary: true

$ kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd snap ls -p ceph-ssd-block-pool --all csi-vol-14794525-a741-4bab-894f-da5dbf33288d
SNAPID  NAME                                                                                       SIZE    PROTECTED  TIMESTAMP                 NAMESPACE
   945  19a7104b-2646-479c-93fb-806f1791cb56                                                       10 GiB             Mon Nov  6 04:56:59 2023  trash (csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp) # <--- But the snapshot is already deleted

We verified that RBD mirroring succeeds if we enabled mirroring without deleting the temporary snapshot by directly executing rbd command.

Therefore, it seems that RBD mirroring for the cloned image fails because ceph-csi deletes the temporary snapshot.

Will this be a limitation of ceph-csi? Or, is ceph-csi going to provide a workaround for this problem?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 7, 2023

Therefore, it seems that RBD mirroring for the cloned image fails because ceph-csi deletes the temporary snapshot.

i don't think deleting the snapshot is the problem but deleting the clone image created for volume snapshot is the problem.

@ushitora-anqou Currently it's a limitation with cephcsi but we have a plan to work on it in a few releases. @Rakshith-R can you please add more details?

@Rakshith-R
Copy link
Contributor

Therefore, it seems that RBD mirroring for the cloned image fails because ceph-csi deletes the temporary snapshot.

i don't think deleting the snapshot is the problem but deleting the clone image created for volume snapshot is the problem.

@ushitora-anqou Currently it's a limitation with cephcsi but we have a plan to work on it in a few releases. @Rakshith-R can you please add more details?

I am yet to do testing on this, but I'll add more details soon.

@Rakshith-R
Copy link
Contributor

@Madhu-1 We are trying to mirror a cloned RBD image from one cluster to another one. As you commented above, we enabled mirroring for all of the parent image, the temporary (*-temp) image, and the cloned image. However, the RBD mirroring didn't work correctly.

When we create a cloned image by a PVC-sourced PVC, ceph-csi first creates a temporary snapshot of the sourced image. Then, it deletes the snapshot after creating the cloned image from it.

# The RBD image `csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp` was created by ceph-csi
$ kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd info -p ceph-ssd-block-pool csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp
rbd image 'csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp':
        size 10 GiB in 2560 objects
        order 22 (4 MiB objects)
        snapshot_count: 2
        id: 12b7d4917d0d
        block_name_prefix: rbd_data.12b7d4917d0d
        format: 2
        features: layering, deep-flatten, operations
        op_features: clone-parent, clone-child, snap-trash
        flags:
        create_timestamp: Mon Nov  6 04:56:59 2023
        access_timestamp: Mon Nov  6 04:56:59 2023
        modify_timestamp: Mon Nov  6 04:56:59 2023
        parent: ceph-ssd-block-pool/csi-vol-14794525-a741-4bab-894f-da5dbf33288d@19a7104b-2646-479c-93fb-806f1791cb56 # <--- This image is created from a snapshot
        overlap: 10 GiB
        mirroring state: enabled
        mirroring mode: snapshot
        mirroring global id: 43d70f92-99da-453a-86fb-b3ce033f7387
        mirroring primary: true

$ kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd snap ls -p ceph-ssd-block-pool --all csi-vol-14794525-a741-4bab-894f-da5dbf33288d
SNAPID  NAME                                                                                       SIZE    PROTECTED  TIMESTAMP                 NAMESPACE
   945  19a7104b-2646-479c-93fb-806f1791cb56                                                       10 GiB             Mon Nov  6 04:56:59 2023  trash (csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp) # <--- But the snapshot is already deleted

We verified that RBD mirroring succeeds if we enabled mirroring without deleting the temporary snapshot by directly executing rbd command.

Therefore, it seems that RBD mirroring for the cloned image fails because ceph-csi deletes the temporary snapshot.

Will this be a limitation of ceph-csi? Or, is ceph-csi going to provide a workaround for this problem?

If by temporary snapshot you mean kubernetes snapshot,
We will not be supporting mirroring of such a chain where intermediate k8s snapshot is deleted.
That child PVC which has a intermediate parent image in trash must be flattened in order for mirroring to be enabled on it. We may add an option to flatten such image before enabling mirroring after initial support for clones.

PVC-PVC clone is recommend to overcome this.

@ushitora-anqou
Copy link

@Rakshith-R Sorry. We actually created a cloned PVC from another PVC (#2426).

If by temporary snapshot you mean kubernetes snapshot,

The "temporary snapshot" means the "random snap name" in this document.

We believe that the current implementation of ceph-csi deletes the temporary snapshot, so it does not allow us to create a cloned PVC from another PVC.

@Rakshith-R
Copy link
Contributor

@Rakshith-R Sorry. We actually created a cloned PVC from another PVC (#2426).

If by temporary snapshot you mean kubernetes snapshot,

The "temporary snapshot" means the "random snap name" in this document.

We believe that the current implementation of ceph-csi deletes the temporary snapshot, so it does not allow us to create a cloned PVC from another PVC.

The temporary snapshot is deleted after the cloning process.

@ushitora-anqou
Can you please write down the steps you followed ?
What worked, what did not ?
and what's the expectation ?

@ushitora-anqou
Copy link

@Rakshith-R Sorry, in our last comment I replied with the wrong information. I was able to create a cloned PVC from another PVC, but was unable to mirror the RDB image associated with the cloned PVC.

The steps I took were as follows:

  1. I deployed a PVC to the primary site and created an RBD image (csi-vol-14794525-a741-4bab-894f-da5dbf33288d).

  2. I enabled mirroring between the primary site and the secondary site, referring to the Rook documentation. As a result, I confirmed that the RBD image was mirrored correctly.

  3. I deployed a cloned PVC at the primary site, specifying the above PVC in the dataSource field. As a result, two RBD images were created: csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9 and csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp.

  4. I have enabled mirroring of csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp by running the following rbd command.

    • kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd -p ceph-ssd-block-pool mirror image enable csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp snapshot
  5. I have deployed the VolumeReplication resource for the cloned PVC. I expected that this would result in correct mirroring of the cloned PVC, but in fact the mirroring did not work as described below.

$ kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd mirror pool status -p ceph-ssd-block-pool --verbose
health: ERROR

# ... snip ...

IMAGES

# ... snip ...

# Mirroring of the cloned image was not successful.
csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9:
  global_id:   4aea893a-11a4-41ce-99b2-148e68f7d6a3
  state:       up+stopped
  description: local image is primary
  service:     a on 10.69.0.4
  last_update: 2023-11-09 01:31:55
  peer_sites:
    name: 901d9b89-a189-4496-8742-eddfa9798e71
    state: up+error
    description: error bootstrapping replay
    last_update: 2023-11-09 01:31:57

# The image automatically created by ceph-csi is also not successfully mirrored.
csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp:
  global_id:   43d70f92-99da-453a-86fb-b3ce033f7387
  state:       up+stopped
  description: local image is primary
  service:     a on 10.69.0.4
  last_update: 2023-11-09 01:31:55
  peer_sites:
    name: 901d9b89-a189-4496-8742-eddfa9798e71
    state: up+error
    description: error bootstrapping replay
    last_update: 2023-11-09 01:31:57

# ... snip ...

The result of running rbd info against csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp is as follows:

$ kubectl exec -n ceph-ssd deploy/rook-ceph-tools -- rbd info -p ceph-ssd-block-pool csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp
rbd image 'csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp':
        size 10 GiB in 2560 objects
        order 22 (4 MiB objects)
        snapshot_count: 2
        id: 12b7d4917d0d
        block_name_prefix: rbd_data.12b7d4917d0d
        format: 2
        features: layering, deep-flatten, operations
        op_features: clone-parent, clone-child, snap-trash
        flags:
        create_timestamp: Mon Nov  6 04:56:59 2023
        access_timestamp: Mon Nov  6 04:56:59 2023
        modify_timestamp: Mon Nov  6 04:56:59 2023
        #####
        ##### ceph-csi is creating a *-temp image from a snapshot of the parent image.
        #####
        parent: ceph-ssd-block-pool/csi-vol-14794525-a741-4bab-894f-da5dbf33288d@19a7104b-2646-479c-93fb-806f1791cb56
        overlap: 10 GiB
        mirroring state: enabled ###### Mirroring is correctly enabled (but not successful)
        mirroring mode: snapshot
        mirroring global id: 43d70f92-99da-453a-86fb-b3ce033f7387
        mirroring primary: true

When I checked the RBD mirror daemon, I found the following logs, so I thought that the mirroring was not successful because the snapshot needed to create *-temp had been deleted.

debug 2023-11-06T07:39:55.000+0000 7f2182a23700 -1 librbd::image::OpenRequest: failed to find snapshot 19a7104b-2646-479c-93fb-806f1791cb56
debug 2023-11-06T07:39:55.001+0000 7f2175a09700 -1 librbd::image::CloneRequest: 0x558dda22c000 handle_open_parent: failed to open parent image: (2) No such file or directory
debug 2023-11-06T07:39:55.001+0000 7f2175a09700 -1 rbd::mirror::image_replayer::CreateImageRequest: 0x558ddaaa7e00 handle_clone_image: failed to clone image ceph-ssd-block-pool/12b77c307c09 to csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp
debug 2023-11-06T07:39:55.437+0000 7f2175a09700 -1 librbd::image::CloneRequest: 0x558dda273180 handle_open_parent: failed to open parent image: (2) No such file or directory
debug 2023-11-06T07:39:55.438+0000 7f2175a09700 -1 rbd::mirror::image_replayer::CreateImageRequest: 0x558dda9c2f00 handle_clone_image: failed to clone image ceph-ssd-block-pool/12b7d4917d0d to csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9
debug 2023-11-06T07:40:25.004+0000 7f2183224700 -1 librbd::image::OpenRequest: failed to find snapshot 19a7104b-2646-479c-93fb-806f1791cb56
debug 2023-11-06T07:40:25.005+0000 7f2175a09700 -1 librbd::image::CloneRequest: 0x558dda1c6f00 handle_open_parent: failed to open parent image: (2) No such file or directory
debug 2023-11-06T07:40:25.005+0000 7f2175a09700 -1 rbd::mirror::image_replayer::CreateImageRequest: 0x558ddaa26000 handle_clone_image: failed to clone image ceph-ssd-block-pool/12b77c307c09 to csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9-temp
debug debug 2023-11-06T07:40:25.417+0000 7f2175a09700 -1 rbd::mirror::image_replayer::CreateImageRequest: 0x558dda9c34a0 handle_clone_image: failed to clone image ceph-ssd-block-pool/12b7d4917d0d to csi-vol-ad4b205f-6515-409d-833d-d65c6d812dd9
2023-11-06T07:40:25.417+0000 7f2175a09700 -1 librbd::image::CloneRequest: 0x558dda1c6c80 handle_open_parent: failed to open parent image: (2) No such file or directory

@ushitora-anqou
Copy link

I found that this issue is about mirroring cloned PVC from VolumeSnapshot and my posts are about mirroring cloned PVC from PVC.

I'll create a POC patch to fix my case and attached this in #2426.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo
Projects
None yet
Development

No branches or pull requests

5 participants