New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup RADOS namespace with forced deletion annotation #14052
Conversation
d6f684b
to
92c96c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some small suggestions
logger.Error(retErr) | ||
} | ||
|
||
// TODO: rather than deleting add tasks to trash the image? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean to add tasks to trash it? Since force deleting, seems like we should directly delete all the images as you have here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the TODO for now.
There was one suggestion by Madhu to use following command instead of deleting the images
rbd trash mv <RBD image image>
ceph rbd task trash remove <RBD image>
pkg/daemon/ceph/client/image.go
Outdated
@@ -41,8 +41,11 @@ type CephBlockImage struct { | |||
InfoName string `json:"name"` | |||
} | |||
|
|||
func ListImages(context *clusterd.Context, clusterInfo *ClusterInfo, poolName string) ([]CephBlockImage, error) { | |||
func ListImages(context *clusterd.Context, clusterInfo *ClusterInfo, poolName, namespace string) ([]CephBlockImage, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For code readability, how about creating two methods? The latter will have exactly your implementation, and the first one just calls the second one.
func ListImages(context *clusterd.Context, clusterInfo *ClusterInfo, poolName, namespace string) ([]CephBlockImage, error) { | |
func ListImagesInPool(context *clusterd.Context, clusterInfo *ClusterInfo, poolName string) ([]CephBlockImage, error) { | |
return ListImagesInRadosNamespace(context, clusterInfo, poolName, "") | |
} | |
func ListImagesInRadosNamespace(context *clusterd.Context, clusterInfo *ClusterInfo, poolName, namespace string) ([]CephBlockImage, error) { |
pkg/daemon/ceph/client/image.go
Outdated
@@ -116,10 +133,13 @@ func CreateImage(context *clusterd.Context, clusterInfo *ClusterInfo, name, pool | |||
return &CephBlockImage{Name: name, Size: newSizeBytes}, nil | |||
} | |||
|
|||
func DeleteImage(context *clusterd.Context, clusterInfo *ClusterInfo, name, poolName string) error { | |||
func DeleteImage(context *clusterd.Context, clusterInfo *ClusterInfo, name, poolName, namespace string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I'd suggest DeleteImageInPool()
and DeleteImageInRadosNamespace()
92c96c1
to
663fc94
Compare
806567a
to
6581c50
Compare
2a33156
to
499bee1
Compare
| Custom Resource | Ceph Resources to be cleaned up | | ||
| -------- | ------- | | ||
| CephFilesystemSubVolumeGroup | CSI stored RADOS OMAP details for pvc/volumesnapshots, subvolume snapshots, subvolume clones, subvolumes | | ||
| CephBlockPoolRadosNamespace | Images and snapshots in the rados namespace| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| CephBlockPoolRadosNamespace | Images and snapshots in the rados namespace| | |
| CephBlockPoolRadosNamespace | Images and snapshots in the RADOS namespace| |
499bee1
to
0b8517a
Compare
) | ||
|
||
func RadosNamespaceCleanup(context *clusterd.Context, clusterInfo *client.ClusterInfo, poolName, radosNamespace string) error { | ||
logger.Infof("starting clean up of CephBlockPoolRadosNamespace %q resources in blockpool %q", radosNamespace, poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update blockpool
to cephblockpool
ro match the name in all other places?
return ListImagesInRadosNamespace(context, clusterInfo, poolName, "") | ||
} | ||
|
||
func ListImagesInRadosNamespace(context *clusterd.Context, clusterInfo *ClusterInfo, poolName, namespace string) ([]CephBlockImage, error) { | ||
args := []string{"ls", "-l", poolName} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR we could have used rbd ls -p=replicapool
instead of -l
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Just used the the existing method we had for list images.
err = cephclient.DeleteImageInRadosNamespace(context, clusterInfo, image.Name, poolName, radosNamespace) | ||
if err != nil { | ||
retErr = errors.Wrapf(err, "failed to delete image %q in pool %q.", image.Name, poolName) | ||
logger.Error(retErr) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK rhis doesn't work when we have clones created from the snapshot, when we try to delete the clones on the parent image the parent will fail saying there are clones created from the parent image
sh-4.4$ rbd clone --rbd-default-clone-format 2 replicapool/test@test replicapool/test2
sh-4.4$
sh-4.4$ rbd ls replicapool
test
test2
sh-4.4$ rbd snap purge test
rbd: error opening default pool 'rbd'
Ensure that the default pool has been created or specify an alternate pool name.
sh-4.4$ rbd snap purge replicapool/test
Removing all snapshots: 100% complete...done.
sh-4.4$ rbd rm replicapool/test
2024-04-10T18:04:15.724+0000 7f406b33b580 -1 librbd::api::Image: remove: image has snapshots - not removing
Removing image: 0% complete...failed.
rbd: image has snapshots with linked clones - these must be deleted or flattened before the image can be removed.
sh-4.4$
sh-4.4$ ceph rbd task add trash remove replicapool/911d3a445d20
{"sequence": 1, "id": "2fc6afa4-1239-45f9-98e6-2a4298d8839c", "message": "Removing image replicapool/911d3a445d20 from trash", "refs": {"action": "trash remove", "pool_name": "replicapool", "pool_namespace": "", "image_id": "911d3a445d20"}}
sh-4.4$ rbd trash mv replicapool/test2
sh-4.4$ ceph rbd task add trash remove replicapool/ee91d79d6186
{"sequence": 2, "id": "fed70961-a430-4b6d-8594-f8e18f824905", "message": "Removing image replicapool/ee91d79d6186 from trash", "refs": {"action": "trash remove", "pool_name": "replicapool", "pool_namespace": "", "image_id": "ee91d79d6186"}}
sh-4.4$ ceph rbd task add trash remove replicapool/ee91d79d6186
{"sequence": 2, "id": "fed70961-a430-4b6d-8594-f8e18f824905", "message": "Removing image replicapool/ee91d79d6186 from trash", "refs": {"action": "trash remove", "pool_name": "replicapool", "pool_namespace": "", "image_id": "ee91d79d6186"}}
please test it out and see all the images are deleted from pool/namespace and also from trash.
2f8ede6
to
57b46b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the latest changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some nits, LGTM
err = cephclient.MoveImageToTrashInRadosNamespace(context, clusterInfo, poolName, image.Name, radosNamespace) | ||
if err != nil { | ||
retErr = errors.Wrapf(err, "failed to delete image %q in cephblockpool %q.", image.Name, poolName) | ||
logger.Error(retErr) | ||
} | ||
err = cephclient.DeleteImageFromTrashInRadosNamespace(context, clusterInfo, poolName, image.ID, radosNamespace) | ||
if err != nil { | ||
retErr = errors.Wrapf(err, "failed to delete image %q in cephblockpool %q.", image.Name, poolName) | ||
logger.Error(retErr) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above we already have a for loop for image and we can move this check to above for loop.
for _, image := range images { | ||
err = cephclient.MoveImageToTrashInRadosNamespace(context, clusterInfo, poolName, image.Name, radosNamespace) | ||
if err != nil { | ||
retErr = errors.Wrapf(err, "failed to delete image %q in cephblockpool %q.", image.Name, poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed to move image to trash
add radosnamespace name to the error message (please take care of it in all the places
} | ||
err = cephclient.DeleteImageFromTrashInRadosNamespace(context, clusterInfo, poolName, image.ID, radosNamespace) | ||
if err != nil { | ||
retErr = errors.Wrapf(err, "failed to delete image %q in cephblockpool %q.", image.Name, poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed to add task to remove image from trash
add radosnamespace name to the error message
pkg/daemon/ceph/client/image.go
Outdated
func ListImagesInPool(context *clusterd.Context, clusterInfo *ClusterInfo, poolName string) ([]CephBlockImage, error) { | ||
return ListImagesInRadosNamespace(context, clusterInfo, poolName, "") | ||
} | ||
|
||
func ListImagesInRadosNamespace(context *clusterd.Context, clusterInfo *ClusterInfo, poolName, namespace string) ([]CephBlockImage, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments for these exported functions?
cmd.JsonOutput = true | ||
buf, err := cmd.Run() | ||
if err != nil { | ||
return snapshots, errors.Wrapf(err, "failed to list snapshots of image %q in cephblockpool %q", imageName, poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add radosnamespace name to the error message?
cmd := NewRBDCommand(context, clusterInfo, args) | ||
_, err := cmd.Run() | ||
if err != nil { | ||
return errors.Wrapf(err, "failed to delete snapshot %q of image %q in cephblockpool %q", snapshot, imageName, poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add radosnamespace name to the error message?
} | ||
|
||
// DeleteSnapshotInRadosNamespace deletes a image snapshot created in block pool in a given rados namespace | ||
func DeleteSnapshotInRadosNamespace(context *clusterd.Context, clusterInfo *ClusterInfo, poolName, imageName, snapshot, namespace string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove below if
check as we are deleting snapshots in a rados namespace, same can be done for all other functions as well.
if namespace != "" {
args = append(args, "--namespace", namespace)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applicable for all other functions
pkg/daemon/ceph/client/image.go
Outdated
@@ -196,3 +273,10 @@ func UnMapImage(context *clusterd.Context, clusterInfo *ClusterInfo, imageName, | |||
func getImageSpec(name, poolName string) string { | |||
return fmt.Sprintf("%s/%s", poolName, name) | |||
} | |||
|
|||
func getImageSpecInRadosNamespace(name, poolName, namespace string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name, poolName, namespace string
to poolName, namespace,image string
lets keep it in order
pkg/daemon/ceph/client/image.go
Outdated
func getImageSpecInRadosNamespace(name, poolName, namespace string) string { | ||
return fmt.Sprintf("%s/%s/%s", poolName, namespace, name) | ||
} | ||
func getImageSnapshotSpec(name, poolName, snapshot string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here as well
Clean up pool images and snapshots in the radosnamespace Signed-off-by: sp98 <sapillai@redhat.com>
57b46b2
to
f6b1449
Compare
Feedback addressed, or else let's follow up in a separate PR
Cleanup RADOS namespace with forced deletion annotation (backport #14052)
If RadosNamespace CR is deleted with
rook.io/force-deletion
annotation then start a clean up job that will delete the images and snapshots on the radosnamespace.Steps followed for cleanup:
Operator logs after deleting radosNamespace CR with annotation
operator-logs.txt
Checklist: