Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support backup on volumeMode block via Data mover (Kopia) #6548

Closed
shubham-pampattiwar opened this issue Jul 25, 2023 · 23 comments · Fixed by #6680
Closed

Support backup on volumeMode block via Data mover (Kopia) #6548

shubham-pampattiwar opened this issue Jul 25, 2023 · 23 comments · Fixed by #6680

Comments

@shubham-pampattiwar
Copy link
Collaborator

shubham-pampattiwar commented Jul 25, 2023

Describe the problem/challenge you have

Currently Velero Datamover does not support backup for PVs whose volumeMode is Block.

Describe the solution you'd like

Make Velero Datamover Backup and restore operations work for PVs with volumeMode as Block

Anything else you would like to add:

Environment:

  • Velero version (use velero version):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "The project would be better with this feature added"
  • 👎 for "This feature will not enhance the project in a meaningful way"
@draghuram
Copy link
Contributor

@shubham-pampattiwar Considering that FSB and data mover backup work differently, it would be better to split this into two separate issues.

@dzaninovic
Copy link
Contributor

I am looking into what is needed to support Block PVCs for data mover path so I tried to run a backup to see what will fail.

PVC provisioned from the snapshot is Pending and the error message is “failed pre-populate data from snapshot d7f830c4-280c-11ee-8a47-0242ac110003: exit status 1: tar: invalid magic”.
volumesnapshot and volumesnapshotcontent are ready.

$ kubectl get volumesnapshot -A
NAMESPACE   NAME            READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS            SNAPSHOTCONTENT   CREATIONTIME   AGE
velero      backup3-q5b6p   true                     backup3-q5b6p           1Gi           csi-hostpath-snapclass   backup3-q5b6p     27m            27m

$ kubectl get volumesnapshotcontent -A
NAME            READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                VOLUMESNAPSHOTCLASS      VOLUMESNAPSHOT   VOLUMESNAPSHOTNAMESPACE   AGE
backup3-q5b6p   true         1073741824    Delete           hostpath.csi.k8s.io   csi-hostpath-snapclass   backup3-q5b6p    velero                    27m

PVC VolumeMode is Filesystem instead of Block.

$ kubectl describe pvc backup3-q5b6p -n velero
Name:          backup3-q5b6p
Namespace:     velero
StorageClass:  csi-hostpath-sc
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   volume.beta.kubernetes.io/storage-provisioner: hostpath.csi.k8s.io
               volume.kubernetes.io/storage-provisioner: hostpath.csi.k8s.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      backup3-q5b6p
Used By:     backup3-q5b6p
Events:
  Type     Reason              Age                From                                                                           Message
  ----     ------              ----               ----                                                                           -------
  Warning  ProvisioningFailed  26s                hostpath.csi.k8s.io_csi-hostpathplugin-0_77083f3a-3d1c-44c7-ab75-3be9f8449218  failed to provision volume with StorageClass "csi-hostpath-sc": error getting handle for DataSource Type VolumeSnapshot by Name backup3-q5b6p: snapshot backup3-q5b6p is not Ready
  Normal   Provisioning        11s (x5 over 26s)  hostpath.csi.k8s.io_csi-hostpathplugin-0_77083f3a-3d1c-44c7-ab75-3be9f8449218  External provisioner is provisioning volume for claim "velero/backup3-q5b6p"
  Warning  ProvisioningFailed  11s (x4 over 25s)  hostpath.csi.k8s.io_csi-hostpathplugin-0_77083f3a-3d1c-44c7-ab75-3be9f8449218  failed to provision volume with StorageClass "csi-hostpath-sc": rpc error: code = Unknown desc = failed pre-populate data from snapshot d7f830c4-280c-11ee-8a47-0242ac110003: exit status 1: tar: invalid magic
tar: short read
  Normal  ExternalProvisioning  4s (x3 over 26s)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "hostpath.csi.k8s.io" or manually created by system administrator

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Jul 26, 2023

Block level backup is an valuable feature and it has been around the discussion in several occasions. However, we cannot implement this based on the current FS level data mover. Reasons:

  1. Once a volume is mounted in block level, we cannot see something like a file/file system object, we see a raw disk device, for which, we need to use special approach to read data from it. i.e., the block level read cannot be processed in random size, it should be aligned with sector size; moreover, the block device cannot be opened like an ordinal file
  2. We need to figure out how to store this block data in the backup repo. Most simply speaking, we save each disk as a repo object, but it is not practical:
  • A disk may be very large it is less efficient to save such a large object
  • The most useful feature for a block level backup is its ability to integrate with CBT(change block tracking), so we should design the format and organization of the data in repo so as to maintain the relationship between parent object and incremental objects

Totally speaking, we need a new uploader and also a new data format/organization of backup repo. And as you can see in the Unified Repository design and Volume Snapshot Data Mover design, we have left spaces for the new uploader and repo module.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Jul 26, 2023

Pod Volume Backup is even impossible even if it supports reading data from block level because it is totally inconsistent from the block level, as a result, the volume restored will be corrupted and cannot be mount

@draghuram
Copy link
Contributor

@Lyndon-Li, You bring up some good points but we should distinguish between FSB and new data mover path and focus only on the latter. Since snapshot is going to be taken of the block PV, it should be ok to read the device as a file and back it up. I agree that CBT would help but it is going to take long while - first for Kubernetes to agree on the API and then for CSI drivers to implement it. It would be better to have a solution in place before then. Also note that Kopia will dedup all the zero blocks in the device but we cannot avoid the overhead of reading the device.

Finally, we supported Block PVCs at CloudCasa using Kopia for a while and it is certainly doable. @dzaninovic is exploring Velero data mover code (see his test results above) and he will soon update with his findings about possible fixes.

@shubham-pampattiwar shubham-pampattiwar changed the title Support backup on volumeMode block via FSBackup (Kopia) and built-in Data mover (Kopia) Support backup on volumeMode block via Data mover (Kopia) Jul 27, 2023
@shubham-pampattiwar
Copy link
Collaborator Author

@draghuram @Lyndon-Li Updated the issue title and description, lets use this issue for data mover.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Jul 27, 2023

it should be ok to read the device as a file and back it up
we supported Block PVCs at CloudCasa using Kopia

Yes, it is doable to read from a block device. However, we cannot do this by kopia uploader, because Kopia uploader is a file system uploader. Whereas to read block data:

  • When we read from a block device, the read size should not be random, but be aligned with the sector size
  • For the IO mode, we need to use DIO instead of buffered IO because it is much less efficient
  • We finally need to integrate with CBT so the uploader should be able to query CBT in future

By that I mean, we will need a new uploader, i.e., block uploader, to do the block device backup.
If that is not the case in CloudCasa, or if you are directly using Kopia tool to backup a block device, please help to share more details.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Jul 27, 2023

it should be ok to read the device as a file and back it up

If we back up the entire block device data as a single file and put it to kopia repository, or the file should be represented as a object in the repo, it indeed works.
However, that is not what we want eventually as the reasons I have mentioned originally.
For sure, we can do this as the first phase, but eventually we need to get into the final state. So we need to keep this clear if we want to do something by phases.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Jul 27, 2023

By all the above, I am not saying the current data movement architecture doesn't support block level backup and as I've mentioned above, we have left space for block level backup in the Unified Repo design and Volume Snapshot Data Movement design.
I just want to say that it involves large efforts, we cannot treat it as a bug fix.
Even for the first phase, we at least need an implementation for the new block data uploader and also some repo side changes to write the block data as a repo object.

Therefore, we can discuss this for the plan of 1.13 and see what can be included.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Jul 27, 2023

For 1.12, the current data mover code deliberately set the volumeMode to filesystem since we lack of data path support (from uploader and repo) for block level backup (that is why you get the above test result). So even if we can create a snapshot of block mode PVC or we can create a PVC from the snapshot, we have no way to back it up.

Moreover, in the document of data mover which will be ready for review around next week, we will clarify that the current data mover only supports file system mode backup.

@Lyndon-Li
Copy link
Contributor

@draghuram See my comments above. Thanks.

@draghuram
Copy link
Contributor

I agree that CBT is the right way to go for block PV backups but it will take some time for the current Kubernetes CBT proposal to reach GA and for CSI drivers to implement it. In the meanwhile, it will be good for Velero to have a decent solution. It is possible to read the device and back it up as a file using Kopia (as we have done in CloudCasa).

I think it would be best to have a POC implementation and then discuss.

@Lyndon-Li
Copy link
Contributor

@draghuram
You can try it to back up the block volumes through Kopia uploader directly. It might just work (for Linux based hosts), I've never tried it.

On the other hand, this way is not compatible with the final block level backup target we want:

  • Support CBT from the uploader
  • Support incremental chain from the backup repo
  • Gain a good performance and reasonable resource consumption for large volumes
  • Support backups of block data from non-disk objects
  • Extensible for hosts based on other OS
  • ...

Originally, I thought we at least need to have each phase compatible with the final target so that we can make an evolution gradually.
However, if the current way is with much less efforts and less impact (i.e., no architectural changes, no repo format changes), we can regard it as Phase-0, at least it could satisfy the requirement to back up volumes in block mode.

I will further discuss this with more members and get you back if any.

@draghuram
Copy link
Contributor

Thanks @Lyndon-Li. I think we are in agreement as what I have been talking about is same as your "Phase-0".

@reasonerjt
Copy link
Contributor

In our internal discussion, we found it's not very clear how it will be implemented, if the change is not minimal I would suggest there can be a simple design to make sure there's no break change and does not block moving towards the longer term goal described in
#6548 (comment)

@weshayutin
Copy link
Contributor

@reasonerjt I suspect a discussion in upcoming community calls is required before coming to any conclusions. To your point designs are always helpful.

@pradeepkchaturvedi pradeepkchaturvedi added the 1.13-candidate issue/pr that should be considered to target v1.13 minor release label Aug 4, 2023
@weshayutin
Copy link
Contributor

@pradeepkchaturvedi the hope here is that we can get this into a 1.12.x not 1.13.0. Perhaps some discussion?

@sseago
Copy link
Collaborator

sseago commented Aug 7, 2023

@weshayutin to facilitate getting this into 1.12.x, we have a PR that adds the new field needed to implement this to the provider interface but without implementing it -- i.e. it will error out for block mode volumes, with the hope of adding support for block mode in 1.12.x. We'd like to get that PR into 1.12.0 so that we get the API modified with the necessary new field before anyone is using it, but actual implementation of block mode support would come in a later patch release. That PR is here: #6608

@shawn-hurley
Copy link
Contributor

Hello folks, if we need to have a quick one off outside the community calls, I would be willing to join and explain my thought process for the implementation PR and the rest of the design.

Right now, the design and implementation PR is out of sync, and I am testing out @Lyndon-Li suggestion to use the virtualfs package before making the changes to the design. If we need them to match right now, I can do an interim update though please let me know what you want and/or how to move this forward.

@pradeepkchaturvedi
Copy link

@weshayutin If community maintainers agree to address this in 1.12.x, I would recommend to update the milestone accordingly. cc: @reasonerjt

@dzaninovic
Copy link
Contributor

Here is the code on which the design was based on:
catalogicsoftware#1

I got the backup and restore to work but this is still work in progress so we will work on improving it.

@dzaninovic
Copy link
Contributor

I created a PR #6680 for this issue.

@reasonerjt reasonerjt added target/1.12.1 and removed 1.13-candidate issue/pr that should be considered to target v1.13 minor release labels Aug 25, 2023
@dzaninovic
Copy link
Contributor

Since I got reassigned to work on another high priority internal project I can't find enough time to work on this so @sshende-catalogicsoftware will take over this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants