Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding design updates to handle block device with kopia #6590

Merged
merged 1 commit into from
Aug 17, 2023

Conversation

shawn-hurley
Copy link
Contributor

Thank you for contributing to Velero!

Please add a summary of your change

This will add the design elements for handling block device's. We are making a small change to the VGDP provider interface so that the data type can be based on and used by data movers.

I don't believe this is an API change that is user-facing or even plugin-facing. Can someone help me make sure?

Please indicate you've done the following:

  • Accepted the DCO. Commits without the DCO will delay acceptance.
  • Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • Updated the corresponding documentation in site/content/docs/main.

@kaovilai
Copy link
Contributor

kaovilai commented Aug 2, 2023

/kind changelog-not-required

@github-actions github-actions bot added the kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes label Aug 2, 2023
@Lyndon-Li Lyndon-Li added kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes and removed kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes labels Aug 3, 2023
@dzaninovic
Copy link
Contributor

dzaninovic commented Aug 8, 2023

I got the backup and restore to work in the implementation this design is based on:
catalogicsoftware#1

There is still a lot of work remaining to be done in the implementation.

@shawn-hurley
Copy link
Contributor Author

shawn-hurley commented Aug 8, 2023

@Lyndon-Li Please comment on the PR where I attempt your proposal if you would like

catalogicsoftware#2

I think that @dzaninovic brings up good points about the reader, but we can use the virutalfs streaming file to ensure that Kopia handles it like a streaming file.

I think that for restoration, the performance impacts of writing zeros could be a real concern. I think that we really should consider making a wrapper for the output like here:

https://github.com/catalogicsoftware/velero_block/pull/1/files#diff-5fce85fc3f9b0abc12ba749f6318d311e9177847e1a7e026abd6a8ec90a43827R16-R20

Like I said in the comment on the PR I am planning on updating the design with the above. If it makes sense please let me know so that I can adjust when updating tomorrow :)

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Aug 9, 2023

@shawn-hurley Let me comment on your changes in catalogicsoftware#2 here below, so that we can make the discussion centrally.
Totally speaking, if both the backup and restore work through virtualfs, it will be a good choice, because it involves less potential breaks by future Kopia version and it also make things simpler.

@Lyndon-Li
Copy link
Contributor

1. For the IO optimization, like DIO:
As I mentioned here, the current Kopia file system uploader doesn't support this, so we need a completely separate implementation.
For this implementation, we not only need to implement a reader but also a writer (because the writing also benefits from the DIO).
Moreover, in order to use DIO, the IO pattern will be different because the read & write have to be aligned with sector size.

For all these complications, a new uploader -- block uploader is more suitable.
Therefore, together with other requirements as listed here , I categorized this to the future block level backup instead of the current Phase-0.

On the other hand, I am not saying that the current virtualfs implementation doesn't support these IO optimizations technically. As you can see, the StreamingFileFromReader accepts an io.ReadCloser interface, we can implement interface and open & read the block device through DIO.

Conclusively:

  1. I don't suggest we have these IO optimizations in a patch release. Regarding to data consistency, there are more potential problems for block IO, so we need more time for testing & verification.
  2. If we want to do it for Phase-0 in a minor release, we need to wrap the implementation well enough so that the future block level backup could reuse it

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Aug 9, 2023

2. For skipping zero bytes during restore:
I don't think it is a good choice.
Zero bytes are not always meaningless and should be applied to the target disk in any way.
Once a new volume/PV is provisioned, we cannot assume that the entire disk data are all-zero. Storages usually don't clear the disk data unless they are requested to do this, i.e., for strong security requirement (think about how long it takes to write all zero to a large disk). As a result, if in some area of the disk, the data should be zero, but we skip writing zero to the area and the area for the provisioned disk is not zero, data corruption will happen.

Moreover, detecting zero bytes are not an easy task either. Simply enumerating each byte one by one is not a good choice, think about how much CPU it takes for large disk.

Actually, detecting and applying zero bytes involve backup repo and storage side support, we don't need to consider it in Phase-0 and even the initial version of block level backup doesn't need to support it.

@Lyndon-Li
Copy link
Contributor

3. check target file is a block device from OS level
This is a good to have. We get the volume mode from Kubernetes first and then double check this in a different way from the uploader.
Moreover, I would suggest we wrap this check to a separate function, as it is useful for future modules/functionalities.

@shawn-hurley
Copy link
Contributor Author

I don't suggest we have these IO optimizations in a patch release. Regarding to data consistency, there are more potential problems for block IO, so we need more time for testing & verification.
If we want to do it for Phase-0 in a minor release, we need to wrap the implementation well enough so that the future block level backup could reuse it

I believe the answer here, IMO, is to just use Golang's built-in reader from os. File. This seems to work. @dzaninovic, if we need to add the performance bits, I think it does make sense to make this part of a more full feature block uploader that can also handle CBT IMO.

@Lyndon-Li just making sure, but you are ok with this? @dzaninovic does this work for you?

@dzaninovic
Copy link
Contributor

I believe the answer here, IMO, is to just use Golang's built-in reader from os. File. This seems to work. @dzaninovic, if we need to add the performance bits, I think it does make sense to make this part of a more full feature block uploader that can also handle CBT IMO.

@Lyndon-Li just making sure, but you are ok with this? @dzaninovic does this work for you?

I am fine with having optimizations added later in separate PRs and use the simple way for now if that is acceptable.

@dzaninovic
Copy link
Contributor

2. For skipping zero bytes during restore: I don't think it is a good choice. Zero bytes are not always meaningless and

This can be a non-default option that user can set if they know that disks are provisioned with all zeroes.

@shawn-hurley shawn-hurley force-pushed the add-design-block-vol branch 2 times, most recently from c466d34 to 572d601 Compare August 9, 2023 16:59
@shawn-hurley
Copy link
Contributor Author

@sseago @shubham-pampattiwar @Lyndon-Li @dzaninovic @weshayutin

Quick ping to inform folks that the design is updated based on the conversations. Please take a look and let me know what y'all think!

@weshayutin
Copy link
Contributor

I've read through this design, which looks good and the interface change pr 6608 which is also looking good.

sseago
sseago previously approved these changes Aug 9, 2023
@Lyndon-Li
Copy link
Contributor

The current design looks good to me.

Lyndon-Li
Lyndon-Li previously approved these changes Aug 10, 2023
@Lyndon-Li
Copy link
Contributor

@dzaninovic

2. For skipping zero bytes during restore: I don't think it is a good choice. Zero bytes are not always meaningless and

This can be a non-default option that user can set if they know that disks are provisioned with all zeroes.

Personally, I don't think we can use this as an optional method. It is too dangerous since all-zero provision is hard to config, detect and not all upper level scenarios direct to a new provision; and it is not a good bargain to detect zero data by per-byte enumeration.
Anyway, as mentioned above, detecting and applying zero bytes is an advanced feature, let's see what we can deliver in future when we have more mature backup, backup repo and restore.

@dzaninovic
Copy link
Contributor

I reverted catalogicsoftware#2 where virtualfs.StreamingFileFromReader() was attempted because it was causing a backup failure.

time="2023-08-09T21:31:47Z" level=error msg="Async fs backup data path failed" dataupload=backup1-cv4mr error="Failed to run kopia backup: Failed to upload the kopia snapshot for si default@default:block1/pvc-raw: unsupported source: default@default:block1/pvc-raw" error.file="/go/pkg/mod/github.com/kopia/kopia@v0.13.0/snapshot/snapshotfs/upload.go:1291" error.function="github.com/kopia/kopia/snapshot/snapshotfs.(*Uploader).Upload" logSource="pkg/controller/data_upload_controller.go:328"

By looking at the Kopia code I can see that it fails because fs.StreamingFile is not considered in the switch statement.
https://github.com/kopia/kopia/blob/v0.13.0/snapshot/snapshotfs/upload.go#L1291

@shawn-hurley will attempt to use this instead so if that is successful and more preferable than my original solution we will use that:
https://github.com/kopia/kopia/blob/master/fs/virtualfs/virtualfs.go#L96

In any case design will have to change to account for this change so don't merge it yet.

Signed-off-by: Shawn Hurley <shawn@hurley.page>
@codecov
Copy link

codecov bot commented Aug 14, 2023

Codecov Report

Merging #6590 (572d601) into main (bb96c21) will decrease coverage by 0.02%.
Report is 31 commits behind head on main.
The diff coverage is n/a.

❗ Current head 572d601 differs from pull request most recent head ebaf316. Consider uploading reports for the commit ebaf316 to get more accurate results

@@            Coverage Diff             @@
##             main    #6590      +/-   ##
==========================================
- Coverage   60.18%   60.17%   -0.02%     
==========================================
  Files         242      242              
  Lines       25640    25679      +39     
==========================================
+ Hits        15432    15452      +20     
- Misses       9135     9150      +15     
- Partials     1073     1077       +4     

see 8 files with indirect coverage changes

@shawn-hurley
Copy link
Contributor Author

@Lyndon-Li @sseago @shubham-pampattiwar

I have once again updated the design. We have to do some specific things during the restore to handle the block device and how things work from the node.

We still use the virtualfs to save the data into the kopia repository.

Please let me now what y'all think

@dzaninovic My new virtualfs PR should work for restore now!

@dzaninovic
Copy link
Contributor

@dzaninovic My new virtualfs PR should work for restore now!

I merged your code and confirmed that md5 of restored data is correct.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Aug 15, 2023

@shawn-hurley
I checked the code. The restore part looks fine as it is adding some more checks on the block device. Merely, I am just curious whether the restore part code are must-have (vs better to have)?

from your statement:

do some specific things during the restore to handle the block device and how things work from the node

It looks like these code have fixed some problems, without of which the restore will fail. I didn't get what the problems are. So if it is true, could you share some more details?

@shawn-hurley
Copy link
Contributor Author

It looks like these codes have fixed some problems, without which the restore will fail. I didn't get what the issues were. So if it is true, could you share some more details?

The most significant change is that the built-in output will try and create a directory where the block device is. This fails, so we check to make sure that the file is a block device, and if it is, we skip the creation of the directory because it exists.

Another change is the writing of the blocks. When using the default FileSystemOutput.WriteFile, was changing the data in some way (the SHA sums of the resulting data were inconsistent). I did not look into why because even if we understood why, we would need to either write the code that exists or we would be required to make some change upstream.

@kaovilai
Copy link
Contributor

#6608 merged

@sseago
Copy link
Collaborator

sseago commented Aug 17, 2023

"codecov/project — 60.18% (-0.01%) compared to bb96c21"
This failure isn't relevant for this PR since it only includes design documentation and no actual new code that can be tested.

@sseago sseago merged commit 0e7c417 into vmware-tanzu:main Aug 17, 2023
7 of 8 checks passed
@dzaninovic dzaninovic mentioned this pull request Aug 18, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants