Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qcow: Fix the calculation of the number of clusters for L1 table #4767

Merged
merged 1 commit into from
Oct 13, 2022

Conversation

lv-mz
Copy link
Contributor

@lv-mz lv-mz commented Oct 12, 2022

In function create_for_size(),it calculates l1_clusters by dividing by cluster_size, whereas it should be divided by cluster_size/size_of(u64).

This fixes: #4762

Copy link
Member

@likebreath likebreath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lv-mz Thank you for the contribution. Can you please revise your commit message based on our documentation https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/CONTRIBUTING.md?

Can you please also provide some context from the qcow spec to explain what is "l1_clusters" and why its calculation was wrong?

@lv-mz
Copy link
Contributor Author

lv-mz commented Oct 12, 2022

@lv-mz Thank you for the contribution. Can you please revise your commit message based on our documentation https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/CONTRIBUTING.md?

Can you please also provide some context from the qcow spec to explain what is "l1_clusters" and why its calculation was wrong?

@likebreath Thankyou .
In my own development, I wanted to know how to calculate the number of clusters for L1 table in cloud-hypervisor , so I found the function create_ for_ size(version: u32, size: u64), it is calculated as follows:

let l2_size: u32 = cluster_size / size_of::() as u32;
let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
let num_l2_clusters: u32 = div_round_up_u32(num_clusters, l2_size);
**let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, cluster_size);**

Variables num_l2_cluster and l1_clusters are used in the following code:

QcowHeader {
            magic: QCOW_MAGIC,
            version,
            backing_file_offset: 0,
            backing_file_size: 0,
            cluster_bits: DEFAULT_CLUSTER_BITS,
            size,
            crypt_method: 0,
            l1_size: num_l2_clusters,
            l1_table_offset: u64::from(cluster_size),
            // The refcount table is after l1 + header.
            refcount_table_offset: u64::from(cluster_size * (l1_clusters + 1)),

Therefore, l1_ clusters indicates the number of clusters occupied by the L1 table in the qcow file and
num_l2_clusters(header.l1_size) indicates the number of entries in the L1 table.

I think the number of clusters for L1 table should be calculated as follows:
We divide the total number of entries in the L1 table by the number of entries in each cluster ,not by the cluster size.

For example, if the data cluster size is 64k and L1 and L2 blocks are all one cluster long, and the entyies of L1 and L2 table are 8192, then the disk space that an L1 table can manage is 8192 x 8192 x 64k = 4T. If the disk size exceeds 4T, the L1 table need two clusters.

In function create_ for_ size(version: u32, size: u64)
if the size is greater than 4T (for example, 5T) , version is 2 and cluster size is default 65536, according to the function calculation method,value of l1_clusters will be still 1, In fact, its value should be 2.

If there is any problem with my analysis, please let me know. Thank you

qcow/src/qcow.rs Outdated
Comment on lines 258 to 261
let l2_size: u32 = cluster_size / size_of::<u64>() as u32;
let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
let num_l2_clusters: u32 = div_round_up_u32(num_clusters, l2_size);
let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, cluster_size);
let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, l2_size);
Copy link
Member

@sboeuf sboeuf Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • l2_size should really be called entries_per_cluster given we're dividing a cluster size by size_of(u64), which really means we want the number of entries/addresses that can fit into a cluster.
  • num_clusters looks good given we're dividing the total size in bytes by the size of a cluster in bytes.
  • num_l2_clusters is fine, but we could update it for more clarity.
  • l1_clusters should be called num_l1_clusters.

So here is my suggestion to make things a bit clearer:

let entries_per_cluster: u32 = cluster_size / size_of::<u64>() as u32;
let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
let num_l2_clusters: u32 = div_round_up_u32(num_clusters, entries_per_cluster);
let num_l1_clusters: u32 = div_round_up_u32(num_l2_clusters, entries_per_cluster);

So to summarize, your change looks great, but I'd like to use this opportunity to rename a few variables to increase clarity.

Copy link
Contributor Author

@lv-mz lv-mz Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sboeuf
thank you very much

Copy link
Member

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

@sboeuf
Copy link
Member

sboeuf commented Oct 12, 2022

/cc @rbradford

@sboeuf sboeuf requested a review from rbradford October 12, 2022 09:40
let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
let num_l2_clusters: u32 = div_round_up_u32(num_clusters, l2_size);
let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, cluster_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You missed that this is calculated a second place in this file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? I don't get the point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marvin:~/src/cloud-hypervisor (main)$ git grep "let l1_clusters = "
qcow/src/qcow.rs:        let l1_clusters = div_round_up_u64(num_l2_clusters, cluster_size);
qcow/src/qcow.rs:            let l1_clusters = div_round_up_u64(u64::from(header.l1_size), cluster_size);
qcow/src/qcow.rs:        let l1_clusters = div_round_up_u64(l2_clusters, cluster_size);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh okay the same calculation is done somewhere else :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rbradford Thanks,
Yes, there are 9 places that need to be modified, and so I have restored the original representation of l1_clusters.
@sboeuf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielverkamp Hi, this bit of code is essentially unmodified from crosvm. Do you think this proposed change makes sense? Looks like it can be found empirically with a disk size > 4TiB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the new math (and better names) look correct to me. The diff is a bit hard to read due to the renames, though, so if it were me, I might split it up into two commits just to make the bugfix part totally clear.

For the original submitter, if you're willing to send this to crosvm too once it's finalized (so the change is correctly attributed), that would be greatly appreciated! (https://crosvm.dev/book/contributing/index.html)

@lv-mz lv-mz force-pushed the bug_fix branch 2 times, most recently from e9a08f9 to c1f861d Compare October 12, 2022 10:49
In function create_for_size(), when calculating the number
of clusters for the L1 table, it incorrectly divides the total number of
entries in the L1 table by the cluster size. In fact, it should be
divided by the number of entries/addresses that can fit into a cluster.

Signed-off-by: lv.mengzhao <lv.mengzhao@zte.com.cn>
@rbradford rbradford changed the title qcow: fix the caculation of l1_clusters qcow: fix the calculation of l1_clusters Oct 12, 2022
@rbradford rbradford changed the title qcow: fix the calculation of l1_clusters qcow: Fix the calculation of the number of clusters for L1 table Oct 12, 2022
@rbradford rbradford merged commit 2a45e44 into cloud-hypervisor:main Oct 13, 2022
@lv-mz lv-mz deleted the bug_fix branch October 14, 2022 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

qcow: error caculation of the l1_clusters in function create_for_size()
5 participants