qcow: Fix the calculation of the number of clusters for L1 table #4767

lv-mz · 2022-10-12T01:52:00Z

In function create_for_size()，it calculates l1_clusters by dividing by cluster_size, whereas it should be divided by cluster_size/size_of(u64).

This fixes: #4762

likebreath

@lv-mz Thank you for the contribution. Can you please revise your commit message based on our documentation https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/CONTRIBUTING.md?

Can you please also provide some context from the qcow spec to explain what is "l1_clusters" and why its calculation was wrong?

lv-mz · 2022-10-12T07:30:27Z

@lv-mz Thank you for the contribution. Can you please revise your commit message based on our documentation https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/CONTRIBUTING.md?

Can you please also provide some context from the qcow spec to explain what is "l1_clusters" and why its calculation was wrong?

@likebreath Thankyou .
In my own development, I wanted to know how to calculate the number of clusters for L1 table in cloud-hypervisor , so I found the function create_ for_ size(version: u32, size: u64), it is calculated as follows:

let l2_size: u32 = cluster_size / size_of::() as u32;
let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
let num_l2_clusters: u32 = div_round_up_u32(num_clusters, l2_size);
**let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, cluster_size);**

Variables num_l2_cluster and l1_clusters are used in the following code：

QcowHeader {
            magic: QCOW_MAGIC,
            version,
            backing_file_offset: 0,
            backing_file_size: 0,
            cluster_bits: DEFAULT_CLUSTER_BITS,
            size,
            crypt_method: 0,
            l1_size: num_l2_clusters,
            l1_table_offset: u64::from(cluster_size),
            // The refcount table is after l1 + header.
            refcount_table_offset: u64::from(cluster_size * (l1_clusters + 1)),

Therefore, l1_ clusters indicates the number of clusters occupied by the L1 table in the qcow file and
num_l2_clusters(header.l1_size) indicates the number of entries in the L1 table.

I think the number of clusters for L1 table should be calculated as follows:
We divide the total number of entries in the L1 table by the number of entries in each cluster ，not by the cluster size.

For example, if the data cluster size is 64k and L1 and L2 blocks are all one cluster long, and the entyies of L1 and L2 table are 8192, then the disk space that an L1 table can manage is 8192 x 8192 x 64k = 4T. If the disk size exceeds 4T, the L1 table need two clusters.

In function create_ for_ size(version: u32, size: u64)
if the size is greater than 4T (for example, 5T) , version is 2 and cluster size is default 65536, according to the function calculation method，value of l1_clusters will be still 1, In fact, its value should be 2.

If there is any problem with my analysis, please let me know. Thank you

sboeuf · 2022-10-12T08:27:56Z

qcow/src/qcow.rs

        let l2_size: u32 = cluster_size / size_of::<u64>() as u32;
        let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
        let num_l2_clusters: u32 = div_round_up_u32(num_clusters, l2_size);
-        let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, cluster_size);
+        let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, l2_size);


l2_size should really be called entries_per_cluster given we're dividing a cluster size by size_of(u64), which really means we want the number of entries/addresses that can fit into a cluster.

num_clusters looks good given we're dividing the total size in bytes by the size of a cluster in bytes.

num_l2_clusters is fine, but we could update it for more clarity.

l1_clusters should be called num_l1_clusters.

So here is my suggestion to make things a bit clearer:

let entries_per_cluster: u32 = cluster_size / size_of::<u64>() as u32; let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32; let num_l2_clusters: u32 = div_round_up_u32(num_clusters, entries_per_cluster); let num_l1_clusters: u32 = div_round_up_u32(num_l2_clusters, entries_per_cluster);

So to summarize, your change looks great, but I'd like to use this opportunity to rename a few variables to increase clarity.

@sboeuf
thank you very much

sboeuf

Thanks 👍

sboeuf · 2022-10-12T09:40:20Z

/cc @rbradford

rbradford · 2022-10-12T09:43:54Z

qcow/src/qcow.rs

        let num_clusters: u32 = div_round_up_u64(size, u64::from(cluster_size)) as u32;
-        let num_l2_clusters: u32 = div_round_up_u32(num_clusters, l2_size);
-        let l1_clusters: u32 = div_round_up_u32(num_l2_clusters, cluster_size);


You missed that this is calculated a second place in this file.

Can you elaborate? I don't get the point.

marvin:~/src/cloud-hypervisor (main)$ git grep "let l1_clusters = " qcow/src/qcow.rs: let l1_clusters = div_round_up_u64(num_l2_clusters, cluster_size); qcow/src/qcow.rs: let l1_clusters = div_round_up_u64(u64::from(header.l1_size), cluster_size); qcow/src/qcow.rs: let l1_clusters = div_round_up_u64(l2_clusters, cluster_size);

Oh okay the same calculation is done somewhere else :)

@rbradford Thanks，
Yes, there are 9 places that need to be modified， and so I have restored the original representation of l1_clusters.
@sboeuf

@danielverkamp Hi, this bit of code is essentially unmodified from crosvm. Do you think this proposed change makes sense? Looks like it can be found empirically with a disk size > 4TiB

Thanks, the new math (and better names) look correct to me. The diff is a bit hard to read due to the renames, though, so if it were me, I might split it up into two commits just to make the bugfix part totally clear.

For the original submitter, if you're willing to send this to crosvm too once it's finalized (so the change is correctly attributed), that would be greatly appreciated! (https://crosvm.dev/book/contributing/index.html)

In function create_for_size(), when calculating the number of clusters for the L1 table, it incorrectly divides the total number of entries in the L1 table by the cluster size. In fact, it should be divided by the number of entries/addresses that can fit into a cluster. Signed-off-by: lv.mengzhao <lv.mengzhao@zte.com.cn>

lv-mz requested a review from a team as a code owner October 12, 2022 01:52

lv-mz force-pushed the bug_fix branch from 85d239a to 7b8e6cb Compare October 12, 2022 02:34

likebreath reviewed Oct 12, 2022

View reviewed changes

sboeuf reviewed Oct 12, 2022

View reviewed changes

lv-mz force-pushed the bug_fix branch from 7b8e6cb to 372c9bd Compare October 12, 2022 09:34

sboeuf approved these changes Oct 12, 2022

View reviewed changes

sboeuf requested a review from rbradford October 12, 2022 09:40

rbradford requested changes Oct 12, 2022

View reviewed changes

lv-mz force-pushed the bug_fix branch 2 times, most recently from e9a08f9 to c1f861d Compare October 12, 2022 10:49

lv-mz force-pushed the bug_fix branch from c1f861d to 28172fb Compare October 12, 2022 10:55

rbradford changed the title ~~qcow: fix the caculation of l1_clusters~~ qcow: fix the calculation of l1_clusters Oct 12, 2022

rbradford changed the title ~~qcow: fix the calculation of l1_clusters~~ qcow: Fix the calculation of the number of clusters for L1 table Oct 12, 2022

rbradford approved these changes Oct 13, 2022

View reviewed changes

rbradford merged commit 2a45e44 into cloud-hypervisor:main Oct 13, 2022

lv-mz deleted the bug_fix branch October 14, 2022 01:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qcow: Fix the calculation of the number of clusters for L1 table #4767

qcow: Fix the calculation of the number of clusters for L1 table #4767

lv-mz commented Oct 12, 2022

likebreath left a comment

lv-mz commented Oct 12, 2022 •

edited

sboeuf Oct 12, 2022 •

edited

lv-mz Oct 12, 2022 •

edited

sboeuf left a comment

sboeuf commented Oct 12, 2022

rbradford Oct 12, 2022

sboeuf Oct 12, 2022

rbradford Oct 12, 2022

sboeuf Oct 12, 2022

lv-mz Oct 12, 2022

rbradford Oct 12, 2022

danielverkamp Oct 13, 2022

qcow: Fix the calculation of the number of clusters for L1 table #4767

qcow: Fix the calculation of the number of clusters for L1 table #4767

Conversation

lv-mz commented Oct 12, 2022

likebreath left a comment

Choose a reason for hiding this comment

lv-mz commented Oct 12, 2022 • edited

sboeuf Oct 12, 2022 • edited

Choose a reason for hiding this comment

lv-mz Oct 12, 2022 • edited

Choose a reason for hiding this comment

sboeuf left a comment

Choose a reason for hiding this comment

sboeuf commented Oct 12, 2022

rbradford Oct 12, 2022

Choose a reason for hiding this comment

sboeuf Oct 12, 2022

Choose a reason for hiding this comment

rbradford Oct 12, 2022

Choose a reason for hiding this comment

sboeuf Oct 12, 2022

Choose a reason for hiding this comment

lv-mz Oct 12, 2022

Choose a reason for hiding this comment

rbradford Oct 12, 2022

Choose a reason for hiding this comment

danielverkamp Oct 13, 2022

Choose a reason for hiding this comment

lv-mz commented Oct 12, 2022 •

edited

sboeuf Oct 12, 2022 •

edited

lv-mz Oct 12, 2022 •

edited