Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMPROVEMENT] Expose actual size of a logical volume #5947

Closed
shuo-wu opened this issue May 17, 2023 · 29 comments
Closed

[IMPROVEMENT] Expose actual size of a logical volume #5947

shuo-wu opened this issue May 17, 2023 · 29 comments
Assignees
Labels
area/spdk SPDK upstream/downstream area/v2-data-engine v2 data engine (SPDK) kind/improvement Request for improvement of existing function priority/0 Must be fixed in this release (managed by PO)
Milestone

Comments

@shuo-wu
Copy link
Contributor

shuo-wu commented May 17, 2023

Is your improvement request related to a feature? Please describe (👍 if you like this request)

Currently, the logic volume get result won't tell the actual size, then Longhorn cannot calculate the actual size for each volume.

	{
		"name": "dcb09a53-549b-4dd4-9612-3f129ca80518",
		"aliases": [
			"spdk-00/lvol0"
		],
		"product_name": "Logical Volume",
		"block_size": 4096,
		"num_blocks": 25600,
		"uuid": "dcb09a53-549b-4dd4-9612-3f129ca80518",
		"assigned_rate_limits": {
			"rw_ios_per_sec": 0,
			"rw_mbytes_per_sec": 0,
			"r_mbytes_per_sec": 0,
			"w_mbytes_per_sec": 0
		},
		"claimed": true,
		"claim_type": "exclusive_write",
		"zoned": false,
		"supported_io_types": {
			"read": true,
			"write": true,
			"unmap": true,
			"write_zeroes": true,
			"flush": false,
			"reset": true,
			"compare": false,
			"compare_and_write": false,
			"abort": false,
			"nvme_admin": false,
			"nvme_io": false
		},
		"driver_specific": {
			"lvol": {
				"lvol_store_uuid": "b6ec55cb-6615-42dc-8578-a4332bb4607f",
				"base_bdev": "spdk-00",
				"thin_provision": true,
				"snapshot": false,
				"clone": false
			}
		}
	}

Describe the solution you'd like

The logical volume get result should show the actual size/used block number

Additional context

N/A

@shuo-wu shuo-wu added kind/improvement Request for improvement of existing function area/v2-data-engine v2 data engine (SPDK) labels May 17, 2023
@innobead innobead added this to the v1.6.0 milestone May 17, 2023
@innobead innobead added the priority/0 Must be fixed in this release (managed by PO) label May 17, 2023
@DamiaSan
Copy link
Contributor

The size of the logical volume can be obtained with the following formula:
block_size * num_blocks. In the example above the size of spdk-00/lvol0 is 100Mb

@derekbit
Copy link
Member

derekbit commented May 17, 2023

The size of the logical volume can be obtained with the following formula: block_size * num_blocks. In the example above the size of spdk-00/lvol0 is 100Mb

lvol can be thin-provisioned.
IIRC, block_size * num_blocks is the nominal size rather than the actual size.

@DamiaSan
Copy link
Contributor

Ok sorry, actual size/used block number.
About the information you need, is it the number of allocated blocks only in the live data right?

@derekbit
Copy link
Member

Ok sorry, actual size/used block number.
About the information you need, is it the number of allocated blocks only in the live data right?

Yes, the allocated blocks.

@derekbit
Copy link
Member

Check write_zeros if impact the num_allocated_blocks

@DamiaSan
Copy link
Contributor

Check write_zeros if impact the num_allocated_blocks

@derekbit Yes, write_zeros allocate clusters in the same way a normal write do

@shuo-wu I can implement in blobstore a function to retrieve the number of allocated clusters, so the RPC can calculate the real size of the volume. To make that, I think the more performant solution is to insert another variable in spdk_blob_mut_data, otherwise I should lock cluster map every time to perform the calculation. But adding a new variable can be dangerous, because we should add the new variable update every time on upstream is made a new development where the cluster map is modified.
So i want to understand when and how often Longhorn call this RPC.

@shuo-wu
Copy link
Contributor Author

shuo-wu commented Jul 25, 2023

Longhorn SPDK server fetch all lvols (bdev_lvol_get_lvols) every 5s (we can modify it if necessary but it cannot be too large). With this fetched info, the server will calculate the lvol and replica actual size. In other words, retrieving the number of allocated clusters will happen every several seconds.

@DamiaSan
Copy link
Contributor

Ok, so the best solution is to use a new variable. We will then pay attention every time we rebase from upstream to check for new code that modify the cluster map.

@shuo-wu
Copy link
Contributor Author

shuo-wu commented Jul 25, 2023

BTW, how does the maintainer consider the missing of actual size/allocated cluster number in the lvol get result? If the upstream SPDK could accept this feature, do we still need to pay attention to the future cluster map modifications?

@DamiaSan
Copy link
Contributor

If we are able to make this development accepted on upstream, we will not need to pay this attention.

@DamiaSan
Copy link
Contributor

DamiaSan commented Aug 2, 2023

Our idea probably will be accepted on upstream and will be developed in shallow copy patch serie: if we have the actual size of a logical volume there is no need to calculate the total number of clusters to be copied

@derekbit
Copy link
Member

derekbit commented Aug 2, 2023 via email

@DamiaSan
Copy link
Contributor

DamiaSan commented Aug 2, 2023

We will add a new variable to the spdk_blob_mut_data structure to store the actual number of cluster allocated. Every time the cluster map is updated this new variable will be updated accordingly.
This new variable will be accessed both from a new RPC to know the space allocated for a logical volume and from shallow copy status RPC to know the total number of cluster to be copied.

@shuo-wu
Copy link
Contributor Author

shuo-wu commented Aug 4, 2023

Then will we deprecate the shallow copy status check API after launching this actual size feature?

@DamiaSan
Copy link
Contributor

DamiaSan commented Aug 28, 2023

Then will we deprecate the shallow copy status check API after launching this actual size feature?

No, the shallow copy status check API will read this new variable to understand how many clusters will be copied. Actually this number is calculated at the beginning of the shallow copy operation

@DamiaSan
Copy link
Contributor

DamiaSan commented Sep 6, 2023

This is done in #6388

@derekbit
Copy link
Member

derekbit commented Sep 7, 2023

Then will we deprecate the shallow copy status check API after launching this actual size feature?

No, the shallow copy status check API will read this new variable to understand how many clusters will be copied. Actually this number is calculated at the beginning of the shallow copy operation

It seems the actual size is calculated when doing shallow copy. Can we get the value anytime? cc @DamiaSan

@DamiaSan
Copy link
Contributor

DamiaSan commented Sep 7, 2023

Yes, it is exactly what I have just done. As soon as this development will end the review stage in gerrit upstream (see #6388 ), we will have a new filed in lvol dedicated section of bdev_get_bdevs RPC respose with the actual number of clusters allocated.

@longhorn-io-github-bot
Copy link

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
    The PR for the YAML change is at:
    The PR for the chart change is at:

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at

  • Which areas/issues this PR might have potential impacts on?
    Area
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue (including backport-needed/*)
    The automation skeleton PR is at
    The automation test case PR is at
    The issue of automation test case implementation is at (please create by the template)

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@DamiaSan
Copy link
Contributor

DamiaSan commented Dec 6, 2023

Upstream review is quite slow, I will port this development in our repo
cc @shuo-wu

@DamiaSan
Copy link
Contributor

@innobead
Copy link
Member

@derekbit is the spdk-engine part also ready?

@innobead innobead added the area/spdk SPDK upstream/downstream label Dec 24, 2023
@derekbit
Copy link
Member

derekbit commented Dec 24, 2023

is the spdk-engine part also ready?

SPDK and go-spdk-engine are read. Waiting for the Shuo's snapshot part. The actual size is set in snaoshot disk (lvol) info.

@innobead
Copy link
Member

Moving back to implementation, and add @shuo-wu part of assignees.

@roger-ryao
Copy link

is the spdk-engine part also ready?

SPDK and go-spdk-engine are read. Waiting for the Shuo's snapshot part. The actual size is set in snaoshot disk (lvol) info.

Hi @innobead & @DamiaSan :
I want to confirm whether the status of this ticket should be Implement or Ready For Testing.
Because I did not find any PR related to Shuo's snapshot in this ticket.

@innobead
Copy link
Member

innobead commented Jan 3, 2024

Moved this to implementation first. Wait for @DamiaSan update.

@derekbit
Copy link
Member

derekbit commented Jan 5, 2024

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:
  1. Create a Longhorn system (branch: master-head)
  2. Enable v2-data-engine
  3. Create a v2 volume
  4. Write some data to v2 volume
  5. Create a snaphsot. Check the snapshot size and the actual size of the volume
  6. Repeat step 4 and 5
  • Does the PR include the explanation for the fix or the feature?

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at

longhorn/longhorn-ui#676
longhorn/spdk#24
longhorn/go-spdk-helper#59

  • Which areas/issues this PR might have potential impacts on?
    Area: v2 volume, snapshot, actual size
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

#7524

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@DamiaSan
Copy link
Contributor

DamiaSan commented Jan 8, 2024

SPDK part is already merged: longhorn/spdk#24

@chriscchien
Copy link
Contributor

Verified pass on longhorn master(longhorn-ui 954076, spdk a8489c, go-spdk-helper 89d49f)

Create v2 volume, create 4 snapshots with diffenent size, from UI can see the actucal size is the sum of all snapshots.
image

Delete 1 of the snapshot, the volume actual size changed and equal to sum of all snapshots.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/spdk SPDK upstream/downstream area/v2-data-engine v2 data engine (SPDK) kind/improvement Request for improvement of existing function priority/0 Must be fixed in this release (managed by PO)
Projects
None yet
Development

No branches or pull requests

7 participants