Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to resize rootfs if using LVM #721

Merged
merged 17 commits into from
Mar 30, 2021
Merged

Conversation

otubo
Copy link
Contributor

@otubo otubo commented Dec 10, 2020

Proposed Commit Message

This patch adds support to resize a single partition of a VM if it's using an
LVM underneath. The patch detects if it's LVM if the given block device
is a device mapper by its name (e.g. /dev/dm-1) and if it has slave
devices under it on sysfs. After that syspath is updated to the real
block device and growpart will be called to resize it (and automatically
its Physical Volume).

The Volume Group will be updated automatically and a final call to
extend the rootfs to the remaining space available will be made.

Using the same growpart configuration, the user can specify only one
device to be resized when using LVM and growpart, otherwise cloud-init
won't know which one should be resized and will fail.

rhbz: #1810878
LP: #1799953

Signed-off-by: Eduardo Otubo otubo@redhat.com

Additional Context

The commit also includes changes to the documentation as well as to unit tests to support the changes. The unit test is a copy from the existing test but using /dev/XXdm-0 variants for LVM tests.

Test Steps

One can verify the fix by installing cloud-init on an instance using LVM and configure it to expand one single partition, e.g. the rootfs.

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

Copy link
Collaborator

@igalic igalic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙋🏻‍♀️

cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one general comment, and some changes requested inline.
If there is no 'lvm', then we should not try to resize with lvm (i think right now that would atually just fail/stacktrace if there is no 'lvm' command). just check if subp.which('lvm'):

cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Jan 5, 2021

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon.

(If the pull request is closed, please do feel free to reopen it if you wish to continue working on it.)

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Jan 5, 2021
@github-actions github-actions bot closed this Jan 12, 2021
@smoser
Copy link
Collaborator

smoser commented Jan 22, 2021

This is what it looks like without LVM:

name vda1 vda2 vda3 10G <--- extra 100G --->
abstraction none none none
EFI /boot /

Based on 'growpart's simplified view of the world,
growpart currently accepts that the user wanted it to add:

  • add extra space to vda3
  • grow the filesystem on /

So after growpart, we end up with:

name vda1 vda2 vda3 110G
abstraction none none none
EFI /boot / 110G

In a simple LVM case (one PV on a VG) we might see a disk like this:

name vda1 vda2 vda3 10G <--- extra 100G --->
abstraction none none LVM PV VDA3
LVM VG VG0
EFI /boot [ LV / 8G ] [ LV /home 2G]

Assuming we can determine a simple case like this.
Then the behavior consistent with the no-LVM behavior would be
to turn the above into:

name vda1 vda2 vda3 110G
abstraction none none LVM PV VDA3
LVM VG VG0 (automatically)
EFI /boot [ LV / 108G ] [LV /home 2G]

What we've done there is:

  1. grow the VDA3 partition
  2. grow the VDA3 PV
  3. grow the VG (I'm not sure if that happens automatically)
  4. grown the / LV to take 100% of the added
  5. resized the filesystem on /

There are all sorts of ways the above could be more complicated. The vg0 could be spread across multiple PVS via RAID or stripe or linear.

Lets focus on the simple case.

I think the operations above simplify down to 2 groups (1,2,3 and 4,5).
You basically have 2 choices here:
a.) should you grow that partition and PV and VG (1,2,3)? The user could have had plans to add another partition and use it another way, but we already take that choice away in the simple case above.
b.) If you did so, then do you then grow the LV (4). The user could have had plans to use the new space in the VG otherwise. If you do grow the LV, then step 5 is obvious.

I think I'm fine with saying 'yes' to both as the default behavior, and if the user does not like it, they should disable growpart.

Also, lets limit operations to the simple case described above. If the / LV is on a VG that spans multiple PVs, then we should just do nothing.

@otubo
Copy link
Contributor Author

otubo commented Jan 22, 2021

Thanks for all the input @smoser . I believe I fixed most of the things. Feel free to leave comments on the rework :-) Also, had to force push because of outdated branch, I personally don't like merge commits. Also^2: Running into lots of tox/pylint errors, even on a clean checkout. I'll file a bug for that (but that's why I'm not running tox in its full extent on my local laptop)

cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
@otubo otubo force-pushed the 1810878 branch 5 times, most recently from 923befc to d145939 Compare January 26, 2021 07:42
Copy link
Collaborator

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're almost there, @otubo. Thank you for your patience.

cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
@otubo
Copy link
Contributor Author

otubo commented Jan 27, 2021

we're almost there, @otubo. Thank you for your patience.

Thank you, Scott for all the review and help merging this. Just finished the final fixes and pushed.

Copy link
Collaborator

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@otubo thanks... I just have the one comment. And seeking further input from @OddBloke or @blackboxsw . It really is not a huge deal, but I just don't like stack traces that aren't errors.

cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @otubo; minor suggestion about a possible failure path on Linux and non-linux systems that probably requires a bit of work on get_pvs_for_lv and handling.

Minimally, I'd like to see get_pvs_for_lv values get checked before putting it into rpath and allowing os.path.basename to field a boolean return value.

Ideally get_pvs_for_lv would return only one data type, not str or False.

Please also change the two util.logexc to logging.warning in cases where we would like to proceed or actually raise an exception if we really want cloud-init to fall over. It's confusing to a user to have a logged exception in cloud-init that isn't actually raising an exception.

cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
This patch adds support to resize a single partition of a VM if it's using an
LVM underneath. The patch detects if it's LVM if the given block device
is a device mapper by its name (e.g. /dev/dm-1) and if it has slave
devices under it on sysfs. After that syspath is updated to the real
block device and growpart will be called to resize it (and automatically
its Physical Volume).

The Volume Group will be updated automatically and a final call to
extend the rootfs to the remaining space available will be made.

Using the same growpart configuration, the user can specify only one
device to be resized when using LVM and growpart, otherwise cloud-init
won't know which one should be resized and will fail.

rhbz: #1810878

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
* Moved LVM detection and PV information to its own function
* Other small fixes
* Adjusted the tests for the new functions
@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Mar 17, 2021
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
cloudinit/config/cc_growpart.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @otubo

@smoser smoser requested a review from blackboxsw March 22, 2021 18:14
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs fixing just so that we have a pre-flight subp.which('lvm') check before erroring and calling lvm commands.

Copy link
Collaborator

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this as it is. But if you wanted to move the sanity check to the top of get_pvs_for_lv, i would approve that change also.

Thank you so much for your patience, @otubo

@lru_cache
def get_pvs_for_lv(devpath):
    if not util.is_Linux():
        LOG.info("No support for LVM on %s", platform.system())
        return None
    if not subp.which('lvm'):
        LOG.info("No 'lvm' command present")
        return None

    # continue on with less indentation
    try:
        (out, _err) = subp.subp(...

@otubo
Copy link
Contributor Author

otubo commented Mar 25, 2021

I'm ok with this as it is. But if you wanted to move the sanity check to the top of get_pvs_for_lv, i would approve that change also.

Thank you so much for your patience, @otubo

@lru_cache
def get_pvs_for_lv(devpath):
    if not util.is_Linux():
        LOG.info("No support for LVM on %s", platform.system())
        return None
    if not subp.which('lvm'):
        LOG.info("No 'lvm' command present")
        return None

    # continue on with less indentation
    try:
        (out, _err) = subp.subp(...

I think that makes sense. Just updated. Thanks!

@blackboxsw blackboxsw removed the stale-pr Pull request is stale; will be auto-closed soon label Mar 29, 2021
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 thanks a lot @otubo all good on my front with this review. I've just clicked update branch to get latest from cloud-init tip and someone can land it if they get to it before I see it tomorrow morning.

@blackboxsw blackboxsw merged commit 74fa008 into canonical:master Mar 30, 2021
OddBloke added a commit to OddBloke/cloud-init that referenced this pull request Apr 29, 2021
This reverts commit 74fa008.

During pre-release testing, we discovered two issues with this commit.
Firstly, there's a typo in the udevadm command that causes a TypeError
for _all_ growpart executions.  Secondly, the LVM resizing does not
appear to successfully resize everything up to the LV, though some
things do get resized.

We certainly want this change, so we'll be happy to review and land it
alongside an integration test which confirms that it is working as
expected.
OddBloke added a commit that referenced this pull request May 3, 2021
This reverts commit 74fa008.

During pre-release testing, we discovered two issues with this commit.
Firstly, there's a typo in the udevadm command that causes a TypeError
for _all_ growpart executions.  Secondly, the LVM resizing does not
appear to successfully resize everything up to the LV, though some
things do get resized.

We certainly want this change, so we'll be happy to review and land it
alongside an integration test which confirms that it is working as
expected.

LP: #1922742
@otubo otubo deleted the 1810878 branch September 22, 2021 09:13
This was referenced May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants