Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for managing pool members #264

Merged

Conversation

vojtechtrefny
Copy link
Collaborator

For LVM pools this adds support for adding and removing members
(PVs) from the pool (VG).

Fixes: #166


Keeping this as a draft for now, I found some issues in blivet that needs to be fixed first. There is a workaround in the code for storaged-project/blivet#1034 but storaged-project/blivet#1036 needs to be fixed before releasing this.

self._pool['name'],
str(e)))

for disk in remove_disks:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Should we consider removing the PV unsafe? Currently the code will remove the PV from the VG even with the safe mode on. This isn't a destructive operation, but I am also removing the PV from the disk after that and that might be potentially dangerous. And it also might not be user friendly to remove the PV just because user specified incomplete disks by accident.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, it makes sense to require storage_safe_mode = false to remove PV's from a VG. The storage_safe_mode description in the README states that "When true (the default), an error will occur instead of automatically removing existing devices and/or formatting.", and it seems that this operation would fall under "removing existing devices".

Does the role fail if the user tries to remove a PV from a VG and the PV has data on it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage_safe_mode description in the README states that "When true (the default), an error will occur instead of automatically removing existing devices and/or formatting.", and it seems that this operation would fall under "removing existing devices".

Thanks, I'll change the code to require safe mode to be turned off for this operation.

Does the role fail if the user tries to remove a PV from a VG and the PV has data on it?

If there is enough free space in the VG it will run pvmove to move the LV to an empty PV but it will return error if there isn't enough free space.

@richm
Copy link
Contributor

richm commented Apr 29, 2022

example error:

TASK [linux-system-roles.storage : manage the pools and volumes to match the specified state] ***
task path: /tmp/tmpngnprtsi/tests/roles/linux-system-roles.storage/tasks/main-blivet.yml:77
Thursday 28 April 2022  00:49:58 +0000 (0:00:00.020)       0:00:36.477 ******** 
fatal: [/cache/centos-7.qcow2c.snap]: FAILED! => {
    "actions": [],
    "changed": false,
    "crypts": [],
    "leaves": [],
    "mounts": [],
    "packages": [],
    "pools": [],
    "volumes": []
}

MSG:

failed to remove disk 'sda1' from pool 'foo': 'unicode' object has no attribute 'path'

TASK [linux-system-roles.storage : failed message] *****************************
task path: /tmp/tmpngnprtsi/tests/roles/linux-system-roles.storage/tasks/main-blivet.yml:99
Thursday 28 April 2022  00:50:01 +0000 (0:00:02.424)       0:00:38.902 ******** 
fatal: [/cache/centos-7.qcow2c.snap]: FAILED! => {
    "changed": false
}

MSG:

{'crypts': [], 'mounts': [], 'leaves': [], 'changed': False, 'actions': [], 'failed': True, 'volumes': [], 'invocation': {'module_args': {'packages_only': False, 'disklabel_type': None, 'diskvolume_mkfs_option_map': {'ext4': '-F', 'ext3': '-F', 'ext2': '-F'}, 'safe_mode': False, 'pools': [{'name': 'foo', 'encryption_password': None, 'state': 'present', 'raid_metadata_version': None, 'encryption': False, 'encryption_key_size': None, 'disks': ['nvme0n1'], 'raid_level': None, 'encryption_luks_version': None, 'raid_device_count': None, 'encryption_key': None, 'volumes': [], 'raid_chunk_size': None, 'type': 'lvm', 'encryption_cipher': None, 'raid_spare_count': None}], 'volumes': [], 'pool_defaults': {'encryption_password': None, 'raid_metadata_version': None, 'encryption': False, 'encryption_cipher': None, 'disks': [], 'encryption_key': None, 'encryption_key_size': None, 'encryption_luks_version': None, 'raid_device_count': None, 'state': 'present', 'volumes': [], 'raid_chunk_size': None, 'type': 'lvm', 'raid_level': None, 'raid_spare_count': None}, 'volume_defaults': {'raid_metadata_version': None, 'mount_device_identifier': 'uuid', 'fs_type': 'xfs', 'mount_options': 'defaults', 'size': 0, 'mount_point': '', 'compression': None, 'encryption_password': None, 'encryption': False, 'raid_level': None, 'raid_device_count': None, 'state': 'present', 'vdo_pool_size': None, 'fs_overwrite_existing': True, 'encryption_cipher': None, 'encryption_key_size': None, 'encryption_key': None, 'fs_label': '', 'encryption_luks_version': None, 'mount_passno': 0, 'raid_spare_count': None, 'cache_mode': None, 'deduplication': None, 'cached': False, 'type': 'lvm', 'disks': [], 'mount_check': 0, 'cache_size': 0, 'raid_chunk_size': None, 'cache_devices': [], 'fs_create_options': ''}, 'use_partitions': True}}, 'pools': [], 'packages': [], 'msg': "failed to remove disk 'sda1' from pool 'foo': 'unicode' object has no attribute 'path'", '_ansible_no_log': False}

looks like trying to dereference a string instead of an object

@vojtechtrefny
Copy link
Collaborator Author

looks like trying to dereference a string instead of an object

That's another bug in blivet, already fixed in upstream and tracked here https://bugzilla.redhat.com/show_bug.cgi?id=2079220 for RHEL. Fedora 35 has already the fixed version but the tests are failing on F35 due to lack of space.

@richm
Copy link
Contributor

richm commented May 3, 2022

fedora-35 tests are failing due to https://pagure.io/cloud-sig/issue/379

@richm
Copy link
Contributor

richm commented May 25, 2022

[citest bad]

@vojtechtrefny vojtechtrefny force-pushed the master_lvm-volumes-management branch from adb6e9b to bec3429 Compare June 8, 2022 13:21
@vojtechtrefny vojtechtrefny marked this pull request as ready for review June 8, 2022 13:24
library/blivet.py Outdated Show resolved Hide resolved
For LVM pools this adds support for adding and removing members
(PVs) from the pool (VG).
Copy link
Collaborator

@japokorn japokorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@richm
Copy link
Contributor

richm commented Jun 20, 2022

Except for the question about safe_mode global, lgtm

@vojtechtrefny
Copy link
Collaborator Author

Except for the question about safe_mode global, lgtm

Sorry, I missed the comment. You are right the global safe_mode wasn't needed, I removed the line, thanks.

@richm richm merged commit 956ff67 into linux-system-roles:master Jun 22, 2022
richm added a commit to richm/linux-system-roles-storage that referenced this pull request Jul 19, 2022
[1.9.0] - 2022-07-19
--------------------

### New Features

- Add support for attaching LVM cache to existing LVs (linux-system-roles#273)

Fixes: linux-system-roles#252

- Add support for managing pool members (linux-system-roles#264)

For LVM pools this adds support for adding and removing members
(PVs) from the pool (VG).

* Do not allow removing members from existing pools in safe mode

- ensure role works with gather_facts: false (linux-system-roles#277)

Ensure tests work when using ANSIBLE_GATHERING=explicit

### Bug Fixes

- loop variables are scoped local - no need to reset them (linux-system-roles#282)

If you use
```yaml
  loop_control:
    loop_var: storage_test_pool
```
Then the variable `storage_test_pool` is scoped local to the task
and is undefined after the task.  In addition, referencing the
variable after the loop causes this warning:
```
[WARNING]: The loop variable 'storage_test_pool' is already in use. You should
set the `loop_var` value in the `loop_control` option for the task to something
else to avoid variable collisions and unexpected behavior.
```

- support ansible-core-2.13 (linux-system-roles#278)

Looks like ansible-core-2.13 (or latest jinja3) does not support
constructs like this:
```
var: "{{ [some list] }} + {{ [other list] }}"
```
instead, the entire thing has to be evaluated in the same jinja
evaluation context:
```
var: "{{ [some list] + [other list] }}"
```
In addition - it is an Ansible antipattern to use
```yaml
- set_fact:
    var: "{{ var + item }}"
    loop: "{{ some_list }}"
```
so that was rewritten to use filters instead

### Other Changes

- ensure cryptsetup is available for testing (linux-system-roles#279)

- make min_ansible_version a string in meta/main.yml (linux-system-roles#281)

The Ansible developers say that `min_ansible_version` in meta/main.yml
must be a `string` value like `"2.9"`, not a `float` value like `2.9`.

- Skip the entire test_lvm_pool_members playbook with old blivet (linux-system-roles#280)

Multiple bugs in blivet were fixed in order to make the feature
work and without the correct version even the most basic test to
remove a PV from a VG will fail so we should skip the entire test
with old versions of blivet.
Skip test on el7 if blivet version is too old
Add support for `is_rhel7`
Refactor EL platform and version checking code
Add a name for the `end_play` task

- Add CHANGELOG.md (linux-system-roles#283)

Signed-off-by: Rich Megginson <rmeggins@redhat.com>
richm added a commit that referenced this pull request Jul 21, 2022
* Version 1.9.0 - CHANGELOG.md [citest skip]

[1.9.0] - 2022-07-19
--------------------

### New Features

- Add support for attaching LVM cache to existing LVs (#273)

Fixes: #252

- Add support for managing pool members (#264)

For LVM pools this adds support for adding and removing members
(PVs) from the pool (VG).

* Do not allow removing members from existing pools in safe mode

- ensure role works with gather_facts: false (#277)

Ensure tests work when using ANSIBLE_GATHERING=explicit

### Bug Fixes

- loop variables are scoped local - no need to reset them (#282)

If you use
```yaml
  loop_control:
    loop_var: storage_test_pool
```
Then the variable `storage_test_pool` is scoped local to the task
and is undefined after the task.  In addition, referencing the
variable after the loop causes this warning:
```
[WARNING]: The loop variable 'storage_test_pool' is already in use. You should
set the `loop_var` value in the `loop_control` option for the task to something
else to avoid variable collisions and unexpected behavior.
```

- support ansible-core-2.13 (#278)

Looks like ansible-core-2.13 (or latest jinja3) does not support
constructs like this:
```
var: "{{ [some list] }} + {{ [other list] }}"
```
instead, the entire thing has to be evaluated in the same jinja
evaluation context:
```
var: "{{ [some list] + [other list] }}"
```
In addition - it is an Ansible antipattern to use
```yaml
- set_fact:
    var: "{{ var + item }}"
    loop: "{{ some_list }}"
```
so that was rewritten to use filters instead

### Other Changes

- ensure cryptsetup is available for testing (#279)

- make min_ansible_version a string in meta/main.yml (#281)

The Ansible developers say that `min_ansible_version` in meta/main.yml
must be a `string` value like `"2.9"`, not a `float` value like `2.9`.

- Skip the entire test_lvm_pool_members playbook with old blivet (#280)

Multiple bugs in blivet were fixed in order to make the feature
work and without the correct version even the most basic test to
remove a PV from a VG will fail so we should skip the entire test
with old versions of blivet.
Skip test on el7 if blivet version is too old
Add support for `is_rhel7`
Refactor EL platform and version checking code
Add a name for the `end_play` task

- Add CHANGELOG.md (#283)

Signed-off-by: Rich Megginson <rmeggins@redhat.com>

* ensure tests work with gather_facts: false is not a new feature

ensure tests work with gather_facts: false is not a new feature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vgextend failing to extend the volume group
4 participants