Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thin pool recreation logic / use of vgcfgrestore is broken #2222

Closed
rmetrich opened this issue Sep 2, 2019 · 28 comments
Closed

Thin pool recreation logic / use of vgcfgrestore is broken #2222

rmetrich opened this issue Sep 2, 2019 · 28 comments
Assignees
Labels
bug The code does not do what it is meant to do discuss / RFC documentation enhancement Adaptions and new features no-issue-activity
Milestone

Comments

@rmetrich
Copy link
Contributor

rmetrich commented Sep 2, 2019

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"): ** ALL, including latest**

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"): All Linux using LVM2

  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR): ALL

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device): ALL

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot): ALL

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe): ALL

  • Description of the issue (ideally so that others can reproduce it):

With #1806 we fixed LVM2 volume group recreation when volume groups had Thin pools.
The idea consisted in using vgcfgrestore then deleting the Thin pools and recreating them using lvcreate commands.
It appears that this is not working properly, see Red Hat BZ https://bugzilla.redhat.com/show_bug.cgi?id=1747468 for details.

Additionally, our use of vgcfgrestore is probably not appropriate at all, it works by chance (see comment https://bugzilla.redhat.com/show_bug.cgi?id=1747468#c3 for details). Typically, it works only for Linear volumes, and won't probably for Caches and Raid hierarchies or when there are existing Snapshots on the system.

  • Workaround, if any:

The only proper solution I see is stop relying on vgcfgrestore at all, but then we are not capable of restoring volume groups and logical volumes with all properties from original system.

So what should we do now???

@pcahyna
Copy link
Member

pcahyna commented Sep 2, 2019

What properties would be lost? To preserve segment properties, I proposed using lvextend for each segment separately, see https://bugzilla.redhat.com/show_bug.cgi?id=1732328#c21

@rmetrich
Copy link
Contributor Author

rmetrich commented Sep 2, 2019

Tons of properties, and also dependent on the release of LVM2.
For example when LV is cached, when Raid level is complicated.
For now, only a small subset of properties are fetched using lvm lvs (see layout/save/GNU/Linux/220_lvm_layout.sh)

@pcahyna
Copy link
Member

pcahyna commented Sep 2, 2019

@dwlehman, do you please have any advice on how to save LVM configuration for later recreation on a different system, while preserving the above mentioned properties and not relying on vgcfgrestore?

@jsmeix jsmeix added this to the ReaR future milestone Sep 3, 2019
@jsmeix jsmeix added discuss / RFC documentation enhancement Adaptions and new features labels Sep 3, 2019
@jsmeix
Copy link
Member

jsmeix commented Sep 3, 2019

A general note one such kind of issues:

There are many other cases where ReaR does not support
arbitrary sophisticated or complicated setups.

E.g. arbitrary file system tuning parameters are often not supported
(only for XFS there is rather complete support for that, cf. MKFS_XFS_OPTIONS)
cf. #2010
or one can set up a too complicated disk layout structure
cf. #2023
or whatever special bootloader stuff
cf. #2003
and so on ...

Currently in very most cases "rear mkrescue" does not even detect
when things are too complicated for ReaR so "rear mkrescue/mkbackup"
seems to "just work" but later the user learns the hard way
by doing a test with "rear recover" when things don't work,
cf. #2005 (comment)

Currently this is how ReaR is meant to be used:
Do a tentative "rear mkbackup" and then verify on your replacement hardware
that "rear recover" actually works for your case, cf. in particular the section
"No disaster recovery without testing and continuous validation" in
https://en.opensuse.org/SDB:Disaster_Recovery
and several other similar places in that article.

So - from my point of view - what would help first and foremost is
that "rear mkrescue" could detect more cases (as far as already known)
where the system is "too complicated for ReaR" and error out in such cases.

Such cases (missing functionality in ReaR) would be neither an Error()
nor a BugError() but a new kind of MissingFeatureError().

@jsmeix jsmeix added the bug The code does not do what it is meant to do label Sep 3, 2019
@jsmeix jsmeix modified the milestones: ReaR future, ReaR v2.6 Sep 3, 2019
@jsmeix
Copy link
Member

jsmeix commented Sep 3, 2019

I missed that in current ReaR the bug is that vgcfgrestore is the wrong tool
according to https://bugzilla.redhat.com/show_bug.cgi?id=1747468#c3

@pcahyna
Copy link
Member

pcahyna commented Sep 3, 2019

So, the strategy could be:

  • first of all, implement the detection of cases where the vgcfgrestore strategy will not work, and warn the user in those cases
  • then gradually implement the missing cases (if feasible).

By the way, I looked at what Clonezilla is doing. From a quick look at the sources, they seem to restore using vgcfgrestore, and don't mention anything special for thin pools at all, so they probablty have not considered this use case either.

@jsmeix
Copy link
Member

jsmeix commented Sep 3, 2019

A side note FYI
how one could set up basically anything with ReaR see
#2086

@jsmeix
Copy link
Member

jsmeix commented Sep 3, 2019

In general regarding "warn the user" see
https://blog.schlomo.schapiro.org/2015/04/warning-is-waste-of-my-time.html

That's why I would prefer to error out with something like a

MissingFeatureError "Unable to restore VG rhel/pool00 with Thin volumes: Thin pool rhel-pool00-tpool"

plus - to "provide final power to the user" - a new config variable
something like LVM_THIN_VOLUMES_IGNORE=( rhel/pool00 )
where the user can specify what should be deliberately ignored
so that he can intentionally skip the matching MissingFeatureError
i.e. this is the way how the user confirms that he is aware of the issue.

@pcahyna
Copy link
Member

pcahyna commented Sep 3, 2019

@jsmeix I meant to "warn the user" by erroring out actually, I agree that just log a warning and continue is not that helpful. The key part though is to detect the problem during the mkrescue step, not during the recovery step, when it is too late.

rmetrich added a commit to rmetrich/rear that referenced this issue Sep 4, 2019
…is broken

Removing forcibly (with '--force' passed twice) just works.

Signed-off-by: Renaud Métrich <rmetrich@redhat.com>
rmetrich added a commit that referenced this issue Sep 13, 2019
Fix for #2222 - Thin pool recreation logic / use of vgcfgrestore is broken
@jsmeix jsmeix modified the milestones: ReaR v2.6, ReaR v2.7 Apr 29, 2020
@github-actions
Copy link

Stale issue message

@mailinglists35
Copy link

mailinglists35 commented May 27, 2021

so, is unclear to me what the status closed of this issue means: is ReaR able or not to recreate a lvm on thin pool structure like this?

[root@localhost ~]# lsblk -f
NAME                                          FSTYPE      LABEL          UUID                                   MOUNTPOINT
sda
├─sda1                                        ext4                       cb170528-4d10-48ac-959f-cb24feca2baa   /boot
└─sda2                                        crypto_LUKS                3d45e902-8a78-4a92-b756-636ccaa21e34
  └─luks-3d45e902-8a78-4a92-b756-636ccaa21e34 LVM2_member                hi3sil-H8Tt-PXpA-MY68-BeUc-yZPE-1YkK8v
    ├─ol-pool00_tmeta
    │ └─ol-pool00-tpool
    │   ├─ol-root                             ext4                       e6bab103-3c7c-4577-a9e3-7c319c3d2d8d   /
    │   ├─ol-swap                             swap                       7e750545-b4d9-47e8-8d66-6bbfcdf8578a   [SWAP]
    │   ├─ol-pool00
    │   └─ol-home                             ext4                       ecffaf62-4ac3-4688-bee1-654d6498b2f0   /home
    └─ol-pool00_tdata
      └─ol-pool00-tpool
        ├─ol-root                             ext4                       e6bab103-3c7c-4577-a9e3-7c319c3d2d8d   /
        ├─ol-swap                             swap                       7e750545-b4d9-47e8-8d66-6bbfcdf8578a   [SWAP]
        ├─ol-pool00
        └─ol-home                             ext4                       ecffaf62-4ac3-4688-bee1-654d6498b2f0   /home

it's created by rhel8 gui installer

@mailinglists35
Copy link

mailinglists35 commented May 27, 2021

rhel bugzilla 1747468 has no comments since 2019 while only in summer 2020 went into assigned status and in april this year went into high priority

@gdha
Copy link
Member

gdha commented May 27, 2021

@mailinglists35 Please ask assistance via BZ for that particular case.
@pcahyna Is this something you can follow-up?
@jsmeix On request we re-open this issue.

@gdha gdha reopened this May 27, 2021
@pcahyna
Copy link
Member

pcahyna commented May 27, 2021

@gdha hello, yes, it is on my ToDo list.

@jsmeix
Copy link
Member

jsmeix commented May 28, 2021

@pcahyna
I added you to the assignees here - I hope this is OK for you.

@github-actions
Copy link

Stale issue message

@mailinglists35
Copy link

mailinglists35 commented Aug 9, 2021

I think the bot has closed the issue prematurely. ReaR still cannot recreate a thin pool structure. If you don't want / have no resources to support this, perhaps you should mention it. also related upstream BZ is still open

@gdha
Copy link
Member

gdha commented Aug 10, 2021

@mailinglists35 @rmetrich @pcahyna On request we re-open this issue.

@gdha gdha reopened this Aug 10, 2021
@pcahyna
Copy link
Member

pcahyna commented Sep 14, 2021

hello @mailinglists35 , I have a patch that improves the situation and I will open a PR. Do you have a test case that you can try?

@mailinglists35
Copy link

mailinglists35 commented Sep 14, 2021 via email

@pcahyna
Copy link
Member

pcahyna commented Oct 5, 2021

Hello @mailinglists35, if you have a system on which you can test, please try my branch https://github.com/pcahyna/rear/tree/thinpools-layout.

@pcahyna
Copy link
Member

pcahyna commented Oct 5, 2021

@mailinglists35

ReaR still cannot recreate a thin pool structure.

By the way, what version of ReaR have you tried and what errors did you get on your thin pool layout?

@pcahyna
Copy link
Member

pcahyna commented Oct 15, 2021

Hello @mailinglists35, I am still interested what errors did you get and if my branch improves the situation.

@mailinglists35
Copy link

I'm sorry I wasn't able to try your patched version until now, will take a couple of weeks more until I can get around that machine

@github-actions
Copy link

Stale issue message

@mailinglists35
Copy link

oh I forgot to try your branch

@gdha gdha reopened this Dec 23, 2021
@pcahyna
Copy link
Member

pcahyna commented Feb 11, 2022

@mailinglists35 any updates?

@github-actions
Copy link

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The code does not do what it is meant to do discuss / RFC documentation enhancement Adaptions and new features no-issue-activity
Projects
None yet
Development

No branches or pull requests

5 participants