New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #1380 - ReaR recovery fails when the OS contains a Thin Pool/Volume #1806
Conversation
…l/Volume Using 'vgcfgrestore' to restore Thin LVs doesn't work, because there isn't the metadata in the configuration file. To be able to use Thin LVs, upon failure of 'vgcfgrestore' (or migration), traditional 'vgcreate/lvcreate' are used instead. This approach is sub-optimal because it requires to know all the possible options to 'lvcreate', which is hard/impossible to do. Nonetheless, this is better than nothing since the issue with 'lvcreate' options was already there in case of migration. The code using 'lvcreate' only runs upon 'vgcfgrestore' failure, which typically happens only with Thin LVs (IMHO). Hence, on most setups, the optimized code will still run, preventing any regression to happen. Signed-off-by: Renaud Métrich <rmetrich@redhat.com>
…ove Thin volumes being recovered, since they are broken.
The proposed code does the following:
|
@gdha |
Thin pools are widely used along with docker. |
Hello @rmetrich at first glance it looks good to me. I'm however not that good in reading plain code :-/ ... V. |
@rmetrich End of next week I can do some automated tests - this weekend is occupied (marriage of my daughter and other work duty unfortunately). Beginning of next week I take a few days off to really relax (in Germany and Luxembourg). Additional note: ReaR 2.4 is OK for me as my main customer is also struggling with these kind of stuff (with docker containers). And, I believe RedHat had plans to replace there 2.0 version with the latest 2.4 if I'm not mistaken. |
👍 Perfect! |
@rmetrich This approach is sub-optimal because it requires to know all the possible options to 'lvcreate', which is hard/impossible to do. You may have a look at my general opinion about such cases But I know basically nothing at all about Thin Pool/Volume or even docker |
@rmetrich When in migration mode it is perfectly fine to ask the user Only when not in migration mode "rear recover" should normally |
@rmetrich,
FYI, the layout is LVM sles11 default: I don't have time to look at the reason why ... but just to be sure I've tested several time. I also confirm that I don't get any problem with the current master branch on this system. I attache the log file of I'm gonna test on other system (SLES12 / RHEL7 / RHEL6 and ubuntu) |
I'll have a look on tuesday. Thanks for reporting.
Renaud.
Out of the office / Sent from my phone.
Le sam. 19 mai 2018 18:53, Sébastien Chabrolles <notifications@github.com>
a écrit :
… @rmetrich <https://github.com/rmetrich>,
I tried your patch on a sles11sp4 (POWER arch). I got the following error
during rear recover
No code has been generated to recreate fs:/ (fs).
To recreate it manually add code to /var/lib/rear/layout/diskrestore.sh or abort.
UserInput -I ADD_CODE_TO_RECREATE_MISSING_FSFS needed in /usr/share/rear/layout/prepare/default/600_show_unprocessed.sh line 33
Manually add code that recreates fs:/ (fs)
1) View /var/lib/rear/layout/diskrestore.sh
2) Edit /var/lib/rear/layout/diskrestore.sh
3) Go to Relax-and-Recover shell
4) Continue 'rear recover'
5) Abort 'rear recover'
(default '4' timeout 10 seconds)
I don't have time to look at the reason why ... but just to be sure I've
tested several time. I also confirm that I don't get any problem with the
current master branch on this system.
I attache the log file of rear -D recover in debug mode
rear-rear-sles11-144.log
<https://github.com/rear/rear/files/2019653/rear-rear-sles11-144.log>
I'm gonna test on other system (SLES12 / RHEL7 / RHEL6 and ubuntu)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1806 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABHBc11N1T9p6KIBGovoWBzJfBUGTsFzks5t0E3ogaJpZM4T_zHF>
.
|
@schabrolles From the Debug log, I do not see any |
What is strange is that
Here is the log during |
The issue is on /dev/mapper/rootvg-lvroot which has no
What is your LVM version? |
LVM version from sles11sp4
|
OK, can reproduce on RHEL6.4 which has similar version. |
@schabrolles That should make it now. I had to add quite a lot of code for older LVM versions. |
@rmetrich this is working now ... I'm planning to test it tonight on all the Linux distribution I got. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested with standard LVM (no thinpool) on:
- rhel6 (ppc64)
- rhel7 (ppc64/ppc64le)
- sles11sp4 (ppc64)
- sles12sp2 (ppc64le)
- ubuntu16.04 (ppc64le)
Tested during migration on
- sles11sp4
- sles12sp2
- rhel7
Tested with lv_thinpool on rhel7 (ppc64le)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting on the blessing of @jsmeix before I approve this PR
|
||
if lvs --noheadings -o thin_count | grep -q -v "^\s*$" ; then | ||
# There are Thin Pools on the system, include required binaries | ||
PROGS=( "${PROGS[@]}" /usr/sbin/thin_* ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rmetrich Could you list up (in comment) which executables you need at minimum? I think of thin_check (aka RedHat Solution bulletin 3031921 - https://access.redhat.com/solutions/3031921)
Why do I ask this? I know it has been tested and approved for RHEL7 by @schabrolles - but what about SLES? Are the executables named the same? To be verified by @jsmeix
Personally I will approve once @jsmeix gives his blessing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot test or verify this within a reasonable time,
not this week and next week I am not in the office
and afterwards - with probability one - some other
issues have piled up that have higher priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact they are not really needed except thin_check
, since for thin volumes, we recreate these using lvcreate
commands, so I'll remove the others (only the libraries are needed).
Let me test again without ...
if lvs --noheadings -o thin_count | grep -q -v "^\s*$" ; then | ||
# There are Thin Pools on the system, include required binaries | ||
PROGS=( "${PROGS[@]}" /usr/sbin/thin_* ) | ||
LIBS=( "${LIBS[@]}" /usr/lib64/*lvm2* ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schabrolles @gdha
I can only bildly trust you here
(plain looking at the code changes does not help me here).
Tested (RHEL7 + RHEL6) and pushed. Would you prefer squashed commits? |
I prefer the default behaviour with non-squashed commits For example see in | * s.chabrolles@fr.ibm.com 51d5f949fdd17a0fb205d865f9a4d5704a6650fe Fri May 18 09:24:23 2018 +0200 |\ Merge pull request #1805 from schabrolles/get_only_partition_from_holders : | | Verify if dm-X is a partition before adding to sysfs_paths | | | * s.chabrolles@fr.ibm.com 0ebeeac6d71cf3dba4e7e0085e56e066cf5f9db2 Tue May 15 07:23:51 2018 +0200 | | Update comment : | | | * s.chabrolles@fr.ibm.com bc1fa9878dab19ed4c5a0392d98c873c1f7845a7 Mon May 14 22:04:42 2018 +0200 | | add comments : | | | * s.chabrolles@fr.ibm.com 65d0fc32e0814545e1fa28c44586e7400a5859e9 Mon May 14 19:04:47 2018 +0200 |/ Verify if dm-X is a partition before adding to sysfs_paths : | |
Alright, not touching anything then. I also prefer having the history. |
Thanks to all of you for accepting this. |
I noticed that REAR was not backing up physical device thin pools, today... However..... I had a "thinpool" that was mapped to a BLOCK FILE ,then remounted on loop, and REAR obviously dealt with it correctly because it was a physical file on /dev/sda ,in the directory tree. So possibly we may need to be careful, since in this case LVM "sees" a thin pool, but actually it is a "lie." it's actually a raw image mounted as loop. We may get into a situation where "REAR" backs it up correctly, as a general file , BUT then backs it up AGAIN as an LVM thin pool, in this situation how would it re-create the physical device |
Fix a problem introduced in commits b184194 and 311bfb3 (PR rear#1806) that shows up when there are multiple VGs to restore. Using variables create_thin_volumes_only and create_logical_volumes to propagate infromation from VG creation to LV creation does not work well in the case of multiple VGs, because the variables are global and if there are multiple VGs, their values will leak from one VG to another. The generated diskrestore.sh script does not guarantee that the LVs of a given VG are created immediately after their VG and before creating another VG. Curently, the script first creates all VGs and then all LVs, so all the LVs in all VGs will see the value of create_logical_volumes and create_thin_volumes_only from the last VG, not from their own. This matters when different VGs behave differently (typically if one has a thin pool and the other does not). Fix by replacing the scalar values by arrays of VG names. If a given VG is in the array, it is the equivalent of the former scalar value being 1 for the given VG, if it is not in the array, it is an equivalent of a former value of 0. For the create_volume_group variable the change is not needed, but do it nevertheless for symmetry with other variables.
Fix a problem introduced in commits b184194 and 311bfb3 (PR rear#1806) that shows up when there are multiple VGs to restore. Using variables create_thin_volumes_only and create_logical_volumes to propagate infromation from VG creation to LV creation does not work well in the case of multiple VGs, because the variables are global and if there are multiple VGs, their values will leak from one VG to another. The generated diskrestore.sh script does not guarantee that the LVs of a given VG are created immediately after their VG and before creating another VG. Curently, the script first creates all VGs and then all LVs, so all the LVs in all VGs will see the value of create_logical_volumes and create_thin_volumes_only from the last VG, not from their own. This matters when different VGs behave differently (typically if one has a thin pool and the other does not). Fix by replacing the scalar values by arrays of VG names. If a given VG is in the array, it is the equivalent of the former scalar value being 1 for the given VG, if it is not in the array, it is an equivalent of a former value of 0. For the create_volume_group variable the change is not needed, but do it nevertheless for symmetry with other variables.
Fix a problem introduced in commits b184194 and 311bfb3 (PR rear#1806) that shows up when there are multiple VGs to restore. Using variables create_thin_volumes_only and create_logical_volumes to propagate information from VG creation to LV creation does not work well in the case of multiple VGs, because the variables are global and if there are multiple VGs, their values will leak from one VG to another. The generated diskrestore.sh script does not guarantee that the LVs of a given VG are created immediately after their VG and before creating another VG. Currently, the script first creates all VGs and then all LVs, so all the LVs in all VGs will see the value of create_logical_volumes and create_thin_volumes_only from the last VG, not from their own. This matters when different VGs behave differently (typically if one has a thin pool and the other does not). Fix by replacing the scalar values by arrays of VG names. If a given VG is in the array, it is the equivalent of the former scalar value being 1 for the given VG, if it is not in the array, it is an equivalent of a former value of 0. For the create_volume_group variable the change is not needed, but do it nevertheless for symmetry with other variables.
Fix a problem introduced in commits b184194 and 311bfb3 (PR rear#1806) that shows up when there are multiple VGs to restore. Using variables create_thin_volumes_only and create_logical_volumes to propagate infromation from VG creation to LV creation does not work well in the case of multiple VGs, because the variables are global and if there are multiple VGs, their values will leak from one VG to another. The generated diskrestore.sh script does not guarantee that the LVs of a given VG are created immediately after their VG and before creating another VG. Curently, the script first creates all VGs and then all LVs, so all the LVs in all VGs will see the value of create_logical_volumes and create_thin_volumes_only from the last VG, not from their own. This matters when different VGs behave differently (typically if one has a thin pool and the other does not). Fix by replacing the scalar values by arrays of VG names. If a given VG is in the array, it is the equivalent of the former scalar value being 1 for the given VG, if it is not in the array, it is an equivalent of a former value of 0. For the create_volume_group variable the change is not needed, but do it nevertheless for symmetry with other variables.
Using 'vgcfgrestore' to restore Thin LVs doesn't work, because there
isn't the metadata in the configuration file.
To be able to use Thin LVs, upon failure of 'vgcfgrestore' (or migration),
traditional 'vgcreate/lvcreate' are used instead.
This approach is sub-optimal because it requires to know all the
possible options to 'lvcreate', which is hard/impossible to do.
Nonetheless, this is better than nothing since the issue with 'lvcreate'
options was already there in case of migration.
The code using 'lvcreate' only runs upon 'vgcfgrestore' failure, which
typically happens only with Thin LVs (IMHO). Hence, on most setups, the
optimized code will still run, preventing any regression to happen.
Signed-off-by: Renaud Métrich rmetrich@redhat.com
PLEASE TEST MORE.
I tested as follows:
I plan to test tomorrow on a system with 1 VG, no thin pool but some special LVs (mirror, raid, etc) to verify that
vgcfgrestore
also works with special LVs.