Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow full migration of /dev/disk/by-id/devices #1450

Merged
merged 17 commits into from Sep 1, 2017

Conversation

schabrolles
Copy link
Member

Currently, /dev/disk/by-id/ devices are not "always" migrated.

  • It should be taken by finalize/GNU/Linux/260_rename_diskbyid.sh script; but in its actual form, this script only migrate NEW by-id name if the real device (/dev/name) has the same name between source and target system.

example: (diskbyid_mappings file)

virtio-a1bd20bf-f66d-442c-8 vda
virtio-a1bd20bf-f66d-442c-8-part1 vda1
virtio-a1bd20bf-f66d-442c-8-part2 vda2

=> /dev/disk/by-id/virtio-a1bd20bf-f66d-442c-8 will be renamed only if /dev/vda is present on the target system.

  • Some files like /etc/lilo.conf /etc/yaboot.conf /etc/default/grub_installdevice are not part of the list of file to be migrated by finalize/GNU/Linux/260_rename_diskbyid.sh
    /etc/default/grub_installdevice is used in SuSe Linux and use /dev/disk/by-id/ devices to point to the bootloader partition. If we forget to migrate this file, yast bootloader will fail.

Proposal:

1- Add /etc/lilo.conf /etc/yaboot.conf /etc/default/grub_installdevice to FILES variable in finalize/GNU/Linux/260_rename_diskbyid.sh

2- Use $SHARE_DIR/layout/prepare/default/320_apply_mappings.sh to apply device mapping on diskbyid_mappings file to migrate /dev/old_device to /dev/new_device in case of recover on a different system.

Add this section at the beginning of finalize/GNU/Linux/260_rename_diskbyid.sh

# Apply device mapping to replace device in case of migration.
tmp_layout="$LAYOUT_FILE"
LAYOUT_FILE="$OLD_ID_FILE"
source $SHARE_DIR/layout/prepare/default/320_apply_mappings.sh
LAYOUT_FILE="$tmp_layout"

But, in order to migrate real device from diskbyid_mappings file (second column), with 320_apply_mappings.sh, those one should be in absolute PATH.

virtio-a1bd20bf-f66d-442c-8 /dev/vda
virtio-a1bd20bf-f66d-442c-8-part1 /dev/vda1
virtio-a1bd20bf-f66d-442c-8-part2 /dev/vda2

Tested with sles11 and sles12 on POWER

example of output with sles12 (/etc/default/grub_installdevice)

[...]
Restore the Mountpoints (with permissions) from /var/lib/rear/recovery/mountpoint_permissions
Patching /etc/default/grub_installdevice: Replacing [/dev/disk/by-id/virtio-3af70bf3-fa3b-40b8-9-part1] by [/dev/disk/by-id/dm-name-mpatha-part1]
[...]

 -  Apply device mapping to diskbyid_mappings file to replace device in
case of migration.
 - use full path in diskbyid_mappings file.
 - Add /etc/default/grub_installdevice to target FILES.
  - Use /dev/mapper/device instead of /dev/dm- device in
diskbyid_mappings
@schabrolles schabrolles added enhancement Adaptions and new features minor bug An alternative or workaround exists labels Aug 21, 2017
@schabrolles schabrolles added this to the ReaR v2.3 milestone Aug 21, 2017
@schabrolles schabrolles self-assigned this Aug 21, 2017
Copy link
Member

@schlomo schlomo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't test it and trust your tests. Thanks.

# Apply device mapping to replace device in case of migration.
tmp_layout="$LAYOUT_FILE"
LAYOUT_FILE="$OLD_ID_FILE"
source $SHARE_DIR/layout/prepare/default/320_apply_mappings.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see the relevant code from the other script extracted into functions and those functions used here. That will help to create a clean interface for this functionality and it will help us with future refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I just wonder if I should add this new function to lib/layout-functions.sh or create a new one named lib/apply-mappings-functions.sh ... or elsewhere ?

test -s "$sed_change_monitor" && sed_change=1

sed -i "s#$ID_FULL\$#/dev/$DEV_NAME#w $sed_change_monitor" "$realfile"
test -s "$sed_change_monitor" && sed_change=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to check this file after every call of sed? Isn't it enough to check it once at the end of the while loop before printing out the change info?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file will be written by sed if changes are made into $realfile.
But it is overwritten (no append).
So if on the last sed didn't make any change, the file $sed_change_monitor will be empty even if a previous sed made a change.

# udevinfo is deprecated by udevadm (SLES 10 still uses udevinfo)
UdevSymlinkName=""
type -p udevinfo >/dev/null && UdevSymlinkName="udevinfo -r / -q symlink -n"
type -p udevadm >/dev/null && UdevSymlinkName="udevadm info --root --query=symlink --name"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have this now more than once in the ReaR code I think that we would benefit from extracting that into a function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question: where, in your opinion, should I put this function ?

UdevSymlinkName=""
type -p udevinfo >/dev/null && UdevSymlinkName="udevinfo -r / -q symlink -n"
type -p udevadm >/dev/null && UdevSymlinkName="udevadm info --root --query=symlink --name"

[[ -z "$UdevQueryName" ]] && {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No check for UdevSymlinkName?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll do it

@schlomo
Copy link
Member

schlomo commented Aug 22, 2017

@schabrolles I would put all these functions into layout-functions.sh because they are used in that stage.

@schabrolles
Copy link
Member Author

@schlomo for your review.
I have put all functions in layout-function.sh as per your request.
Currently, I did not update any other additional script to point to those functions. I think it is better to do it in a separate pull request for better control and limit side effect.

Tested during a disk migration on :

  • sles11 sp4
  • sles12 sp2
  • rhel7.3
  • ubuntu 16.04

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think code like

test -z "$var" && BugIfError "..."

cannot work because *IfError functions
check if $? is not zero
(see lib/_input-output-functions.sh)
but '&&' means $? is zero.
E.g. on commandline:

# test -z "" && echo $?
0

I would keep it simple and do

test "$var" || BugError "..."

to ensure var is not empty and

test $var || BugError "..."

to ensure var is neither empty nor only spaces, cf. on commandline:

# var="" ; test "$var" || echo empty
empty

# var=" " ; test "$var" || echo empty

# var=" " ; test $var || echo empty or blank
empty or blank

@schlomo
Copy link
Member

schlomo commented Aug 23, 2017

In general, please try to use positive logic as much as possible. It helps a lot to read code and understand the logic.

Good:

  • check foo AND do something
  • check foo OR die with error (this is the classical assertion)

Bad:

  • check not foo or do something
  • check not foo and die with error

The negated check makes one stop and think what does it actually mean.

@jsmeix
Copy link
Member

jsmeix commented Aug 23, 2017

FYI:
In general I would avoid the *IfError functions
because they cannot work reliably in all cases
because they test $? but $? can be an unexpected value
(totally in compliance how bash works but unexpected)
in a bit more complicated cases, cf.
#1415 (comment)
where

StopIfError "USB device '$USB_DEVICE' is already mounted on $(grep -E "^$REAL_USB_DEVICE\\s" /proc/mounts | cut -d' ' -f2 |tail -1)"

did never error out as intended because the pipe after 'grep' always
returns $? with zero value because 'tail -1' results zero return code.

@schabrolles
Copy link
Member Author

@jsmeix

What about : test -z "$var" && Error "Empty string passed to UdevSymlinkName()"

@schlomo
Copy link
Member

schlomo commented Aug 24, 2017

for that type of thing I prefer test "$var" || Error "Empty string passed to UdevSymlinkName()" because it is a positive check with an else clause.

@jsmeix
Copy link
Member

jsmeix commented Aug 24, 2017

@schabrolles
see my comment
#1450 (review)
what I personally prefer - i.e.

test "$var" || Error "var must not be empty"

or even

test $var || Error "var must not be empty (or only spaces)"

But technically

test -z "$var" && Error "var must not be empty"

is also right and sufficiently well understandable.

In contrast oversophisticated stuff like

! test -z "$var" || Error "var must not be empty"

is bad because it is needlessly hard to understand
and it could even make simple minds explode ;-)

Ultimately it is your code so that the final decision
is yours what specific coding style you prefer.

@jsmeix
Copy link
Member

jsmeix commented Aug 24, 2017

I made evil typos in my prevois comment that I fixed now.
The right code is

test "$var" || Error "var must not be empty"

and

test $var || Error "var must not be empty (or only spaces)"

@schabrolles
FYI regarding 'Error' versus 'BugError':

BugError should only be used when the cause is a bug in ReaR.

Error should be used when the cause is not in ReaR itself.

For example when in script 100_prepare_it.sh a variable is set
by ReaR that is needed in a subsequent script 200_do_it.sh
then in script 200_do_it.sh an assertion should be like

# var must have been set in 100_prepare_it.sh:
test $var || BugError "var empty (or only spaces)"

but in 100_prepare_it.sh it would be usually a test like

# ... [ code that sets var ] ...
test $var || Error "Failed to set var (empty or only spaces)"

when the root cause why it failed to set var is not a bug in ReaR
but an error in the environment where that code is run
(i.e. something outside of ReaR).
And when var must be provided by the user it would be

test $var || Error "var must not be empty (or only spaces)"

@schabrolles
Copy link
Member Author

schabrolles commented Aug 24, 2017

@jsmeix
Thanks for the precious info (I learn).
This test on $var is made inside a function (provided by layout-function.sh) that need an argument (var=$1), this function is called by other ReaR scripts (like 260_rename_diskbyid.sh).
So, if $var is empty, this mean that something goes wrong in 260_rename_diskbyid.sh (which should have called the function with an argument).
Then it should be considered has a "Bug" ... Am I right ?

If yes, I see this example (in layout-function.sh):

[[ "$name" ]]
BugIfError "Empty string passed to get_device_name"

What so you think about this implementation ?

@jsmeix
Copy link
Member

jsmeix commented Aug 25, 2017

@schabrolles
you are right - a syntactically wrong function call is a bug in ReaR.
E.g. see the BugError in the UserInput function when things
are hopelessly wrong versus the 'using fallback' or 'ignored'
behaviour when there is a reasonable way to proceed.

@jsmeix
Copy link
Member

jsmeix commented Aug 25, 2017

@schabrolles
I found

    if test $sed_change -eq 1 ; then

which may show unwanted bash errors
but quoting "$sed_change" does not help here
because (on commandline):

# sed_change=" 1 " ; if test $sed_change -eq 1 ; then echo OK ; fi
OK

# sed_change="X" ; if test $sed_change -eq 1 ; then echo OK ; fi
-bash: test: X: integer expression expected

# sed_change="" ; if test $sed_change -eq 1 ; then echo OK ; fi
-bash: test: -eq: unary operator expected

# unset sed_change ; if test $sed_change -eq 1 ; then echo OK ; fi
-bash: test: -eq: unary operator expected

# sed_change=" 1 " ; if test "$sed_change" -eq 1 ; then echo OK ; fi
OK

# sed_change="X" ; if test "$sed_change" -eq 1 ; then echo OK ; fi
-bash: test: X: integer expression expected

# sed_change="" ; if test "$sed_change" -eq 1 ; then echo OK ; fi
-bash: test: : integer expression expected

# unset sed_change ; if test "$sed_change" -eq 1 ; then echo OK ; fi
-bash: test: : integer expression expected

so that - as far as I know - all what one can do is to
suppress the unwanted STDERR bash messages
that won't help in any way if they appear in the log
(but they may cause user questions that something
could be wrong in our code), so that I suggest simply

    # Avoid stderr if sed_change is not set or empty or not an integer value:
    if test "$sed_change" -eq 1 2>/dev/null ; then

(c.f. what I do in the UserInput function).

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schabrolles
I do no longer see real issues.
Feel free to merge it when it is o.k. for you.

@jsmeix
Copy link
Member

jsmeix commented Aug 25, 2017

I need to correct my wrong
#1450 (comment)
where I had written

In general I would avoid the *IfError functions
because they cannot work reliably in all cases
because they test $? but $? can be an unexpected value

That reason is nonsense because also

COMMAND || Error "..."

test the exit code (i.e. $?) of COMMAND.

The actual reason why I avoid the *IfError functions
is that they cannot work in case of 'set -e'
cf. #700
because in case of 'set -e' the bash directly exits
when a single COMMAND results non-zero exit code
and no subsequent *IfError function is ever called
that could test $? of the previous COMMAND.
For example on commandline:

# ( set -e ; cat qqq ; if (( $? != 0 )) ; then echo ERROR ; fi )
cat: qqq: No such file or directory

# ( set -e ; cat qqq || echo ERROR )
cat: qqq: No such file or directory
ERROR

# ( set -e ; if ! cat qqq ; then echo ERROR ; fi )
cat: qqq: No such file or directory
ERROR

The crucial point is that 'cat qqq || echo ERROR'
or 'if ! cat qqq ; then echo ERROR ; fi'
is a single COMMAND in bash.

@schlomo
Copy link
Member

schlomo commented Aug 25, 2017

@schabrolles @jsmeix maybe I have an idea for a much simpler solution to the sed change monitor problem:

Instead of writing to a special file write to STDOUT like this:

$ rm changed
$ date > d ; sed -i -e 's/2017/XXX/w /dev/stdout' d >>changed
$ date > d ; sed -i -e 's/2016/XXX/w /dev/stdout' d >>changed
$ wc -l changed
1 changed
$ cat changed
Fr 25. Aug 13:15:10 CEST XXX
$ if test $(wc -l < changed) -gt 0 ; then echo changed ; fi
changed

This trick also makes the code shorter and removes the need to reset the change tracking file every time. You can collect all the change infos from all the sed calls and then check in the end.

The other thing I noticed is that there is no /g flag in the sed substitution which means that sed will only replace the first occurrence of an ID on a single line. If this is not intentional then better add /g to every substitution.

@jsmeix
Copy link
Member

jsmeix commented Aug 25, 2017

@schlomo
do I understand it correctly that the actual improvement of

sed -i -e 's/THIS/THAT/gw /dev/stdout' file >>all_changes

over

sed -i -e 's/THIS/THAT/gw current_changes' file

is that appending various sed's STDOUT into all_changes
makes all changes sum up therein while in contrast
sed's own 'w current_changes' cannot append
so that one cannot collect changes of various
sed calls this way in one same file?

@schlomo
Copy link
Member

schlomo commented Aug 25, 2017

@jsmeix yes, that is the idea. Saves you from checking the result file after every sed call and counting the lines in this file avoids the problem with the quoting that you mentioned above.

@schabrolles
Copy link
Member Author

@schlomo thanks for the tip ... it is implemented now

@schabrolles
Copy link
Member Author

@jsmeix @gdha @schlomo Same for this one... Last review before merging.

Copy link
Member

@schlomo schlomo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, found some small things you can decide to leave or to improve.


# apply-mappings need one argument (file which contains disk device to migrate).
if [ -z "$1" ] ; then
LogError "apply-mappings function called without argument (file_to_migrate)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call BugError here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Do you think we should generalize this test for all the function which must have an argument ?

part_base="${original}p" # append p between main device and partitions
;;
*mapper[/!]*)
case $OS_VENDOR in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this after @jsmeix changed the way it works, but I'd have expected here $OS_MASTER_VENDOR

Copy link
Member

@jsmeix jsmeix Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understood at the time of
5b35338
and
9c42a6a
I did not change the way it works.
I only set for SUSE those variables in a way
as I saw it was already done for other distributions.
I.e. I tried to be compatible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then indeed it might be better to use $OS_MASTER_VENDOR because it will catch you all Debian derivatives under Debian etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsmeix can you give me more information about this ?

  • difference between $OS_VENDOR and $OS_MASTER_VENDOR
  • Is there a variable which could act as a "family" name like FEDORA=(redhat,centos,fedora) DEBIAN=(ubuntu,debian).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsmeix Ok I got my answer from config-functions.sh

part_base="${original}p" # append p between main device and partitions
fi
;;
Ubuntu)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Debian?

# for example : /dev/mapper/mpatha-part1
part_base="${original}-part" # append -part between main device and partitions
;;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Arch, Gentoo etc.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have Arch or Gentoo (not available on POWER), so I don't know.
In the current script, they will use the default /dev/mapper/mpatha1 partition name scheme.

;;
Ubuntu)
# Ubuntu 16.04 (need to check for other version) named muiltipathed partitions with
# [mapth device name] + "-part" + [part number]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should probably read

[mpath device name] + "-part" + [part number]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected. thanks

target="${target}p" # append p between main device and partitions
;;
*mapper[/!]*)
case $OS_VENDOR in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Since this is the same logic, could you extract that into a function like format_device_name or something like this? Makes it also easier to keep it in sync if something changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I've named it get_part_device_name_format()... it is a bit long, but format_device_name sound a bit "dangerous" and make me nervous :) (=> formatting a device ????)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schabrolles
don't worry when meaningful names become longer.
But worry about meaningful and unambiguous names
(unambiguous also means names that are not misleading)
cf. https://github.com/rear/rear/wiki/Coding-Style

@jsmeix
Copy link
Member

jsmeix commented Aug 30, 2017

@schabrolles
preferably use paired parenthesis for case patterns
cf. "Paired parenthesis" in
https://github.com/rear/rear/wiki/Coding-Style

@schabrolles
Copy link
Member Author

@schlomo @jsmeix changes applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adaptions and new features fixed / solved / done minor bug An alternative or workaround exists
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants