Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong GRUB version recovered #895

Closed
GCChelp opened this issue Jun 27, 2016 · 29 comments
Closed

Wrong GRUB version recovered #895

GCChelp opened this issue Jun 27, 2016 · 29 comments
Assignees
Labels
Milestone

Comments

@GCChelp
Copy link

GCChelp commented Jun 27, 2016

  • rear version (/usr/sbin/rear -V): Relax-and-Recover 1.18 / Git

  • OS version (cat /etc/rear/os.conf or lsb_release -a):
    OS_VENDOR=SUSE_LINUX
    OS_VERSION=13.1

    ARCH='Linux-i386'
    OS='GNU/Linux'
    OS_VERSION='13.1'
    OS_VENDOR='SUSE_LINUX'
    OS_VENDOR_VERSION='SUSE_LINUX/13.1'
    OS_VENDOR_ARCH='SUSE_LINUX/i386'

  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
    OUTPUT=USB
    USB_DEVICE=/dev/disk/by-label/REAR-000
    BACKUP=NETFS
    BACKUP_URL=nfs://nfsserver/backups/rear
    BACKUP_OPTIONS="nfsvers=3,nolock"
    EXCLUDE_MOUNTPOINTS=( /home /scratch )
    AUTOEXCLUDE_PATH=( /media /mnt )
    AUTOEXCLUDE_AUTOFS=..
    AUTOEXCLUDE_DISKS=y
    SSH_ROOT_PASSWORD=XXX

  • Brief description of the issue
    "rear recover" restores unconfigured GRUB2, while GRUB legacy was used on the system.
    Result is a unbootable machine after restore.

  • Work-around, if any
    nothing known

@GCChelp
Copy link
Author

GCChelp commented Jun 27, 2016

rear-testupd.log-bootloader+grub.txt

@jsmeix
Attached the result of egrep -i "bootloader|grub" rear-testupd.log as separate file.
I couldn't find any entries causing the problem on first sight.
Please let me know what to search for if you have further ideas.

@jsmeix jsmeix added the bug The code does not do what it is meant to do label Jun 29, 2016
@jsmeix jsmeix added this to the Rear v1.19 milestone Jun 29, 2016
@jsmeix jsmeix self-assigned this Jun 29, 2016
@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

I assing it to me because it is about openSUSE 13.1
regardless that I am not at all an expert in bootloader issues
i.e. do not expect too much from me here.

@GCChelp
first and foremost I like to understand
why you use GRUB legacy and not GRUB2?

On my openSUSE 13.1 test system
(cf. #870 (comment) )
I get by default GRUB 2 used as bootloader
and with that it "just works" for me.

Therefore I like to understand the reasoning behind
why you use GRUB legacy on your system.

@GCChelp
Copy link
Author

GCChelp commented Jun 29, 2016

@jsmeix
We use GRUB legacy for legacy reasons...
Have been running openSUSE on this workstations from something like 11.2 on, updating them as required. Now, everything is updated to 13.1, only the bootloader stayed where it was.

I am not an expert in bootloader issues, too. That's why I always postpone the possible project to update GRUB legacy to GRUB 2...

@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

@GCChelp
many thanks for your background information.
It helps me a lot to understand how it can happen
that an old bootloader is used on a new system:

By updating only the files of the system from older ones
(i.e. by updateing the installed RPM packages)
instead of installing the whole system anew from scratch.

This means in general that rear should be prepared
to detact if an old (even outdated) bootloader is still in use.

But there is the general problem to reliably detect the actually
used bootloader, cf. "Disaster recovery does not just work" in
https://en.opensuse.org/SDB:Disaster_Recovery

I will dig into it how rear detects what bootloader is used...

@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

@GCChelp
In your #895 (comment) your
"result of egrep -i 'bootloader|grub' rear-testupd.log"
contains:

+++ cat /var/lib/rear/recovery/bootloader
++ BOOTLOADER=GRUB

i.e. it has correctly detected that GRUB (not GRUB2) is used
but later it does contradictingly

++ Print 'Installing GRUB2 boot loader'
++ echo -e 'Installing GRUB2 boot loader'
++ grub_name=grub2

FYI: On my openSUSE 13.1 test system
after "rear -d -D mkbackup" I have in
var/lib/rear/recovery/bootloader

GRUB2

@GCChelp
Copy link
Author

GCChelp commented Jun 29, 2016

@jsmeix

This means in general that rear should be prepared
to detact if an old (even outdated) bootloader is still in use.

I fully agree!

i.e. it has correctly detected that GRUB (not GRUB2) is used
but later it does contradictingly

I am willing to provide further details or do some tests in our environment.
Just let me know what you suggest.

@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

With probability one (cf. https://en.wikipedia.org/wiki/Almost_surely )
I found the root cause:

Your "result of egrep -i 'bootloader|grub' rear-testupd.log"
contains:

+ source /usr/share/rear/finalize/Linux-i386/21_install_grub.sh
++ ((  USING_UEFI_BOOTLOADER  ))
++ [[ GRUB = \G\R\U\B ]]
+++ type -p grub-probe
+++ type -p grub2-probe
++ [[ -n /sbin/grub2-probe ]]
++ ((  USING_UEFI_BOOTLOADER  ))
2016-06-27 12:20:57.496977540 Including finalize/Linux-i386/22_install_grub2.sh
+ source /usr/share/rear/finalize/Linux-i386/22_install_grub2.sh
++ ((  USING_UEFI_BOOTLOADER  ))
+++ type -p grub-probe
+++ type -p grub2-probe
++ [[ -n /sbin/grub2-probe ]]
++ LogPrint 'Installing GRUB2 boot loader'

I.e. it runs finalize/Linux-i386/21_install_grub.sh
but that gives up because of a wrong test and
lets finalize/Linux-i386/22_install_grub2.sh do the job
which means it tries to install GRUB2.

The wrong test in finalize/Linux-i386/21_install_grub.sh is

# check the BOOTLOADER variable (read by 01_prepare_checks.sh script)
if [[ "$BOOTLOADER" = "GRUB" ]]; then
    if [[ $(type -p grub-probe) || $(type -p grub2-probe) ]]; then
        # grub2 script should handle this instead
        return
    fi
fi

In finalize/Linux-i386/22_install_grub2.sh there is the same test:

# Only for GRUB2 - GRUB Legacy will be handled by its own script
[[ $(type -p grub-probe) || $(type -p grub2-probe) ]] || return

Currently I do not understand how that tests are meant to work.
As usual no comments that tell the reasoning behind
so that I could undrerstand the intent behind that code,
cf. "Code should be easy to understand" in
https://github.com/rear/rear/wiki/Coding-Style

@GCChelp
when you are logged in as root in the recovery system
you can change whatever you want in the recovery system
before you run "rear recover".

In this case please change
/usr/share/rear/finalize/Linux-i386/21_install_grub.sh
and
/usr/share/rear/finalize/Linux-i386/22_install_grub2.sh
before you run "rear recover"
to enforce that 21_install_grub.sh does its job
and that 22_install_grub2.sh does not do anything.

For example (untested) remove the test from 21_install_grub.sh
and in 22_install_grub2.sh add a plain "return" at the beginning
so that 22_install_grub2.sh does nothing.

Then try if afterwards "rear recover" installs GRUB
(and not GRUB2).

@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

"git blame" indicates the above mentioned test in 21_install_grub.sh
is from "Jesper Sander Lindgren" via commit 3f8b22f
and the matching test in 22_install_grub2.sh
is from @gdha via commit 079de45

@gdha
could you explain how the

[[ $(type -p grub-probe) || $(type -p grub2-probe) ]]

tests in #895 (comment) are meant to work?

From my current point of view the test looks like an
overcomplicated indirection (RFC 1925 item 6a).

I would have expected something straightforward
and simple (KISS) like in 21_install_grub.sh

# do not do anything for GRUB 2 here because
# GRUB 2 is installed via 22_install_grub2.sh
test "GRUB2" = "$BOOTLOADER" && return

and accordingly in 22_install_grub2.sh

# do not do anything for GRUB (i.e. GRUB Legacy) here
# because GRUB Legacy is installed via 21_install_grub.sh
test "GRUB" = "$BOOTLOADER" && return

@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

Ha!
Commit 3f8b22f shows that the test was KISS before.
But currently I have no idea why it was complicated.
There must have been a reason but who knows it?

@jsmeix
Copy link
Member

jsmeix commented Jun 29, 2016

Grepping in "git log" for "3f8b22f" results:

Merge: 78a6f9f 3f8b22f
Author: Gratien D'haese 
Date:   Thu May 21 18:24:28 2015 +0200
    Merge pull request #589 from sanderu/Grub2_support_for_Linux
 

which leads to #589
that contains

Problems found:
The 21_install_grub.sh checked for GRUB2 which is
not part of the first 2048 bytes of a disk - only GRUB
was present - thus the check for grub-probe/grub2-probe.

I am afraid - I still do not understand it - I mean how does

if [[ "$BOOTLOADER" = "GRUB2" ]]; then

check the first 2048 bytes of a disk?

On my openSUSE 13.1 I have
"$BOOTLOADER" = "GRUB2"
and accordingly I think it should work to test for it.

@gdha
Copy link
Member

gdha commented Jun 30, 2016

@jsmeix I believe [[ $(type -p grub-probe) || $(type -p grub2-probe) ]] was meant as a test to verify if grub legacy or grub2 is present on the system. It looks indeed a bit strange, feel free to modify towards you think how it should be.

@jsmeix
Copy link
Member

jsmeix commented Jun 30, 2016

The whole logic when what bootloader install script
is actually installing its specific bootloader
confuses me.

In
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
I tried to overhaul finalize/Linux-i386/21_install_grub.sh
but I fear I introduced regressions.

I need to test it.

For now
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
is only there FYI so that you can have a look.

@gdha
could you check my comments in
https://github.com/jsmeix/rear/blob/perfer_to_run_GRUB_install_script_as_specified_issue895/usr/share/rear/finalize/Linux-i386/21_install_grub.sh
whether or not my reasoning makes sense?

In particular I have now in finalize/Linux-i386/21_install_grub.sh

# If the BOOTLOADER variable (read by finalize/default/01_prepare_checks.sh)
# is not "GRUB" (which means GRUB Legacy) skip this script (which is only for GRUB Legacy)
# because finalize/Linux-i386/22_install_grub2.sh is for installing GRUB 2
# and finalize/Linux-i386/22_install_elilo.sh is for installing elilo:
test "GRUB" = "$BOOTLOADER" || return

But I wonder what is the intended way what "rear recover"
should do if no BOOTLOADER value exists?

Is then "NOBOOTLOADER=1" the right default?

Or should perhaps "rear recover" install GRUB 2
as fallback if no BOOTLOADER value exists?

@jsmeix
Copy link
Member

jsmeix commented Jun 30, 2016

I tested
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
on a SLES11-SP4 system that uses GRUB Legacy:

# rpm -qa | grep -i grub
grub-0.97-162.172.1

There it still "just works" (at least for me):

RESCUE f96:~ # cat /var/lib/rear/recovery/bootloader 
GRUB
RESCUE f96:~ # rear -d -D recover
Relax-and-Recover 1.18 / Git
...
Recreated initramfs (mkinitrd).
Installing GRUB Legacy boot loader:
Installed GRUB Legacy boot loader with /boot on disk with MBR booted on 'device (hd0) /dev/sda' with 'root (hd0,1)'.
Updating udev configuration (70-persistent-net.rules)
...
Finished recovering your system. You can explore it under '/mnt/local'.

@GCChelp
can you please use
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
for a test on your openSUSE 13.1 system with GRUB Legacy.

I assume it will not yet work for you because I did not
remove the test for

type -p grub-probe || type -p grub2-probe

but I show now that message

Skip installing GRUB Legacy boot loader
because GRUB 2 is installed
(grub-probe or grub2-probe exist).

I assume you will get that message
but I would appreciate it if you could try it out
and confirm if my assumption is right (or not).

@jsmeix
Copy link
Member

jsmeix commented Jun 30, 2016

@GCChelp
only FYI if needed how to use something like
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
for a test:

Basically "git clone" it into a directory and
then run rear from within that directory.

This is how I did in on my above mentioned SLES11-SP4 system:

# git clone https://github.com/jsmeix/rear.git
Cloning into 'rear'...
# cd rear
# git branch -a
* master
...
  remotes/origin/perfer_to_run_GRUB_install_script_as_specified_issue895
# git checkout -b perfer_to_run_GRUB_install_script_as_specified_issue895 origin/perfer_to_run_GRUB_install_script_as_specified_issue895
Branch perfer_to_run_GRUB_install_script_as_specified_issue895 set up to track remote branch perfer_to_run_GRUB_install_script_as_specified_issue895 from origin.
Switched to a new branch 'perfer_to_run_GRUB_install_script_as_specified_issue895'
# vi etc/rear/local.conf
...
# grep -v '^#' etc/rear/local.conf
OUTPUT=ISO
BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=nfs://10.160.4.244/nfs
NETFS_KEEP_OLD_BACKUP_COPY=yes
SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"
KEEP_BUILD_DIR=""
# usr/sbin/rear -d -D mkbackup
...

@jsmeix
Copy link
Member

jsmeix commented Jun 30, 2016

Interestingly on my openSUSE 13.1 system I have by default
both GRUB Legacy and GRUB 2 RPMs installed:

# cat /etc/os-release
NAME=openSUSE
VERSION="13.1 (Bottle)"
VERSION_ID="13.1"
PRETTY_NAME="openSUSE 13.1 (Bottle) (x86_64)"
# rpm -qa | grep -i grub
grub2-2.00-39.1.3.x86_64
grub-0.97-194.1.2.x86_64
grub2-i386-pc-2.00-39.1.3.x86_64
grub2-x86_64-efi-2.00-39.1.3.x86_64
grub2-branding-openSUSE-13.1-10.4.13.noarch
# type -p grub-probe
# type -p grub2-probe
/usr/sbin/grub2-probe

This means the test for

type -p grub-probe || type -p grub2-probe

succeeds which results that during "rear recover"

Skip installing GRUB Legacy boot loader
because GRUB 2 is installed
(grub-probe or grub2-probe exist).

@jsmeix
Copy link
Member

jsmeix commented Jun 30, 2016

I tested
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
on a SLES12-SP1 system that uses GRUB 2:

# rpm -qa | grep -i grub
grub2-2.02~beta2-69.1.x86_64
grub2-branding-SLE-12-11.3.1.noarch
grub2-i386-pc-2.02~beta2-69.1.x86_64
grub2-snapper-plugin-2.02~beta2-69.1.noarch

There it also still "just works" (at least for me):

RESCUE e229:~ # cat /var/lib/rear/recovery/bootloader
GRUB2
RESCUE e229:~ # rear -d -D recover
Relax-and-Recover 1.18 / Git
...
Recreated initramfs (mkinitrd).
Installing GRUB2 boot loader
Finished recovering your system. You can explore it under '/mnt/local'.

@GCChelp
Copy link
Author

GCChelp commented Jul 1, 2016

@jsmeix
Thanks a lot for providing the "git clone" information. Would have been hard for me without this!

but I show now that message

Skip installing GRUB Legacy boot loader
because GRUB 2 is installed
(grub-probe or grub2-probe exist).

I assume you will get that message
but I would appreciate it if you could try it out
and confirm if my assumption is right (or not).

You are right, I got

Skip installing GRUB Legacy boot loader because GRUB 2 is installed (grub-probe or grub2-probe exist).
Installing GRUB2 boot loader

BTW:

cat /var/lib/rear/recovery/bootloader

GRUB

@jsmeix
Copy link
Member

jsmeix commented Jul 1, 2016

@GCChelp
because you do not use GRUB 2 on your system
I assume you no not need /usr/sbin/grub2-probe
on your system so that you could move it away with

# mv /usr/sbin/grub2-probe /usr/sbin/grub2-probe.away

as a dirty hack to make "rear recover" work on
your particular system - at least for now because
I need more time to find out about the reason behind
why that special code is there not to simply trust when
the value in /var/lib/rear/recovery/bootloader is "GRUB"
(i.e. I like to avoid regressions on whatever other Linux
distributions if I simply remove that code).

@gdha
Copy link
Member

gdha commented Jul 4, 2016

@jsmeix Your script version of 21_install_grub.sh looks ok to me. We can give it a try (+1)
The NOBOOTLOADER variable is meant as a kind of being desperate of not knowing what to do when there is no bootloader found. It was (and still is) up the the user to decide then. What else can we do? Make some suggestions, but I wouldn't make a decision on his behalf as some will like it, where others will be angry...

jsmeix added a commit that referenced this issue Jul 4, 2016
…_as_specified_issue895

First steps towards more reliably installing the right bootloader.

Regarding installing GRUB Legacy as bootloader
overhauled finalize/Linux-i386/21_install_grub.sh and
improved finalie/default/01_prepare_checks.sh
(see issue #895)

It is still not fully reliably installing the right bootloader:
When GRUB 2 is installed in addition To GrubLegacy
it still prefers to install GRUB 2 as bootloader
even if the BOOTLOADER variable tells "GRUB"
(which means GRUB Legacy).

I.e. further work is needed...
@jsmeix
Copy link
Member

jsmeix commented Jul 4, 2016

With #900 I merged
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895

It is still not fully reliably installing the right bootloader:
When GRUB 2 is installed in addition to GRUB Legacy
it still prefers to install GRUB 2 as bootloader
even if the BOOTLOADER variable tells "GRUB"
(which means GRUB Legacy).

I.e. further work is needed...

An idea how to further improve it:

If GRUB 2 and GRUB Legacy are installed and
the BOOTLOADER variable tells "GRUB",
then install GRUB Legacy as bootloader.

In contrast if GRUB Legacy is not installed then
do not install GRUB Legacy as bootloader
in finalize/Linux-i386/21_install_grub.sh
even if the BOOTLOADER variable tells "GRUB".
In this case hope for the best that then GRUB 2 is installed.
In finalize/Linux-i386/22_install_grub2.sh do the following:
If GRUB 2 is installed then install GRUB 2 as bootloader
even if the BOOTLOADER variable tells "GRUB".

@GCChelp
Copy link
Author

GCChelp commented Jul 5, 2016

@jsmeix
It seems that openSUSE 13.1 installed both, GRUB legacy and GRUB 2 by default.

Concerning the dirty hack, I found another workaround:
If I simply deinstall all GRUB 2 rpms, the machine still is able to boot and rear is finally working.
I could even perform my first successful rear recover with this setup! :-)

Regarding the BOOTLOADER variable and the file /var/lib/rear/recovery/bootloader : Where does this value come from? How is it determined and defined?

@jsmeix
Copy link
Member

jsmeix commented Jul 11, 2016

@GCChelp
regarding
"openSUSE 13.1 installed both, GRUB legacy and GRUB 2 by default":
Yes, see my
#895 (comment)

Regarding
"deinstall all GRUB 2 ... successful rear recover":
Many thanks for the positive feedback!

Regarding
"BOOTLOADER ... Where does this value come from?":
This is what I need to learn and understand as a precondition
to further improve that the actually right bootloader gets installed
during "rear recover".

Because of this I like to keep this issue open because
in general it is not really solved regardless that for this
special case here a workaround (deinstall GRUB 2) exists.

@jsmeix jsmeix added enhancement Adaptions and new features cleanup and removed bug The code does not do what it is meant to do waiting for info labels Jul 11, 2016
@stoxxys
Copy link

stoxxys commented Jul 13, 2016

I had the same error in openSUSE 42.1:
You need to edit /usr/share/rear/finalize/Linux-i386/21_install_grub.sh

You need to change the bootloader settings part like this and it will work:

...

#skip if another bootloader was installed
if [[ -z "$NOBOOTLOADER" ]] ; then
return
fi

#for UEFI systems with grub legacy with should use efibootmgr instead
[[ ! -z "$USING_UEFI_BOOTLOADER" ]] && return # not empty means UEFI booting
**
#check the BOOTLOADER variable (read by 01_prepare_checks.sh script)

if [[ "$BOOTLOADER" = "GRUB" || "$BOOTLOADER" = "GRUB2" ]]; then
# grub2 script should handle this instead
return
fi

#Only for GRUB Legacy - GRUB2 will be handled by its own script
if [[ -z "$(type -p grub)" ]]; then
return
fi
...

@jsmeix
Copy link
Member

jsmeix commented Jul 13, 2016

@stoxxys
please provide a GitHub pull request with your changes
so that I can reliably see your exact changes
(a GitHub pull request provides a nice diff)
or at least provide a "diff -u" output preferably
as a uploaded file or at least here enclosed
in <pre> ... </pre>

@jsmeix
Copy link
Member

jsmeix commented Jul 13, 2016

@stoxxys
what "grub*" RPM packages have you installed on
your openSUSE Leap 42.1 system?

On my openSUSE Leap 42.1 system I have only grub2*
packages

# cat /etc/os-release 
NAME="openSUSE Leap"
VERSION="42.1"
VERSION_ID="42.1"
PRETTY_NAME="openSUSE Leap 42.1 (x86_64)"
...
# rpm -qa | grep -i grub
grub2-i386-pc-2.02~beta2-76.1.x86_64
grub2-2.02~beta2-76.1.x86_64
grub2-x86_64-efi-2.02~beta2-76.1.x86_64
grub2-branding-openSUSE-42.1-6.2.noarch
grub2-snapper-plugin-2.02~beta2-76.1.noarch

and for me "rear recover" just works there.

@jsmeix
Copy link
Member

jsmeix commented Jul 13, 2016

@stoxxys
your /usr/share/rear/finalize/Linux-i386/21_install_grub.sh
in your #895 (comment)
is not the current rear master, in particular ist is not
my latest version from #900

Please report issues with older rear versions as separated
issues and do not mix up your issues into other issues.

In general in case of issues preferably try to reproduce them
with newest rear master code.

In general regarding how to test with
the currently newest rear GitHub master code:

# git clone https://github.com/rear/rear.git
# cd rear
# vi etc/rear/local.conf
# usr/sbin/rear -d -D mkbackup
and finally test whether "rear -d -D recover" works

@gdha
Copy link
Member

gdha commented Sep 13, 2016

@GCChelp Do you still need some assistance from us or are you good to close it?

@GCChelp
Copy link
Author

GCChelp commented Sep 14, 2016

@gdha
For me the workaround was fine, no further assistance required. Thanks for asking!

@jsmeix
Do you consider this issue fixed in the next Rear release?

@jsmeix
Copy link
Member

jsmeix commented Sep 15, 2016

All what is shown as "merged" above will be in the next
Relay-and-Recover release.

@jsmeix jsmeix closed this as completed Sep 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants