Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support saving and restoring hybrid BIOS/UEFI bootloader setup and clean up bootloader detection #3145

Merged
merged 15 commits into from Feb 16, 2024

Conversation

pcahyna
Copy link
Member

@pcahyna pcahyna commented Feb 6, 2024

Pull Request Details:
  • Type: Bug Fix / Enhancement

  • Impact: High

  • Reference to related issue (URL): also helps with the problem in ErrorIfDeprecated when GRUB Legacy is used #3128

  • How was this pull request tested?
    Full backup and recovery on a hybrid UEFI/BIOS bootloader setup, with UEFI firmware.

  • Description of the changes in this pull request:

    • Fixes a bug where a system with hybrid BIOS/UEFI boot was restored without the BIOS bootloader (GRUB2) and on subsequent mkrescue it will abort because it can't detect the bootloader - the BIOS bootloader was not found in the boot sector and the presence of the BIOS boot partition confused the EFI bootloader detection. In detail:
    • Support BOOTLOADER=GRUB/GRUB2 with UEFI. This will happen with hybrid boot: BOOTLOADER=GRUB2 indicates that there is the BIOS version of GRUB2 installed and at the same time, USING_UEFI_BOOTLOADER=1 indicates theat there is also an EFI bootloader. Note that it is now perfectly legal to have USING_UEFI_BOOTLOADER=1 and BOOTLOADER neither EFI nor GRUB2-EFI nor ELILO. This will happen with hybrid BIOS/UEFI booting: BOOTLOADER will be detected as GRUB2, but USING_UEFI_BOOTLOADER=1.
    • Detect GRUB2 earlier - when saving layout. We used to distinguish between GRUB and GRUB2 when reinstalling the bootloader. This meant that the saved bootloader was wrong: GRUB for GRUB2. Detect GRUB2 earlier, already at the moment when we guess the bootloader, and save the correct value. This should help with the problem reported in PR ErrorIfDeprecated when GRUB Legacy is used #3128 (misdetection of GRUB2 as GRUB on any non-SUSE distro).
    • Don't look for EFI bootloader in the start of disk. EFI bootloader is not installed to the start of the disk, but to the EFI System partition as a regular file. It is this futile to search for EFI bootloaders in the first sectors of disks. If we find "EFI", it is only because there is a GPT, and "EFI PART" is the GPT signature (this does not tell anything about the presence of an EFI bootloader). See 67ac463#r138332329 .
    • Delete the code that checked for 'Hah!IdontNeedEFI' as a special case - it is not needed now. The code effectively skipped the check for EFI when there was a BIOS boot partition. Its presence does not indicate the absence of an EFI bootloader, though. We could be on an UEFI machine with the disk partitioned in hybrid (BIOS/UEFI compatible) mode. In this case ReaR can error out if legacy GRUB is not installed, because it does not find any bootloader (the EFI check is skipped).

@pcahyna pcahyna self-assigned this Feb 6, 2024
@pcahyna pcahyna added enhancement Adaptions and new features bug The code does not do what it is meant to do cleanup labels Feb 6, 2024
@pcahyna pcahyna added this to the ReaR v2.8 milestone Feb 6, 2024
@pcahyna pcahyna requested a review from a team February 6, 2024 18:07
# (both GRUB 2 and GRUB Legacy have the string "GRUB" in their MBR).
function is_grub2_installed () {
if type -p grub-probe >&2 || type -p grub2-probe >&2 ; then
LogPrint "GRUB 2 is installed (grub-probe or grub2-probe exist)."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think DebugPrint would be better here
to avoid that this message is shown to the user
every time when is_grub2_installed is called.
Perhaps even only Log is sufficient here.
When needed user messages can be shown
at the place where is_grub2_installed is called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsmeix I can do that. I chose this, as the original code ( https://github.com/rear/rear/pull/3145/files#diff-078a2efa46c4e9db15409dc78a15393011d85d51b0dba2323d590e06f3db36b3L47 ) had LogPrint as well, so I wanted to preserve the information. It is true that it may be a bit too verbose, so I will do as you suggest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will use Log here, DebugPrint would make the message disappear even from the log if not DEBUG, and I prefer to have more verbose logs. Or, I could introduce LogDebugPrint that would log always and print only if DEBUG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time when I implemented that in June 2016 via
6eb37d9
there was not yet a DebugPrint function so I used LogPrint.

Over the time such LogPrint calls became more and more
so ReaR's verbose mode became more and more oververbose.

I implemented the DebugPrint function in July 2017 via
2a03091

There are many older LogPrint calls where it is
questionable if they are helpful in verbose mode
and where DebugPrint is probably better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't use Log() because for my personal needs
having something only in the log file is not helpful.
I never look at the log file unless I need to investigate
something and then I want debug messages.

But I do not use ReaR in production environments.
Probably in production environments there are reasons
to have certain messages only in the log file.

I use ReaR only as a developer and I call ReaR basically
always in debug mode - regadless if I need to debug something
or not - I just want to have all info already there when
I spot something where I need detailed information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if having a message like

GRUB 2 is installed (grub-probe or grub2-probe exist).

only in the log file (but not on the user's terminal)
could be useful or helpful in production environments?

Copy link
Member

@jsmeix jsmeix Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offhandedly I think instead of an additional LogDebugPrint
it would be better to change the current DebugPrint
to log always and print only if DEBUG.

In particular when you prefer to have more verbose logs
in any case.

I also want verbose logs and I call ReaR basically
always in debug mode to get verbose logs.

But when there are more verbose logs in any case
some users may not like that and probably righfully
complain why the logs are so verbose in any case
when there is the verbose mode and the debug mode
for those who want more verbose logs.

I think the messaging/logging area is complicated
and it should be better discussed separately.

For this particular case here feel free to use
what you think is best here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't use Log() because for my personal needs
having something only in the log file is not helpful.
I never look at the log file unless I need to investigate
something and then I want debug messages.

the idea is that especially during recovery, one wants to have information in the log by default and not only when asked, because if a user encounters a problem but the recovered machine basically works, they may dislike to have to reproduce the issue by recovering again with debug mode, so the log from the run with default settings will be the only thing that you will have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 4be6d2f


label="$( get_disklabel_type "$disk" )" || return 1
# I (pcahyna) don't know whether the SUSE-specific 'gpt_sync_mbr' partition scheme
# supports GRUB installation without a bios_grub partition.
Copy link
Member

@jsmeix jsmeix Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcahyna
I think you can safely ignore the 'gpt_sync_mbr' partitioning.

Details:

I found https://build.opensuse.org/request/show/519107
where the SUSE specific patches to parted were removed
that implemented the 'gpt_sync_mbr' partitioning.
The comment therein (excerpt)

clean-up regarding pMBR handling to sync with SLE12-SP*

indicates that the 'gpt_sync_mbr' partitioning
was no longer supported since some SLES12 service pack.

Further investigation with the openSUSE build service tool 'osc' shows

# osc search parted | grep '^SUSE:SLE-12'
SUSE:SLE-12-SP1:GA                       parted
SUSE:SLE-12-SP1:Update                   parted
SUSE:SLE-12-SP2:GA                       parted
SUSE:SLE-12-SP2:Update                   parted
SUSE:SLE-12-SP3:GA                       parted
SUSE:SLE-12-SP3:Update                   parted
SUSE:SLE-12-SP4:GA                       parted
SUSE:SLE-12-SP4:Update                   parted
SUSE:SLE-12:GA                           parted

# osc cat SUSE:SLE-12-SP2:GA parted parted.changes
...
Thu Mar 24 11:52:51 UTC 2016 - puzel@suse.com

- Drop (SUSE specific) support for hybrid pMBR (gpt_sync_mbr
  label) (fate#317849)
  - remove: parted-gpt-mbr-sync.patch
  - remove: libparted-ppc-prepboot-in-syncmbr.patch
  - refresh patches

SUSE:SLE-12-SP1:GA parted parted.changes does not contain that.

So the SUSE specific patches for 'gpt_sync_mbr' were removed
for SLES12 SP2.

This means a SLES12 system that was installed with SLES12
before SLES12 SP2 could have 'gpt_sync_mbr' partitioning.
Subsequent system upgrades to further service packs only
upgrade RPM packages but leave the partitioning unchanged.
I think that one can also upgrade from SLES12 to SLES15
without new partitioning (i.e. without installing from scratch)
so there could be even SLES15 systems with old inherited
'gpt_sync_mbr' partitioning.

If issues appear in ReaR because of an old inherited
'gpt_sync_mbr' partitioning, it is SUSE's problem,
not yours.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsmeix Thanks for the explanation. The code currently does not have any gpt_sync_mbr handling, the comment explains why it is not needed. So the only thing I could improve is to delete the comment to avoid distracting the reader from more important stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in acff60f

# Returns 1 if the device is an LVM physical volume
# Returns 0 otherwise or if the device doesn't exists
# Returns true if the device is an LVM physical volume
# Returns false otherwise or if the device doesn't exists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcahyna
thank you for cleaning up the inverse logic
of the is_disk_a_pv return code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An addedum regarding is_disk_a_pv:
I was wondering about the word 'disk' in this function
because a LVM PV can be a disk or a partition
(in general a PV is a block device, e.g. also a LUKS volume).
The is_disk_a_pv function was introduced via
#901
so the is_disk_a_pv function is meant to be used
only with a disk as argument when installing GRUB[2].

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except some nitpicking all looks well to me
(from plain looking at the code)
so I approve it gratefully!

@pcahyna
Copy link
Member Author

pcahyna commented Feb 8, 2024

@rear/contributors I plan to merge it tomorrow unless there are objections.

Introduce function is_grub2_installed() with common code.
We used to distinguish between GRUB and GRUB2 when reinstalling the
bootloader. This meant that the saved bootloader was wrong: GRUB for
GRUB2. Detect GRUB2 earlier, already at the moment when we guess the
bootloader, and save the correct value.

This should help with the problem reported in PR #3128 (misdetection of
GRUB2 as GRUB on any non-SUSE distro).
EFI bootloader is not installed to the start of the disk, but to the EFI
System partition as a regular file. It is this futile to search for EFI
bootloaders in the first sectors of disks. If we find "EFI", it is only
because there is a GPT, and "EFI PART" is the GPT signature (this does
not tell anything about the presence of an EFI bootloader).

Delete the code that checked for 'Hah!IdontNeedEFI' as a special case -
it is not needed now. The code effectively skipped the check for EFI
when there was a BIOS boot partition. Its presence does not indicate
the absence of an EFI bootloader, though. We could be on an UEFI machine
with the disk partitioned in hybrid (BIOS/UEFI compatible) mode. In this
case ReaR can error out if legacy GRUB is not installed, because it does
not find any bootloader (the EFI check is skipped).

Detect EFI instead according to the USING_UEFI_BOOTLOADER variable when
no other bootloader is found.

Note that it is now perfectly legal to have USING_UEFI_BOOTLOADER=1
and BOOTLOADER neither EFI nor GRUB2-EFI nor ELILO. This will happen
with hybrid BIOS/UEFI booting: BOOTLOADER will be detected as GRUB2, but
USING_UEFI_BOOTLOADER=1.

BOOTLOADER=EFI is now intended as a fallback when no other bootloader is
found to avod an empty BOOTLOADER value.
This will happen with hybrid boot: BOOTLOADER=GRUB2 indicates that there
is the BIOS version of GRUB2 installed and at the same time,
USING_UEFI_BOOTLOADER=1 indicates theat there is also an EFI bootloader.
The function is_disk_a_pv returns with 1 if disk is a PV, 0 if it is
not.

As a nonzero return code is interpreted as "false" in shell tests,
this convention is very confusing and contrary to all other is_*
functions, which return true (exit code 0) when the condition that they
check and is expressed in their name is true (cf. is_true).

Invert the exit code of is_disk_a_pv to comply with the usual convention
and adjusts comments and callers.
This will happen with hybrid boot: BOOTLOADER=GRUB indicates that there
is the BIOS version of GRUB installed and at the same time,
USING_UEFI_BOOTLOADER=1 indicates theat there is also an EFI bootloader.
Not all disks are suitable for GRUB installation into their MBR. For
example, GPT-partitioned disks need a BIOS boot partition
( https://www.gnu.org/software/grub/manual/grub/html_node/BIOS-installation.html )
and attempting to install GRUB on disks without it makes grub-install
error out. This is shown in the recovery log ona BIOS system with
GPT-partitioned disks, where /dev/vdb and the following disks are part
of the / volume, but don't have a BIOS boot partition:

Installing GRUB2 boot loader...
Determining where to install GRUB2 (no GRUB2_INSTALL_DEVICES specified)
Found possible boot disk /dev/vda - installing GRUB2 there
Found possible boot disk /dev/vdb - installing GRUB2 there
Failed to install GRUB2 on possible boot disk /dev/vdb
Found possible boot disk /dev/vdc - installing GRUB2 there
Failed to install GRUB2 on possible boot disk /dev/vdc
Found possible boot disk /dev/vdd - installing GRUB2 there
Failed to install GRUB2 on possible boot disk /dev/vdd
Found possible boot disk /dev/vde - installing GRUB2 there
Failed to install GRUB2 on possible boot disk /dev/vde
Found possible boot disk /dev/vdf - installing GRUB2 there
Failed to install GRUB2 on possible boot disk /dev/vdf

To fix that, introduce function is_disk_grub_candidate and use it before
installing GRUB/GRUB2 on the disk. This functions currently performs the
check for the BIOS boot partition on GPT-formatted disks and the check
that the disk is not a LVM PV.
layout/save/default/445_guess_bootloader.sh needs to know whether we are
using UEFI. Source prep/default/320_include_uefi_env.sh (which sets
USING_UEFI_BOOTLOADER) in the savelayout workflow.

This is a minimally invasive change; there may be other variables set
during prep stage that are needed during savelayout (see
#1673 ) so a more general, but more
invasive solution would be to source the entire prep stage in the
savelayout workflow.
finalize/Linux-i386/660_install_grub2.sh installs GRUB2 for legacy BIOS
boot into the disks(s) MBR(s). When using EFI, this can be still used in
a hybrid BIOS/UEFI boot scenario, but we need to pass --target=i386-pc
to grub-install in this case, otherwise it will attempt to install the
EFI GRUB, not the BIOS GRUB.

See also output/USB/Linux-i386/300_create_grub.sh .
Update comments for the code that detects if the "GRUB" value for
BOOTLOADER actually means GRUB2 and emit warnings (using LogPrintError)
about "GRUB" value to mean GRUB 2 being deprecated and the GRUB Legacy
support itself being deprecated as well.

Do not error out in this situation as erroring out near the end of
recovery only to inform about a deprecated setting would be unhelpful.
Commit 67ac463 ("initial attemp to guess bootloader") introduced
bootloader detection, so it has not been true since then that
"There is no guarantee that GRUB was the boot loader used originally.
One possible attempt would be to save and restore the MBR for each disk,
but this does not guarantee a correct boot order, or even a working boot
loader config" We don't save and restore the MBR for each disk, we
detect the bootloader and reinstall it, thus avoiding the latter issue.

Mention that we make "all suitable disks" bootable, not "all disks", as
we make only the / and /boot partition disks bootable and there are
further rules to skip some disks.
BOOTLOADER = "GRUB" used to mean GRUB 2 if GRUB 2 was installed and GRUB
Legacy if not. With the improved GRUB detection, we don't want this
ambiguity any more. "GRUB" should mean GRUB Legacy and GRUB2 should mean
GRUB 2.
@pcahyna pcahyna merged commit ca99d85 into master Feb 16, 2024
21 of 22 checks passed
@pcahyna pcahyna deleted the restore-hybrid-bootloader branch February 16, 2024 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The code does not do what it is meant to do cleanup enhancement Adaptions and new features fixed / solved / done
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants