Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix boot failure on ARM UEFI devices because of missing regexp module. #1010

Merged
merged 1 commit into from
Jun 17, 2020

Conversation

kacf
Copy link
Member

@kacf kacf commented Jun 16, 2020

Changelog: Fix boot failure on ARM UEFI devices because of missing
regexp module. A typical error log looks like this:

Welcome to GRUB!

error: no such device: ((hd0,msdos1)/efi/boot)/EFI/BOOT/grub.cfg.
lock: OK
lock: OK
error: can't find command `regexp'.

error: disk `,msdos2' not found.

Dropping to grub prompt for unknown reason. Should never get here.

It happened because the binary grub-efi-bootarm.efi file was not
updated when the regexp module was added as a requirement. Fixed by
grabbing it from the vexpress-qemu UEFI build.

Signed-off-by: Kristian Amlie kristian.amlie@northern.tech

Changelog: Fix boot failure on ARM UEFI devices because of missing
`regexp` module. A typical error log looks like this:
```
Welcome to GRUB!

error: no such device: ((hd0,msdos1)/efi/boot)/EFI/BOOT/grub.cfg.
lock: OK
lock: OK
error: can't find command `regexp'.

error: disk `,msdos2' not found.

Dropping to grub prompt for unknown reason. Should never get here.
```

It happened because the binary `grub-efi-bootarm.efi` file was not
updated when the regexp module was added as a requirement. Fixed by
grabbing it from the vexpress-qemu UEFI build.

Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech>
@mender-test-bot
Copy link

Hello 😸 I created a pipeline for you here: Pipeline-156735344

Build Configuration Matrix

Key Value
BASE_BRANCH warrior
BUILD_BEAGLEBONEBLACK true
BUILD_QEMUX86_64_BIOS_GRUB true
BUILD_QEMUX86_64_BIOS_GRUB_GPT true
BUILD_QEMUX86_64_UEFI_GRUB true
BUILD_VEXPRESS_QEMU true
BUILD_VEXPRESS_QEMU_FLASH true
BUILD_VEXPRESS_QEMU_UBOOT_UEFI_GRUB true
META_MENDER_REV pull/1010/head
POKY_REV warrior
RUN_INTEGRATION_TESTS true
TEST_QEMUX86_64_BIOS_GRUB true
TEST_QEMUX86_64_BIOS_GRUB_GPT true
TEST_QEMUX86_64_UEFI_GRUB true
TEST_VEXPRESS_QEMU true
TEST_VEXPRESS_QEMU_FLASH true
TEST_VEXPRESS_QEMU_UBOOT_UEFI_GRUB true

@kacf kacf mentioned this pull request Jun 16, 2020
@eigendude
Copy link

Confirmed that this fixes the problem:

Welcome to GRUB!

lock: OK
lock: OK


[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 5.3.13-jumpnow (oe-user@oe-host) (gcc version 8.3.0 (GCC)) #1 Fri Jun 12 19:58:51 UTC 2020
[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] OF: fdt: Machine model: TI AM335x BeagleBone Black

...

@kacf
Copy link
Member Author

kacf commented Jun 17, 2020

Thanks!

@kacf kacf merged commit 9d89a7e into mendersoftware:warrior Jun 17, 2020
@kacf kacf deleted the outdated_grub-efi-bootarm.efi branch June 17, 2020 09:57
@eigendude
Copy link

eigendude commented Jun 18, 2020

One of the downsides of an opaque binary is that it's an incredible hill climb to add a feature. Maybe I could seek some direction for this feature?

I wrote a live installer for installing to the BeagleBone's eMMC, and with just U-Boot I was able to dual-boot the entire system (SD card when present, eMMC when not). The key was for U-Boot to propagate which device it knows is booting throughout the boot process, all the way to Mender. I modified the Mender U-Boot patch to know which device to do A/B root partitioning on. Now I get to maintain a U-Boot patch of a U-Boot patch. Um... no thanks.

So I fought with the GRUB option and UEFI booting. Now, I can install and boot the UEFI system from eMMC.

However, at the GRUB step, when the binary reads grub.cfg, it needs to know from which device it was booted from, and I think this has to come from U-Boot like before. If there were a simple way to pass a parameter from U-Boot to grub, and condition on this parameter in grub.cfg, then I could achieve dual-boot again. Indeed, U-Boot supports USB devices, so someday maybe triple-boot.

What challenge am I looking at? Do I need to find some repo for the binary file, learn the build system, modify the code, compile and deploy myself? Or is there a way to pass data from U-Boot to grub.cfg?

Thanks for your input.

@drewmoseley
Copy link

drewmoseley commented Jun 18, 2020

@eigendude I am not sure on passing data from U-boot to grub. I think the DTB is shared so perhaps there is something that can be injected there. I would suggest asking this question in the forums over on Mender hub since it will get a wider audience there.

We would also love to see the installer you mention.

@kacf
Copy link
Member Author

kacf commented Jun 18, 2020

One of the downsides of an opaque binary is that it's an incredible hill climb to add a feature. Maybe I could seek some direction for this feature?

The opaque binary is unfortunately necessary because of toolchain issues in Yocto. I believe it is because Yocto refuses to compile it without hard float support, despite the fact that floats are not used in GRUB, and compiling with them breaks GRUB. So we compile it with a different toolchain.

That being said, I am not a huge fan of the current approach with using a stripped down feature set. Especially with UEFI, it makes sense that the GRUB boot loader supports a standard set of commands, so that it is fully customizable. But it would require some research into whether all modules should be included, or whether we should omit some for security/size reasons. Of course, it would increase size regardless, but I think we can live with that if it's not too big.

In the zeus branch, the method for building the binary has been cleaned up considerably, so you may find it easier to experiment with building a new one. Check out this repository and script. Still not optimal, but better than warrior.

However, at the GRUB step, when the binary reads grub.cfg, it needs to know from which device it was booted from, and I think this has to come from U-Boot like before.

I'm a little bit confused, doesn't the regexp call that was added solve exactly this?

@eigendude
Copy link

Thanks, I'll ask on Mender hub. It sounds like digging is required.

Here are the two commits for the eMMC installer so far:

@eigendude
Copy link

eigendude commented Jun 18, 2020

I'm a little bit confused, doesn't the regexp call that was added solve exactly this?

Here is the regexp command:

regexp (.*),(.*) $root -s mender_grub_storage_device

It looks like it modifies the root variable. And how is root set? Via:

root=${mender_root} ${bootargs}

How is mender_root set? Via:

mender_root="${mender_kernel_root_base}${mender_boot_part}"

Where does mender_kernel_root_base come from? It is hard-coded to MENDER_STORAGE_DEVICE here:

mender_kernel_root_base=${MENDER_STORAGE_DEVICE_BASE}

I need to dynamically select between MENDER_STORAGE_DEVICE and MENDER_INSTALL_DEVICE based on which device we booted from, which only U-Boot figures out dynamically by trial-and-error. Better yet, remove MENDER_INSTALL_DEVICE and always select dynamically.

Linux does this by reading the root kernel variable set by U-Boot and GRUB. GRUB is loaded into and run from memory, but it must know which device to load grub.cfg from. This is the mystery I'm trying to figure out now.

@eigendude
Copy link

eigendude commented Jun 21, 2020

Do I need to find some repo for the binary file, learn the build system, modify the code, compile and deploy myself?

So I did exactly that... and almost even in that order 🙂

I found the EFI API between u-boot and grub. U-boot passes a handle to the disk that grub was loaded from, but the context identifying the disk is lost. The disk always becomes hd0. If booted from eMMC, then eMMC becomes hd0. If booted from SD, then SD becomes hd0, and eMMC becomes hd1. GRUB root is always set to hd0, but we need the kernel root to be set based on the U-Boot context.

So we have two options. The "hacky" way, where the live installer mounts the boot partition after it's dd'ed, and then sed's mmcblk0 (storage path) to mmcblk1 (install path). However, this won't allow booting from USB. Then there's the "proper" way, of extending the EFI API (or using an existing data path, like dtb's) to allow for disk context to be passed, and using a lookup table in grub.cfg instead of hard-coding the path. The downside is we might have to maintain U-Boot and GRUB patches, but the patches would simply add a variable to an API, which is more maintainable and scalable than additional logic.

I'll take the hacky option now, but USB booting is really tempting, and could benefit almost every device beyond those with eMMC. Aclima sends sensors to far-flung reaches of the earth, where connectivity isn't a given, and it would be nice to email the 200MB disk img file that can be burned to a USB stick and booted.

If I'm successful with the eMMC installer and dual boot, should I open a draft PR? I'm happy to shed visibility into the work, and if I'm able to, I'm willing to do the extra work it takes to upstream. At the very least, I can maintain the draft PR as long as it stays hacky for anyone who's comfortable with a hacky option.

@kacf
Copy link
Member Author

kacf commented Jun 22, 2020

Thanks for a great investigation!

Given that EFI is a standard, I wonder if it already has a way to pass disk context through its API. I think U-Boot implements only a subset of the standard, and might be missing this part. If so, I reckon both U-Boot and GRUB maintainers would be happy to receive patches to enable it. This is just a guess, I'm no expert on EFI, but might be worth asking on the U-Boot mailing list.

If I'm successful with the eMMC installer and dual boot, should I open a draft PR?

Yes, please do! This sounds like a very worthwhile improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants