Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: add new memtest86+ to recovery options via systemd-boot #291

Open
wagnerck opened this issue Apr 11, 2022 · 19 comments
Open

Comments

@wagnerck
Copy link

The open-source memtest86+ (http://memtest.org/, distinct from the commercial memtest86 owned by PassMark at https://www.memtest86.com/) is under active development again. V6 is going to go into beta testing later this April, and it includes new features to make it useful once more, such as UEFI support, DDR5 testing, and more. But the most interesting feature that I'm seeing is that it compiles EFI executables by default, which work as expected when added to the /boot/efi partition.

This would allow us to add the new memtest86+ to our default Pop!OS installs, as another boot option that users can choose along with the recovery partition. I feel like this would be very useful for troubleshooting purposes, and could be a proof-of-concept for adding other hardware troubleshooting tools to the EFI partition. The memtest.efi file is 135k and should not cause any space issues on /boot/efi.

The following commands will build the new memtest86+ (v6 alpha) to include both an ISO you can burn to a USB stick and a memtest.efi file (although additional packages may be required, xorriso was the only additional one I had to install myself):

$ git clone https://github.com/memtest86plus/memtest86plus
$ cd memtest86plus/build64
$ sudo apt install xorriso
$ make iso
$ ls
app      floppy.img  lib          memtest.efi  memtest_shared      tests
boot     iso         Makefile     memtest.iso  memtest_shared.bin
esp.img  ldscripts   memtest.bin  memtest.mbr  system

Then copy memtest.efi to /boot/efi/EFI/memtest/memtest.efi, and create /boot/efi/loader/entries/memtest.conf that contains the following:

title memtest86+
efi /EFI/memtest/memtest.efi
options smp

This works in a UEFI VM and on my oryp6. See the following photos:
IMG_0154
IMG_0155

You can also add it to the firmware's boot device listing via efibootmgr, but I feel like the more important thing is having it in the systemd-boot menu, since that's where our users go to choose the recovery partition, and the command to open up that menu is universal across hardware (i.e. hold down the spacebar).

And if there's also a way to add this to the installer ISO, as an option in the initial boot menu, that would make this useful for folks at the factory who are doing installation and testing, as well as for people trying to install Pop!OS and can't because they have a hardware problem. I don't know if that would be done with an EFI file or with something else, like the floppy image.

I've been told that it was determined that we can't do this kind of thing with the commercial memtest86 from PassMark, but this project appears to be fully GPL2 (https://github.com/memtest86plus/memtest86plus/blob/main/LICENSE) so there shouldn't be any insurmountable licensing issues.

Caveats: memtest86+ v6 is still in an alpha state, with a beta in late April 2022, so while testing might be worth doing right now, we probably shouldn't actually add this to the shipping product until later. Maybe the beta will be stable enough to start shipping it, maybe not.

Also, we would clearly need to test this on a wide variety of hardware, especially our desktop PC line (Thelio and Meerkat) because memtest86+ has a new keyboard detection system (as outlined on the GitHub page) which may not reliably detect external USB keyboards on all systems just yet, or may require booting with CSM and/or legacy modes enabled.

I'm not sure how difficult this would be to add to our installer, but I feel like the best option here would be to treat it like the recovery partition, since it also will only be present on UEFI systems. Pop!OS could also possibly update the EFI executable independent of the OS, using the same mechanism used to update the recovery partition.

You can supposedly change the order of the options in the systemd-boot menu with a sort-key option in the configuration file (see https://systemd.io/BOOT_LOADER_SPECIFICATION/), but I couldn't get it to work. I think it needs to have a sort-key in all of the configuration files under /boot/efi/loader/entries/ for this to work correctly, but I didn't mess with the ones automatically generated by Pop!OS to test this.

Finally, even if the EFI executable doesn't turn out to be feasible to include in Pop!OS, we should be happy that there's once again going to be a fully open-source memtest that we can refer users to, that won't nag them to install a "Pro" version.

@mmstick
Copy link
Member

mmstick commented Apr 11, 2022

I think it'd just require making a debian package for it which installs to that path and adds an entry.conf for it. Shouldn't require much or any change to the installer.

@wagnerck
Copy link
Author

Is there a way to make it so a Debian package won't install on non-UEFI systems? I know there's a lot of of people (myself included) who are still running Pop!OS on BIOS hardware. That was part of why I thought that doing it as part of the installer might be a better idea.

Also, that wouldn't allow us to add it as a boot option to the installer ISO, right?

@mmstick
Copy link
Member

mmstick commented Apr 11, 2022

The installer would only make it available for new installs. A package could make it available to all systems in an update. The debian package could use a postinst script to copy the EFI binary to the EFI directory and set up the entry for it (if it even requires a boot conf), and a postrm script could remove it. The script could do nothing on a system without an EFI partition mounted.

@wagnerck
Copy link
Author

If you think that's the better solution, I'll look into how I can create a Debian package for this instead of adding it to the Pop!OS installer setup.

@thomas-zimmerman
Copy link

I think it would be better to have the deb install the files in the ESP and create the laoder entry--and on remove do nothing unless purged. This would allow the package to be included on install (creating the entry) and not be installed in the resulting system.

Support would be able to request install of the deb package again if the memory test is not found in the systemd-boot menu.

With this design, would the Pop!OS ISO have the entry as well?

@mmstick
Copy link
Member

mmstick commented Apr 11, 2022

I don't think that it would, but something like that could be added separately.

@wagnerck
Copy link
Author

I feel like having the package remain as part of the resulting install would be important because then it could be updated with a new EFI binary as part of your normal apt update+upgrade process, or through the Pop!Shop. Support would still be able to do apt reinstall --purge pop-memtest or whatever if necessary, but it would be kept up-to-date continually by default. The same package could also theoretically be expanded later to include other tools, so maybe naming it something like pop-diagnostics to start would be better? These are decisions that can made later since they won't affect the basic plan.

The Pop!OS ISO would need to have it added to the opening menu, I think? On UEFI systems it shows the "Try or Install Pop_OS" screen, which only has the one menu option at the moment. Adding a second one ("Test System Memory" perhaps) would be a matter of adding the menu option to /boot/grub/grub.cfg and have it point to the executable, I think. And yeah, this would have to be separate from a Debian package.

@wagnerck
Copy link
Author

I'm working on a proof-of-concept for this (with a package name named pop-diagnostics) and while it turned out to be fairly trivial to have the package tools copy the appropriate files to /boot/efi, I've run into a snag. Attempting to upgrade the package fails, as dpkg uses symlinks as part of the upgrade process, so it can rollback to the original files easily if the upgrade fails, and that doesn't work on an EFI partition since it's FAT32.

ckw@flatline:~/Documents/projects/memtest$ sudo dpkg -i pop-diagnostics_0.1.1_amd64.deb 
[sudo] password for ckw: 
(Reading database ... 258533 files and directories currently installed.)
Preparing to unpack pop-diagnostics_0.1.1_amd64.deb ...
systemd-boot present, continuing install
Unpacking pop-diagnostics (0.1.1) over (0.1.0) ...
dpkg: error processing archive pop-diagnostics_0.1.1_amd64.deb (--install):
 unable to make backup link of './boot/efi/EFI/memtest/memtest.efi' before installing new version: Operation not permitted
Errors were encountered while processing:
 pop-diagnostics_0.1.1_amd64.deb

This is not a unique problem. as there's a Raspbian hack to get around the same problem and a Ubuntu issue about this is almost ten years old, etc. If anyone here has any ideas about the proper way to work around this problem? I'm looking into options but they're all a bit hacky. Or would this be something better handled by the same tools that maintain the recovery partition?

@mmstick
Copy link
Member

mmstick commented May 31, 2022

Perhaps the package shouldn't be placing files in the EFI partition directly, but through a postinst / postrm script.

@wagnerck
Copy link
Author

That was in fact my thought. Right now they're being placed using a .install file:

memtest86plus/build64/memtest.efi /boot/efi/EFI/memtest/
src/memtest.conf /boot/efi/loader/entries/

If they're instead placed via a postinst script, and removed on uninstall/purge with a postrm script, dpkg won't try to do the symlink thing, right? My concern is that this does feel a bit hacky, but if this is the proper method, I'll look into doing it that way instead.

@mmstick
Copy link
Member

mmstick commented May 31, 2022

Yes there wouldn't be a symlink issue to worry about. All the packages that add files to the EFI partition (kernelstub, grub) make the EFI installation a separate part of the package installation. kernelstub gets run with initramfs updates which copies kernels over and generates configs. There could be an initramfs hook that installs the files if they're missing, if not done directly by the postinst.

@wagnerck
Copy link
Author

The package would install the files to somewhere in /usr/lib or the like as their official package locations, and then the scripts would copy them to the ESP? I think having them be part of the package scripts might be better than using initramfs hooks in this case, since errors in the hook could cause the process of creating the boot image to fail entirely. It would also allow users to install the package from the recovery partition; even though the recovery partition resets itself on restart, the files copied to the ESP would still be present.

This seems like a good way to move forward, and I'll try to find the least hacky way to get the postinst and postrm scripts to do this.

@wagnerck
Copy link
Author

I've got a prototype in place that installs, uninstalls, upgrades, etc: https://github.com/cwsystem76/pop-diagnostics

Rather than moving this repo to the pop-os github org, I think using my personal repo to experiment and then doing a new, clean repo with those experiments as a foundation would be better.

@wagnerck
Copy link
Author

wagnerck commented Jun 1, 2022

Summarizing some Slack discussion here:

  • Disagreement about whether the staging should occur in /usr/lib or /usr/share. I think that the FHS guidelines suggest it should be /usr/lib (or possible /usr/libexec), others think it should be /usr/share. This is a trivial thing to change at this point in the packaging; do you have any opinions on the subject, @mmstick ?
  • It was suggested that we have a user-accessible script in the staging area, something like memtest-utils.sh, that could be run by users to reinstall, remove, or otherwise adjust the files in the ESP. This script could then be called by the postinst and prerm package scripts to do the work instead. I think that this is a good idea and unless there are objections I'm going to add that to the next test version. This has been completed.
  • It was also suggested that the files be left in the EFI on a normal package uninstall, and only remove them when the package is purged. This seems reasonable, any objections?
  • There was talk about using kernelstub to install the files into the ESP. I don't think this is a good idea, as this would increase interdependence between packages, require adding new functionality to kernelstub, make it harder to add the files to the ESP when booted to recovery, etc. I believe we can pass on that idea.
  • Other folks in Support asked if it would be possible to add this as an option when booting from a USB stick prior to install. I'm not sure how difficult this would be, although right now the ISO boots on a UEFI system to a menu that only has one option; how do we add additional options?
  • There may be a way to make this work with GRUB systems but it would be significantly more complicated and I'm not sure if it's worth the effort, given that we haven't shipped a BIOS system in a while and it's a very small percentage of Pop!OS users on non-System76 hardware. I'm keeping my notes on this but not actively pursuing it.

@thomas-zimmerman
Copy link

Name of the update script would be memtest-update.sh

@wagnerck
Copy link
Author

wagnerck commented Jun 1, 2022

I was thinking memtest-utils.sh because it can also be used to remove it, or possibly even add launch parameters if we add that functionality.

@wagnerck
Copy link
Author

wagnerck commented Jun 2, 2022

The latest commit now has memtest-utils.sh, which is called by the package postinst and prerm scripts. An example command that a user can run in a terminal:

sudo /usr/lib/pop-diagnostics/memtest/memtest-util.sh reinstall

A future version of this script will add an "options" verb to let systemd-boot pass launch parameters to the EFI executable, like keyboard=legacy or nosm when necessary for troublesome hardware.

@wagnerck
Copy link
Author

wagnerck commented Jun 3, 2022

The "options" feature has been added to memtest-utils.sh and the readme and comments generally improved. For example:

sudo /usr/lib/pop-diagnostics/memtest/memtest-util.sh options keyboard=legacy

will write a custom configuration file that will not use the new USB keyboard drivers, and will instead rely on the UEFI firmware to initialize the keyboard in legacy mode.

@wagnerck
Copy link
Author

wagnerck commented Jul 11, 2022

The current Beta 2 build doesn't work with the latest Coreboot firmware on my oryp6, but the dev build from nightly source does. One of a few reasons to hold off on any public release until it's out of beta and we've done a lot more testing.

[edit] This problem has been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants