Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qemu-vm: remove bootDisk, refactor using make-disk-image with better EFI support #203641

Closed

Conversation

RaitoBezarius
Copy link
Member

Description of changes

This PR does multiple things :

make-disk-image refactoring in QEMU test infrastructure

As per #23052 — QEMU test infrastructure do not use ad-hoc code to build the boot disk image but reuse make-disk-image which was improved to support use-cases.

Doing this consolidates make-disk-image as one place to do these changes, it seems like the boot disk production code was doing old stuff with GRUB 0.97, I'm uncertain whether that should be cleaned up.

UEFI firmware infrastructure

As part of #187887 ; this uses consolidated UEFI firmware and variables and expose their variables to end users.

It also adds SecureBoot and System Management Mode enforcement support in the test infrastructure and the make-disk-image function.

EFIVARS can be modified as part of the disk image build process, enabling people to build a disk image with specific authenticated UEFI variables, e.g. SecureBoot keys.

Documentation

This PR adds documentation on what are the breaking changes in regard to the disk layout (storeImage is used only if needed at all.)

TODO/notes:

  • Cleanup UEFI firmware infrastructure: done in
  • Cleanup the /dev/vdb2 stuff in QEMU code.
  • Some tests such as installer depends on bootDevice being vdb or second drive.
  • GRUB version 1 may have some subtleties.
  • Assert touchEFIVars only if partition table type is hybrid or efi.
  • Use a better error message for assert.
  • Refactor in make-disk-image the EFI firmware infrastructure using ovmf: expose EFI prefixes and refactor qemu-vm with it #187887.
  • Simple tests:
  • Legacy boot without any EFI stuff should still work.
  • EFI/Hybrid boot without touching EFI vars should still work.
  • EFI/Hybrid boot with touching EFI vars should still work AND we should be able to write/read in EFIVARS.
  • Reproducibility of system image, wrt to fixed GUIDs in partition table.
  • Documentation:
    • How to test this kind of changes? What to look for?
    • Where are the breaking changes? Everything that relies on /dev/vdb2 is now incorrect.
    • Compatibility layer? Deprecation notice?
    • Prevent users to shoot their foot: increase the number of assertions & warnings accordingly in the module
  • For a future PR
    • Over {systemd-boot, GRUB1, GRUB2} cartesian product for SecureBoot test.
    • makeEc2Test is useful to initialize using an image, it would be good to extract it as a lower level primitive to start test off an already existing disk image, maybe put it directly into makeTest machinery or makeTestFromImage

@RaitoBezarius
Copy link
Member Author

Opened due to GitHub merging the old PR #191665

@RaitoBezarius
Copy link
Member Author

With this, https://hydra.nixos.org/eval/1786610 ; this PR do not introduce any failing test beyond what there was on the base nixpkgs point of this PR.

Will fix the merge conflicts now.

Copy link
Contributor

@nikstur nikstur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good change and I'm happy to see it. Most of my comments are just minor nitpicks.

nixos/modules/virtualisation/qemu-vm.nix Outdated Show resolved Hide resolved
nixos/modules/virtualisation/qemu-vm.nix Outdated Show resolved Hide resolved
nixos/modules/virtualisation/qemu-vm.nix Outdated Show resolved Hide resolved
nixos/modules/virtualisation/qemu-vm.nix Outdated Show resolved Hide resolved
@RaitoBezarius RaitoBezarius force-pushed the simplified-qemu-boot-disks branch 2 times, most recently from 07da111 to a993ab2 Compare December 4, 2022 15:10
@RaitoBezarius RaitoBezarius marked this pull request as ready for review December 4, 2022 15:12
@samueldr
Copy link
Member

samueldr commented Dec 5, 2022

One overall note, still reading this whopper of a PR: I think the lack of separation of concerns by shoving filesystem creation duties with disk image creation duties in the same expression/derivation is likely to cause issues overall. And we're seeing even less separation of concern by addition of irrelevant machine state (EFI vars) within this, I don't really like this.

@RaitoBezarius
Copy link
Member Author

One overall note, still reading this whopper of a PR: I think the lack of separation of concerns by shoving filesystem creation duties with disk image creation duties in the same expression/derivation is likely to cause issues overall. And we're seeing even less separation of concern by addition of irrelevant machine state (EFI vars) within this, I don't really like this.

I agree with you, but this PR has only a small ambition (which took already a lot of time to do "well"), that is, to remove the ad-hoc disk image creation that existed before and put the knobs into that super function (which is bad, I know.)

Once it lands, we can:

  • split disk image creation and filesystem creation (with machine state), to expose a lower level function
  • split filesystem creation and generic machine state recording during VM operations

IMHO, it's easier to deal with it in this way also because I have a use case (SecureBoot) which can nourish the adequate refactors, some of them were already mentioned in the N previous PRs (that were closed due to my mistakes), see also #187855 which is an interesting symptom exposing ad-hoc knobs to test images, but no generic tooling to do this across the ecosystem.

@samueldr
Copy link
Member

samueldr commented Dec 5, 2022

Nevertheless, do you think it's futile spending time on making these images more deterministic?

Futile no. I think it's good to want to improve what is there.

I think my main thought is that this is the wrong model.

As a cleanup effort, a short-to-mid term effort, great! But building too large on top of this foundation is (in my opinion) not the way forward.

@roberth
Copy link
Member

roberth commented Dec 8, 2022

the lack of separation of concerns by shoving filesystem creation duties with disk image creation duties in the same expression/derivation is likely to cause issues overall.

There's a technical argument to be made for not producing a lot of garbage in the store. Such images fill up the store quickly, requiring more frequent garbage collection, reducing the utility of the local store as a cache. This gets even worse for CIs, which push build dependencies, including such images, to a persistent cache.

@samueldr
Copy link
Member

samueldr commented Dec 8, 2022

There's a technical argument to be made for not producing a lot of garbage in the store. Such images fill up the store quickly, requiring more frequent garbage collection, reducing the utility of the local store as a cache. This gets even worse for CIs, which push build dependencies, including such images, to a persistent cache.

You're right. I have been slowly working on a scheme to make the concerns separate, but coalesce in a single derivation. So this should be a non-issue in the end.

In other words, the user interface separates concerns, the implementation builds on discrete blocks, but they all (can) assemble in a single derivation.

This is anyway a requirement for it to properly work with the org's hydra and the ARM sd image, as the discrete components are too big to be cached. And as you said, it causes a lot of unneeded churn.

@RaitoBezarius
Copy link
Member Author

Ready for another round of review.

RaitoBezarius and others added 11 commits December 18, 2022 00:11
`bootDisk` was generated manually in a runVM command, this is now
refactored using `make-disk-image` facility, enabling:

- a better name: `systemImage` rather than `bootDisk`, i.e. bootDisk
  contained more than only the boot files in fact
- support for arbitrary additional paths & contents with permissions &
  ownership, not exposed yet through public API though
- UEFI firmware manipulation at make-disk-image time, e.g. SecureBoot
  can be tested
- EFI firmware usage is simplified
- bootDisk / rootDisk are clearly separated to support the case when we
  do not want the default filesystem, but still want a bootloader, or do
  not want bootloader but still want the default root filesystem, etc.
- no multiple disk when it is not needed
- OVMF firmware can be compiled with SMM support ;
- make-disk-image can enforce SMM ;
- QEMU test infrastructure can enforce SMM ;
- disable by default because it seems broken on OVMF
@@ -17,7 +17,7 @@
# either "efi" or "hybrid"
# This will be undersized slightly, as this is actually the offset of
# the end of the partition. Generally it will be 1MiB smaller.
bootSize ? "256MiB"
bootSize ? "256M"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 2900974 you are not just changing documentation but also revert `

- bootSize ? "256MiB"
+ bootSize ? "256M"

This should be in 5380f26

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right

@@ -16,7 +16,7 @@ let
};
boot.loader.systemd-boot.enable = true;
boot.loader.efi.canTouchEfiVariables = true;
environment.systemPackages = [ pkgs.efibootmgr pkgs.sbsigntool pkgs.sbctl ];
environment.systemPackages = with pkgs; [ efibootmgr sbsigntool sbctl ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also should not be in a commit that changes documentation but in your secureboot test commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True

@@ -43,7 +43,7 @@ in {

sizeMB = mkOption {
type = with types; either (enum [ "auto" ]) int;
default = 2048;
default = 2252;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit does not say why the disk size was bumped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the Amazon test was broken due to not enough disk space, I don't know if it was induced by me or in-between latest release and master. Will double check.

@@ -10,4 +10,5 @@
<xi:include href="images/ocitools.section.xml" />
<xi:include href="images/snaptools.section.xml" />
<xi:include href="images/portableservice.section.xml" />
<xi:include href="images/makediskimage.section.xml" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you change anything about the interface? Otherwise I would pull this out into its own PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added parameters.

@@ -316,14 +316,34 @@ in
'';
};

virtualisation.bootPartition =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the legacy boot broken before or broken by your changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken by my changes

@RaitoBezarius
Copy link
Member Author

Closed in favor of #207039 #207038 #207043.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants