Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove grub special-casing #8627

Merged
merged 1 commit into from Aug 18, 2020
Merged

Conversation

rlaager
Copy link
Member

@rlaager rlaager commented Apr 15, 2019

Motivation and Context

The GRUB restrictions are based around the pool's bootfs property. Given the current situation where GRUB is not staying current with OpenZFS pool features, having either a non-ZFS /boot or a separate pool with limited features are pretty much the only long-term answers for GRUB support. Only the second case matters in this context. For the restrictions to be useful, the bootfs property would have to be set on the boot pool, because that is where we need the restrictions, as that is the pool that GRUB reads from. The documentation for bootfs describes it as pointing to the root pool. That's also how it's used in the initramfs. ZFS does not allow setting bootfs to point to a dataset in another pool. (If it did, it'd be difficult-to-impossible to enforce these restrictions cross-pool). Accordingly, bootfs is pretty much useless for GRUB scenarios moving forward.

Even for users who have only one pool, the existing restrictions for GRUB are incomplete. They don't prevent you from enabling the unsupported checksums, for example. For that reason, I have ripped out all the GRUB restrictions.

A little longer-term, I think extending the proposed features=portable system to define a features=grub is a much more useful approach. The user could set that on the boot pool at creation, and things would Just Work.

Description

This removes the bootfs-based restrictions that are intended to address GRUB's feature limitation.

How Has This Been Tested?

I have done zero testing of this particular change.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@rlaager rlaager mentioned this pull request Apr 15, 2019
12 tasks
@rlaager rlaager changed the title Remove grub references Remove grub special-casing Apr 15, 2019
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Apr 15, 2019
@rlaager
Copy link
Member Author

rlaager commented Apr 17, 2019

This has been rebased on master after my changes in #8626 were merged. This eliminates the shared commit (which was already merged) and strips out the GRUB mentions from zpool-features.5.

@codecov
Copy link

codecov bot commented Apr 18, 2019

Codecov Report

Merging #8627 into master will increase coverage by 0.51%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8627      +/-   ##
==========================================
+ Coverage   79.24%   79.75%   +0.51%     
==========================================
  Files         393      393              
  Lines      123859   123833      -26     
==========================================
+ Hits        98149    98766     +617     
+ Misses      25710    25067     -643     
Flag Coverage Δ
#kernel 80.11% <ø> (-0.13%) ⬇️
#user 66.54% <100.00%> (+2.19%) ⬆️
Impacted Files Coverage Δ
module/zfs/spa.c 87.62% <ø> (+0.20%) ⬆️
module/zfs/zfs_ioctl.c 86.42% <ø> (+0.02%) ⬆️
lib/libzfs/libzfs_pool.c 73.31% <100.00%> (-0.09%) ⬇️
module/zfs/aggsum.c 95.34% <0.00%> (-4.66%) ⬇️
module/os/linux/zfs/arc_os.c 59.67% <0.00%> (-3.23%) ⬇️
module/zfs/spa_log_spacemap.c 93.40% <0.00%> (-1.72%) ⬇️
module/zfs/zvol.c 84.06% <0.00%> (-0.34%) ⬇️
module/zfs/sa.c 84.19% <0.00%> (-0.33%) ⬇️
module/zfs/space_map.c 93.06% <0.00%> (-0.25%) ⬇️
module/os/linux/zfs/zfs_vnops.c 69.84% <0.00%> (-0.21%) ⬇️
... and 58 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fccaea4...2545268. Read the comment docs.

@rlaager
Copy link
Member Author

rlaager commented Apr 19, 2019

I rebased this on master and removed the mention about lz4.

@behlendorf
Copy link
Contributor

I've been giving this some thought and I agree we're going to need to reevaluate the grub restrictions. They made good sense when both grub and ZFS could be updated together (in OpenSolaris) but that's never been true for Linux. In fact, these days there's a decent chance the system is booting with UEFI and doesn't use grub at all. In which case, we have no boot restrictions.

@rlaager is there a compelling need to relax these restrictions right away? If not, I think we should consider tackling this with the proposed portable pools feature.

@rlaager
Copy link
Member Author

rlaager commented May 9, 2019

Since 0.8.0 is feature frozen, I think this should wait until after then, at least. I think we are in agreement on that. I might ask for an exception if this would otherwise have to wait for 0.9.0 rather than 0.8.1, but I’m not even sure how I feel about either part of that.

I think your concern might be that we are removing the working-but-incomplete GRUB restrictions (and maybe more importantly, the warnings in the documentation) without having a better replacement. If that is how you feel, we can wait until a features=grub is available.

Alternatively, we could rip out the code, since it is incomplete, but leave the restrictions in the docs. I think those are complete, from my previous PR.

@behlendorf
Copy link
Contributor

Yes, that's right. I'm not necessarily against removing these restrictions, but before doing so I'd like to have a clearer path forward. If we have a specific use case, relaying them is definitely still an option for 0.8.1.

One additional consideration is that after 0.8 we'll be bringing in the code to make it possible for FreeBSD to use this repository as its upstream. They have more control over their grub implementation and they may want to keep, or even extend, these restrictions,

@ryao
Copy link
Contributor

ryao commented Jun 4, 2020

They have more control over their grub implementation and they may want to keep, or even extend, these restrictions

FreeBSD has its own bootloader that isn't GRUB.

Someone poked me about reviewing this. I can't do it right now, but I imagine that this waiting a few more days (or possibly a week) won't hurt.

@mschilli87
Copy link
Contributor

mschilli87 commented Jun 15, 2020

Just to add to the poking: Yesterday I hit #8538 when setting up a new system without GRUB. I'll have to re-create the root pool with -O dnodesize=legacy to avoid creating a separate partition for a specific boot pool just to work around zfs not letting me boot from the pool because a legacy boot loader I don't even use doesn't support dnodesize=auto.
Given that legacy boot is (finally) slowly phased out, I suspect more and more people will be looking into GRUB-free boot (as there often is simply no need for it with UEFI). Combine that with an increase in root on zfs (due to popular distributions shipping it now), I think it would be really great to get this over a year old (relatively small) PR reviewed soon.
If there is anything I can do (other than nagging 😉) to help the process, let me know.
Personally, there is no rush as I anyways plan to re-create all datasets once #8538 #10278 is (finally) merged and made it to a stable release.
If by then I could take the extra step and re-create the pool with dnodesize=auto, I'd be a happy man. 😉

@Ornias1993
Copy link
Contributor

If there is anything I can do (other than nagging 😉) to help the process, let me know.
Personally, there is no rush as I anyways plan to re-create all datasets once #8538 #10278 is (finally) merged and made it to a stable release.
If by then I could take the extra step and re-create the pool with dnodesize=auto, I'd be a happy man. 😉

Be aware that zstd on a zfs bootpool requires some more work on grub before it would work (even after merging #10278 ) @BrainSlayer already prepared the required grub modifications in a grubfork.

@mschilli87
Copy link
Contributor

@Ornias1993: Like I said, I don't plan on using GRUB. I directly boot a CONFIG_EFI_STUB-enabled kernel placed on a FAT32 partition, the rest of the disk is in a second partition hosting the root pool.

@BrainSlayer
Copy link
Contributor

@Ornias1993 i dont think that my mods can handle the restrictions mentioned here. my mods are just a extension for the compression support. the rest remains. however the restrictions here mentioned state that grub doesnt support compression at all. thats not true. grub did support already all compression variants (except for my zstd mod which i made)

@Ornias1993
Copy link
Contributor

@Ornias1993 i dont think that my mods can handle the restrictions mentioned here. my mods are just a extension for the compression support. the rest remains. however the restrictions here mentioned state that grub doesnt support compression at all. thats not true. grub did support already all compression variants (except for my zstd mod which i made)

I wasn't talking about this PR @BrainSlayer It was a general statement about ZSTD and Grub compatiblity. I was NOT refering to this PR.

@BrainSlayer
Copy link
Contributor

@Ornias1993 okay

The GRUB restrictions are based around the pool's bootfs property.
Given the current situation where GRUB is not staying current with
OpenZFS pool features, having either a non-ZFS /boot or a separate
pool with limited features are pretty much the only long-term answers
for GRUB support.  Only the second case matters in this context.  For
the restrictions to be useful, the bootfs property would have to be set
on the boot pool, because that is where we need the restrictions, as
that is the pool that GRUB reads from. The documentation for bootfs
describes it as pointing to the root pool. That's also how it's used in
the initramfs. ZFS does not allow setting bootfs to point to a dataset
in another pool. (If it did, it'd be difficult-to-impossible to enforce
these restrictions cross-pool). Accordingly, bootfs is pretty much
useless for GRUB scenarios moving forward.

Even for users who have only one pool, the existing restrictions for
GRUB are incomplete. They don't prevent you from enabling the
unsupported checksums, for example. For that reason, I have ripped out
all the GRUB restrictions.

A little longer-term, I think extending the proposed features=portable
system to define a features=grub is a much more useful approach. The
user could set that on the boot pool at creation, and things would
Just Work.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
@rlaager
Copy link
Member Author

rlaager commented Jun 16, 2020

I've rebased this on current master. It still looks correct to me, but I haven't actually tested this change in any way (even building it).

Copy link
Contributor

@pcd1193182 pcd1193182 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the general approach, but I think it would be good to replace the documentation notes we're removing with a general explanation that GRUB is falling increasingly behind in feature support, so GRUB boot pools need to be very careful about what features they're enabling. While this may be reasonably well known among the ZFS community, there are plenty of potential new users who don't know about these issues.

@rlaager
Copy link
Member Author

rlaager commented Jul 13, 2020

@pcd1193182 Do you have any specific language or ideas of where to document this?

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rlaager unless you object I'd like to go ahead and merge this as is. We can follow up with improved documentation once we have a better idea what that should look like.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Aug 18, 2020
@rlaager
Copy link
Member Author

rlaager commented Aug 18, 2020

@behlendorf I remain in favor of this.

@behlendorf behlendorf merged commit eaa25f1 into openzfs:master Aug 18, 2020
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
The GRUB restrictions are based around the pool's bootfs property.
Given the current situation where GRUB is not staying current with
OpenZFS pool features, having either a non-ZFS /boot or a separate
pool with limited features are pretty much the only long-term answers
for GRUB support.  Only the second case matters in this context.  For
the restrictions to be useful, the bootfs property would have to be set
on the boot pool, because that is where we need the restrictions, as
that is the pool that GRUB reads from. The documentation for bootfs
describes it as pointing to the root pool. That's also how it's used in
the initramfs. ZFS does not allow setting bootfs to point to a dataset
in another pool. (If it did, it'd be difficult-to-impossible to enforce
these restrictions cross-pool). Accordingly, bootfs is pretty much
useless for GRUB scenarios moving forward.

Even for users who have only one pool, the existing restrictions for
GRUB are incomplete. They don't prevent you from enabling the
unsupported checksums, for example. For that reason, I have ripped out
all the GRUB restrictions.

A little longer-term, I think extending the proposed features=portable
system to define a features=grub is a much more useful approach. The
user could set that on the boot pool at creation, and things would
Just Work.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes openzfs#8627
@enterit
Copy link

enterit commented Apr 14, 2021

@rlaager or anyone, could you please clarify the current state of bootfs restrictions and GRUB?
In the latest ZFS version, should it be possible to set bootfs on one pool pointing to a dataset in another pool?

You mentioned:

The documentation for bootfs describes it as pointing to the root pool. That's also how it's used in the initramfs. ZFS does not allow setting bootfs to point to a dataset in another pool. (If it did, it'd be difficult-to-impossible to enforce these restrictions cross-pool). Accordingly, bootfs is pretty much useless for GRUB scenarios moving forward.

This sounds like GRUB requires setting bootfs on the root pool. Please correct me if I am wrong.

I assumed your comment meant that the plan was to lift that limitation. I am using zfs-2.0.0-rc5, which seems to include the related commit eaa25f1a8e. In this version, I cannot set bootfs property on one pool referring to a dataset in another pool. I am getting an error:

# sudo zpool set bootfs=bpool/boot rpool
cannot set property for 'rpool': 'bpool/boot' is an invalid name

I am having issues getting GRUB working with two pools - boot pool bpool and root pool rpool. The boot pool only has features enabled that should be supported by GRUB, as specified in Debian Buster Root on ZFS / Disk Formatting. But, when I try to boot, GRUB2 seems to ignore the bootfs property of bpool, and tries to boot from rpool.
And, GRUB2 fails, presumably due to unsupported features enabled on rpool.
It prints error: compression algorithm 115 not supported and complains about other compression algorithms.

So I am trying to figure out if there is any way to force GRUB2 ignoring pools without bootfs property and successfully booting from the right pool in the two-pool setup.

@rlaager
Copy link
Member Author

rlaager commented Apr 14, 2021

This sounds like GRUB requires setting bootfs on the root pool. Please correct me if I am wrong.

It does not. Really, just pretend the bootfs property does not exist.

I am having issues getting GRUB working with two pools - boot pool bpool and root pool rpool. The boot pool only has features enabled that should be supported by GRUB, as specified in Debian Buster Root on ZFS / Disk Formatting. But, when I try to boot, GRUB2 seems to ignore the bootfs property of bpool, and tries to boot from rpool.

GRUB will boot the kernel and initrd from bpool. That will mount what is specified by root=ZFS=..., per step 5.3. Make sure you set that manually (because GRUB doesn't support all the features on rpool and thus will fail to detect it at update-grub time).

And, GRUB2 fails, presumably due to unsupported features enabled on rpool.
It prints error: compression algorithm 115 not supported and complains about other compression algorithms.

That compression error can almost certainly be ignored.

@enterit
Copy link

enterit commented Apr 15, 2021

I might be hijacking this thread to discuss my GRUB issues. I would appreciate any help. This may also be a pretty common scenario; perhaps other people facing the same issue may find this useful. If you would prefer me to move this comment to a new thread or issue, please let me know.

It is really hard to ignore compression errors, because then GRUB spits out more than 24 lines of errors in the console, prints --MORE--, and gets stuck waiting for a keypress. Which is fine for a laptop, but is pretty bad for a remote VM, where the user may not have access to graphics or serial console. So I am trying to figure out if there is a way to work around this problem, by either suppressing GRUB error messages, or by forcing it to ignore non-boot pools.

I tried multiple ways to suppress errors or configure serial terminal with an artificially large height, but it looks like GRUB just uses 24 lines by default, so I reached a dead end.

I guess my question is how GRUB makes a distinction between a boot pool and a non-boot pool.
I did not find a lot of documentation on that, so I am just following the code.

GRUB probes boot device on mkconfig:
https://github.com/rhboot/grub2/blob/a53e530f8ad3770c3b03c208c08ae4162f68e3b1/util/grub-mkconfig.in#L140-L141

GRUB_DEVICE_BOOT="`${grub_probe} --target=device /boot`"
GRUB_DEVICE_BOOT_UUID="`${grub_probe} --device ${GRUB_DEVICE_BOOT} --target=fs_uuid 2> /dev/null`" || true

GRUB_DEVICE_BOOT_UUID is not used anywhere.
GRUB_DEVICE_BOOT is used in
https://github.com/rhboot/grub2/blob/a53e530f8ad3770c3b03c208c08ae4162f68e3b1/util/grub.d/10_linux.in#L137

This code processes the boot partition correctly, and generates search call with correct boot_fs_uuid of the boot partition.
Code:

if [ x$dirname = x/ ]; then
...
else
  if [ -z "${prepare_boot_cache}" ]; then
    prepare_boot_cache="$(prepare_grub_to_access_device ${GRUB_DEVICE_BOOT} | grub_add_tab)"
  fi
  printf '%s\n' "${prepare_boot_cache}" | sed "s/^/$submenu_indentation/"
fi

Output:

insmod part_gpt
insmod zfs
if [ x$feature_platform_search_hint = xy ]; then
  search --no-floppy --fs-uuid --set=root  <boot_fs_uuid>
else
  search --no-floppy --fs-uuid --set=root <boot_fs_uuid>
fi

All this code is only used for the menu generation, and the error: compression algorithm %s not supported happens before the menu is displayed.

I also saw GRUB using bootfs in mkconfig scripts. Here, bootfs is actually a boot dataset in the root pool:
https://github.com/rhboot/grub2/blob/a53e530f8ad3770c3b03c208c08ae4162f68e3b1/util/grub.d/10_linux.in#L75

And the other comment implies that bootfs is a property of a root pool: https://github.com/rhboot/grub2/blob/a53e530f8ad3770c3b03c208c08ae4162f68e3b1/grub-core/fs/zfs/zfs.c#L3031

I was hoping that I could use bootfs to force GRUB to skip probing root pool on start, but the way it is used in 10_linux.in makes me thinking I misunderstand the purpose of this property.

@bghira
Copy link

bghira commented Apr 15, 2021

instead of fixing bootfs and making it work it's been bandied about like a hot potato no one wants to touch because it'll be misunderstood/misused. it has been proposed to remove the property entirely.

sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
The GRUB restrictions are based around the pool's bootfs property.
Given the current situation where GRUB is not staying current with
OpenZFS pool features, having either a non-ZFS /boot or a separate
pool with limited features are pretty much the only long-term answers
for GRUB support.  Only the second case matters in this context.  For
the restrictions to be useful, the bootfs property would have to be set
on the boot pool, because that is where we need the restrictions, as
that is the pool that GRUB reads from. The documentation for bootfs
describes it as pointing to the root pool. That's also how it's used in
the initramfs. ZFS does not allow setting bootfs to point to a dataset
in another pool. (If it did, it'd be difficult-to-impossible to enforce
these restrictions cross-pool). Accordingly, bootfs is pretty much
useless for GRUB scenarios moving forward.

Even for users who have only one pool, the existing restrictions for
GRUB are incomplete. They don't prevent you from enabling the
unsupported checksums, for example. For that reason, I have ripped out
all the GRUB restrictions.

A little longer-term, I think extending the proposed features=portable
system to define a features=grub is a much more useful approach. The
user could set that on the boot pool at creation, and things would
Just Work.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes openzfs#8627
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants