New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coreos-install creates ESP with FAT16 instead of the recommended FAT32 #2246

Closed
mortenlj opened this Issue Nov 11, 2017 · 10 comments

Comments

Projects
None yet
3 participants
@mortenlj

mortenlj commented Nov 11, 2017

Issue Report

Bug

Container Linux Version

coreos-install directly from the master branch, don't know which version that would be

Environment

Bare-metal. ASUS laptop (yes, I know this is not the typical target system, but there should be no reason it shouldn't work).

Expected Behavior

The install-script creates an ESP, using FAT32, that will be recognized by the UEFI firmware and used for booting.

Actual Behavior

The FAT16 partition is not recognized as a ESP, and hence the system won't boot.

This is probably the ASUS UEFI implementation that is being more strict in what it accepts than others, but as far as I can tell, the UEFI spec does explicitly say that the ESP should use a variant of FAT32.

Reproduction Steps

  1. Install Container Linux using the coreos-install script

Other Information

https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface

Such a setup is usually referred to as UEFI-GPT, while ESP is recommended to be at least 512 MiB in size and formatted with a FAT32 filesystem for maximum compatibility.

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Nov 11, 2017

Member

Nice catch! If you copy the files out of the ESP, reformat the filesystem, and copy the files back, does your system boot? Be sure to label the filesystem EFI-SYSTEM, i.e.,

mkfs.vfat -F 32 -n EFI-SYSTEM /dev/sda1

coreos-install is really just downloading and writing a disk image, so the problem is in the formatting tool that runs during image creation. It seems we've always had this bug.

Member

bgilbert commented Nov 11, 2017

Nice catch! If you copy the files out of the ESP, reformat the filesystem, and copy the files back, does your system boot? Be sure to label the filesystem EFI-SYSTEM, i.e.,

mkfs.vfat -F 32 -n EFI-SYSTEM /dev/sda1

coreos-install is really just downloading and writing a disk image, so the problem is in the formatting tool that runs during image creation. It seems we've always had this bug.

@mortenlj

This comment has been minimized.

Show comment
Hide comment
@mortenlj

mortenlj Nov 13, 2017

I tried copying the files out and reformatting, but for some reason I get an error about invalid GPT signature when I try booting afterwards. I had to manually add the boot entry, for reasons unknown, so I might have done something wrong there.

I booted using EFI\boot\bootx64.efi, which gave me a GRUB with three options, coreos default, usr-a and usr-b. default and usr-a gave me the invalid GPT signature error, usr-b gave me an error about vmlinux-b not found (or something similar).

coreos-install is really just downloading and writing a disk image, so the problem is in the formatting tool that runs during image creation. It seems we've always had this bug.

Would it just be a matter of adding -F 32 to the command you linked to, or is that function used for other things too so it needs to be configurable?

mortenlj commented Nov 13, 2017

I tried copying the files out and reformatting, but for some reason I get an error about invalid GPT signature when I try booting afterwards. I had to manually add the boot entry, for reasons unknown, so I might have done something wrong there.

I booted using EFI\boot\bootx64.efi, which gave me a GRUB with three options, coreos default, usr-a and usr-b. default and usr-a gave me the invalid GPT signature error, usr-b gave me an error about vmlinux-b not found (or something similar).

coreos-install is really just downloading and writing a disk image, so the problem is in the formatting tool that runs during image creation. It seems we've always had this bug.

Would it just be a matter of adding -F 32 to the command you linked to, or is that function used for other things too so it needs to be configurable?

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Nov 13, 2017

Member

I tried copying the files out and reformatting, but for some reason I get an error about invalid GPT signature when I try booting afterwards. I had to manually add the boot entry, for reasons unknown, so I might have done something wrong there.

Container Linux never installs a boot entry; it always boots on EFI via Default Boot Behavior. Installing a boot entry by hand should work, or else deleting all boot entries.

Did the switch to FAT32 at least allow you to make progress, or is it possible that the missing boot entry was the original problem as well?

What was the exact text of the invalid signature message?

I booted using EFI\boot\bootx64.efi, which gave me a GRUB with three options, coreos default, usr-a and usr-b. default and usr-a gave me the invalid GPT signature error, usr-b gave me an error about vmlinux-b not found (or something similar).

On a freshly-installed image, USR-B won't work, and USR-A and default are equivalent except for some detection steps.

Would it just be a matter of adding -F 32 to the command you linked to, or is that function used for other things too so it needs to be configurable?

I have a two-line patch that adds -F 32 only for partitions whose type GUID corresponds to the EFI system partition. We don't use VFAT for anything else, but the code might as well handle all cases.

Member

bgilbert commented Nov 13, 2017

I tried copying the files out and reformatting, but for some reason I get an error about invalid GPT signature when I try booting afterwards. I had to manually add the boot entry, for reasons unknown, so I might have done something wrong there.

Container Linux never installs a boot entry; it always boots on EFI via Default Boot Behavior. Installing a boot entry by hand should work, or else deleting all boot entries.

Did the switch to FAT32 at least allow you to make progress, or is it possible that the missing boot entry was the original problem as well?

What was the exact text of the invalid signature message?

I booted using EFI\boot\bootx64.efi, which gave me a GRUB with three options, coreos default, usr-a and usr-b. default and usr-a gave me the invalid GPT signature error, usr-b gave me an error about vmlinux-b not found (or something similar).

On a freshly-installed image, USR-B won't work, and USR-A and default are equivalent except for some detection steps.

Would it just be a matter of adding -F 32 to the command you linked to, or is that function used for other things too so it needs to be configurable?

I have a two-line patch that adds -F 32 only for partitions whose type GUID corresponds to the EFI system partition. We don't use VFAT for anything else, but the code might as well handle all cases.

@mortenlj

This comment has been minimized.

Show comment
Hide comment
@mortenlj

mortenlj Nov 13, 2017

Container Linux never installs a boot entry; it always boots on EFI via Default Boot Behavior. Installing a boot entry by hand should work, or else deleting all boot entries.

Did the switch to FAT32 at least allow you to make progress, or is it possible that the missing boot entry was the original problem as well?

Yeah, switching to FAT32 made the partition detected by the firmware, so it was definitive progress. There were no boot entries defined, so it should have used the default boot behavior, but for some reason that didn't happen. Adding the equivalent boot entry worked.

What was the exact text of the invalid signature message?

I will try to get time for another attempt this evening, and grab a picture of it so I can get the exact text.

On a freshly-installed image, USR-B won't work, and USR-A and default are equivalent except for some detection steps.

Ok, I suspected as much. This is to be expected then, so the remaining issue is the invalid GPT signature problem. I will get back to you with details about that when I get a chance to look at it.

mortenlj commented Nov 13, 2017

Container Linux never installs a boot entry; it always boots on EFI via Default Boot Behavior. Installing a boot entry by hand should work, or else deleting all boot entries.

Did the switch to FAT32 at least allow you to make progress, or is it possible that the missing boot entry was the original problem as well?

Yeah, switching to FAT32 made the partition detected by the firmware, so it was definitive progress. There were no boot entries defined, so it should have used the default boot behavior, but for some reason that didn't happen. Adding the equivalent boot entry worked.

What was the exact text of the invalid signature message?

I will try to get time for another attempt this evening, and grab a picture of it so I can get the exact text.

On a freshly-installed image, USR-B won't work, and USR-A and default are equivalent except for some detection steps.

Ok, I suspected as much. This is to be expected then, so the remaining issue is the invalid GPT signature problem. I will get back to you with details about that when I get a chance to look at it.

@mortenlj

This comment has been minimized.

Show comment
Hide comment
@mortenlj

mortenlj Nov 13, 2017

What was the exact text of the invalid signature message?

Booting `CoreOS default'

error: invalid GPT signature.

Reading or updating the GPT failed!
Please file a bug with any messages above to CoreOS:
 https://issues.coreos.com

Aborted. Press enter to exit GRUB.

I pressed enter.

error: can't find command `exit'.
error: file `/coreos/vmlinuz-b' not found.

Press any key to continue...

I pressed a key...


   Failed to boot both default and fallback entries.

Press any key to continue...

Pressing any key here, returns to the GRUB menu, where default, usr-a and usr-b are the options to select from.

I don't know if this is a consequence of the reformatting of the partition, or a separate issue.
I'm out travelling for a few days now, but I can continue troubleshooting this when I get back. I might also just try to run the install-script again, after your fix has been merged.

mortenlj commented Nov 13, 2017

What was the exact text of the invalid signature message?

Booting `CoreOS default'

error: invalid GPT signature.

Reading or updating the GPT failed!
Please file a bug with any messages above to CoreOS:
 https://issues.coreos.com

Aborted. Press enter to exit GRUB.

I pressed enter.

error: can't find command `exit'.
error: file `/coreos/vmlinuz-b' not found.

Press any key to continue...

I pressed a key...


   Failed to boot both default and fallback entries.

Press any key to continue...

Pressing any key here, returns to the GRUB menu, where default, usr-a and usr-b are the options to select from.

I don't know if this is a consequence of the reformatting of the partition, or a separate issue.
I'm out travelling for a few days now, but I can continue troubleshooting this when I get back. I might also just try to run the install-script again, after your fix has been merged.

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Nov 14, 2017

Member

error: can't find command 'exit' shouldn't happen. It's not directly related to the problem, but it's odd. The subsequent behavior is a result of that error.

invalid GPT signature is coming from GRUB code. It's not immediately clear why, but I agree that testing a fixed image is the next step. When you get a chance, please try this test image:

wget 'https://users.developer.core-os.net/bgilbert/boards/amd64-usr/1590.0.0%2B2017-11-13-1552/coreos_production_image.bin.bz2'
sudo coreos-install -f coreos_production_image.bin.bz2 [...]
Member

bgilbert commented Nov 14, 2017

error: can't find command 'exit' shouldn't happen. It's not directly related to the problem, but it's odd. The subsequent behavior is a result of that error.

invalid GPT signature is coming from GRUB code. It's not immediately clear why, but I agree that testing a fixed image is the next step. When you get a chance, please try this test image:

wget 'https://users.developer.core-os.net/bgilbert/boards/amd64-usr/1590.0.0%2B2017-11-13-1552/coreos_production_image.bin.bz2'
sudo coreos-install -f coreos_production_image.bin.bz2 [...]
@mortenlj

This comment has been minimized.

Show comment
Hide comment
@mortenlj

mortenlj Nov 17, 2017

I've tested your image now, and that worked like a charm.

mortenlj commented Nov 17, 2017

I've tested your image now, and that worked like a charm.

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Nov 18, 2017

Member

Great. The fix should be included in the next alpha (1618.0.0), due in a couple weeks. Thanks again for reporting!

Member

bgilbert commented Nov 18, 2017

Great. The fix should be included in the next alpha (1618.0.0), due in a couple weeks. Thanks again for reporting!

@bgilbert bgilbert closed this Nov 18, 2017

@ajeddeloh

This comment has been minimized.

Show comment
Hide comment
@ajeddeloh

ajeddeloh commented Dec 18, 2017

Reopening since we are reverting this due to a grub bug

@ajeddeloh ajeddeloh reopened this Dec 18, 2017

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Dec 18, 2017

Member

When this is fixed, the partitioning docs should be updated to say FAT32, since VFAT is not a thing.

Member

bgilbert commented Dec 18, 2017

When this is fixed, the partitioning docs should be updated to say FAT32, since VFAT is not a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment