Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB boot fails if the GPT contains no basic data or EFI partitions #130

Closed
ghost opened this issue May 21, 2020 · 18 comments
Closed

USB boot fails if the GPT contains no basic data or EFI partitions #130

ghost opened this issue May 21, 2020 · 18 comments

Comments

@ghost
Copy link

ghost commented May 21, 2020

Describe the bug
With a Startech 3.5inch HDD enclosure (S3510BMU33) attached to a Pi 4, I cannot get it to boot from USB MSD. If one or more of these enclosures is attached at boot time, boot appears to stop when the bootloader tries to talk to this device. I have two of these enclosures, each with different brand of hard disk in them - both show the same problem.

To Reproduce
Steps to reproduce the behavior:

  1. Configure Pi 4 for USB MSD boot per the instructions at https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=274595#p1663644, leaving the SD card slot empty, and BOOT_ORDER=0xf41.

  2. Attach at least one Startech S3510BMU33 hard disk enclosure to the Pi 4 and power it up using its own PSU.

  3. Attach HDMI screen to the Pi 4 so you can see what is happening.

  4. Apply power to Pi 4.

  5. Observe bootloader output on HDMI screen.

Expected behaviour
Pi 4 should try each SD card boot, fail, try eachUSB device in turn looking for a bootable device, fail to find a bootable USB device and continue in a loop from SD card boot again.

Screenshots
P1040683
Pi 4 gets stuck at the point shown in the screenshot, although it is still sending out trace packets over the network so it does appear to still be running OK, not hung.

Bootloader version and configuration

Bootloader configuration:
[all]
BOOT_UART=0
WAKE_ON_GPIO=1
POWER_OFF_ON_HALT=0
DHCP_TIMEOUT=45000
DHCP_REQ_TIMEOUT=4000
TFTP_FILE_TIMEOUT=30000
BOOT_ORDER=0xf41
SD_BOOT_MAX_RETRIES=1
USB_MSD_BOOT_MAX_RETRIES=1
ENABLE_SELF_UPDATE=1
DISABLE_HDMI=0
XHCI_DEBUG=0x23
NETCONSOLE=6665@169.254.1.1/eth0,6666@/

Bootloader version:
May 15 2020 11:05:52
version 23a9f59b85f5a81bb2eec455e064ef9905216322 (release)
timestamp 1589537152

USB boot (please complete the following information):
sudo lsusb -vvv output:
lsusb-vvv.txt.gz

Netconsole boot trace:
Startech-S3510BMU33-hdd-enclosure-w-hdd.pcapng.gz

Additional context
Problem occurs whether there is a usable USB MSD boot device attached or not (i.e. one that has been prepared and verified to boot the Pi 4 successfully). Problem does not occur with SD card boot, since the boot order is set to do SD card boot first. I would expect it to fail if it was the other way around, since the whole boot process stops, although the trace shows that the bootloader is still running and sending trace packets out over the network.

The network trace is done with the enclosure connected to a USB 3 hub. I have tried it with a USB 2.0 hub and it shows the same problem.

I have attempted to activate XHCI_DEBUG bits 5, 1 and 0, using the setting 0x23. I may have got that wrong.

@timg236
Copy link
Collaborator

timg236 commented May 21, 2020

Excellent, first NET_CONSOLE report. I think this is the gold standard in bug reports.
Selecting "Analyze -> Follow -> UDP streams" shows that the bootloader is able to read from the disk and is finding a GPT partition table although possibly not finding an EFI or basic data partition because it would have stopped at that point.

If you can dump the partition table via gdisk that would be useful.

A slightly better XHCI_DEBUG would be 0x1f to show everything except the TRBs but 0x7F for everything would also be fine.

It looks as though it didn't find what it thought was a usable partition and carried on looking for the next disk. A HDMI diagnostic would probably help here

GPT 448904efc4fa451089061e1f33446f5f 000000001 074706daf num-partitions 128 entry-size 128
GPT partition: 0 c38c896ad21db21199a6080020736631 000000800..0747027ff
GPT partition: 1 00000000000000000000000000000000 000000000..000000000
GPT partition: 2 00000000000000000000000000000000 000000000..000000000
GPT partition: 3 00000000000000000000000000000000 000000000..000000000
GPT partition: 4 00000000000000000000000000000000 000000000..000000000
GPT partition: 5 00000000000000000000000000000000 000000000..000000000
GPT partition: 6 00000000000000000000000000000000 000000000..000000000
GPT partition: 7 00000000000000000000000000000000 000000000..000000000
GPT partition: 8 3b5a946ad21db21199a6080020736631 074702800..0747067ff
GPT partition: 9 00000000000000000000000000000000 000000000..000000000

@ghost
Copy link
Author

ghost commented May 21, 2020

I see. So the boot failure is due to my unusual partitioning, not the bridge chip. That's actually a relief, since it is meant to be one of the better ones.

The disks are part of a mirrored OpenZFS pool, which is probably not a common setup on the Pi. They were created by telling ZFS to create a mirrored pool using the entire disks, leaving it up to the zpool create command to partition the disks, which I'm guessing is the reason for the odd partitioning.

@ghost ghost changed the title Startech S3510BMU33 HDD enclosure blocks USB MSD boot Whole disk partition of OpenZFS pool blocks USB MSD boot May 21, 2020
@timg236
Copy link
Collaborator

timg236 commented May 21, 2020

The bootloader and start.elf only know how to read FAT16/FAT32. If you can create a small FAT partition within the GPT to store those files then it should boot ok, unless the boot partition is on a different drive?

@ghost
Copy link
Author

ghost commented May 21, 2020

The bootloader and start.elf only know how to read FAT16/FAT32. If you can create a small FAT partition within the GPT to store those files then it should boot ok, unless the boot partition is on a different drive?

The boot and root partitions are on a different drive, which I left disconnected when I did that trace, as the problem occurs whether the boot&root drive is attached or not. The problem is the bootloader getting stuck and not progressing past examining the drive formatted using the zpool create command. Linux seems quite happy with the drive - it is just the bootloader that is having trouble.

I've confirmed the same behaviour with a drive from another OpenZFS pool in a different model of enclosure directly attached to the Pi 4.

@lurch
Copy link
Contributor

lurch commented May 21, 2020

@andrum99 Not directly-related to this bug report, but does that setup boot okay from USB MSD on a Pi3 B+?

@ghost
Copy link
Author

ghost commented May 21, 2020

Another disk from an OpenZFS pool in a different model of enclosure with a different bridge:

pi@pi4b2sd:~ $ sudo blkid
/dev/mmcblk0p1: LABEL_FATBOOT="boot" LABEL="boot" UUID="4BBD-D3E7" TYPE="vfat" PARTUUID="738a4d67-01"
/dev/mmcblk0p2: LABEL="rootfs" UUID="45e99191-771b-4e12-a526-0779148892cb" TYPE="ext4" PARTUUID="738a4d67-02"
/dev/sda1: LABEL="zarchive" UUID="4136594926126255193" UUID_SUB="7529765489205740072" TYPE="zfs_member" PARTLABEL="zfs-7040622454dc10fa" PARTUUID="439905e7-ecc2-7b4e-befe-e7b5323075f8"
/dev/mmcblk0: PTUUID="738a4d67" PTTYPE="dos"
/dev/sda9: PARTUUID="519bfa4d-1f57-c448-a6d8-e65cb1685e7f"

pi@pi4b2sd:~ $ sudo gdisk -l /dev/sda
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 1250263728 sectors, 596.2 GiB
Model: MK6459GSXP
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): E0880D71-DCBA-0845-A4A8-8AD20C213F0A
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1250263694
Partitions will be aligned on 2048-sector boundaries
Total free space is 2669 sectors (1.3 MiB)

Number Start (sector) End (sector) Size Code Name
1 2048 1250246655 596.2 GiB BF01 zfs-7040622454dc10fa
9 1250246656 1250263039 8.0 MiB BF07

XHCI_DEBUG=0x1f netconsole log:

OpenZFS-whole-disk-pool.pcapng.gz

Screenshot:

P1040684

@timg236 - when you say:

A HDMI diagnostic would probably help here

How do I get the Pi to do that?

@ghost
Copy link
Author

ghost commented May 21, 2020

@andrum99 Not directly-related to this bug report, but does that setup boot okay from USB MSD on a Pi3 B+?

I have no idea - I stopped using USB MSD boot on the Pi 3 due to finding it easier to use SD card boot, and also SD boot and root seemed to be faster than with the hard disks I was using. Do you want me to try that? The only Pi 3B+ I have is running my DNS and DHCP server (Pi-hole) so it's a bit annoying to fiddle with, but I can if required.

(I don't have SSDs to connect to the Pi - just some old 2.5" laptop hard disks, and the pair of 3.5" 1TB drives I previously used as direct attached backup drives on my laptop (which now make up an OpenZFS pool on a Pi 4B 4GB running Ubuntu). I've moved the primary backup of my laptop to the cloud, as it is easier to do off-site backups, and I now have a 10Mbps upstream broadband connection, which makes it vaguely viable).

@timg236
Copy link
Collaborator

timg236 commented May 21, 2020

@timg236 - when you say:

A HDMI diagnostic would probably help here

How do I get the Pi to do that?

You can't, but once I've figured out what's going on an indicator to say no valid partitions on this disk would probably be a good idea. I think the issue is that there's a problem exiting from the scanning GPT partitions state if there are no valid EFI partitions.

Please could you confirm that if you just connect the boot drive then finds and loads start.elf?

@timg236
Copy link
Collaborator

timg236 commented May 21, 2020

@andrum99 Not directly-related to this bug report, but does that setup boot okay from USB MSD on a Pi3 B+?

I have no idea - I stopped using USB MSD boot on the Pi 3 due to finding it easier to use SD card boot, and also SD boot and root seemed to be faster than with the hard disks I was using. Do you want me to try that? The only Pi 3B+ I have is running my DNS and DHCP server (Pi-hole) so it's a bit annoying to fiddle with, but I can if required.

(I don't have SSDs to connect to the Pi - just some old 2.5" laptop hard disks, and the pair of 3.5" 1TB drives I previously used as direct attached backup drives on my laptop (which now make up an OpenZFS pool on a Pi 4B 4GB running Ubuntu). I've moved the primary backup of my laptop to the cloud, as it is easier to do off-site backups, and I now have a 10Mbps upstream broadband connection, which makes it vaguely viable).

@timg236 timg236 closed this as completed May 21, 2020
@timg236 timg236 reopened this May 21, 2020
@timg236
Copy link
Collaborator

timg236 commented May 21, 2020

Ooops, pressed the wrong button!

GPT has never been supported in start.elf until now so not too worried about Pi3 GPT. I tested a USB pen-drive with MBR primary partition will boot on both a Pi3B+ and a Pi4

@ghost
Copy link
Author

ghost commented May 21, 2020

@timg236 - when you say:

A HDMI diagnostic would probably help here

How do I get the Pi to do that?

You can't, but once I've figured out what's going on an indicator to say no valid partitions on this disk would probably be a good idea. I think the issue is that there's a problem exiting from the scanning GPT partitions state if there are no valid EFI partitions.

Ah, I see. I've very little clue about GPT / EFI, beyond it being the successor to MBR (and Sun's VTOC).

Please could you confirm that if you just connect the boot drive then finds and loads start.elf?

Yes - just tried it and it boots fine with just the boot drive attached (with or without USB mouse and keyboard).

(The working boot drive in this case happens to be in the same model of enclosure with the same bridge chip - 174c:1153 = ASM1153E. As you identified, the bridge chip seems to be irrelevant - it's the partitioning on the disk that seems to be the problem).

@lurch
Copy link
Contributor

lurch commented May 21, 2020

Do you want me to try that? The only Pi 3B+ I have is running ...

I was merely curious. Please don't bother unless Tim asks you to :)

@ghost
Copy link
Author

ghost commented May 21, 2020

GPT has never been supported in start.elf until now so not too worried about Pi3 GPT. I tested a USB pen-drive with MBR primary partition will boot on both a Pi3B+ and a Pi4

The working USB MSD boot (and root) disk is MBR only - it's created by the SD Card Copier in Raspbian, based on a working SD card. (Verified with gdisk).

@timg236
Copy link
Collaborator

timg236 commented May 21, 2020

I think I see the issue (from code inspection). This should fix it, or possibly complain then explode but it will indicate if my hypothesis is vaguely correct

I don't have a GPT setup and need to head off now, thanks for the detailed information

pieeprom-gpt-fix.zip

@ghost
Copy link
Author

ghost commented May 21, 2020

I think I see the issue (from code inspection). This should fix it, or possibly complain then explode but it will indicate if my hypothesis is vaguely correct

I don't have a GPT setup and need to head off now, thanks for the detailed information

pieeprom-gpt-fix.zip

That bootloader firmware has fixed it. Unmodified (i.e. without changing the embedded config) I get:

GPT: no bootable partitions

then after a short delay:

USB MSD timed out after 10000 seconds

Which appears to actually be 10000 milliseconds. The bootloader then correctly continues to attempt booting as intended (in this case 0xf41). I can hotplug an SD card or the bootable hard disk and it will boot from either. Powering up or rebooting with the problem hard disk, and a bootable hard disk and/or bootable SD card also works correctly.

Many thanks 😁

@ghost
Copy link
Author

ghost commented May 21, 2020

To check this wasn't due to a problem with the particular setup I had on my NAS when I created these OpenZFS pools, I checked the behaviour of the 2020-05-15 bootloader (i.e. USB MSD beta) with a whole disk OpenZFS pool on a USB stick which was created in Ubuntu 20.04 amd64 on the Pi 4. (My other pools were created on Raspbian with the 64-bit Raspbian kernel, building ZFS as an out of tree module and some other tweaking per https://github.com/andrum99/zfs-for-pi).

(openzfs/zfs#489 suggested that the particular setup I had when I created the pools, which I don't have running now, may have been the cause of the problem).

The behaviour is identical with the pool created on Ubuntu 20.04 amd64 on Pi 4 - the 2020-05-15 bootloader gets stuck on that LUN. With the fixed bootloader from the above message (2020-05-21), the bootloader works as expected, as with the other pools.

@timg236 timg236 changed the title Whole disk partition of OpenZFS pool blocks USB MSD boot USB boot fails if the GPT contains no basic data or EFI partitions May 26, 2020
timg236 added a commit that referenced this issue May 26, 2020
* Resolve: USB boot fails if the GPT contains no basic data or EFI partitions #130
* Resolve: Fix default BOOT_ORDER in mass storage beta #129
* Resolve: Add support for booting from a "superfloppy" disk #120
* Resolve: USB MSD timeout message - incorrect units #131
* Resolve: Recognize efi partition (0xef) as a valid boot #126
* The HDMI diagnostics screen now displays the most significant bytes of the SHA-256 of the config.txt file.
@timg236
Copy link
Collaborator

timg236 commented May 26, 2020

This should be fixed by e87bc4d

The binaries are available in this release which can be flashed using the Raspberry Pi Imager
https://github.com/raspberrypi/rpi-eeprom/releases/tag/v2020.05.26-137ad

@ghost
Copy link
Author

ghost commented May 26, 2020

This should be fixed by e87bc4d

The binaries are available in this release which can be flashed using the Raspberry Pi Imager
https://github.com/raspberrypi/rpi-eeprom/releases/tag/v2020.05.26-137ad

Confirmed - LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants