Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression booting rpi3/4 with U-Boot on recent firmware #1619

Open
nullr0ute opened this issue Sep 1, 2021 · 46 comments
Open

Regression booting rpi3/4 with U-Boot on recent firmware #1619

nullr0ute opened this issue Sep 1, 2021 · 46 comments

Comments

@nullr0ute
Copy link

Describe the bug
Moving Fedora from firmware 7208c3d to 25e2b59 without any other changes has regressed booting with U-Boot on Fedora.

It stops the RPi3 from booting at all and the RPi4 boots but extremely slowly. The F-34 U-Boot for RPi-3 and RPi-4

To reproduce

U-Boot 2021.04 (Apr 21 2021 - 00:00:00 +0000)

DRAM:  992 MiB
RPI 3 Model B+ (0xa020d3)
MMC:   sdhci_setup_cfg: Hardware doesn't specify base clock frequency
sdhci_setup_cfg: Hardware doesn't specify base clock frequency
mmcnr@7e300000 - probe failed: -22
mmc@7e202000: 0sdhci_setup_cfg: Hardware doesn't specify base clock frequency

Loading Environment from FAT... fsm 1, hsts 00000080
read timeout error - HSTS 00000080
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:382/bcm2835_prepare_data()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
unable to select a mode
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: USB DWC2
scanning bus usb@7e980000 for devices... 4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0 
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
fsm 1, hsts 00000080
read timeout error - HSTS 00000080
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:382/bcm2835_prepare_data()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
unable to select a mode
no mmc device at slot 1

Device 0: unknown device
lan78xx_eth Waiting for PHY auto negotiation to complete...... done
BOOTP broadcast 1
DHCP client bound to address 192.168.0.39 (4 ms)
Using lan78xx_eth device
TFTP from server 192.168.0.20; our IP address is 192.168.0.39

Expected behaviour
No "sdhci_setup_cfg: Hardware doesn't specify base clock frequency" errors and timeouts reading from the MMC

Actual behaviour
Timeouts on the mSD slot

System
Vanilla F-34 with just the RPi firmware updated.

  • Which model of Raspberry Pi? RPi3b/RPi3B+/RPi4
  • Which OS and version (cat /etc/rpi-issue)? Fedora release 34 (Thirty Four)
  • Which firmware version (vcgencmd version)? 2021-08-19T12:27:51 - 25e2b59
  • Which kernel version (uname -a)? 5.13.12-200.fc34.aarch64

Logs
Included above

Additional context

Looks most like the following but I will be checking all the in between builds:

firmware: arm_loader: Make most arm clock requests required
See: https://github.com/raspberrypi/firmware/issues/1598
@vfazio
Copy link

vfazio commented Sep 2, 2021

Confirming this issue... SD card throughput plummets with recent firmware.

From u-boot 2021.07 with rpi_arm64_defconfig:

fatload mmc 0:1 ${kernel_addr_r} <image>

Currently:

5259877 bytes read in 44908 ms (114.3 KiB/s)
   Uncompressing Kernel Image

Previously:

5259877 bytes read in 226 ms (22.2 MiB/s)
   Uncompressing Kernel Image

the current 1.20210831 tag is affected, 1.20210805 is not.

25e2b59 seems to introduce the issue. 69cfd9f is not affected.

Works:

root@rpi-87fa00:~# vcdbg version
vcos_build_version     = 046501ae7f6bb2bb270f609d764be02a5fb2cf65 (clean)
vcos_build_branch      = bcm2711_2
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_application'
vcos_build_application = ?
vcos_build_date        = Aug 18 2021
vcos_build_time        = 12:56:16
vcos_build_user        = dom
vcos_build_hostname    = buildbot
vcos_build_platform    = raspberrypi_linux
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_type'
vcos_build_type        = ?

affected:

root@rpi-87fa00:~# vcdbg
vcdbg - no command specified
root@rpi-87fa00:~# vcdbg version
vcos_build_version     = ef2c018dccdeb94b0376db62a2ea4c882f9b500d (clean)
vcos_build_branch      = bcm2711_2
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_application'
vcos_build_application = ?
vcos_build_date        = Aug 19 2021
vcos_build_time        = 12:28:18
vcos_build_user        = dom
vcos_build_hostname    = buildbot
vcos_build_platform    = raspberrypi_linux
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_type'
vcos_build_type        = ?

@jlinton
Copy link

jlinton commented Sep 6, 2021

This is hitting the latest edk2 builds as well. I debugged it a bit in uboot last week and the mailbox is returning a 0 clock rate. I need to see if the mailbox structure changed/whatever but I'm away from the pi's today.

@pelwell
Copy link
Contributor

pelwell commented Sep 7, 2021

Any further information you can provide about the clock in question and the circumstances surrounding the 0 rate would be appreciated.

@vfazio
Copy link

vfazio commented Sep 7, 2021

@pelwell

I put some print statements in u-boot:

MMC:   bcm2835: clk: 4 clkstate: 1
MMC:   bcm2835: clk: 4 clkrate: 0
mmc@7e202000: 0
Loading Environment from FAT... slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
OK
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: usb dr_mode not found
USB DWC2
scanning bus usb@7e980000 for devices...
Warning: smsc95xx_eth (eth0) using random MAC address - 52:40:40:f3:94:cb
4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0
U-Boot> mmcinfo
Device: mmc@7e202000
Manufacturer ID: 3
OEM: 5344
Name: SC32G
Bus Speed: 0
Mode: SD High Speed (50MHz)
Rd Block Len: 512
SD version 3.0
High Capacity: Yes
Capacity: 29.7 GiB
Bus Width: 4-bit
Erase Group Size: 512 Bytes

on previous firmware:

MMC:   bcm2835: clk: 4 clkstate: 1
MMC:   bcm2835: clk: 4 clkrate: 250000000
mmc@7e202000: 0
Loading Environment from FAT... slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
sdhost: mmclock: 122129
sdhost: set cdiv: 2045
sdhost: mmclock: 50000000
sdhost: set cdiv: 3
OK
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: usb dr_mode not found
USB DWC2
scanning bus usb@7e980000 for devices...
Warning: smsc95xx_eth (eth0) using random MAC address - 16:d1:56:63:74:ef
4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0
U-Boot> mmc info
Device: mmc@7e202000
Manufacturer ID: 3
OEM: 5344
Name: SC32G
Bus Speed: 50000000
Mode: SD High Speed (50MHz)
Rd Block Len: 512
SD version 3.0
High Capacity: Yes
Capacity: 29.7 GiB
Bus Width: 4-bit
Erase Group Size: 512 Bytes

So on newer firmware bcm2835_get_mmc_clock
https://source.denx.de/u-boot/u-boot/-/blob/v2021.07/arch/arm/mach-bcm283x/msg.c#L74

is returning 0 for clock BCM2835_MBOX_CLOCK_ID_CORE (4)

@jlinton
Copy link

jlinton commented Sep 7, 2021

Ok, reproduced it fully with a rpi4-8G model in edk2.

When it works (with a newer sd card)

...
RpiFirmwareSetClockState: Request clock state C = 1
RpiFirmwareSetClockState: Request clock state 1 = 1
...
RpiFirmwareGetClockRate: Get Clock Rate: Tag=30002 ClockId=C
RpiFirmwareGetClockRate: Get Clock Rate (MailboxBuf=F5A5F000):
20 00 00 00 00 00 00 00
02 00 03 00 08 00 00 00
00 00 00 00 0C 00 00 00
00 00 00 00 00 00 00 00
RpiFirmwareGetClockRate: Get Clock Rate Done:
20 00 00 00 00 00 00 80
02 00 03 00 08 00 00 00
08 00 00 80 0C 00 00 00
00 E1 F5 05 00 00 00 00
RpiFirmwareGetClockRate: Get Clock Rate return: ClockRate=100000000

But that last bit when it fails:

RpiFirmwareGetClockRate: Get Clock Rate: Tag=30002 ClockId=C
RpiFirmwareGetClockRate: Get Clock Rate (MailboxBuf=F5A5F000):
20 00 00 00 00 00 00 00
02 00 03 00 08 00 00 00
00 00 00 00 0C 00 00 00
00 00 00 00 00 00 00 00
RpiFirmwareGetClockRate: Get Clock Rate Done:
20 00 00 00 00 00 00 80
02 00 03 00 08 00 00 00
08 00 00 80 0C 00 00 00
00 00 00 00 00 00 00 00
RpiFirmwareGetClockRate: Get Clock Rate return: ClockRate=0

The fedora defect: https://bugzilla.redhat.com/show_bug.cgi?id=1999180

Notes that uboot is splatting:

sdhci_setup_cfg: Hardware doesn't specify base clock frequency

which happens the same way (the mailbox is returning that the clock rate is 0)

but it also notes that its dependent on SD card, which seems to be the case since I spent an hour or so thinking I was crazy because I couldn't get it to trigger today.

Further, its not apparently some kind of powersave/whatever going on because If I issue a request to set the clockrate to 100mhz on the emmc2 right before requesting the current state it will accept the request, but then immediately return that the clockrate is 0.

@jens-maus
Copy link

Please note, that I am seeing the same behaviour here with a RaspberryPi2 and the latest 1.20210831 firmware. As soon as I use that version rather than the previous 1.20210805 one the SD card performance within u-boot drops significantly (from ~ 23MB/s to 120KB/s). So it seems that this is not limited to a Pi3 or Pi4 but also affects the older Pi2 as well.

jens-maus added a commit to jens-maus/RaspberryMatic that referenced this issue Sep 8, 2021
version breaks fast sd card use in u-boot environments due to invalid
base freq spec. This fixes #1405 and refs
raspberrypi/firmware#1619
@vfazio
Copy link

vfazio commented Sep 10, 2021

Is there any further information needed to help with this issue?

@pelwell
Copy link
Contributor

pelwell commented Sep 10, 2021

@popcornmix This seems to be a direct consequence of making GET_CLOCK_RATE specify the REQUIRED flag for all but CLOCK_ARM.

@popcornmix
Copy link
Contributor

GET_CLOCK_RATE has always been problematic, as the firmware can change the clocks for a variety of reasons.
e.g. core clock changes after initial_turbo has expired, during hdmi mode sets, when temperature is high, and during expensive gpu operations.
This caused an issue with the linux kernel's use, as sometimes it queried the clock and skipped setting it. That caused hard to reproduce errors when the clock just happened to temporarily match what was wanted, so the request was skipped, and when the firmware removed it's clock requirement the clock no longer matched what was wanted by linux.

The fix for this was to make GET_CLOCK_RATE just return the last value requested by SET_CLOCK_RATE.
The actual clock chosen is the max of all requests (i.e. what the arm has requested, plus requests from the firmware).

If uboot is just using this call to get a fixed clock without setting it, it will now get zero returned.
Note: that this was always unsafe as on different boots you could end up with a different value returned.

It may be better for uboot to call GET_MIN_CLOCK. That should return the value you were previously usually getting before the firmware update. However it will now not be subject to changes based on temperature, initial_turbo, hdmi requirements etc.

The other option is to use GET_CLOCK_MEASURED, which will return the actual value of a clock, but this will include temporary changes based on what firmware is doing, and as it's actually measured, the values have a little jitter on them.

@vfazio
Copy link

vfazio commented Sep 10, 2021

@popcornmix thanks

I thought @jlinton mentioned that setting the clock and then retrieving it still returned 0... I can try playing around with uboot to see what happens with:

  1. setting the rate then retrieving it
  2. using GET_MIN_CLOCK
  3. using GET_CLOCK_MEASURED

@popcornmix
Copy link
Contributor

I'd expect setting clock and retrieving it to give the value you set. Let me now if you can confirm that.

@jlinton
Copy link

jlinton commented Sep 10, 2021

MIN clock, or MAX clock? I think in both uboot/edk2 the returned clock rate is being used to select the smallest possible divisor for the given card (aka run it as fast as possible). I don't think either of those project is really interested in a dynamic clock value, but rather its peak. If the actual clock rate is lower, AFAIK SD cards should be able to deal with that. The problem is if it ends up going over the card max.

@popcornmix
Copy link
Contributor

MAX clock may be safer if you don't want a divided down value to exceed a specced value.

@vfazio
Copy link

vfazio commented Sep 10, 2021

I can confirm that, within u-boot at least, if i issue SET_CLOCK_RATE, i receive the requested rate in the response to the request and in a subsequent GET_CLOCK_RATE message

U-Boot 2021.07 (Sep 10 2021 - 19:03:20 +0000)

DRAM:  948 MiB
RPI Compute Module 3+ (0xa02100)
MMC:   
bcm2835: set_clock_rate response 4 250000000
bcm2835: get_clock_rate response 4 250000000
bcm2835: get_max_clock_rate response 4 400000000

Whereas without the SET_CLOCK_RATE call:

MMC:   
bcm2835: get_clock_rate response 4 0
bcm2835: get_max_clock_rate response 4 400000000

uboot will boot fine with either GET_MIN_CLOCK_RATE or GET_MAX_CLOCK_RATE, though it reads from the sd card faster with GET_MIN_CLOCK_RATE

@jlinton
Copy link

jlinton commented Sep 10, 2021

What happens if MAX is read, and then set? Does it actually adjust the clock rate near the max? I would imagine that would be the safe method of maintaining the higher speeds? (I will probably take another pass with edk2 later this evening)

@vfazio
Copy link

vfazio commented Sep 10, 2021

@jlinton i'll test that in a bit. I was trying to divine what's being done in the Linux driver, the logic is a little different than what's being done in uboot

https://github.com/raspberrypi/linux/blob/1.20210831/drivers/mmc/host/bcm2835-sdhost.c#L2127
https://github.com/raspberrypi/linux/blob/1.20210831/drivers/mmc/host/bcm2835-sdhost.c#L1558

@vfazio
Copy link

vfazio commented Sep 10, 2021

Setting the core clock (4) to the max maintains the higher speeds

MMC:   
bcm2835: get_clock_rate response 4 0
bcm2835: get_max_clock_rate response 4 400000000
bcm2835: set_clock_rate response 4 400000000
mmc@7e202000: 0
...
U-Boot> boot
5259877 bytes read in 226 ms (22.2 MiB/s)
   Uncompressing Kernel Image

though i'm a little wary of changing the core clock to max (core is currently being queried due to this note #953 (comment))

@jlinton
Copy link

jlinton commented Sep 10, 2021

So, i'm looking at the code now, and I now remember the dance edk2+ACPI is playing with Linux. (and I think this is true of DT too for the EMMC2). Linux, in the case of the EMMC2, is depending on the base clock rate to actually reflect what is in the SDHCI base clock frequency field (which is 100Mhz). In the case of the older EMMC which doesn't provide a meaningful cap's register, we are using the generic linux sdhci-caps/mask property to override the field to 250mhz.

So, we can play this dance and change the freq but we need to override the sdhci base frequency fields being fed into Linux. Leading me to think, I'm just going to validate that the requested frequency is between min->max and request 100 (or 250) mhz.

But, I'm not sure if I'm doing something wrong (I don't think so, we can override the core clock/etc) but I've tried changing the emmc2 clock (ClockId=C) with both the skip turbo and !skip turbo set, and I read it immediately back and its still zero, but only with the card that's failing. Otherwise, everything is fine (newer cards). The card that consistently fails for me is an old patriot 8GB SDHC class 4 card (SD2.0). This is following a 1.5Ghz request of the core clock, which is responding that the clock rate is what we requested.

Further, if I ignore the clock=0 return, assume the clock is what we requested, everything seem to work fine, including Linux.

@jlinton
Copy link

jlinton commented Sep 10, 2021

@popcornmix This seems to be a direct consequence of making GET_CLOCK_RATE specify the REQUIRED flag for all but CLOCK_ARM.

I assumed you were talking about something internal to the vc firmware since I don't see that here: https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface#get-clock-rate or is there a change not documented there?

@pelwell
Copy link
Contributor

pelwell commented Sep 11, 2021

@vfazio With a supporting firmware (which they all will have been since the Pi 3 was launched) the sdhost driver delegates the SD clock management to the firmware, relying on it to adjust the divisor as the core clock changes.

@jlinton Yes, I was referring to a change in the firmware code that implements GET_CLOCK_RATE.

@ultibohub
Copy link

Just wanted to add that we are seeing the same results reported above with our bare metal kernels.

Any firmware version from commit 25e2b59 always returns 0 for GET_CLOCK_RATE requests for the following clocks:

CLOCK_ID_EMMC (1)
CLOCK_ID_UART (2)
CLOCK_ID_EMMC2 (0xC)

regardless of whether SET_CLOCK_RATE is called prior or not.

These all return correct values for GET_MIN_CLOCK and GET_MAX_CLOCK and all behave correctly with any earlier firmware version.

Other clocks such as CLOCK_ID_CORE (4) return 0 prior to calling SET_CLOCK_RATE but return a correct value for GET_CLOCK_RATE afterwards.

Commit 57e7fd1 in March 2021 changed the response for CLOCK_ID_CORE (4) and others, at that point we added code to call SET_CLOCK_RATE for some of these so we could calculate correct divisors for I2C, SPI and SDHOST.

Clearly the sequence of calling SET_CLOCK_RATE and passing the value from GET_MAX_CLOCK will also be acceptable for EMMC, UART and EMMC2 but at present it simply doesn't work.

@nullr0ute
Copy link
Author

I can confirm that, within u-boot at least, if i issue SET_CLOCK_RATE, i receive the requested rate in the response to the request and in a subsequent GET_CLOCK_RATE message

Did you have a patch published somewhere to use as a testing base?

@vfazio
Copy link

vfazio commented Sep 13, 2021

@nullr0ute I'll push something soonish. There may be two patches, one to implement the logic that's being used in Linux re letting the firmware manage the clock, and another likely to have some safeguard behavior if GET_CLOCK_RATE is 0 to make a call to GET_MIN_CLOCK and SET_CLOCK_RATE and to use the clock value in the response

I need to test that second patch w/o the first patch to make sure it works as intended.

vfazio added a commit to vfazio/u-boot that referenced this issue Sep 13, 2021
In rpi-firmware 25e2b597ebfb2495eab4816a276758dcc6ea21f1 and later,
the GET_CLOCK_RATE mailbox property was changed to return the last
value set by SET_CLOCK_RATE.

raspberrypi/firmware#1619 (comment)

U-Boot does not call SET_CLOCK_RATE so bcm2835_get_mmc_clock will
return zero and can cause degraded MMC performance.

Calling SET_CLOCK_RATE fixes the frequency of the clock to a specific
value and disables the firmware's clock scaling so is not an option.

Instead, fallback to GET_MAX_CLOCK_RATE in bcm2835_get_mmc_clock if
GET_CLOCK_RATE returns zero.

Signed-off-by: Vincent Fazio <vfazio@xes-inc.com>
@vfazio
Copy link

vfazio commented Sep 13, 2021

@nullr0ute This is based on the 2021.07 release: https://github.com/vfazio/u-boot/commits/vfazio-rpi-mmc-clock-fix but they'll probably apply cleanly to 2021.04.

I'll try to submit the series to upstream tomorrow once i'm comfortable with the commit message wording.

vfazio added a commit to vfazio/u-boot that referenced this issue Sep 14, 2021
In rpi-firmware 25e2b597ebfb2495eab4816a276758dcc6ea21f1,
the GET_CLOCK_RATE mailbox property was changed to return the last
value set by SET_CLOCK_RATE.

raspberrypi/firmware#1619 (comment)

Due to this change in firmware behavior, bcm2835_get_mmc_clock now
returns a clock rate of zero since we do not issue SET_CLOCK_RATE.
This results in degraded MMC performance.

SET_CLOCK_RATE fixes the clock to a specific value and disables scaling
so is not an ideal solution.

Instead, fallback to GET_MAX_CLOCK_RATE in bcm2835_get_mmc_clock if
GET_CLOCK_RATE returns zero.

Signed-off-by: Vincent Fazio <vfazio@xes-inc.com>
vfazio added a commit to vfazio/u-boot that referenced this issue Sep 14, 2021
In rpi-firmware 25e2b597ebfb2495eab4816a276758dcc6ea21f1,
the GET_CLOCK_RATE mailbox property was changed to return the last
value set by SET_CLOCK_RATE.

raspberrypi/firmware#1619 (comment)

Due to this change in firmware behavior, bcm2835_get_mmc_clock now
returns a clock rate of zero since we do not issue SET_CLOCK_RATE.
This results in degraded MMC performance.

SET_CLOCK_RATE fixes the clock to a specific value and disables scaling
so is not an ideal solution.

Instead, fallback to GET_MAX_CLOCK_RATE in bcm2835_get_mmc_clock if
GET_CLOCK_RATE returns zero.

Signed-off-by: Vincent Fazio <vfazio@xes-inc.com>
@jlinton
Copy link

jlinton commented Sep 22, 2021

I'm just going to echo the last sentence of ultibohub's comment in case it was missed.

"Clearly the sequence of calling SET_CLOCK_RATE and passing the value from GET_MAX_CLOCK will also be acceptable for EMMC, UART and EMMC2 but at present it simply doesn't work."

I would also like clarification that SET_CLOCK_RATE will act as a clamp and the firmware won't exceed the requested value, as well as it will try and honor that value unless there is a thermal/power problem.

Thanks,

@popcornmix
Copy link
Contributor

I would also like clarification that SET_CLOCK_RATE will act as a clamp and the firmware won't exceed the requested value, as well as it will try and honor that value unless there is a thermal/power problem.

The clock will be set to the maximum of all requests.
For example the firmware may have requested a higher value than the arm for core_clock (e.g. while changing hdmi mode), as not doing so will cause a failure.

Also there may be other constraints, such as clocks which share a PLL.

You'd probably have to give an example of the clock and value set, and whether the outcome was as expected.

@ultibohub
Copy link

The clock will be set to the maximum of all requests.
For example the firmware may have requested a higher value than the arm for core_clock (e.g. while changing hdmi mode), as not doing so will cause a failure.

@popcornmix what is the expected behavior of GET_CLOCK_RATE at this point?

You'd probably have to give an example of the clock and value set, and whether the outcome was as expected.

I've just retested with commit f9bc224 and for CLOCK_ID_EMMC (1), CLOCK_ID_UART (2) and CLOCK_ID_EMMC2 (0xC) the return from GET_CLOCK_RATE is always 0 regardless of what value was set with SET_CLOCK_RATE.

Other clocks such as CORE and CPU still return the value that was set.

We've added an additional check so that if GET_CLOCK_RATE returns 0 we call GET_CLOCK_MEASURED instead but should we assume that GET_CLOCK_RATE will no longer function in a meaningful way and abandon using it completely?

We are aware of other projects seeing the same results as well.

@popcornmix
Copy link
Contributor

@popcornmix what is the expected behavior of GET_CLOCK_RATE at this point?

It returns the value requested by the arm.

I've just retested with commit f9bc224 and for CLOCK_ID_EMMC (1), CLOCK_ID_UART (2) and CLOCK_ID_EMMC2 (0xC) the return from GET_CLOCK_RATE is always 0 regardless of what value was set with SET_CLOCK_RATE.

let me check what that is doing.

@popcornmix
Copy link
Contributor

popcornmix commented Sep 23, 2021

let me check what that is doing.

I can reproduce. For those clocks it's not behaving as designed. I'll have a think about best way to fix it.

@sallevan
Copy link

BTW, the patch is not backward-compatible: firmware from 03.03.2021 doesn't work with U-Boot patched with this patch.

starting USB...
Bus xhci_pci: Register 5000420 NbrPorts 5
Starting the controller
USB XHCI 1.00
scanning bus xhci_pci for devices... XHCI timeout on event type 33... cannot recover.
BUG at drivers/usb/host/xhci-ring.c:481/xhci_wait_for_event()!
BUG!
resetting ...

@vfazio
Copy link

vfazio commented Sep 28, 2021

BTW, the patch is not backward-compatible: firmware from 03.03.2021 doesn't work with U-Boot patched with this patch.

starting USB...
Bus xhci_pci: Register 5000420 NbrPorts 5
Starting the controller
USB XHCI 1.00
scanning bus xhci_pci for devices... XHCI timeout on event type 33... cannot recover.
BUG at drivers/usb/host/xhci-ring.c:481/xhci_wait_for_event()!
BUG!
resetting ...

Does it complain if the second patch isn't applied?

@sallevan
Copy link

sallevan commented Sep 28, 2021

... and does not work with dtoverlay=mmc :)

It seems to me that it might be better approach just to copy all clocks initialisation from old start.elf into U-Boot code...

@vfazio
Copy link

vfazio commented Sep 29, 2021

Unfortunately I do not have an rpi4 to test on. I tested with a CM3 with the rpi_arm64_defconfig and didn't have any issues, so if anyone is able to assist with debug efforts I'd appreciate it.

@popcornmix
Copy link
Contributor

Are you able to test this firmware and report if it behaves as you'd like.

@vfazio
Copy link

vfazio commented Sep 29, 2021

Are you able to test this firmware and report if it behaves as you'd like.

@popcornmix who is the firmware bundle targeting? Is this wrt #1619 (comment) ?

@popcornmix
Copy link
Contributor

popcornmix commented Sep 29, 2021

The firmware is for any users experiencing "Regression booting rpi3/4 with U-Boot on recent firmware".

@vfazio
Copy link

vfazio commented Sep 29, 2021

@popcornmix works on my CM3 in a quick test. Looks like GET_CLOCK_RATE now returns a value (u-boot built without my patches on top)

U-Boot 2021.07 (Sep 29 2021 - 19:44:44 +0000)

DRAM:  948 MiB
RPI Compute Module 3+ (0xa02100)
MMC:   mmc@7e202000: 0
Loading Environment from FAT... OK
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: usb dr_mode not found
USB DWC2
scanning bus usb@7e980000 for devices...
Warning: smsc95xx_eth (eth0) using random MAC address - fa:5d:a5:38:8c:a2
4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0
5400263 bytes read in 233 ms (22.1 MiB/s)
   Uncompressing Kernel Image
Moving Image from 0x80000 to 0x200000, end=1690000
## Flattened Device Tree blob at 2eff8f00
   Booting using the fdt blob at 0x2eff8f00
   Using Device Tree in place at 000000002eff8f00, end 000000002f002f39

Starting kernel ...
root@rpi-87fa00:~# vcgencmd version
Sep 29 2021 18:51:54
Copyright (c) 2012 Broadcom
version 5ab606ac65f577d380ddb89bf99df710b4dc34df (tainted) (release) (start_x)

@ultibohub
Copy link

Are you able to test this firmware and report if it behaves as you'd like.

@popcornmix, Tested and confirmed that GET_CLOCK_RATE now returns a valid value for EMMC, UART and EMMC2.

Tested on Pi 3B+ and Pi 4B, results from Pi 4B.

Release firmware:

C:\>gencmd version
Sep 21 2021 15:46:24
Copyright (c) 2012 Broadcom
version 0bb32cf4b147b5ad26b6dfb04186c94e3a6489c5 (clean) (release) (start)

Clock Rate
   CLOCK_ID_MMC0: 0
   CLOCK_ID_MMC1: 500000000
   CLOCK_ID_MMC2: 0
   CLOCK_ID_UART0: 0
   CLOCK_ID_UART1: 500000000

Test firmware:

C:\>gencmd version
Sep 29 2021 18:48:55
Copyright (c) 2012 Broadcom
version 5ab606ac65f577d380ddb89bf99df710b4dc34df (tainted) (release) (start)

Clock Rate
   CLOCK_ID_MMC0: 250000000
   CLOCK_ID_MMC1: 500000000
   CLOCK_ID_MMC2: 500000000
   CLOCK_ID_UART0: 48000000
   CLOCK_ID_UART1: 500000000

Seems to be resolved, thank you.

popcornmix added a commit that referenced this issue Sep 30, 2021
See: #1619

firmware: dispmanx: Demote null eptr from vcos_verify to no warning
See: raspberrypi/linux#4592

firmware: filesystem: sdcard: Probe FAT type in GPT ESD partitions
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Sep 30, 2021
See: raspberrypi/firmware#1619

firmware: dispmanx: Demote null eptr from vcos_verify to no warning
See: raspberrypi/linux#4592

firmware: filesystem: sdcard: Probe FAT type in GPT ESD partitions
@popcornmix
Copy link
Contributor

Fix should be in latest rpi-update firmware.

@vfazio
Copy link

vfazio commented Sep 30, 2021

We primarily use tagged versions of the rpi firmware. Is there a tag scheduled at some point?

@popcornmix
Copy link
Contributor

At some point, but I can't say when exactly. The tagging occurs when it's added to stable tree (and potentially to apt)
but we will want to keep an eye on reports of any issues before then.

@sallevan
Copy link

Are you able to test this formware and report if it behaves as you'd like.

U-Boot 2021.04 (Sep 30 2021 - 20:35:47 +0000)

RPI 4 Model B (0xc03111)

Works fine in this RPi/U-Boot pairing.

@jlinton
Copy link

jlinton commented Sep 30, 2021

So the set/get sequence seems to behave as expected now. Plus, it appears the behavior has reverted to what it was doing previously with 250Mhz->emmc, 100Mhz->emmc2 on rpi4b being reported as the default clock rates (without first requesting a clock rate). So Thumbs up there...

That said, I've seen some odd behaviors where if I leave it sitting for a few mins in grub, then reboot it via grub's "UEFI setup", it fails to write the updated variable store to the SD card. I'm not sure if this was happening before because I wasn't using this SD card that appeared to cause the 0 returns consistently from get clock.

So. I guess this fixes the primary problem, but I'm still concerned about what I've heard about SET_CLOCK, because I don't see how having the clock exceed the requested values can really be dealt with in a meaningful way.

I'm generally OK with it running slower than the requested clock, but if SET_CLOCK is just a hint, and the low level firmware on the VC asks for something closer to MAX, that is going to be a problem. Particularly if it happens after we hand off control to the OS. Meaning the only "safe" way to deal with this is going to be requesting MAX clock via SET_CLOCK, which doesn't seem optimal if we are dividing it down to 50Mhz or whatever.

@sallevan
Copy link

I did push these to U-Boot's patchwork

Fellow @vfazio, I'm sorry, but that was my fault in patching U-Boot -- your patch is working perfectly.

pbatard added a commit to pftf/RPi4 that referenced this issue Oct 19, 2021
* Also revert start4.elf to most recent version as raspberrypi/firmware#1619 (#177) should be fixed now.
@vfazio
Copy link

vfazio commented Nov 18, 2021

I think the fix has been included in a tag, at least i'm not seeing the hang as part of 1.20211029

trini pushed a commit to trini/u-boot that referenced this issue Mar 24, 2023
In rpi-firmware 25e2b597ebfb2495eab4816a276758dcc6ea21f1,
the GET_CLOCK_RATE mailbox property was changed to return the last
value set by SET_CLOCK_RATE.

raspberrypi/firmware#1619 (comment)

Due to this change in firmware behavior, bcm2835_get_mmc_clock now
returns a clock rate of zero since we do not issue SET_CLOCK_RATE.
This results in degraded MMC performance.

SET_CLOCK_RATE fixes the clock to a specific value and disables scaling
so is not an ideal solution.

Instead, fallback to GET_MAX_CLOCK_RATE in bcm2835_get_mmc_clock if
GET_CLOCK_RATE returns zero.

Signed-off-by: Vincent Fazio <vfazio@xes-inc.com>
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
mattsm pushed a commit to analogdevicesinc/u-boot that referenced this issue May 2, 2024
In rpi-firmware 25e2b597ebfb2495eab4816a276758dcc6ea21f1,
the GET_CLOCK_RATE mailbox property was changed to return the last
value set by SET_CLOCK_RATE.

raspberrypi/firmware#1619 (comment)

Due to this change in firmware behavior, bcm2835_get_mmc_clock now
returns a clock rate of zero since we do not issue SET_CLOCK_RATE.
This results in degraded MMC performance.

SET_CLOCK_RATE fixes the clock to a specific value and disables scaling
so is not an ideal solution.

Instead, fallback to GET_MAX_CLOCK_RATE in bcm2835_get_mmc_clock if
GET_CLOCK_RATE returns zero.

Signed-off-by: Vincent Fazio <vfazio@xes-inc.com>
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants