-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcm2835-power: Timeout waiting for grafx power OK #3046
Comments
|
Just another data-point: I built https://github.com/raspberrypi/linux/tree/rpi-5.2.y and I'm getting |
|
model: firmware version: kernel version: kernel logs: The |
|
I'm getting the same issue with RPi 3B+, Arch Linux aarch64, Kernel 5.2.10-1-ARCH. However, I have several Pi 3B+ and it is NOT happening on all of them (using the same SD card with the same image). Some of them detect the VC4 GPU during boot just fine. And with the other boards, it appears to be temperature related. When the board is at room temperature (having been unpowered for some time) the GPU is detected normally. Also, over a couple of reboots. But after some minutes, when the temperature rises above about 50 °C, the GPU is not detected any longer on reboot and the bcm2835-power log message appears. Maybe that additional piece of information helps tracking down the issue. |
|
Thanks for your report. I build the Mainline kernel 5.3-rc6 with multi_v7_defconfig (Raspbian rootfs) for my RPI 3B+. Then i caused enough load to reach ~ 54 °C (no cpufreq enabled) and triggered a reboot. "Unfortunately" i wasn't able to reproduce the timeout. |
|
Thanks for looking into it. I have seven Pi3B+ boards and I am currently testing them all under the same conditions to see how many of them are affected (so far 2 out of 4 fail when warm, fully reproducible; the others never fail). Maybe some chips are more 'sensitive' to the power-up ramp than others. Could changing the current ramp (lower initial, lower step size, more time between steps) help? I'd try playing around with bcm2835-power.c but I have no experience integrating a custom kernel for the RPi and don't know if it is as simple as 'replace the ARCH kernel with the selfmade one'. |
|
One update: I started building the (mainline) kernel using your defconfig (arm64/configs/defconfig). I interrupted when I realized that it is going to take some time... I'll do it at home over night ;-). So it is definitely a matter of temperature, but the cut between good and bad varies from device to device. Maybe you can stress your board to higher temperatures and see if the timeout appears as well. |
|
FYI I'm seeing the timeouts on my RPI3b+ with 5.3.0-rc4. Can't really say whether it's temperature related as it always fails. I can run some debugging if needed. |
|
After enabling the Mainline cpufreq driver i'm seeing the timeouts, too. |
|
IIRC The main functional difference between the downstream cpufreq driver and upstream is that we're disabling turbo mode when changing the clocks. What about no cpufreq and setting arm's clock @ 1.2GHz in config.txt? |
|
I don't think there is a issue with cpufreq driver. Since my default governour is ondemand, this causes much more CPU stress during boot. I will try to test your suggestion. |
|
My test results: |
|
@popcornmix Any idea to analyze this further? Without documentation i don't have a clue what's going on in the new bcm2835 pm driver. |
|
I made a register dump of the PM addresses for the following cases:
Comparing both dumps showed only 1 difference:
Note: without e1dc2b2 and with enabled forced_turbo i'm not able to reproduce the timeout @anholt Is this expected? |
|
@lategoodbye the difference in PM_RSTS registers is just: so I guess first was captured after a power cycle, and second after a |
|
Okay, thanks. So the difference is unrelated. I will wait for suggestions to narrow down this issue until the release of Linux 5.4-rc1, after that i will revert e1dc2b2 according to the no regression policy. |
…of firmware." This reverts commit e1dc2b2. see: raspberrypi/linux#3046
…of firmware." This reverts commit e1dc2b2. see: raspberrypi/linux#3046
|
For what it is worth I am seeing this error pop up multiple times with 5.3.0 on a 3b+ running arm64/ubuntu using a mainline kernel from here: https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/ I'm noticing that a warm reboot using (My setup is currently headless, so I'm not seeing what comes up on the screen when this situation arises.) It seems this might be connected? (Or I can open another issue if it seems unconnected.) |
|
The error message is the same, and the fact that upstream code shows the same issue is useful datapoint. By the way, you should be able to replace |
…of firmware." Since release of the new BCM2835 PM driver there has been several reports of V3D probing issues. This is caused by timeouts during powering-up the GRAFX PM domain: bcm2835-power: Timeout waiting for grafx power OK I was able to reproduce this reliable on my Raspberry Pi 3B+ after setting force_turbo=1 in the firmware configuration. Since there are no issues using the firmware PM driver with the same setup, there must be an issue in the BCM2835 PM driver. Unfortunately there hasn't been much progress in identifying the root cause since June (mostly in the lack of documentation), so i decided to switch back until the issue in the BCM2835 PM driver is fixed. Link: raspberrypi/linux#3046 Fixes: e1dc2b2 (" ARM: bcm283x: Switch V3D over to using the PM driver instead of firmware.") Cc: stable@vger.kernel.org Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
…of firmware." Since release of the new BCM2835 PM driver there has been several reports of V3D probing issues. This is caused by timeouts during powering-up the GRAFX PM domain: bcm2835-power: Timeout waiting for grafx power OK I was able to reproduce this reliable on my Raspberry Pi 3B+ after setting force_turbo=1 in the firmware configuration. Since there are no issues using the firmware PM driver with the same setup, there must be an issue in the BCM2835 PM driver. Unfortunately there hasn't been much progress in identifying the root cause since June (mostly in the lack of documentation), so i decided to switch back until the issue in the BCM2835 PM driver is fixed. Link: raspberrypi/linux#3046 Fixes: e1dc2b2 (" ARM: bcm283x: Switch V3D over to using the PM driver instead of firmware.") Cc: stable@vger.kernel.org Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
…of firmware." Since release of the new BCM2835 PM driver there has been several reports of V3D probing issues. This is caused by timeouts during powering-up the GRAFX PM domain: bcm2835-power: Timeout waiting for grafx power OK I was able to reproduce this reliable on my Raspberry Pi 3B+ after setting force_turbo=1 in the firmware configuration. Since there are no issues using the firmware PM driver with the same setup, there must be an issue in the BCM2835 PM driver. Unfortunately there hasn't been much progress in identifying the root cause since June (mostly in the lack of documentation), so i decided to switch back until the issue in the BCM2835 PM driver is fixed. Link: raspberrypi/linux#3046 Fixes: e1dc2b2 (" ARM: bcm283x: Switch V3D over to using the PM driver instead of firmware.") Cc: stable@vger.kernel.org Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
|
Yesterday, i tested the revert against current Mainline Linux 5.4 + Raspbian Buster with a Raspberry Pi 3 B+ . Unfortunately X hangs completely during boot, so i asked Florian to drop this patch :-( |
…of firmware." This reverts commit e1dc2b2. see: raspberrypi/linux#3046
…of firmware." This reverts commit e1dc2b2. see: raspberrypi/linux#3046
…instead of firmware."" Because both upstream [1] and Raspbian downstream [2] kernels drops this patch. This reverts commit 655c3ca. https://phabricator.endlessm.com/T28448 [1]: https://patchwork.kernel.org/patch/11136979/#22928901 [2]: raspberrypi/linux#3046 (comment) Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
This was the reason behind the revert. But the revert causes hang during boot of Raspbian, so i decided to drop the revert. |
|
It seems that without these reverts, the GPU will also work, so maybe these reverts cause the X hang? |
|
Add these lines to the dts file, compile it, replace the dtb with the newly compiled one, then the gpu will start working. |
Devicetree changes usually don't cause hangs, it's more a driver issue. According your change you combine the "best" of both power drivers. Unfortunately it's unsafe to handle the same register ranges with two Linux drivers. Currently i only see two options:
|
…instead of firmware."" Because both upstream [1] and Raspbian downstream [2] kernels drops this patch. This reverts commit 655c3ca. https://phabricator.endlessm.com/T28448 [1]: https://patchwork.kernel.org/patch/11136979/#22928901 [2]: raspberrypi/linux#3046 (comment) Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
…of firmware." This reverts commit e1dc2b2. see: raspberrypi/linux#3046
This solves it for me. |
…of firmware." Since release of the new BCM2835 PM driver there has been several reports of V3D probing issues. This is caused by timeouts during powering-up the GRAFX PM domain: bcm2835-power: Timeout waiting for grafx power OK I was able to reproduce this reliable on my Raspberry Pi 3B+ after setting force_turbo=1 in the firmware configuration. Since there are no issues using the firmware PM driver with the same setup, there must be an issue in the BCM2835 PM driver. Unfortunately there hasn't been much progress in identifying the root cause since June (mostly in the lack of documentation), so i decided to switch back until the issue in the BCM2835 PM driver is fixed. Link: raspberrypi/linux#3046 Fixes: e1dc2b2 (" ARM: bcm283x: Switch V3D over to using the PM driver instead of firmware.") Cc: stable@vger.kernel.org Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
…of firmware." Since release of the new BCM2835 PM driver there has been several reports of V3D probing issues. This is caused by timeouts during powering-up the GRAFX PM domain: bcm2835-power: Timeout waiting for grafx power OK I was able to reproduce this reliable on my Raspberry Pi 3B+ after setting force_turbo=1 in the firmware configuration. Since there are no issues using the firmware PM driver with the same setup, there must be an issue in the BCM2835 PM driver. Unfortunately there hasn't been much progress in identifying the root cause since June (mostly in the lack of documentation), so i decided to switch back until the issue in the BCM2835 PM driver is fixed. Link: raspberrypi/linux#3046 Fixes: e1dc2b2 (" ARM: bcm283x: Switch V3D over to using the PM driver instead of firmware.") Cc: stable@vger.kernel.org Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
…instead of firmware."" Because both upstream [1] and Raspbian downstream [2] kernels drops this patch. This reverts commit 655c3ca. https://phabricator.endlessm.com/T28448 [1]: https://patchwork.kernel.org/patch/11136979/#22928901 [2]: raspberrypi/linux#3046 (comment) Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
|
A RPi3B+ of mine has not been used for a while. I used a new SD card and prepared it with Arch Linux ARM AArch64. I turned the RPi off yesterday evening and turned it on this morning. So, if you need another board to get some diagnostic information, I can try to provide. |
|
The consensus above is that this is caused by an incompatibility in the upstream/mainline 3B+ DTB. Edit the source file as described by @sankayop above and rebuild it (or download the prebuilt version they link to) and try with that. |
|
Thank you for your reply, will give it a try later... Will the specific change that seems to be applied to all the fedora kernel versions make it upstream? |
|
Currently for upstream i only see two "options":
I'm not happy with both of them. @sankayop patch will enable both power driver for the same power domain. I consider this as a path to hell ... |
|
@lategoodbye Do you have a preference between 1 and 2? Is there something we can do to help? |
|
Number 1 isn't a real option, because we need this driver for Raspberry Pi 4. Number 2 should be do able for downstream, but would result more likely in a merge of both drivers for upstream. The best option would be to ask someone with deeper understanding of BCM2835 why the rampup causes these random timeouts (timing issue, missing requirements, wrong order of power domain handling) and fix the bcm2835 power driver. |
|
Any updates on this? |
|
In the upstream kernel the suggested patch to revert has been applied. The hanging X issue was unrelated. |
|
When booting up my Raspberry Pi 2 Model B with Arch Linux ARM, it seems one of two things happens:
In my testing, it does seem that the first occurrence is more likely when the Pi is cooled down, rather than right after rebooting. Most of these are things that have already been pointed out, but I wanted to provide a test case for anyone else having the issue. Is the issue with X hanging being tracked anywhere? |
Here is the accepted fix: |
|
Seems the Pi 3 A+ has the same problem :-/ |
Describe the bug
Starting with Linux 5.1 there is a new power driver for BCM2835. The idea behind this is to have a better control about the V3D power domain. After rollout i got informed that some RPI boards (currently a handfull) have issues during enabling the V3D power domain. The ramp-up runs into a timeout (20 us), because we never get a PM_POWOK. I don't have a clue what causes this issue (timing, hardware tolerance, ...). Currently i don't have a board, which is affected.
To reproduce
start the RPI with Mainline Kernel 5.1
Expected behaviour
bcm2835-power succeeded to enable V3D power domain
Actual behaviour
bcm2835-power failes to enable V3D power domain because PM_POWOK stays off
System
RPI 2, RPI 3B and RPI 3B+
vcgencmd version)?2019-02-12, 2019-03-27
uname -a)?Mainline Kernel / DTB 5.1
Logs
More info:
anholt#153
Additional context
Add any other relevant context for the problem.
The text was updated successfully, but these errors were encountered: