Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raspberrypi-clk: Failed to get / change pllb frequency: -12 #3479

Open
olifre opened this issue Feb 22, 2020 · 27 comments
Open

raspberrypi-clk: Failed to get / change pllb frequency: -12 #3479

olifre opened this issue Feb 22, 2020 · 27 comments

Comments

@olifre
Copy link
Contributor

olifre commented Feb 22, 2020

Describe the bug
After anything between 1-2 weeks of runtime, kernel log is flooded with messages such as:

[566789.421766] raspberrypi-clk raspberrypi-clk: Failed to change pllb frequency: -12
[566789.421784] raspberrypi-clk raspberrypi-clk: Failed to get pllb frequency: -12
[566790.128946] raspberrypi-clk raspberrypi-clk: Failed to change pllb frequency: -12
[566790.128962] raspberrypi-clk raspberrypi-clk: Failed to get pllb frequency: -12
[566790.340932] raspberrypi-clk raspberrypi-clk: Failed to change pllb frequency: -12
[566790.340948] raspberrypi-clk raspberrypi-clk: Failed to get pllb frequency: -12
[566791.499069] raspberrypi-clk raspberrypi-clk: Failed to change pllb frequency: -12
[566791.499086] raspberrypi-clk raspberrypi-clk: Failed to get pllb frequency: -12

The system still runs normally, though.

To reproduce
Let the system run for 1-2 weeks. No overclocking, no cooling issues, official power supply, no extra loads attached, no preceding kernel messages.

System
Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:

  • Which model of Raspberry Pi? e.g. Pi3B+, PiZeroW
Raspberry Pi 4 Model B Rev 1.1
  • Which OS and version (cat /etc/rpi-issue)?
Raspberry Pi reference 2016-11-25
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 21a681692ac70e77d3ebe1dd2a76b31de1a72d89, stage4
  • Which firmware version (vcgencmd version)?
Feb 12 2020 12:36:21 
Copyright (c) 2012 Broadcom
version c3c8dbdf147686fb0c3f32aece709d0653368810 (clean) (release) (start)
  • Which kernel version (uname -a)?
Linux raspiserv 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux

Logs
See report above.

Additional context
Sadly, no real trigger leading up to the issue. Reboot solves it.

@olifre
Copy link
Contributor Author

olifre commented Feb 22, 2020

Trying the commands:

# vcgencmd measure_clock pllb
frequency(100)=3000000000
# vcgencmd measure_clock arm
frequency(48)=750199232

seems to work fine in that situation.

@olifre
Copy link
Contributor Author

olifre commented Feb 22, 2020

I grepped through the logs and found the following, potentially related messages:

[567278.869559] xhci_hcd 0000:01:00.0: Ring expansion failed
[567278.869581] hub 1-1.2.4:1.0: activate --> -12
[567294.010078] xhci_hcd 0000:01:00.0: Ring expansion failed
[567294.010100] hub 1-1.2.4:1.0: activate --> -12
[567309.150218] xhci_hcd 0000:01:00.0: Ring expansion failed
[567309.150331] hub 1-1.2.4:1.0: activate --> -12

So I add my usb device tree here in addition in case it helps:

# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 2: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
            |__ Port 2: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 480M
            |__ Port 4: Dev 5, If 0, Class=Hub, Driver=hub/4p, 480M

The hub is a self-powered USB 2 hub with a USB 3 HDD attached to it (I plan to replace the hub with a USB 3 capable model in the future).

The HDD access is still fine, though.

@olifre
Copy link
Contributor Author

olifre commented Feb 22, 2020

After reading through #3040 which might be related(?), I have now added:

coherent_pool=4M

to my /boot/cmdline and rebooted. Other suggestions welcome. Will report back latest in ~2 weeks if the problem did not reappear, and earlier if it does.

@lategoodbye
Copy link
Contributor

Hi @olifre,

the -12 means ENOMEM and this is a typical cases when you running out of memory. I assume there is at least one memory leak. In case you are able to rebuild your kernel, try to enable CONFIG_DEBUG_KMEMLEAK in the kernel config under kernel hacking -> memory debugging. After that you should be able to see the leaks in /sys/kernel/debug/kmemleak.

@olifre
Copy link
Contributor Author

olifre commented Feb 23, 2020

HI @lategoodbye ,

thanks! I wondered whether this is a leak or "just" fragmentation within the coherent memory region.
I'm a bit astonished this would only hit me, though, since there is no special hardware involved.

While I regularly compile my own kernel on Desktops, I've not done so on Raspbian / Debian yet. I'm already stuck at the point where to get the upstream kernel config and Raspbian patches (if these exist) to build "the same" kernel with just a single flag changed — the docs at https://www.raspberrypi.org/documentation/linux/kernel/building.md only mention *_defconfig which will likely yield a different configuration (i.e. I would test with something different than what I use now).

So if somebody can help me out here, I can do it, but I would presume the issue is reproducible with any USB 2 hub + USB 3 mass storage device connected to a USB 3 port of the RPi 4.

@popcornmix
Copy link
Collaborator

The docs you linked to are how the standard raspbian kernel is built.
bcm2711_defconfig is the config used on a pi 4.

@olifre
Copy link
Contributor Author

olifre commented Feb 24, 2020

@popcornmix Thanks, compiling now. Hope it'll be straightforward to go back to the upstream built kernel (i.e."just upgrade").

Will report back once I found something.

I wonder if:

CONFIG_DEBUG_KMEMLEAK=Y
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=Y

has any performance implications (if not, it might be helpful to have that by default in case these issues appear regularly).

@olifre
Copy link
Contributor Author

olifre commented Feb 24, 2020

Modified kernel is running, at least there is no immediate leak visible (even when triggering an active scan). Guess I'll have to wait a few days.
Or maybe it is "just" fragmentation after all, and compaction not working well enough?

@olifre
Copy link
Contributor Author

olifre commented Feb 24, 2020

I have by now executed all the workloads I execute on the device, including SMART tests of the attached HDD, running hd-idle, running fstrim on the SD card, running a full borg backup from the SD card to the HDD, producing some CPU load and IO load etc.

No leak seen yet. Any other ideas, or should we just wait for ~a week to see if anything changes?

@olifre
Copy link
Contributor Author

olifre commented Feb 29, 2020

@lategoodbye ,
it's 5 days now and kmemleak still shows up empty. So I think we can already exclude a leak, right?

Since I increased to coherent_pool=4M, if it is fragmentation of the coherent region or something similar, the issue might not show up at all anymore, or take months to appear.

Any other ideas on how to debug? If not, I would revert to a stock kernel (the kmemleak overhead is noticeable) keeping coherent_pool=4M and switch to my new USB 3 hub. Both these changes may hide the issue, so if there's anything else you would like to see, let me know.

@olifre
Copy link
Contributor Author

olifre commented Mar 18, 2020

@lategoodbye , the issue is now back after 23 days of uptime, so the increase to coherent_pool=4M seems to have delayed it.

However:

sudo cat /sys/kernel/debug/kmemleak

gives me only empty output.

Any ideas? Can I reboot the system and revert to a "stock" kernel, or do you have anything you'd like to get extracted?

@olifre
Copy link
Contributor Author

olifre commented Mar 20, 2020

Since there were still no leaks and no further ideas, I've now:

  • Reverted to the stock kernel.
  • Switched from the USB 2 hub to a USB 3 hub connected to the USB 3 port, keeping the same USB 3 HDD.
  • bumped coherent_pool to 8M.

I'll report back if it happens again, or in a month if it does not. Any other ideas welcome.

@dustContributor
Copy link

dustContributor commented Apr 5, 2020

I had the same issue. Logs being filled with the exact same messages. Filled my SD with logs.

Raspberry Pi 4 B 4GB
---
Raspberry Pi reference 2020-02-05
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, e577677b623b577f2a0ec7cfaffc3c27da005da3, stage2
---
Jan 30 2020 13:50:45 
Copyright (c) 2012 Broadcom
version 8f792e011c8a0ea3c5d47e7cc10172cc10b93c09 (clean) (release) (start)
---
Linux rpi4 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux

In case it helps:

I have a WiFi dongle in a USB 2 port, and an external hard drive in a USB 3 port. The only change I've made to the RPI itself is that I've disabled the built in WiFi via dtoverlay=disable-wifi

Everything else is stock. Raspbian Lite 2020-02-05 iso. Non official 4a PSU.

I'm trying the coherent_pool flag with 8M as @olifre suggested, I'll report back if it happens again.

@lmondoux
Copy link

I'm also getting these errors all of a sudden, will report back if I find anything out.

@user747
Copy link

user747 commented May 7, 2020

seeing this on a Raspberry Pi 4 Model B Rev 1.1 with a modified kernel

raspberrypi-clk: Failed to get pllb frequency: -12
raspberrypi-clk: Failed to change pllb frequency: -12

xhci_hcd 0000:01:00.0: Ring expansion failed

Linux test 5.4.38 #0 SMP Tue May 5 07:14:40 2020 aarch64 GNU/Linux

eeprom update maybe worth trying? i dont have any usb devices plugged in

rpi-eeprom-update 
BCM2711 detected
*** UPDATE REQUIRED ***
BOOTLOADER: update required
CURRENT: Tue Sep 10 10:41:50 UTC 2019 (1568112110)
 LATEST: Thu Mar 19 14:27:25 UTC 2020 (1584628045)
 FW DIR: /lib/firmware/raspberrypi/bootloader/beta
VL805: update required
CURRENT: 000137ab
 LATEST: 000137ad

@olifre
Copy link
Contributor Author

olifre commented May 7, 2020

I'm currently running with coherent_pool=8M and am up to 45 days of uptime without encountering this issue again.

Current info (apart from the high coherent_pool size):

  • Kernel 4.19.97-v7l+ (vanilla Raspbian again)
  • I am now running with the 000137ad EEPROM version, when opening the thread, I was on an older version (the release notes do not seem to suggest significant USB changes there, though).
  • I moved from an USB 2 hub with the USB 3 HDD connected to a USB 3 hub with the same drive.

Interestingly, with this hub / firmware / Kernel, I regularly get USB disconnects and reconnects when the drive goes to standby and comes back.
Maybe this helps with the issue?

@user747
Copy link

user747 commented May 10, 2020

i rebuilt to 5.4.39 with CONFIG_DEBUG_KMEMLEAK=Y
also updated eeprom to 000137ad

oot@test:~# cat /boot/bcm2711-rpi-4-b.dtb | grep pool
coherent_pool=1M 8250.nr_uarts=1 cma=64M

24 hours in haven't seen this error yet

@olifre
Copy link
Contributor Author

olifre commented May 10, 2020

I'm currently running with coherent_pool=8M and am up to 45 days of uptime without encountering this issue again.

It came back now after 48 days. Since I often reboot more often for updates anyways, coherent_pool=8M mostly hides the issue for me by sufficiently delaying it...

@user747
Copy link

user747 commented May 11, 2020

and its back

Mon May 11 03:04:48 2020 kern.err kernel: [75908.914506] xhci_hcd 0000:01:00.0: Ring expansion failed Mon May 11 03:04:48 2020 kern.err kernel: [75908.919835] hub 1-1:1.0: activate --> -12

root@test:~# cat /sys/kernel/debug/kmemleak
root@test:~#

nothing there but i have a feeling i'm not using these tool correctly

root@test:~# mount | grep debug
nodev on /sys/kernel/debug type debugfs (rw,relatime)

root@test:~# echo scan > /sys/kernel/debug/kmemleak
ash: write error: Operation not permitted

@olifre
Copy link
Contributor Author

olifre commented Jul 25, 2020

5.4.51-v7l+ has now entered stable. I wonder if the issue will go away with that, updated just now.

@user747
Copy link

user747 commented Jul 27, 2020

still present when using 5.4.52 aarch64 GNU/Linux (patched kernel with mptcp on OpenWRT)
edit: my firmware is out of date. going to update and retry.

@user747
Copy link

user747 commented Jul 31, 2020

this might be related
raspberrypi/rpi-eeprom@d7b1a49

## 2020-07-31 Standardize USB port power control accross board revisions - BETA
   * Turn off USB power power for 1-second regardless of boot-mode. This appears
     to resolve an issue on R1.3 and older board revisions where some USB
     devices would fail upon reboot. On R1.4 USB port power is turned off
     automatically by the PMIC so this is just held in reset for longer. For
     earlier board revisions the USB port power is explicitly turned off via
     XHCI.
     This can be overriden via USB_MSD_PWR_OFF_TIME in the EEPROM config.
   * Update the to latest Broadcom memsys FW - no significant functional change.

@user747
Copy link

user747 commented Oct 20, 2020

logs still being flooded on 5.4.69
using this firmware

BCM2711 detected Dedicated VL805 EEPROM detected BOOTLOADER: up-to-date CURRENT: Thu Sep 3 12:11:43 UTC 2020 (1599135103) LATEST: Thu Sep 3 12:11:43 UTC 2020 (1599135103) FW DIR: /lib/firmware/raspberrypi/bootloader/stable VL805: up-to-date CURRENT: 000138a1 LATEST: 000138a1

Tue Oct 20 23:08:54 2020 kern.err kernel: [675644.083552] xhci_hcd 0000:01:00.0: Ring expansion failed Tue Oct 20 23:08:54 2020 kern.err kernel: [675644.088965] hub 1-1:1.0: activate --> -12

@borpin
Copy link

borpin commented Jan 31, 2021

I am seeing this too.

@877dev
Copy link

877dev commented May 3, 2021

Same here!

pi@raspberrypi:/etc $ cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
pi@raspberrypi:/etc $ uname -a
Linux raspberrypi 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l GNU/Linux
pi@raspberrypi:/etc $

@ayasystems
Copy link

I have the same problem

pi@domotica:~ $ cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
pi@domotica:~ $ uname -a
Linux domotica 5.15.68-v7l+ #1587 SMP Tue Sep 20 11:17:01 BST 2022 armv7l GNU/Linux
pi@domotica:~ $

@popcornmix
Copy link
Collaborator

You seem to be running buster (which is deprecated) with a latest (rpi-update?) kernel.
You would be advised to update to bullseye.

What does vcgencmd version report?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants