Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPI speed ~2x slower than it should be on RPi 4 #3381

Open
gtrainavicius opened this issue Dec 19, 2019 · 25 comments
Open

SPI speed ~2x slower than it should be on RPi 4 #3381

gtrainavicius opened this issue Dec 19, 2019 · 25 comments

Comments

@gtrainavicius
Copy link
Contributor

The SPI transfer speed used on RPi4 is 2 times slower (50kHz) than requested for example here: https://github.com/raspberrypi/linux/blob/rpi-4.19.y/sound/soc/bcm/pisound.c#L329 (100kHz)

On RPi4, requesting 100000 Hz produces 50kHz communcation speed. Changing the requested speed to 200000 Hz, on RPi4 it first starts at 80kHz, then at some point it switches to 100kHz, so that means RPi4 is capable of using exactly 100kHz.

On RPi3B+, requesting 100000 yields 62.5kHz speed. Requesting 200000 gives 125kHz. I assume RPi3B+ does not have in-between speeds like RPi4.

The question (and bug) is - why does RPi4 pick 50kHz when requested 100kHz instead, even though it can be seen being capable of 80kHz and 100kHz speeds.

To reproduce
Without any hat connected, you may run spidev_test to test drive /dev/spidev.0.0. Either an oscilloscope or logic analyzer can be connected to SCK line to measure the frequency, or it can very roughly be inferred based on the reported data rate and the duration of execution. Here are test results of sending 80000 bytes in 16 byte transfers using 100kHz and 200kHz requested speeds on RPi4 and 3B+, they contain SPI logic analyzer timed output, as well as the spidev_test invocation, results and additional notes at the very bottom.

spi_test_results.zip

100kHz RPi4
pi@raspberrypi:~/spidev_test $ time { ./spidev_test -S 16 -I 5000 -D /dev/spidev0.0 -s 100000; }
spi mode: 0x0
bits per word: 8
max speed: 100000 Hz (100 KHz)
rate: tx 41.1kbps, rx 41.1kbps
rate: tx 41.8kbps, rx 41.8kbps
rate: tx 41.8kbps, rx 41.8kbps
total: tx 78.1KB, rx 78.1KB

real    0m18.392s
user    0m0.056s
sys     0m0.286s

Actual transfer rate: 4351.44 B/s
Theoretical 50kHz rate: 6250 B/s
Theoretical 100kHz rate: 12500 B/s

Logic analyzer measured SCK rate: 40kHz
100kHz RPi3B+
pi@raspberrypi:~/spidev_test $ time { ./spidev_test -S 16 -I 5000 -D /dev/spidev0.0 -s 100000; }
spi mode: 0x0
bits per word: 8
max speed: 100000 Hz (100 KHz)
rate: tx 54.4kbps, rx 54.4kbps
rate: tx 64.8kbps, rx 64.8kbps
total: tx 78.1KB, rx 78.1KB

real    0m11.838s
user    0m0.050s
sys     0m0.214s

Actual transfer rate: 6763.58 B/s
Theoretical 50kHz rate: 6250 B/s
Theoretical 100kHz rate: 12500 B/s

Logic analyzer measured SCK rate: 62.5kHz
200kHz RPi4
pi@raspberrypi:~/spidev_test $ time { ./spidev_test -S 16 -I 5000 -D /dev/spidev0.0 -s 200000; }
spi mode: 0x0
bits per word: 8
max speed: 200000 Hz (200 KHz)
rate: tx 69.1kbps, rx 69.1kbps
total: tx 78.1KB, rx 78.1KB

real    0m9.369s
user    0m0.040s
sys     0m0.312s

Actual transfer rate: 8544.44 B/s
Theoretical 50kHz rate: 6250 B/s
Theoretical 100kHz rate: 12500 B/s

Logic analyzer measured SCK rate: 80kHz
200kHz RPi3B+
pi@raspberrypi:~/spidev_test $ time { ./spidev_test -S 16 -I 5000 -D /dev/spidev0.0 -s 200000; }
spi mode: 0x0
bits per word: 8
max speed: 200000 Hz (200 KHz)
rate: tx 125.1kbps, rx 125.1kbps
total: tx 78.1KB, rx 78.1KB

real    0m6.120s
user    0m0.045s
sys     0m0.232s

Actual transfer rate: 13086.32 B/s
Theoretical 50kHz rate: 6250 B/s
Theoretical 100kHz rate: 12500 B/s

Logic analyzer measured SCK rate: 125kHz

As can be seen from the results of same command being executed on RPi3B+ and RPi4, the tests finish significantly faster on RPi3B+.

Expected behaviour
100kHz gets picked when requesting 100kHz speed, or at least something closer like 80kHz.

System
RPi 3B+ and RPi 4 using latest Raspbian Lite running:

Linux raspberrypi 4.19.88-v7+ #1284 SMP Wed Dec 11 13:46:41 GMT 2019 armv7l GNU/Linux

@popcornmix
Copy link
Collaborator

The SPI clock is a divided down version of the core clock.
The core clock reduces when arm is not busy.

If you want to force the full speed of SPI you need to about core clock from clocking down.
There are a few ways of doing this:
In config.txt

force_turbo=1

or

core_freq_min=500
core_freq=500

Or from the arm:

sudo sh -c "echo performance > cpu0/cpufreq/scaling_governor" .

@gtrainavicius
Copy link
Contributor Author

Is 50 kHz the best it can do when slowed down? In all cases when running the test, the system was otherwise idle - if requesting 200kHz, the system managed to run SPI at 80kHz, this seems like a bug - it should be able to get something closer to 100kHz when being asked for 100kHz...

@popcornmix
Copy link
Collaborator

The core clock changes outside of the spi driver's knowledge.
Therefore it assumes the highest frequency and will run slower when the core runs slower.

Typically core_freq=500 core_freq_min=200.
So when idle you will get 2/5 of the SPI frequency you requested.

@pelwell
Copy link
Contributor

pelwell commented Dec 19, 2019

The SPI interfaces (and I2C and UART1) share the same clock as the VPU cores; if the VPU clock frequency changes, so does the SPI clock. The Linux SPI driver is unaware of these clock changes, so to avoid a bus speed which is too high the clock divisor is calculated for the turbo speed, but when running at the normal speed the divisor is too high with the result that the bus clock is too slow. Locking the VPU/core clock to a fixed value allows the divisor to be correctly calculated without affecting the ARM clock speeds.

@edo1
Copy link
Contributor

edo1 commented Dec 21, 2019

What are default frequencies for RPi 3B+ and RPi 4?
I use gpu_freq=250, is it enough to have stable SPI frequency on PRi 3B+? Or I have to specify core_freq_min as well?
What about RPi 4?

Does force_turbo=1 affect system stability/overheating?

@popcornmix
Copy link
Collaborator

Pi4 uses 200MHz for and and all older Pi's 250MHz.

gpu_freq=250 will have stable frequency on Pi0-3.

Does force_turbo=1 affect system stability/overheating?

You'll have marginally higher temperatures due to the higher clock when idle, and no change when arm is busy.

@gtrainavicius
Copy link
Contributor Author

I see, I assumed the communication or some other drivers would be handling clock changes not to impact the communication too much.

Is there something that can be added to a device tree overlay EEPROM to ensure a minimum clock speed, so we can make sure that the SPI communication is within minimum and maximum allowable range?

@pelwell
Copy link
Contributor

pelwell commented Jan 10, 2020

See raspberrypi/firmware#1308.

@marckleinebudde
Copy link
Contributor

FYI: there are two functions that might be of interest here. You can register a notifier for clock and/or cpu frequenctychange
https://elixir.bootlin.com/linux/v5.5.7/source/drivers/cpufreq/cpufreq.c#L341
https://elixir.bootlin.com/linux/v5.5.7/source/drivers/clk/clk.c#L4136

@lurch
Copy link
Contributor

lurch commented Apr 15, 2020

@marckleinebudde I suspect those are for monitoring CPU-side clock changes, whereas the issue here is GPU-side clock changes (since the SPI, I2C and UART1 peripherals are running off the VPU clock rather than the CPU clock, as @pelwell describes above).

@marckleinebudde
Copy link
Contributor

@lurch I'm not familiar with the raspi clock tree, but if the GPU-side clock is not mapped to the linux clock framework, this doesn't work.

@pelwell
Copy link
Contributor

pelwell commented Apr 15, 2020

The raspberrypi-clk driver sends all requests to the VPU/GPU, so it's well aware of what is changing.

@David00
Copy link

David00 commented Apr 6, 2021

I am also seeing about half the SPI sampling rate on my Pi 4 when compared against my Pi 3B+.

I have tried to implement the suggestions made by @popcornmix on my Pi 4 running Raspberry Pi OS (kernel 5.4.83-v7l+), and they don't seem to have any impact on the sampling rate.

The Pi docs suggest setting the cpu_freq is not supported on the Pi 4, but they don't mention much about force_turbo=1 on the Pi 4 specifically. I tried both, and neither improve my sampling rate.

Any other ideas?

@lurch
Copy link
Contributor

lurch commented Apr 6, 2021

Pi 4 running Raspberry Pi OS (kernel 5.4.83-v7l+)

Raspberry Pi OS is now using a 5.10 kernel... http://downloads.raspberrypi.org/raspios_armhf/release_notes.txt
(I've got no idea if this fixes things, but perhaps it's something you'd like to test?)

@David00
Copy link

David00 commented Apr 6, 2021

I upgraded to 5.10.17-v7l+ and my SPI rates are the same. Thanks for trying, though!

@David00
Copy link

David00 commented Apr 21, 2021

FYI, I continued to do some more testing on my Pi 4B. Both kernel version 5.4.x and 5.10.x exhibited the same issue with SPI data rates. I went back to using kernel version 4.19 and the issue is gone. For the record, using force_turbo or core_freq and core_freq_min did not work for me.

For future readers, assuming this issue goes unresolved...

UPDATE: (June, 2022) This may no longer be a suitable workaround if you have a newer Pi with a newer bootloader. I've had trouble getting the newer bootloaders to boot 4.x kernels, but I can't find any documentation about bootloader/kernel support.

The command (on a Raspberry Pi running Raspberry Pi OS) to install a specific kernel is:

sudo rpi-update <hash>

... where <hash> is the commit hash from the following GitHub repository that correlates to the specific kernel version you want to install:

https://github.com/Hexxeh/rpi-firmware

So, to install v4.19.118, the command is:
sudo rpi-update e1050e94821a70b2e4c72b318d6c6c968552e9a2

Simply press y at the prompt, and then reboot your Pi.

@UKHKPaul
Copy link

I have been experimenting with SPI driving NeopIxels from RPI mostly using pi zero and didn’t really see this issue. But when I tried on a pi 4 I had the same issue with the pulses being too slow.

same cause the cpu clock was slowing ( much wider range on pi4 over piZero).
In my case as it’s my own SPI driver I did a simple fix of checking the cpu speed before starting the write, and used the speed to recalculate the effective speed to drive the SPI. i.e. at full speed I use a 1:1 ratio, it as the clock speed drops I used a nominally higher speed.

in practice I also found that it helped to check the speed at the end, and if it changed then I just did a retry.

in my case I drive 20+ NeopIxels with no observable issues.

note I also found that a pull down resistor on the SPI pin from the pi helped with stray cases.

@aharish879
Copy link

The SPI clock is a divided down version of the core clock. The core clock reduces when arm is not busy.

If you want to force the full speed of SPI you need to about core clock from clocking down. There are a few ways of doing this: In config.txt

force_turbo=1

or

core_freq_min=500
core_freq=500

Or from the arm:

sudo sh -c "echo performance > cpu0/cpufreq/scaling_governor" .

Hi,
We are trying to send 68byte chunks continuously from user space to kernel space(spidev.c) in raspberry pi 4B board, there time interval between two chunks is approximately 50us, we want to reduce this turnaround time so that we can achieve higher data rates. We have tried above spi full speed optimization methods, still we are not observing lower turnaround times. Could you please suggest any optimizations. Thank you in advance.
image

@5ft24dave
Copy link

Has this been addressed in the recent 5.15.44 kernel?

@aharish879
Copy link

aharish879 commented Jun 11, 2022

Hi,
Thank you for reply, I have used 5.10.x Kernel Version

@janvanhulzen
Copy link

I have used this test on my Raspberry Pi 4 (4Gb) with OS (Debian version: 11 (bullseye)) and it seems to work as intended. I have tested spidev0.0 at 100kHz and 1 MH
SPI_SLCK_1MHz
SPI_SLCK_50kHz
SPI_SLCK_100kHz
z

@David00
Copy link

David00 commented Sep 18, 2022

@janvanhulzen, thanks for sharing the pics with everyone. (Additional context about your setup, and specific kernel version would be helpful)

I just ran a test on the latest Raspberry Pi OS Lite build (Sept 6 '22), kernel 5.15.61-v7l+, using spidev_test.c, on a Pi 4.

Without modifying the clock settings, the SPI clock is all over the place. I would check the CPU frequency immediately before running an spidev test at 1MHz and I can clearly see correlation between CPU clock speed and SPI clock speed.

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq ; sudo ./test -I 10000 -D /dev/spidev0.0 -s 1000000 -I 10

(where test is my output binary after compiling spidev_test.c)

The results of the SPI clock speed, as measured by my scope on the SCLK pin, at various CPU clocks are as follows:

CPU: 600 kHz
SPI: 355 kHz

CPU: 700 kHz
SPI: 404 kHz

CPU: 900 kHz
SPI: 404 kHz

CPU: 1.8 GHz
SPI: 888 kHz

You can see that the final result where the CPU clock is at its peak is the only result remotely near the requested 1 MHz SPI clock.

So, I tested with force_turbo=1 as @popcornmix recommended a while ago, and this gave me consistent results of 1.8 GHz CPU clock and a 888 kHz SPI clock. I also measured the total power consumption of the Pi and it increased about 100mW on average after adding force_turbo=1.

The C driver for the SPI interface demonstrates, at least, that kernel 5.15.61-v7l+ is fine.

In my application that uses the Python spidev library, my sample rates are still about half as fast compared to the sample rate on a v4 kernel, but the SPI clock as measured on the wire has no issue, so there is a problem higher up in the stack. At least now I'm pretty confident that it is not due to the kernel.

@gtrainavicius
Copy link
Contributor Author

I notice that on Pi 5 the SPI speed seems to be constant, regardless of CPU scaling and current CPU frequency, while Pi 4 and earlier models keep changing the speeds.

Is it possible for a kernel module to know that it is running on a Pi 5? (Or have a device tree overlay dedicated for Pi 5 and the rest of the Pis which would pass a param to the kernel module? (in particular, I'd like to pass the SPI baud rate to use for different models))

@rdpoor
Copy link

rdpoor commented Mar 8, 2024

FWIW, I'm observing what I presume is the same issue on an RPi4 running Debian 1:6.1.63-1+rpt1 (2023-11-24).

From a python script, using the spidev package, I can start the SPI SCK at 50 MHz, but about six seconds later, it drops down to 20MHz. Needless to say, I didn't expect this behavior.

Here's the CPU info in case it is useful to anyone:

$ lscpu
Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               ARM
  Model name:            Cortex-A72
    Model:               3
    Thread(s) per core:  1
    Core(s) per cluster: 4
    Socket(s):           -
    Cluster(s):          1
    Stepping:            r0p3
    CPU(s) scaling MHz:  40%
    CPU max MHz:         1500.0000
    CPU min MHz:         600.0000
    BogoMIPS:            108.00
    Flags:               fp asimd evtstrm crc32 cpuid
Caches (sum of all):
  L1d:                   128 KiB (4 instances)
  L1i:                   192 KiB (4 instances)
  L2:                    1 MiB (1 instance)
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Vulnerable
  Spectre v1:            Mitigation; __user pointer sanitization
  Spectre v2:            Vulnerable
  Srbds:                 Not affected
  Tsx async abort:       Not affected

@popcornmix
Copy link
Collaborator

@rdpoor the solutions are any one of the following:
switch to performance governor:

echo performance | sudo tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Boost the minimum core frequency. Add to config.txt

core_freq_min=500

Force turbo. Add to config.txt

force_turbo=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests