Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPI can suffer from overruns due to TX/RX asymmetry #350

Closed
bcantrill opened this issue Dec 28, 2021 · 1 comment
Closed

SPI can suffer from overruns due to TX/RX asymmetry #350

bcantrill opened this issue Dec 28, 2021 · 1 comment
Assignees
Labels
robustness Fixing this would improve robustness of deployed firmware stm32 Has specific implications for STM32 processors

Comments

@bcantrill
Copy link
Collaborator

With the fix for #260, we will entirely fill the TX FIFO when we can. As the data there indicates, this results in better performance -- but it also introduces an asymmetry: because we will fill the TX FIFO as much as we can but only drain the RX FIFO a byte at a time, it is relatively easy to get into RX overruns. (This can be seen when doing operations that clearly exceed the FIFO size.) The overruns are then not handled well (see #349), resulting in an infinite loop. The fix is to do to the RX side what we did to the TX side: continue to drain it as long as we can. If/when the TX FIFO empties, SPI transmission will cease until it is refilled, assuring that we throttle on our true ability to drain.

@bcantrill bcantrill self-assigned this Dec 28, 2021
bcantrill added a commit that referenced this issue Dec 28, 2021
ITM broken on Nucleo H743/H753 boards (#346)
SPI overruns are not handled, resulting in infinite loop (#349)
SPI can suffer from overruns due to TX/RX asymmetry (#350)

In addition to fixing these bugs, this work removes the Gimlet hostflash
task from the H743/H753 boards -- and replaces it with the venerable ping.
@cbiffle
Copy link
Collaborator

cbiffle commented Jan 5, 2022

I am disappointed that the STM32H7 SPI block doesn't have an interlock that can be used to stop transmission when the RX fifo approaches full. Since SPI is lock-step, you can do this sort of thing. But, I just re-read the manual, and they didn't.

We should definitely alter the driver to dequeue many received bytes when possible -- I actually have a floating change from the plane to MN where I did this, I can see about tracking it down. (I also made it stop using one syscall per byte to deliver into the borrow.)

Because system performance can vary and load-dependent failures are the literal worst, we may also want to make the driver more robust against being starved on the RX side. Concretely, rather than loading the TX fifo to capacity, we may want to only load it with the number of bytes that are free in the RX fifo. This will cause the transmitter underrun interlock to halt driving the bus before an RX overrun can occur.

That being said, we should still handle the overrun case better -- it should be an error returned to the caller, since data was lost.

@cbiffle cbiffle added robustness Fixing this would improve robustness of deployed firmware stm32 Has specific implications for STM32 processors labels Jan 5, 2022
bcantrill added a commit that referenced this issue Jan 10, 2022
This fixes the following bugs:

- Spi.exchange() hangs if RX bytes exceeds TX bytes (#341)
- ITM broken on Nucleo H743/H753 boards (#346)
- SPI overruns are not handled, resulting in infinite loop (#349)
- SPI can suffer from overruns due to TX/RX asymmetry (#350)
    
In addition to fixing these bugs, this work removes the Gimlet hostflash
task from the H743/H753 boards -- and replaces it with the venerable
"ping" task (which demonstrates a panicking task).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
robustness Fixing this would improve robustness of deployed firmware stm32 Has specific implications for STM32 processors
Projects
None yet
Development

No branches or pull requests

2 participants