Skip to content

Conversation

@Finomnis
Copy link
Contributor

@Finomnis Finomnis commented May 6, 2025

Fixes #89557.

The main architectural change is that the RX/TX halves of a transmission now signal that they are finished by disabling their interrupts.

This in turn will then be picked up further down in the ISR as the signal that the entire transmission is over.

The order of operations inside the handle_tx_irq had to be changed, so that the interrupt only gets disabled once both of those are fulfilled:

  • no more TX data available
  • physical transmission finished

Before, the IRQ got disabled while the physical transmission was not finished yet; for that to be a given we must be out of data and an active TX interrupt must exist. In the current code, it got disabled once we had no more data, but the TX interrupt wasn't set.

The second fix is that RX will be cleared and RX interrupts will be disabled every time we get an interrupt and we have no more data to send. Previously, this was only done if an RX interrupt was pending, which made it possible that the RX interrupts were still enabled at the end of the transmission.

@github-actions github-actions bot added platform: NXP Drivers NXP Semiconductors, drivers area: SPI SPI bus labels May 6, 2025
@mstumpf-vected mstumpf-vected force-pushed the fix_nxp_lpspi_isr branch 4 times, most recently from 730b71e to 91fe6c1 Compare May 6, 2025 19:31
@decsny
Copy link
Member

decsny commented May 6, 2025

@hakehuang please run full regression test on this PR

@Finomnis
Copy link
Contributor Author

Finomnis commented May 6, 2025

@decsny how do I do this and what hardware do I need?

Copy link
Member

@decsny decsny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please split into two commits, first the RX change then the TX change, and describe more clearly in the commit messages, most people will read that in the future rather than the PR description

@decsny
Copy link
Member

decsny commented May 6, 2025

@decsny how do I do this and what hardware do I need?

I pinged Hake with that message, not directed at you. He has a hardware range to test on many different platforms

BTW the PR makes sense and seems fine to me, but I will wait for his response before approving, because we are always playing whack a mole with this LPSPI since it's on dozens of different platforms and seems to behave slightly differently on all of them. So very often a fix for one platform breaks some others

@hakehuang
Copy link
Contributor

@hakehuang please run full regression test on this PR

@decsny , sure, will feedback once done

@hakehuang
Copy link
Contributor

@hakehuang please run full regression test on this PR

@decsny , I run regression test on this PR, no issues found.

decsny
decsny previously approved these changes May 7, 2025
@Finomnis
Copy link
Contributor Author

Finomnis commented May 7, 2025

please split into two commits, first the RX change then the TX change, and describe more clearly in the commit messages, most people will read that in the future rather than the PR description

I'm not sure how to split them, both changes are required to fix the problem. They all mesh together.

I can definitely write more detail into the commit message, though.

@Finomnis
Copy link
Contributor Author

Finomnis commented May 7, 2025

@decsny I added a lot of detail into the commit message. I hope it is good enough now.

@Finomnis
Copy link
Contributor Author

Finomnis commented May 7, 2025

Yah I think I'm at the point again where it can be merged once accepted :)

@Finomnis
Copy link
Contributor Author

Bump

@Finomnis
Copy link
Contributor Author

Finomnis commented May 14, 2025

@decsny @mmahadevan108 @dleach02 Bump again :) Sorry if I'm being stressful, but this is kind of being a blocker for us right now

@decsny
Copy link
Member

decsny commented May 14, 2025

@decsny @mmahadevan108 @dleach02 Bump again :) Sorry if I'm being stressful, but this is kind of being a blocker for us right now

I can't merge it.

Out of curiosity, why is it blocking you if you already fixed it? It needs to be in mainline ASAP for some reason?

@Finomnis
Copy link
Contributor Author

Finomnis commented May 14, 2025

@decsny Nah, you are right. It just feels wrong to base your product code on a temporary bugfix branch.

I apologize and take back the urgency.

@Finomnis
Copy link
Contributor Author

No changes, just rebased to newest main

@decsny
Copy link
Member

decsny commented May 14, 2025

@decsny Nah, you are right. It just feels wrong to base your product code on a temporary bugfix branch.

I apologize and take back the urgency.

IMO you should not base the final product code on the unstable indev mainline either, it should be based on a stable release point with whatever cherry picked patches you need on top, but that's up to you

@Finomnis
Copy link
Contributor Author

Finomnis commented May 14, 2025

IMO you should not base the final product code on the unstable indev mainline either, it should be based on a stable release point with whatever cherry picked patches you need on top, but that's up to you

Because a bunch of fixes I contributed were not backported to 4.1.0, and upstream has some functionality we need, especially regarding the native-sim sdl display. Don't worry, we are still in development and once we go towards releasing, we will most likely stop at the next stable release and then only update from release to release

There was a race condition where `lpspi_end_xfer` can be called multiple
times per transfer. There was the case where a TX interrupt gets
triggered without the RX interrupt being set, and TX finishes writing
its last byte. Then, `spi_context_rx_len_left() == 0` is true and
`lpspi_end_xfer` happens, but the RX interrupt is still active. Then,
when the RX interrupt happens, `lpspi_end_xfer` will get called again.

To fix that, the architecture was adjusted to only call `lpspi_end_xfer`
once no interrupts are active any more, and the disabling of the
interrupts gets used to signal the end of the TX and RX part.

Minor adjustments were necessary to use the interrupt enable signals for
this purpose; the TX irq handler had its internal order reversed,
otherwise it wasn't guaranteed that the physical transfer is finished
when we disable the interrupt.

Also, the code where the RX interrupt gets disabled had to be moved out
of the RX irq handler, because the RX interrupt also needs to be
disabled if RX is finished but no RX interrupt is currently active.

Signed-off-by: Martin Stumpf <finomnis@gmail.com>
@sonarqubecloud
Copy link

@kartben kartben merged commit 7dd6f94 into zephyrproject-rtos:main May 16, 2025
26 checks passed
@Finomnis Finomnis deleted the fix_nxp_lpspi_isr branch July 22, 2025 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: SPI SPI bus platform: NXP Drivers NXP Semiconductors, drivers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NXP LPSPI ISR handling dodgy and broken

7 participants