Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast double-buffered DMA #226

Merged
merged 3 commits into from Jun 7, 2021
Merged

Fast double-buffered DMA #226

merged 3 commits into from Jun 7, 2021

Conversation

jordens
Copy link
Member

@jordens jordens commented May 24, 2021

When handling multiple double-buffered DMA streams the current safe-and-universal API becomes slower than necessary.
This adds next_dbm_transfer_with(), a closure-based API to the inactive buffer, without address poisoning, compiler fences, data barriers, or buffer swapping. For some applications this reduces the overhead significantly (speedup of 5).

Currently there is 2 PCLK delay (equivalent to up to 8 CPU cycles) when
clearing transfer complete flags for the update to trickle through the
bus matrices, the peripheral and the irq synchronizer after clearing an
interrupt flag.

In many cases the delay is not required because there are enough instruction
following before exiting an ISR that would fully absorb the delay.

This change adds a method to clear the flag without additional delay.
This add a new unsafe method to the DMA API for significantly faster
transfer handling.

In many cases the safety and universality offered by `next_transfer_with`
is not necessary and costly in terms of cycles. This is especially the case
when multiple double-buffer DMA streams are handled together and when the
risk of loosing the race to the inactive buffer against the running DMA
transfer is either acceptable or excluded by design of the handler.
@richardeoin
Copy link
Member

It's really interesting that such a dramatic speedup is possible. I guess that's with release mode / optimisations enabled in both cases?

The PR looks good to me.

Even without address poisoning and buffer swapping we can detect
the case where the user processing is too slow and DMA wins the race to
the inactive buffer. This returns `Err(DMAError::Overflow)` in that case
at the small expense of a single additional peripheral read.
@jordens jordens marked this pull request as ready for review June 1, 2021 12:27
@jordens
Copy link
Member Author

jordens commented Jun 1, 2021

Yes. That's always in release mode. The gain is mostly in cancelling the buffer Option swapping operations, several address writes and reads through the slower bus matrices, various checks of DMA state that depend on whether it's double buffer or not, and the redundant fence in the case of multi-DMA handlers.

@jordens jordens requested a review from richardeoin June 1, 2021 12:32
@richardeoin
Copy link
Member

Thanks!

bors r+

@bors bors bot merged commit acd47be into stm32-rs:master Jun 7, 2021
richardeoin added a commit to richardeoin/stm32h7xx-hal that referenced this pull request Jul 18, 2021
richardeoin added a commit to richardeoin/stm32h7xx-hal that referenced this pull request Jul 18, 2021
richardeoin added a commit to richardeoin/stm32h7xx-hal that referenced this pull request Aug 28, 2021
richardeoin added a commit to richardeoin/stm32h7xx-hal that referenced this pull request Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants