-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore DMA-based APIs built on top of &'static mut
references
#37
Comments
Great stuff. I like how you're tackling DMA. However I feel there's one very important piece missing in the embedded-hal world: The ability to tie interrupts to HAL periperals. You nicely described how DMA gets rid of the need to wait for the transfer to complete on the spot but you still need to spend (potentially) a lot of cycles to check whether your transfer has completed at some point. Whether that is better or not depends on the application, after all you still might do other processing in interrupt handlers while busy waiting for a single transmission to complete. |
I should probably add that I view interrupt support as a much more important topic for async IO and DMA mostly as a method to reduce interrupt frequency and contention. DMA without interrupts is not actually that useful. |
I still need to do some reading (my DMA knowledge is extremely limited), but it would also be interesting how to model peripheral to peripheral transfers. For example, if you (for some reason) wanted to have a DMA driven UART passthrough (data comes in on one, goes out on the other), how would this be modelled? The transaction should take ownership of the serial ports and DMA channel, but wouldn't require a |
@jamesmunns As far as I know STM32 does not support peripheral to peripheral DMA transfers. Do other microcontrollers have such feature? |
@protomors based on reading this thread, I thought it might be possible, though that might be more by accident than by design. AN4031 section 1.1.4 though seems to exclude p-to-p DMA. I can find some mentions of peripheral to peripheral DMA being used on other architectures, such as the PIC32, as well as the PSoC3 (8051), but can't really find any compelling examples for Cortex-M based devices. I don't have a use case (or experience with this), but it looks like this is "not a problem" at the moment. |
Actually, I just found the PSoC5LP (Cortex-M3 based) does support this, judging by this app note from Cypress, but this still looks like a pretty rare feature (might not be worth defining interfaces/traits for the edge cases). |
@jamesmunns Plenty of Cortex-M chips have a peripheral to peripheral communication but it's often not called DMA (because it isn't necessarily just that), e.g. the Nordic NRF have it and it's called Programmable Peripheral Interconnect (PPI) there, the ATSAMD call it EVSYS. As I mentioned before an important aspect of DMA is being able to process interrupts generated by the DMA engine and "p-to-p" is about letting one peripheral know that some other peripheral has finished processing, via DMA or not is not of importance. If you have to poll for DMA completion it's pretty much just a waste of resources. |
Interrupts are a topic orthogonal to the DMA that deserve their own issue / discussion thread. I have my thoughts on the topic but I won't voice them in this issue. Also you can already use HAL abstractions with interrupts using RTFM at zero cost.
You don't need interrupt handlers / callbacks to avoid busy waiting for the DMA transfer to complete either (if that's what you are implicitly suggesting). If you use cooperative multitasking (i.e. generators) you can suspend ( I don't know about peripheral to peripheral transfer but the reference implementation is missing DMA functionality like memory to memory transfers, priority levels and burst mode. However, I think that functionality could be provided as methods on the channel types; it just needs to be implemented. Mem2mem transfers sound like they may be useful to have as a trait but I'm not sure if priority and burst mode should have traits. |
What I'm trying to get at here is that there are a few ways to avoid the busy waiting and they have more to do with the task model (reactive / callback-y / interrupt-driven, cooperative, serial / blocking, etc.) than the DMA abstraction. I'd like the DMA abstraction to be general / flexible enough to support all the task models out there with minimal code duplication (I don't know if that's possible at this point though) |
Well, yes, you can. But it's a completely manual and static setup and fully separate from the hal drivers.
That's not what I meant. Even with cooperative approaches you need to periodically check in whether the transfer has finished and depending on how slow the transfer is you might still waste lots of MCU cycles doing that.
Fair enough. I still stand by my point that DMA was introduced to reduce interrupt overhead and contention and is somewhat pointless without being able to use interrupts. |
With a naive implementation, yes. Above, I was refering to a Tokio (epoll / kqueue) like implementation; there a task is only resumed when it can make progress -- this eliminates the continuous polling you mention. I already have an implementation in my head but haven't had time to test it; it uses the interrupt mechanism (NVIC), because otherwise you can't go to sleep, but doesn't involve any (device specific) interrupt handler. |
So how would a task know it can make progress without polling or being notified by an interrupt handler?
That's yet another aspect why you want to be able to use interrupts. ;) I'll be happy to look at your ideas when you have something. Just wanted to throw in that for me personally DMA is way down on the implementation list without a better way to use interrupts with drivers, it's just a ton of work to implement with very little benefit over a stupid/simple register based implementation. |
Current status of the DMA API design: I think we have a clear idea of what the API should provide: It MUST
It SHOULD support these features:
This is out of scope:
What we know so far from existing implementations:
Unresolved questions:
Attempts at the problem:
|
@japaric IIRC (at least with DMA USB drivers) regardless of word size the buffer has to be 32-bit aligned on the m3's I've used. Almost definitely varies with platfom though. |
I was a bit stuck trying to figure out how to use this for a while with RTFM: japaric/ws2812b#5 I found type TX = Option<Either<(TX_BUF, dma1::C4, Tx<USART1>), Transfer<R, TX_BUF, dma1::C4, Tx<USART1>>>>; I was wishing for a way to reuse a Transfer repeatably (update buffer and 'go again') or get back all the channels and parts to make a new one over the same resources. This does that, but I wonder if there's a nicer way to put more of this in the generic hal rather than user code? However, this (in the SYS_TICK handler of that example) surprised me: let (buf, c, tx) = match r.TX.take().unwrap() {
Either::Left((buf, c, tx)) => (buf, c, tx),
Either::Right(trans) => trans.wait(),
};
So:
|
I spent some time adding DMA support to the stm32l4x6_hal crate, and wanted to add some notes. I attempted to use the Transfer and CircBuffer abstractions with serial ports and RTFM. Because the Transfer owns its Tx or Rx pin and the buffer, and |
So, the PS2, which I've been developing on, has a hardware requirement for 16-byte aligned DMA (it ignores the lower four bits of the specified memory address). For the sake of throwing ideas out, perhaps the implementation could define a zero-sized I'd rather not reinvent the DMA interface wheel if somebody's going to change its shape. |
I think one way to do it is to declare your resource The |
On the topic of this RFC, I see a need for non-static DMA buffer for transactions that will be using the DMA but are blocking. It might seem counterintuitive, but the byte-by-byte APIs of various MCU devices are sometimes impossible to use at the speed of the physical interface (I suspect it's because of the bus traffic in between the bytes), but there is still no interest in going full async on the API (in particular by blocking we can keep the buffer on the stack, then release it). Here is a mention of the problem: https://javakys.wordpress.com/2014/09/04/how-to-implement-full-duplex-spi-communication-using-spi-dma-mode-on-stm32f2xx-or-stm32f4xx/ Here is some code for your consideration:
On a slightly different subject, I have rendered the buffer |
@nraynaud if the operation is blocking I think there's no need for taking the receiver, or the buffer, by value. Something like Also, the blog post linked in the issue description is a bit dated by now. The embedonomicon has the latest information on DMA; there have been a few updates: Pin instead of &'static mut (which allows Box, Rc, etc) and the compiler fences have been softened (while preserving correctness). |
Thanks, I will read all that, and try to use references. It didn't occur to me. Do you have a documentation/blog post on the compiler fences and when to use them? What about |
@nraynaud the embedonomicon explains why the compiler fences are needed. volatile ops can't be reordered wrt to each other but non-volatile opss (like operations on buffers) can be reordered wrt to volatile ops; the compiler fences are there to prevent the latter, which can result in compiler misoptimizations. |
I've tried to rewrite dma transfer according embedonomicon here but haven't tested it yet. |
How are these efforts going? I noticed that the last posting on the thread is from Feb 11 2019. I am new to embedded rust and would like to play with some DMA stuff. While I could just manipulate the chip registers directly and cobble something together, there has obviously been much more thought put into this here. I would like to play with what you guys are suggesting. :) |
@justacec More DMA implementations keep appearing. I've personally worked on two ( I'm not aware of any efforts for creating a platform-independent API that could be included into |
I submitted a draft PR for the stm32f1xx-hal crate (stm32-rs/stm32f1xx-hal#244) which allows circular DMA buffers to support reading arbitrary lengths of data as it comes in. |
37: Update the spidev documentation links r=ryankurte a=gszy There were two links in the doc comments (of the `Spidev` struct and the `Spidev::open()` method) that guided to the documentation of the old (0.3.0) version of the spidev crate. This patch updates those links to direct to the new version (0.4.0, the same as _this_ crate depends on). Co-authored-by: Grzegorz Szymaszek <gszymaszek@short.pl>
Summary of this thread as of 2018-04-02
Current status of the DMA API design:
I think we have a clear idea of what the API should provide:
It MUST
It SHOULD support these features:
This is out of scope:
peripherals. The API for this functionality is up to the HAL implementer to provide / design.
What we know so far from existing implementations:
mem::forget
is safe in Rust)[T; N]
in theTransfer
struct doesn't work because the address is not stable -- itchanges when the
Transfer
struct is moved&'static mut
fulfills these two requirements but so doBox
,Vec
and other heap allocatedcollections. See
owning_ref::StableAddress
for a more complete list.StableAddress
. Also the element type must berestricted -- e.g. a transfer on
&'static mut [Socket]
doesn't make sense.Unresolved questions:
[u16]
buffer require the buffer to be 16-bitaligned? The answer probably depends on the target device.
Attempts at the problem:
Memory safe DMA transfers, a blog post that explores using
&'static mut
references to achieve memory safe DMA transfersstm32f103xx-hal
, PoC implementations of the idea present in the blog post. It containsimplementations of one-shot and circular DMA transfers
allocated collections like
Box
andVec
.What the title says. This blog post describes the approach to building such APIs. The last part of
the post covers platform agnostic traits that could be suitable for inclusion in
embedded-hal
.This issue is for collecting feedback on the proposed approach and deciding on what should land in
embedded-hal
.The text was updated successfully, but these errors were encountered: