Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DMA channels #49

Merged
merged 48 commits into from
May 27, 2020
Merged

Add DMA channels #49

merged 48 commits into from
May 27, 2020

Conversation

mciantyre
Copy link
Member

The PR adds DMA channels, and a set of traits that lets us add DMA capabilities for iMXRT peripherals. I extend the UART peripheral with the new DMA capabilities.

There's a rough example in the teensy4-rs repo showing how we can use DMA to coordinate UART transfers and receives. It needs to be cleaned up and documented, but it seems to work!

There are a bunch of DMA capabilities that this PR does not support, like

  • channel arbitration modes. The DMA controller supports fixed-priority (default) or round-robin transfer scheduling. Users can't change the default right now.
  • group arbitration. There are two DMA channel groups, and we could modify how the DMA controller schedules transfers across / within the two groups. Users can't do anything with groups right now.
  • changing channel priorities. This might be the simplest to implement: we should let user's swap the priorities of two channels. This would let us keep the requirement that no two channels have the same priorities. Priorities also includes channel preemption. Users can't change channel priorities or preemption settings.

If you think we need these DMA capabilities, or have ideas on how we can add them in, let me know!

The PR closes #43 once we extend the SPI peripheral with the new DMA capabilities, and we test SPI DMA transfers.

@mciantyre mciantyre added the imxrt-hal HAL related issue label Apr 23, 2020
@mciantyre
Copy link
Member Author

Added SPI DMA transfer and receive capabilities. The dma_spi.rs example in the teensy4-rs repo is a contrived example of reading one byte from a SPI slave using DMA.

@teburd
Copy link
Member

teburd commented Apr 24, 2020

This is super, I need to look at the DMA controller on this chip closer before I can really comment well on this, will do my best to get a thorough review here soon-ish

@mciantyre
Copy link
Member Author

There's an enhanced UART DMA example in the teensy4-rs test suite. I also updated the SPI DMA example to perform full-duplex, multi-u16 transfers with a connected IMU.

@mciantyre mciantyre marked this pull request as ready for review May 1, 2020 00:22
@mciantyre mciantyre changed the title WIP: Add DMA channels Add DMA channels May 1, 2020
@mciantyre
Copy link
Member Author

I think this is ready for review. There's two working examples in the teensy4-rs repo (see links above). I added a longer doc example in the dma module, which should be checked by our tests.

SPI and UART are DMA-capable peripherals. If we want any other DMA-capable peripherals, or any other DMA features, let me know! I might play with I2C and DMA, but unless someone has a real need for DMA-capable I2C, I don't consider it a blocker for DMA.

@mciantyre mciantyre requested a review from teburd May 1, 2020 00:31
@teburd
Copy link
Member

teburd commented May 4, 2020

Fantastic, I'll spend sometime this afternoon reviewing. This is fantastic, and will certainly be needed once I get to the SAI and USB audio stuff, which I want to use on a board design I've been working on

@teburd
Copy link
Member

teburd commented May 4, 2020

First once again, I want to say how awesome of an addition this is. That said I see a few things that could perhaps be improved through either inline commentary or self describing function names that perhaps better describe whats going on.

Particularly the 4 functions that setup the TCD registers with source/destination addresses, offsets, last address "add" values, things like that. The ref manual doesn't really describe why its setting all the registers itself in its examples, I had to look at the register descriptions. That's a lot of in depth digging to better understand the ~5-10 lines of register manipulation that happen there. It'd be nice if things were self descriptive in the code there. I know what it all does today as I write this, but won't likely remember a few months from now. New people looking at that code won't either.

I think overall this is a great first round on DMA. There's features like you mentioned that are missing. One feature I particularly like in this DMA controller is its ability to automatically read or write from a ring buffer using the MOD field. Its not quite as awesome as "here's my N buffers, read from them or write to them" as the buffer must be contiguous in memory, but still something I'd likely use myself in the future. It's not entirely clear how things might change to use that feature here. the set_source/set_destination don't likely need it, but perhaps the set_destination_buffer, set_source_buffer could. This isn't something that needs to be addressed in this PR but I guess I'd like to hear your thoughts on how we might expand this API to support the MOD feature. I'd like to add a ticket then after to make sure we don't forget.

Lastly a DMA powered memcpy is always a nice addition, and this DMA controller seems to support it. Also not something that needs to be addressed here, but perhaps we can write a ticket so we don't forget.

impl Element for u64 {
const DATA_TRANSFER_ID: u16 = 3;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose to support the 32-byte burst feature here something perhaps a little more special might be needed? Would likely need to be setup to look like

impl Element for u8[32]
or
impl Element for u16[16]

And so on down the line of uint types I'd guess?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSIZE documentation in the reference manual says that the burst is "4 beats of 64 bits." I'd be partial to only support [u64; 4], as it describes the "beats" of the burst.

Am I misinterpreting the 32-byte burst?

Tested in the teensy4-rs project, showing both DMA RX and TX
capabilities.
Effort made me realize that the UART implementation is hard-coded
for UART2's DMA signals. Fixing this in the next diff...
Builder might let us evolve the possible configurations without
breaking users. Helper functions are helpful when you don't want
to type our your explicit type.
If a user wants to transfer one `u16`, the math that computed CITER
and BITER would be zero. The invalid major loop count would trigger
a DMA error. We always want at least one major loop, so ensure that
there at least one major loop when transfering / receiving buffers.

The commit passes the `dma_uart.rs` and `dma_spi.rs` examples in the
`teensy4-rs` project. It matches e84f551 in that project.
CITER and BITER reflect the number of major loop iterations. When
used in the set_*_buffer methods, it reflects the total number of
elements in the buffer. It should not account for the element size;
that's handled by NBYTES.
We may want to set the number of transfer iterations independent of
the source / destination buffer size. This lets a higher-level DMA
abstraction specify the number of transfer iterations independent
of the buffer size. This could be used for a hypothetical peripheral-
to-peripheral transfer, which would need to specify how many elements
to transfer across the peripherals.
There's only an unsafe interface right now. I'm exploring approaches
for a safe interface for in-memory DMA transfers.
@mciantyre
Copy link
Member Author

mciantyre commented May 17, 2020

Thanks for the review. The latest update

  • adds developer docs to describe the set_source(), set_destination_buffer(), and related methods.
  • adds a buffer module, which lets users allocated and share DMA-safe source and destination buffers. It supports normal arrays, which we're calling Linear. It also supports circular buffers, which we're calling Circular. The types in the buffer module let us remove the unsafe methods from the Peripheral interface.
  • adds a Memcpy type (untested), which should transfer memory across two buffers using DMA
  • updates the docs.
  • cleans up and renames methods.

The development branch in the teensy4-rs repo has up-to-date examples of the new DMA API. Memcpy example WIP.

- should set disable on completion, since we need to explicitly
  disable the channel, which we do as of this commit
- move start() call before fence. Unsure if this has an impact at
  the moment, but the reference manual example seemed to group all
  TCD operations next to each other
- call complete on the provided buffers

At this commit, `Memcpy` doesn't work. Or, it didn't work in my
testing on a Teensy 4. I modified the `Memcpy` implementation to
call `start()` on the DMA channel in a busy-loop on `is_complete()`,
and that has the effect of driving the transfer to completion. My
current thinking is that we need to use a single major loop for in-
memory DMA transfers, similar to what's documented in the reference
manual. There's no external peripheral to request DMA services, so
we need to move all the data in one call to `start()`.
Just some code cleanup, in preparation for possible changes in the
DMA module.
This spreads the channel setup throughout the peripheral and buffer
modules. We define a `Transfer` struct, which names and documents
the values that go into defining a transfer. We separate minor loop
iterations from major loop iterations, which let us support a memcpy
with a single DMA service request.

The memcpy example I tested copied memory across two circular
buffers. I found that I was both incorrectly specifying the modulo
value, and not correctly aligning my buffer. Both modulo, and the
alignment, need to account for the size of the element! The former
is fixed here; the latter can be enforced by an error at run-time.
The runtime check wasn't accounting for the element size. It now
checks for the element size. Updated the docs.

We can't reliably run the mis-alignment example, since there's a
chance that the linker could place the buffer at a memory address
that satisfies the alignment. The test could sporadically fail. We
instead check that the example compiles.

Added docs that describe the variability of this runtime-check
approach.
Case in point: runtime checks might miss incorrectly-specified code
that happens to work.
@mciantyre
Copy link
Member Author

Update to the update:

  • refactor the DMA channel methods, leading to an approach that's easier to document, and can support both peripherals and memcpy transfers
  • test DMA-powered memory copy. The first, untested attempt didn't work. It seems to work now, and there's an example in the Teensy 4 repo, which transfers data across two circular buffers. See the commits for justification.
  • fix run-time checks for circular buffer alignment; fix modulo computation for circular buffers

I think this address all thoughts from the first review. Ready for round two!

- Return the buffers if there was a transfer error
- Handle calls to `complete()` that aren't ready
@teburd
Copy link
Member

teburd commented May 21, 2020

I'm fine with this, a few nitpicks don't block a merge in my mind. Fine to merge when you are ready

DMA ERQ is to enable hardware requests, not general DMA channel
service. I renamed the set_enable() method to set_hardware_enable()
to make it more distinct.

This means we have to change how we cancel a DMA memcpy. I added
capabilities to halt DMA transfers, and cancel a transfer. I
reduced the `Memcpy` API to signal both completion and cancel through
the `complete()` method. Tested memcpy cancellation in the Teensy 4
repo.
@mciantyre
Copy link
Member Author

Sounds good. I'll do a self-review, and merge this branch within a week. Let me know if there's anything else that might help with SAI support.

The DAM multiplexer can treat a DMA channel as "always on." When a
channel is always on, it doesn't need explicit re-activation, or
service requests, to drive the major loop to zero. This means we can
accomplish a memory copy across multiple major loops, not just one,
large major loop, without needed explicit software re-activation.

At this commit, all DMA transfers run through the DMA multiplexer.
We have a consistent method for disabling transfers; we disable
the 'enable' flag in ERQ. This undoes the changes to the enable
method introduced in the previous commit.
Users should see the RAL's DMA peripherals as in-use by the HAL.
Users will be required to unsafely steal the peripherals if they
want to bypass the HAL.
Taking care of it now before I forget again! Closes #27.
@mciantyre mciantyre merged commit f2f9f66 into imxrt-rs:master May 27, 2020
@mciantyre mciantyre deleted the features/dma branch May 27, 2020 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imxrt-hal HAL related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support SPI DMA transfers
2 participants