

Maxime Ripard

Free Electrons

maxime@free-electrons.com

© Copyright 2004-2015, Free Electrons.
Creative Commons BY-SA 3.0 license.
Corrections, suggestions, contributions and translations are welcome!





### Maxime Ripard

- Embedded Linux engineer and trainer at Free Electrons
  - Embedded Linux development: kernel and driver development, system integration, boot time and power consumption optimization, consulting, etc.
  - Embedded Linux training, Linux driver development training and Android system development training, with materials freely available under a Creative Commons license.
  - ▶ http://free-electrons.com
- Contributions
  - Kernel support for the sunXi SoCs from Allwinner
  - Contributor to few open-source projects, Buildroot, an open-source, simple and fast embedded Linux build system, Barebox, a modern bootloader.
- ▶ Living in **Toulouse**, south west of France



### Introduction



#### Device DMA vs...







#### **DMA Controllers**















#### Scatter Gather Descriptors









#### Realistic DMA Controller





## Linux Support

#### DMAEngine

- Merged in 2006, in 2.6.18
- Subsystem to handle memory-to-device transfers

#### Async TX

- Merged in 2007, in 2.6.23
- Initially part of the raid5 code to support the XScale offload engines
- Subsystem to handle memory to memory operations (memcpy, XOR, etc.)
- Implemented on top of dmaengine, but takes many shortcuts, instead of being a real client.

#### Slave consumer API

- Request a channel: dma\_request\_channel, or one of its variants
- Configure the channel for our use: dmaengine\_slave\_config
- Get a transaction descriptor for our transfer: dmaengine\_prep\_\*
- 4. Put the transaction in the driver pending queue: dmaengine\_submit
- 5. Issue pending requests (blocks and calls back your driver to give an update on the transfer status): dmaengine\_issue\_pending



## Slave Controller Drivers



- DMAEngine, like any framework, relies on a structure you have to fill with various pieces of information in order to do its job properly.
- ► Mostly:
  - channels Initialized list of the channels supported by the driver, of the size of the number of channels supported by your driver
    - \*\_align Alignment in bytes for the Async TX buffers



#### DMA Transfer Types 1/2

- The next step is to set which transfer types your driver supports
- ➤ This is done through the function dma\_cap\_set, which takes various flags as an argument:
  - ► DMA\_MEMCPY
    - Memory to memory copy
  - ► DMA\_SG
    - Memory to memory scatter gather
  - ► DMA\_INTERLEAVE
    - Memory to memory interleaved transfer
  - ► DMA\_XOR
    - Memory to memory XOR
  - ► DMA\_XOR\_VAL
    - Memory buffer parity check using XOR



### DMA Transfer Types 2/2

- ► DMA\_PQ
  - ► Memory to memory P+Q computation
- ► DMA\_PQ\_VAL
  - ▶ Memory buffer parity check using P+Q
- ► DMA\_INTERRUPT
  - ► The device is able to generate a dummy transfer that will generate interrupts
- ► DMA\_SLAVE
  - Memory to device transfers
- ► DMA\_CYCLIC
  - ▶ The device is able to handle cyclic transfers



#### Weird Transfer Types

- ► DMA\_PRIVATE
  - Async TX doesn't go through dma\_request\_channel but circumvents it, and just starts using any random channel it can.
  - It does so unless you set this flag
- ► DMA\_ASYNC\_TX
  - ▶ Set by the core when you support all Async TX transfer types
  - ▶ Used only if ASYNC\_TX\_ENABLE\_CHANNEL\_SWITCH is enabled
  - Used by dma\_find\_channel, which is a non-exclusive equivalent of dma\_request\_channel, used only by Async TX



#### Channels Resources Allocation

- device\_alloc\_chan\_resources and device\_free\_chan\_resources
- Called by the framework when your channel is first requested
- ► Allows to allocate custom resources for your channel, and free them when you're done
- ▶ Optional (since 3.20)



#### Transaction Descriptor Retrieval Functions

- device\_prep\_dma\_\*
- Optional, but have to match the transfer types you declared
- Should create both the software descriptor, for Linux and clients to identify the transfer, and the hardware descriptor matching it for the dma controller.
- Should also ensure that the parameters of this transfer match what the driver supports



#### Submitting Pending Jobs

- ▶ device\_issue\_pending
- Should take the first descriptor of the transaction and start it
- Should go through all the descriptors in the list, notifying the client using an optional callback of the status of the transfer



#### Transfer Status Reporting

- ► device\_tx\_status
- Reports the current state of a given transaction descriptor
- ▶ Does so using the dma\_set\_residue function, and returns only a flag saying whether it's done or in progress.
- ▶ This is where the granularity we used earlier comes into action.



#### Channel configuration

- device\_control
- Takes an additional flag, to give the type of control to do on the channel
  - ► DMA\_PAUSE
    - Pause a given channel
  - ► DMA\_RESUME
    - ▶ Resume a given channel
  - ► DMA\_TERMINATE\_ALL
    - Aborts all transfers on a given channel
  - ► DMA\_SLAVE\_CONFIG
    - Configures a given channel with new parameters



#### Capabilities

- ► device\_slave\_caps
- Returns various pieces of information about the controller
  - ► Can the transfer be paused? terminated?
  - Which transfer widths are supported?
  - Which slave directions are supported?
- Used by generic layers to get an idea of what the device they're going to use is capable of (only ASoC so far)



## Recent Developments



### Generic Capabilities (3.20)

- Removal of device\_slave\_caps, and moved the logic in the framework
- Introduction of new variables in struct dma\_device
  - \*\_width Bitmask of supported transfer width, both as source and destination
  - directions Bitmask of the supported slave directions (memory to device, device to memory, device to device)
  - granularity Granularity of the transfer residue your controller can provide: bursts, chunks or descriptors
- ➤ Split of the device\_control function in four independent functions: device\_config, device\_pause, device\_resume, device\_terminate\_all



#### Scheduled DMA

- Many DMA controllers have more requests than channels
- ▶ These drivers usually have all the scheduling code
- Plus, every driver has a lot of administrative code, that is not trivial to get right (callback deferral, allocation of the descriptors, etc.), yet similar from one driver to another
- Scheduled DMA framework abstracts away most of it, and only a few things remain in the drivers:
  - ▶ Interrupt management
  - LLI related functions (iterators, configuration, etc.)
  - Scheduling hints
  - Channel management (pause, resume, residues, etc.)

# Questions?

# Maxime Ripard

maxime@free-electrons.com

Slides under CC-BY-SA 3.0

http://free-electrons.com/pub/conferences/2015/elc/ripard-dmaengine