# ELEC 3300 Introduction to Embedded Systems

## Topic 10 Direct Memory Access (DMA)

Prof. Tim Woo



## **Expected Outcomes**

- On successful completion of this topic, you will be able to
  - Classify three different I/O modules
  - Describe the basic operations of the three different I/O modules
  - Understand the hardware configurations of DMA
  - Illustrate the operation of DMA block transfer mode

## What is the function of Direct Memory Access (DMA)?

DMA provides high-speed data transfers between:

- Peripherals and memory
- Memory and memory

Data can be quickly moved by the DMA without any CPU action.

This keeps CPU resources free for other operations.

#### What is a buffer?

An area of memory that is used to store temporary data

Depending on what the hw subsystem is, the rate at which a hardware event is generated can be unpredictable. For example, if the hardware subsystem is a keypad, the time interval between two keystrokes generally varies. If both the input rate and output rate of the buffer is known, we can easily calculate the size of the buffer such that no overflow or underflow can ever occur.



HW typically runs much faster than software

#### The Buffer

Caution – boundary region of data overflow starts from here.



Caution – boundary region of data underflow starts from here.

## Concurrency Between I/O and Programs

#### Conceptual View



#### **Physical View**

Executing
Embedded
Software
Programs

μC or μP
Register

e.g. Cache, RAM

Buffer
Programs

Memory

e.g. Keyboard, mouse, Printer, modem, ...

> I/O Devices







How to write a block of data from disk to memory?

- Three different I/O modules

## I/O Module: Programmed I/O



## I/O Module: Interrupt I/O



## Increasing Concurrency Between I/O and Programs



## I/O Module: Direct Memory Access (DMA)

- What if we make the I/O devices a little smarter?
  - Make the device capable of moving data to/from memory itself
  - Advantage: it would no longer need the processor to move the data
  - Overhead: DMA controller and Additional Control logics



#### DMA – How

- The device can buy memory access cycles from the bus while the processor is not reading or writing.
- Generally, a block move is set up, say 512 bytes per block
  - The processor initiates the block data transfer by memory-mapped I/O
  - A whole block of data is then transferred without further processor intervention
  - To do this, the DMA controller takes control of the bus (instead of the processor) to sequentially copy all the bytes in the block
  - On completion, the DMA controller restores control of the bus to the CPU

**Memory-mapped I/O:** Uses the same address space to address both memory and I/O devices. The memory and registers of the I/O devices are mapped to address values. So when an address is accessed by the CPU, it may refer to a portion of physical RAM, or it can instead refer to memory of the I/O device. Thus, the CPU instructions used to access the memory can also be used for accessing devices.

#### **DMA Hardware**

BR: the DMA controller requests use of the bus for a DMA transfer



#### **DMA Controller**

- When inactive, the DMA controller looks like a normal interface (i.e. the internal registers can be accessed by the processor using a normal read/write bus cycle)
- But when the DMA controller is granted the bus, it generates its own read/write cycles (i.e. it issues address and control signals for the memory and device interfaces)
- This allows the DMA controller to transfer a byte (or word) between memory and device every bus cycle if desired



## DMA Operation (Block Transfer Mode)

As an example, consider reading a block from disk (i.e., I/O device):

- 1. Processor tells the Device Interface the address of block on disk.
- Processor tells the DMA Controller the address and the size of block in memory.
- 3. Device Interface reads block from disk to its internal buffer, while CPU continues normally.
- 4. Once the device interface completes reading the block, it asserts REQ to DMA Controller.
- 5. DMA Controller asserts BR.
- 6. Processor asserts BG.



## DMA Operation (Block Transfer Mode)

- 7. DMA Controller puts memory address on Address Bus, asserts control signals to handshake with memory, assert Read and GO to Device Interface
- 8. Device Interface outputs byte of data from its buffer to Data Bus
- 9. Memory uses the data and address on the bus to store data
- DMA Controller de-asserts GO, and Device Interface disconnects from Data Bus
- 11. Repeat Step 7 to 10 until whole block transferred.
- DMA Controller de-asserts BR and the processor regains control of bus
- 13. DMA Controller sends an interrupt to Processor





## DMA Operation (Block Transfer Mode)

As an example, consider reading a block from disk:

processor fetches the instruction that loads the starting address of the block in the memory to the DMA controller register and, the instruction that loads the starting address on the Disk.



## DMA – Why

- The reason to use DMA are
  - Higher performance: since it requires only one interrupt per N-bytes transfer to check the device's readiness rather than one interrupt per byte (as in interrupt I/O).
  - Higher performance: executing N move instructions to transfer N-bytes of data is slow compared to stealing a cycle from the bus and doing the transfer
  - Used for slower devices that requires a large number of block transfer such as disk, tapes, etc

#### DMA in STM32

Figure 48. DMA block diagram in connectivity line devices Flash **FLITF** DCode Cortex-M3 System SRAM DMA1 Ch.1 1 DMA Reset & clock Ch.2 control (RCC) Bridge 1 Ch.7 Bridge 2 APB1 APB2 Arbiter DAC SPI3/I2S I2C2 SPI2/I2S I2C1 TIM7 UART4 TIM6 DMA request USART1 AHB Slave SPI1 UART4 TIM6 USART3 TIM5 USART2 TIM4 TIM1 DMA2 Ch.1 ТІМЗ DMA request TIM<sub>2</sub> Ch.2 Ch.5 Arbiter AHB Slave Ethernet MAC USB OTG FS

Figure 49. DMA block diagram in low-, medium- high- and XL-density devices



- The DMA2 controller is available only in high-density and XL-density devices.
- ADC3, SPI/I2S3, UART4, SDIO, TIM5, TIM6, DAC, TIM7, TIM8 DMA requests are available only in highdensity devices

RCC: Power save mode: Switch off on-chip peripherals by removing access to their master clocks.

#### DMA in STM32

Figure 50. DMA1 request mapping



Must ENABLE the channel.

Table 78. Summary of DMA1 requests for each channel

| F | Peripherals          | Channel 1 | Channel 2 | Channel 3           | Channel 4                         | Channel 5        | Channel 6             | Channel 7            |
|---|----------------------|-----------|-----------|---------------------|-----------------------------------|------------------|-----------------------|----------------------|
|   | ADC1                 | ADC1      |           |                     |                                   |                  |                       |                      |
|   | SPI/I <sup>2</sup> S |           | SPI1_RX   | SPI1_TX             | SPI2/I2S2_R<br>X                  | SPI2/I2S2_T<br>X |                       |                      |
|   | USART                |           | USART3_TX | USART3_RX           | USART1_TX                         | USART1_RX        | USART2_RX             | USART2_TX            |
|   | I <sup>2</sup> C     |           |           |                     | I2C2_TX                           | I2C2_RX          | I2C1_TX               | I2C1_RX              |
|   | TIM1                 |           | TIM1_CH1  | TIM1_CH2            | TIM1_CH4<br>TIM1_TRIG<br>TIM1_COM | TIM1_UP          | TIM1_CH3              |                      |
|   | TIM2                 | TIM2_CH3  | TIM2_UP   |                     |                                   | TIM2_CH1         |                       | TIM2_CH2<br>TIM2_CH4 |
|   | TIM3                 |           | TIM3_CH3  | TIM3_CH4<br>TIM3_UP |                                   |                  | TIM3_CH1<br>TIM3_TRIG |                      |
|   | TIM4                 | TIM4_CH1  |           |                     | TIM4_CH2                          | TIM4_CH3         |                       | TIM4_UP              |

### In-class activities (Question 1-2)

#### For Android devices, search **HKUST iLearn** at Play Store.



#### For iOS devices, search **HKUST iLearn** at App Store.



## A project demo on DMA application

https://www.youtube.com/watch?v=q50A47aAI9M



This test was taking 84.326 seconds no DMA and no Optimization. It took only 33.71 seconds with DMA+optimization.

## Reflection (Self-evaluation)

- Do you ....
  - Distinguish three different I/O modules and their operations?
  - List some examples of applying different I/O in modern computer ?
  - List the hardware configurations of DMA?
  - Understand the operations of block transfer mode of DMA?

