SPI Master driver is a program that controls ESP32's SPI peripherals while they function as masters.
ESP32 integrates four SPI peripherals.
- SPI0 and SPI1 are used internally to access the ESP32's attached flash memory and thus are currently not open to users. They share one signal bus via an arbiter.
- SPI2 and SPI3 are general purpose SPI controllers, sometimes referred to as HSPI and VSPI, respectively. They are open to users. SPI2 and SPI3 have independent signal buses with the same respective names. Each bus has three CS lines to drive up to three SPI slaves.
The terms used in relation to the SPI master driver are given in the table below.
Term | Definition |
---|---|
Host | The SPI controller peripheral inside ESP32 that initiates SPI transmissions over the bus, and acts as an SPI Master. This may be the SPI2 or SPI3 peripheral. (The driver will also support the SPI1 peripheral in the future.) |
Device | SPI slave device. An SPI bus may be connected to one or more Devices. Each Device shares the MOSI, MISO and SCLK signals but is only active on the bus when the Host asserts the Device's individual CS line. |
Bus | A signal bus, common to all Devices connected to one Host. In general, a bus includes the following lines: MISO, MOSI, SCLK, one or more CS lines, and, optionally, QUADWP and QUADHD. So Devices are connected to the same lines, with the exception that each Device has its own CS line. Several Devices can also share one CS line if connected in the daisy-chain manner. |
|
Master In, Slave Out, a.k.a. Q. Data transmission from a Device to Host. |
|
Master Out, Slave In, a.k.a. D. Data transmission from a Host to Device. |
|
Serial Clock. Oscillating signal generated by a Host that keeps the transmission of data bits in sync. |
|
Chip Select. Allows a Host to select individual Device(s) connected to the bus in order to send or receive data. |
|
Write Protect signal. Only used for 4-bit (qio/qout) transactions. |
|
Hold signal. Only used for 4-bit (qio/qout) transactions. |
|
The action of activating a line. The opposite action of returning the line back to inactive (back to idle) is called de-assertion. |
Transaction | One instance of a Host asserting a CS line, transferring data to and from a Device, and de-asserting the CS line. Transactions are atomic, which means they can never be interrupted by another transaction. |
Launch edge | Edge of the clock at which the source register launches the signal onto the line. |
Latch edge | Edge of the clock at which the destination register latches in the signal. |
The SPI master driver governs communications of Hosts with Devices. The driver supports the following features:
- Multi-threaded environments
- Transparent handling of DMA transfers while reading and writing data
- Automatic time-division multiplexing of data coming from different Devices on the same signal bus
Warning
The SPI master driver has the concept of multiple Devices connected to a single bus (sharing a single ESP32 SPI peripheral). As long as each Device is accessed by only one task, the driver is thread safe. However, if multiple tasks try to access the same SPI Device, the driver is not thread-safe. In this case, it is recommended to either:
- Refactor your application so that each SPI peripheral is only accessed by a single task at a time.
- Add a mutex lock around the shared Device using :c:macro:`xSemaphoreCreateMutex`.
An SPI bus transaction consists of five phases which can be found in the table below. Any of these phases can be skipped.
Phase | Description |
---|---|
Command | In this phase, a command (0-16 bit) is written to the bus by the Host. |
Address | In this phase, an address (0-64 bit) is transmitted over the bus by the Host. |
Write | Host sends data to a Device. This data follows the optional command and address phases and is indistinguishable from them at the electrical level. |
Dummy | This phase is configurable and is used to meet the timing requirements. |
Read | Device sends data to its Host. |
.. todo:: Add a package diagram.
The attributes of a transaction are determined by the bus configuration structure :cpp:type:`spi_bus_config_t`, device configuration structure :cpp:type:`spi_device_interface_config_t`, and transaction configuration structure :cpp:type:`spi_transaction_t`.
An SPI Host can send full-duplex transactions, during which the read and write phases occur simultaneously. The total transaction length is determined by the sum of the following members:
- :cpp:member:`spi_device_interface_config_t::command_bits`
- :cpp:member:`spi_device_interface_config_t::address_bits`
- :cpp:member:`spi_transaction_t::length`
While the member :cpp:member:`spi_transaction_t::rxlength` only determines the length of data received into the buffer.
In half-duplex transactions, the read and write phases are not simultaneous (one direction at a time). The lengths of the write and read phases are determined by :cpp:member:`length` and :cpp:member:`rxlength` members of the struct :cpp:type:`spi_transaction_t` respectively.
The command and address phases are optional, as not every SPI device requires a command and/or address. This is reflected in the Device's configuration: if :cpp:member:`command_bits` and/or :cpp:member:`address_bits` are set to zero, no command or address phase will occur.
The read and write phases can also be optional, as not every transaction requires both writing and reading data. If :cpp:member:`rx_buffer` is NULL and :cpp:type:`SPI_TRANS_USE_RXDATA` is not set, the read phase is skipped. If :cpp:member:`tx_buffer` is NULL and :cpp:type:`SPI_TRANS_USE_TXDATA` is not set, the write phase is skipped.
The driver supports two types of transactions: the interrupt transactions and polling transactions. The programmer can choose to use a different transaction type per Device. If your Device requires both transaction types, see :ref:`mixed_transactions`.
Interrupt transactions will block the transaction routine until the transaction completes, thus allowing the CPU to run other tasks.
An application task can queue multiple transactions, and the driver will automatically handle them one-by-one in the interrupt service routine (ISR). It allows the task to switch to other procedures until all the transactions complete.
Polling transactions do not use interrupts. The routine keeps polling the SPI Host's status bit until the transaction is finished.
All the tasks that use interrupt transactions can be blocked by the queue. At this point, they will need to wait for the ISR to run twice before the transaction is finished. Polling transactions save time otherwise spent on queue handling and context switching, which results in smaller transaction intervals. The disadvantage is that the CPU is busy while these transactions are in progress.
The :cpp:func:`spi_device_polling_end` routine needs an overhead of at least 1 us to unblock other tasks when the transaction is finished. It is strongly recommended to wrap a series of polling transactions using the functions :cpp:func:`spi_device_acquire_bus` and :cpp:func:`spi_device_release_bus` to avoid the overhead. For more information, see :ref:`bus_acquiring`.
During the command and address phases, the members :cpp:member:`cmd` and :cpp:member:`addr` in the struct :cpp:type:`spi_transaction_t` are sent to the bus, nothing is read at this time. The default lengths of the command and address phases are set in :cpp:type:`spi_device_interface_config_t` by calling :cpp:func:`spi_bus_add_device`. If the flags :cpp:type:`SPI_TRANS_VARIABLE_CMD` and :cpp:type:`SPI_TRANS_VARIABLE_ADDR` in the member :cpp:member:`spi_transaction_t::flags` are not set, the driver automatically sets the length of these phases to default values during Device initialization.
If the lengths of the command and address phases need to be variable, declare the struct :cpp:type:`spi_transaction_ext_t`, set the flags :cpp:type:`SPI_TRANS_VARIABLE_CMD` and/or :cpp:type:`SPI_TRANS_VARIABLE_ADDR` in the member :cpp:member:`spi_transaction_ext_t::base` and configure the rest of base as usual. Then the length of each phase will be equal to :cpp:member:`command_bits` and :cpp:member:`address_bits` set in the struct :cpp:type:`spi_transaction_ext_t`.
Normally, the data that needs to be transferred to or from a Device will be read from or written to a chunk of memory indicated by the members :cpp:member:`rx_buffer` and :cpp:member:`tx_buffer` of the structure :cpp:type:`spi_transaction_t`. If DMA is enabled for transfers, the buffers are required to be:
- Allocated in DMA-capable internal memory. If :ref:`external PSRAM is enabled<dma-capable-memory>`, this means using
pvPortMallocCaps(size, MALLOC_CAP_DMA)
.- 32-bit aligned (staring from a 32-bit boundary and having a length of multiples of 4 bytes).
If these requirements are not satisfied, the transaction efficiency will be affected due to the allocation and copying of temporary buffers.
Note
Half-duplex transactions with both read and write phases are not supported when using DMA. For details and workarounds, see :ref:`spi_known_issues`.
Sometimes you might want to send SPI transactions exclusively and continuously so that it takes as little time as possible. For this, you can use bus acquiring, which helps to suspend transactions (both polling or interrupt) to other devices until the bus is released. To acquire and release a bus, use the functions :cpp:func:`spi_device_acquire_bus` and :cpp:func:`spi_device_release_bus`.
.. todo:: Organize the Driver Usage into subsections that will reflect the general usage experience of the users, e.g., Configuration Add stuff about the configuration API here, and the various options in configuration (e.g., configure for interrupt vs. polling), and optional configuration Transactions Describe how to execute a normal transaction (i.e., where data is larger than 32 bits). Describe how to configure between big and little-endian. - Add subsub section on how to optimize when transmitting less than 32 bits - Add subsub section on how to transmit mixed transactions to the same device
Initialize an SPI bus by calling the function :cpp:func:`spi_bus_initialize`. Make sure to set the correct I/O pins in the struct :cpp:type:`spi_bus_config_t`. Set the signals that are not needed to
-1
.Register a Device connected to the bus with the driver by calling the function :cpp:func:`spi_bus_add_device`. Make sure to configure any timing requirements the device might need with the parameter
dev_config
. You should now have obtained the Device's handle which will be used when sending a transaction to it.To interact with the Device, fill one or more :cpp:type:`spi_transaction_t` structs with any transaction parameters required. Then send the structs either using a polling transaction or an interrupt transaction:
- :ref:`Interrupt <interrupt_transactions>`
- Either queue all transactions by calling the function :cpp:func:`spi_device_queue_trans` and, at a later time, query the result using the function :cpp:func:`spi_device_get_trans_result`, or handle all requests synchronously by feeding them into :cpp:func:`spi_device_transmit`.
- :ref:`Polling <polling_transactions>`
- Call the function :cpp:func:`spi_device_polling_transmit` to send polling transactions. Alternatively, if you want to insert something in between, send the transactions by using :cpp:func:`spi_device_polling_start` and :cpp:func:`spi_device_polling_end`.
(Optional) To perform back-to-back transactions with a Device, call the function :cpp:func:`spi_device_acquire_bus` before sending transactions and :cpp:func:`spi_device_release_bus` after the transactions have been sent.
(Optional) To unload the driver for a certain Device, call :cpp:func:`spi_bus_remove_device` with the Device handle as an argument.
(Optional) To remove the driver for a bus, make sure no more drivers are attached and call :cpp:func:`spi_bus_free`.
The example code for the SPI master driver can be found in the :example:`peripherals/spi_master` directory of ESP-IDF examples.
When the transaction data size is equal to or less than 32 bits, it will be sub-optimal to allocate a buffer for the data. The data can be directly stored in the transaction struct instead. For transmitted data, it can be achieved by using the :cpp:member:`tx_data` member and setting the :cpp:type:`SPI_TRANS_USE_TXDATA` flag on the transmission. For received data, use :cpp:member:`rx_data` and set :cpp:type:`SPI_TRANS_USE_RXDATA`. In both cases, do not touch the :cpp:member:`tx_buffer` or :cpp:member:`rx_buffer` members, because they use the same memory locations as :cpp:member:`tx_data` and :cpp:member:`rx_data`.
An SPI Host reads and writes data into memory byte by byte. By default, data is sent with the most significant bit (MSB) first, as LSB first used in rare cases. If a value less than 8 bits needs to be sent, the bits should be written into memory in the MSB first manner.
For example, if 0b00010
needs to be sent, it should be written into a uint8_t
variable, and the length for reading should be set to 5 bits. The Device will still receive 8 bits with 3 additional "random" bits, so the reading must be performed correctly.
On top of that, ESP32 is a little-endian chip, which means that the least significant byte of uint16_t
and uint32_t
variables is stored at the smallest address. Hence, if uint16_t
is stored in memory, bits [7:0] are sent first, followed by bits [15:8].
For cases when the data to be transmitted has the size differing from uint8_t
arrays, the following macros can be used to transform data to the format that can be sent by the SPI driver directly:
- :c:macro:`SPI_SWAP_DATA_TX` for data to be transmitted
- :c:macro:`SPI_SWAP_DATA_RX` for data received
To reduce coding complexity, send only one type of transactions (interrupt or polling) to one Device. However, you still can send both interrupt and polling transactions alternately. The notes below explain how to do this.
The polling transactions should be initiated only after all the polling and interrupt transactions are finished.
Since an unfinished polling transaction blocks other transactions, please do not forget to call the function :cpp:func:`spi_device_polling_end` after :cpp:func:`spi_device_polling_start` to allow other transactions or to allow other Devices to use the bus. Remember that if there is no need to switch to other tasks during your polling transaction, you can initiate a transaction with :cpp:func:`spi_device_polling_transmit` so that it will be ended automatically.
In-flight polling transactions are disturbed by the ISR operation to accommodate interrupt transactions. Always make sure that all the interrupt transactions sent to the ISR are finished before you call :cpp:func:`spi_device_polling_start`. To do that, you can keep calling :cpp:func:`spi_device_get_trans_result` until all the transactions are returned.
To have better control of the calling sequence of functions, send mixed transactions to the same Device only within a single task.
Most of ESP32's peripheral signals have direct connection to their dedicated IO_MUX pins. However, the signals can also be routed to any other available pins using the less direct GPIO matrix. If at least one signal is routed through the GPIO matrix, then all signals will be routed through it.
The GPIO matrix introduces flexibility of routing but also brings the following disadvantages:
- Increases the input delay of the MISO signal, which makes MISO setup time violations more likely. If SPI needs to operate at high speeds, use dedicated IO_MUX pins.
- Allows signals with clock frequencies only up to 40 MHz, as opposed to 80 MHz if IO_MUX pins are used.
Note
For more details about the influence of the MISO input delay on the maximum clock frequency, see :ref:`timing_considerations`.
The IO_MUX pins for SPI buses are given below.
Pin Name | SPI2 | SPI3 |
---|---|---|
GPIO Number | ||
CS0* | 15 | 5 |
SCLK | 14 | 18 |
MISO | 12 | 19 |
MOSI | 13 | 23 |
QUADWP | 2 | 22 |
QUADHD | 4 | 21 |
- Only the first Device attached to the bus can use the CS0 pin.
There are three factors limiting the transfer speed:
- Transaction interval
- SPI clock frequency
- Cache miss of SPI functions, including callbacks
The main parameter that determines the transfer speed for large transactions is clock frequency. For multiple small transactions, the transfer speed is mostly determined by the length of transaction intervals.
Transaction interval is the time that software requires to set up SPI peripheral registers and to copy data to FIFOs, or to set up DMA links.
Interrupt transactions allow appending extra overhead to accommodate the cost of FreeRTOS queues and the time needed for switching between tasks and the ISR.
For interrupt transactions, the CPU can switch to other tasks when a transaction is in progress. This saves the CPU time but increases the interval. See :ref:`interrupt_transactions`. For polling transactions, it does not block the task but allows to do polling when the transaction is in progress. For more information, see :ref:`polling_transactions`.
If DMA is enabled, setting up the linked list requires about 2 us per transaction. When a master is transferring data, it automatically reads the data from the linked list. If DMA is not enabled, the CPU has to write and read each byte from the FIFO by itself. Usually, this is faster than 2 us, but the transaction length is limited to 64 bytes for both write and read.
Typical transaction interval timings for one byte of data are given below.
Typical Transaction Time (us) | ||
---|---|---|
Interrupt | Polling | |
DMA | 24 | 8 |
No DMA | 22 | 7 |
Transferring each byte takes eight times the clock period 8/fspi. If the clock frequency is too high, the use of some functions might be limited. See :ref:`timing_considerations`.
The default config puts only the ISR into the IRAM. Other SPI related functions, including the driver itself and the callback, might suffer from the cache miss and will need to wait until the code is read from the flash. Select :ref:`CONFIG_SPI_MASTER_IN_IRAM` to put the whole SPI driver into IRAM and put the entire callback(s) and its callee functions into IRAM to prevent cache miss.
For an interrupt transaction, the overall cost is 20+8n/Fspi[MHz] [us] for n bytes transferred in one transaction. Hence, the transferring speed is: n/(20+8n/Fspi). An example of transferring speed at 8 MHz clock speed is given in the following table.
Frequency (MHz) |
Transaction Interval (us) |
Transaction Length (bytes) |
Total Time (us) |
Total Speed (KBps) |
---|---|---|---|---|
8 | 25 | 1 | 26 | 38.5 |
8 | 25 | 8 | 33 | 242.4 |
8 | 25 | 16 | 41 | 490.2 |
8 | 25 | 64 | 89 | 719.1 |
8 | 25 | 128 | 153 | 836.6 |
When a transaction length is short, the cost of transaction interval is high. If possible, try to squash several short transactions into one transaction to achieve a higher transfer speed.
Please note that the ISR is disabled during flash operation by default. To keep sending transactions during flash operations, enable :ref:`CONFIG_SPI_MASTER_ISR_IN_IRAM` and set :cpp:class:`ESP_INTR_FLAG_IRAM` in the member :cpp:member:`spi_bus_config_t::intr_flags`. In this case, all the transactions queued before starting flash operations will be handled by the ISR in parallel. Also note that the callback of each Device and their callee functions should be in IRAM, or your callback will crash due to cache miss. For more details, see :ref:`iram-safe-interrupt-handlers`.
As shown in the figure below, there is a delay on the MISO line after the SCLK launch edge and before the signal is latched by the internal register. As a result, the MISO pin setup time is the limiting factor for the SPI clock speed. When the delay is too long, the setup slack is < 0, and the setup timing requirement is violated, which results in the failure to perform the reading correctly.
The maximum allowed frequency is dependent on:
input_delay_ns
- maximum data valid time on the MISO bus after a clock cycle on SCLK starts- If the IO_MUX pin or the GPIO Matrix is used
When the GPIO matrix is used, the maximum allowed frequency is reduced to about 33~77% in comparison to the existing input delay. To retain a higher frequency, you have to use the IO_MUX pins or the dummy bit workaround. You can obtain the maximum reading frequency of the master by using the function :cpp:func:`spi_get_freq_limit`.
Dummy bit workaround: Dummy clocks, during which the Host does not read data, can be inserted before the read phase begins. The Device still sees the dummy clocks and sends out data, but the Host does not read until the read phase comes. This compensates for the lack of the MISO setup time required by the Host and allows the Host to do reading at a higher frequency.
In the ideal case, if the Device is so fast that the input delay is shorter than an APB clock cycle - 12.5 ns - the maximum frequency at which the Host can read (or read and write) in different conditions is as follows:
Frequency Limit (MHz) | Dummy Bits Used By Driver | Comments | |
---|---|---|---|
GPIO matrix | IO_MUX pins | ||
26.6 | 80 | No | |
40 | -- | Yes | Half-duplex, no DMA allowed |
If the Host only writes data, the dummy bit workaround and the frequency check can be disabled by setting the bit SPI_DEVICE_NO_DUMMY in the member :cpp:member:`spi_device_interface_config_t::flags`. When disabled, the output frequency can be 80MHz, even if the GPIO matrix is used.
:cpp:member:`spi_device_interface_config_t::flags`
The SPI master driver can work even if the :cpp:member:`input_delay_ns` in the structure :cpp:type:`spi_device_interface_config_t` is set to 0. However, setting an accurate value helps to:
- Calculate the frequency limit for full-duplex transactions
- Compensate the timing correctly with dummy bits for half-duplex transactions
You can approximate the maximum data valid time after the launch edge of SPI clocks by checking the statistics in the AC characteristics chapter of your Device's specification or measure the time on an oscilloscope or logic analyzer.
Please note that the actual PCB layout design and the excessive loads may increase the input delay. It means that non-optimal wiring and/or a load capacitor on the bus will most likely lead to the input delay values exceeding the values given in the Device specification or measured while the bus is floating.
Some typical delay values are shown in the following table.
Device | Input delay (ns) |
---|---|
Ideal Device | 0 |
ESP32 slave using IO_MUX* | 50 |
ESP32 slave using GPIO_MUX* | 75 |
ESP32's slave device is on a different physical chip. |
The MISO path delay (valid time) consists of a slave's input delay plus master's GPIO matrix delay. This delay determines the frequency limit above which full-duplex transfers will not work as well as the dummy bits used in the half-duplex transactions. The frequency limit is:
Freq limit [MHz] = 80 / (floor(MISO delay[ns]/12.5) + 1)
The figure below shows the relationship between frequency limit and input delay. Two extra APB clock cycle periods should be added to the MISO delay if the master uses the GPIO matrix.
Corresponding frequency limits for different Devices with different input delay times are shown in the table below.
Master | Input delay (ns) | MISO path delay (ns) | Freq. limit (MHz) |
---|---|---|---|
IO_MUX (0ns) | 0 | 0 | 80 |
50 | 50 | 16 | |
75 | 75 | 11.43 | |
GPIO (25ns) | 0 | 25 | 26.67 |
50 | 75 | 11.43 | |
75 | 100 | 8.89 |
Half-duplex transactions are not compatible with DMA when both writing and reading phases are used.
If such transactions are required, you have to use one of the alternative solutions:
Use full-duplex transactions instead.
Disable DMA by setting the bus initialization function's last parameter to 0 as follows:
ret=spi_bus_initialize(VSPI_HOST, &buscfg, 0);
This can prohibit you from transmitting and receiving data longer than 64 bytes.
Try using the command and address fields to replace the write phase.
Full-duplex transactions are not compatible with the dummy bit workaround, hence the frequency is limited. See :ref:`dummy bit speed-up workaround <dummy_bit_workaround>`.
cs_ena_pretrans
is not compatible with the command and address phases of full-duplex transactions.
The code example for displaying graphics on an ESP32-WROVER-KIT's 320x240 LCD screen can be found in the :example:`peripherals/spi_master` directory of ESP-IDF examples.