



# **Application Note QP™ and ARM Cortex-M with IAR**



# **Table of Contents**

| 1 Introduction                                                   | 1          |
|------------------------------------------------------------------|------------|
| 1.1 About the QP Port to ARM Cortex-M                            | 2          |
| 1.1.1 "Kernel-Aware" and "Kernel-Unaware" Interrupts             | 2          |
| 1.1.2 Assigning Interrupt Priorities                             | 4          |
| 1.1.3 The Use of the FPU (Cortex-M4F)                            | 5          |
| 1.1.4 Cortex Microcontroller Software Interface Standard (CMSIS) | 6          |
| 1.2 About QP™                                                    | 6          |
| 1.3 About QM™                                                    | <u>7</u>   |
| 1.4 Licensing QP                                                 | 8          |
| 1.5 Licensing QM™                                                | <u>8</u>   |
| 2 Directories and Files.                                         | q          |
| 2.1 Building the QP Libraries.                                   |            |
| 2.2 Building and Debugging the Examples.                         | 12         |
|                                                                  |            |
| 3 The Vanilla Port                                               | <u>13</u>  |
| 3.1 The dep_port.n Header File                                   | <u>13</u>  |
| 3.2 The QF Port Header File                                      | <u>13</u>  |
| 3.3 Handling Interrupts in the Non-Preemptive Vanilla Kernel     | <u>16</u>  |
| 3.3.1 The Interrupt Vector Table                                 | 10         |
| 3.4 Using the FPU in the "Vanilla" Port (Cortex-M4F)             | 1 <i>1</i> |
| 3.4.2 FPU used in the ISRs.                                      | 1 <i>1</i> |
| 3.4.2 FFO used in the ISRS                                       | <u>10</u>  |
|                                                                  |            |
| 4 The QK Port                                                    | <u>2</u> 0 |
| 4.1 Single-Stack, Preemptive Multitasking on ARM Cortex-M        | 20         |
| 4.1.1 Examples of Various Preemption Scenarios in QK             |            |
| 4.2 Using the FPU with the preemptive QK kernel (Cortex-M4F)     | 22         |
| 4.2.1 FPU used in ONE task only and not in any ISRs              | 23         |
| 4.2.2 FPU used in more than one task or the ISRs                 | <u>23</u>  |
| 4.3 The QK Port Header File                                      | <u>23</u>  |
| 4.3.1 The QK Critical Section.                                   | 22         |
| 4.4 QK Platform-Specific Code for ARM Cortex-M                   | <u>2</u> 5 |
| 4.6 Writing ISRs for QK                                          | <u></u>    |
| 4.0 Willing ISRS for Qn                                          | <u></u> 30 |
| 4.7 QK Idle Processing Customization in QK_onIdle()              | <u></u> 3  |
| 4.8.1 Interrupt Nesting Test.                                    | <u></u> 32 |
| 4.8.2 Task Preemption Test.                                      | <u></u>    |
| 4.8.3 Testing the FPU (Cortex-M4F)                               | 34         |
| 4.8.4 Other Tests.                                               | 34         |
|                                                                  |            |
| 5 QS Software Tracing Instrumentation                            | <u>35</u>  |
| 5.1 QS Time Stamp Callback QS_onGetTime()                        | 37         |
| 5.2 QS Trace Output in QF_onIdle()/QK_onIdle()                   | <u>38</u>  |
| 5.3 Invoking the QSpy Host Application                           |            |
| 6 Related Documents and References                               | 40         |
| 7 Contact Information                                            | 41         |
| I VOITAVE IIIVIIIIUUVIIIIIIIIIIIIIIIIIIIIIIIIII                  |            |





## 1 Introduction

This Application Note describes how to use the QP™ state machine framework with the ARM Cortex-M processors (Cortex M0/M0+/M1/M3/M4 and **M4F** based on the ARMv6-M and ARMv7-M architectures). Two main implementation options are covered: the cooperative "Vanilla" kernel, and the preemptive QK kernel, both available in QP. The port assumes QP version **5.2.0** or higher.

**NOTE:** The interrupt disabling policy for ARM-Cortex-M3/M4 has changed in QP 5.1. Interrupts are now disabled more selectively using the BASEPRI register, which allows to disable only interrupts with priorities below a certain level and **never disables interrupts with priorities above this level** ("zero interrupt latency"). This means that leaving the interrupt priority at the default value of zero (the highest priority) is most likely **incorrect**, because the free-running interrupts **cannot** call any QP services. See Section 1.1.1 for more information.

To focus the discussion, this Application Note uses the IAR Embedded Workbench® for ARM (EWARM version **6.60** KickStart™ edition, which is available as a free download from the <u>IAR website</u>) as well as the EK-LM3S811 and EK-TM4C123GXL boards from Texas Instruments, as shown in Figure 1. However, the source code for the QP port described here is generic for all ARM Cortex-M devices and runs without modifications on all ARM Cortex-M cores.

The provided application examples illustrate also using the **QM™** modeling tool for designing QP applications graphically and generating code automatically.



Figure 1: The EK-LM3S811 and EK-TM4C123GXL boards used to test the ARM Cortex-M port.



### 1.1 About the QP Port to ARM Cortex-M

In contrast to the traditional ARM7/ARM9 cores, ARM Cortex-M cores contain such standard components as the Nested Vectored Interrupt Controller (NVIC) and the System Timer (SysTick). With the provision of these standard components, it is now possible to provide fully portable system-level software for ARM Cortex-M. Therefore, this QP port to ARM Cortex-M can be much more complete than a port to the traditional ARM7/ARM9 and the software is guaranteed to work on any ARM Cortex-M silicon.

The non preemptive cooperative kernel implementation is very simple on ARM Cortex-M, perhaps simpler than any other processor, mainly because Interrupt Service Routines (ISRs) are regular C-functions on ARM Cortex-M.

However, when it comes to handling preemptive multitasking, ARM Cortex-M is a unique processor unlike any other. Section 4 of this application note describes in detail the unique implementation of the preemptive, run-to-completion QK kernel (described in Chapter 10 in [PSiCC2]) on ARM Cortex-M.

**NOTE:** This Application Note pertains both to C and C++ versions of the QP™ state machine frameworks. Most of the code listings in this document refer to the C version. Occasionally the C code is followed by the equivalent C++ implementation to show the C++ differences whenever such differences become important.

### 1.1.1 "Kernel-Aware" and "Kernel-Unaware" Interrupts

Starting from QP 5.1.0, the QP port to ARM Cortex-M3/M4 **never completely disables interrupts**, even inside the critical sections. On Cortex-M3/M4 (ARMv7-M architectures), the QP port disables interrupts selectively using the BASEPRI register. As shown in Figure 2 and Figure 3, this policy divides interrupts into "kernel-unaware" interrupts, which are never disabled, and "kernel-aware" interrupts, which are disabled in the QP critical sections. **Only "kernel-aware" interrupts are allowed to call QP services**. "Kernel-unaware" interrupts are **not allowed** to call any QP services and they can communicate with QP only by triggering a "kernel-aware" interrupt (which can post or publish events).

**NOTE:** The BASEPRI register is not implemented in the ARMv6-M architecture (**Cortex-M0**/M0+), so Cortex-M0/M0+ need to use the PRIMASK register to disable interrupts globally. In other words, in Cortex-M0/M0+ ports, all interrupts are "kernel-aware".

Figure 2: Kernel-aware and kernel-unaware interrupts with 3 priority bits implemented in NVIC

| Interrupt type           | NVIC priority bits | Priority fo<br>NVIC_Set |                                           |
|--------------------------|--------------------|-------------------------|-------------------------------------------|
| Kernel-unaware interrupt | 000 0000           | 0                       | Never disabled                            |
| Kernel-aware interrupt   | 001 00000          | 1 = QF_A                | AWARE_ISR_CMSIS_PRI                       |
| Kernel-aware interrupt   | <b>010</b> 00000   | 2                       |                                           |
| Kernel-aware interrupt   | <b>011</b> 00000   | 3                       | Disabled in critical sections             |
| Kernel-aware interrupt   | <b>100</b> 00000   | 4                       | in ontion cochone                         |
| Kernel-aware interrupt   | <b>101</b> 00000   | 5                       |                                           |
| Kernel-aware interrupt   | <b>110</b> 00000   | 6                       |                                           |
| PendSV interrupt for QK  | 111 00000          | 7                       | Should not be used for regular interrupts |



Figure 3: Kernel-aware and kernel-unaware interrupts with 4 priority bits implemented in NVIC

| Interrupt type           | NVIC priority bits | Priority for NVIC_SetP |                                           |
|--------------------------|--------------------|------------------------|-------------------------------------------|
| Kernel-unaware interrupt | 0000 0000          | 0                      |                                           |
| Kernel-unaware interrupt | 0001 0000          | 1                      | Never disabled                            |
| Kernel-unaware interrupt | <b>0010</b> 0000   | 2                      |                                           |
| Kernel-aware interrupt   | 0011 0000          | 3 = QF_A\              | WARE_ISR_CMSIS_PRI                        |
| Kernel-aware interrupt   | <b>0100</b> 0000   | 4                      |                                           |
| Kernel-aware interrupt   | <b>0101</b> 0000   | 5                      |                                           |
| Kernel-aware interrupt   | <b>0110</b> 0000   | 6                      |                                           |
| Kernel-aware interrupt   | <b>0111</b> 0000   | 7                      | Disabled                                  |
|                          |                    |                        | in critical sections                      |
| Kernel-aware interrupt   | <b>1110</b> 0000   | 14                     |                                           |
| Kernel-aware interrupt   | <b>1101</b> 0000   | 12                     |                                           |
| PendSV interrupt for QK  | 1111 0000          | 15                     | Should not be used for regular interrupts |

As illustrated in Figure 2 and Figure 3, the number of interrupt priority bits actually available is implementation dependent, meaning that the various ARM Cortex-M silicon vendors can provide different number of priority bits, varying from just 3 bits (which is the minimum for ARMv7-M architecture) up to 8 bits. For example, the TI Stellaris/Tiva-C microcontrollers implement only 3 priority bits (see Figure 2). On the other hand, the STM32 MCUs implement 4 priority bits (see Figure 3). The CMSIS standard provides the macro \_\_NVIC\_PRIO\_BITS, which specifies the number of NVIC priority bits defined in a given ARM Cortex-M implementation.

Another important fact to note is that the ARM Cortex-M core stores the interrupt priority values in the **most significant bits** of its eight bit interrupt priority registers inside the NVIC (Nested Vectored Interrupt Controller). For example, if an implementation of a ARM Cortex-M microcontroller only implements three priority bits, then these three bits are shifted up to be bits five, six and seven respectively. The unimplemented bits can be written as zero or one and always read as zero.

And finally, the NVIC uses an **inverted priority numbering scheme** for interrupts, in which priority zero (0) is the highest possible priority (highest urgency) and larger priority numbers denote actually lower-priority interrupts. So for example, interrupt of priority 2 can preempt an interrupt with priority 3, but interrupt of priority 3 cannot preempt interrupt of priority 3. The default value of priority of all interrupts out of reset is zero (0).



The CMSIS provides the function  $NVIC\_SetPriority()$  which you should use to set priority of every interrupt.

**NOTE:** The priority scheme passed to <code>NVIC\_SetPriority()</code> is **different** again than the values stored in the <code>NVIC</code> registers, as shown in Figure 2 and Figure 3 as "CMSIS priorities"



#### 1.1.2 Assigning Interrupt Priorities

The example projects accompanying this Application Note demonstrate the recommended way of assigning interrupt priorities in your applications. The initialization consist of two steps: (1) you enumerate the "kernel-unaware" and "kernel-aware" interrupt priorities, and (2) you assign the priorities by calling the NVIC\_SetPriority() CMSIS function. Listing 1 illustrates these steps with the explanation section following immediately after the code.

Listing 1: Assigning the interrupt priorities (see file bsp.c in the example projects).

```
* Assign a priority to EVERY ISR explicitly by calling NVIC SetPriority().
   * DO NOT LEAVE THE ISR PRIORITIES AT THE DEFAULT VALUE!
   * /
                                                           /* see NOTE00 */
(1) enum KernelUnawareISRs {
      /* ... */
      MAX KERNEL UNAWARE CMSIS PRI
                                                      /* keep always last */
(2)
   };
   /* "kernel-unaware" interrupts can't overlap "kernel-aware" interrupts */
(3) Q ASSERT COMPILE (MAX KERNEL UNAWARE CMSIS PRI <= QF AWARE ISR CMSIS PRI);
(4) enum KernelAwareISRs {
      GPIOPORTA PRI = QF AWARE ISR CMSIS PRI,
                                                            /* see NOTE00 */
(5)
      SYSTICK PRIO,
      /* ... <sup>*</sup>/
      MAX KERNEL AWARE CMSIS PRI
                                                      /* keep always last */
(6)
   } ;
   /* "kernel-aware" interrupts should not overlap the PendSV priority */
(7) Q ASSERT COMPILE(MAX KERNEL AWARE CMSIS PRI <= (0xFF>>(8- NVIC PRIO BITS)));
(8) void QF onStartup(void) {
                /* set up the SysTick timer to fire at BSP TICKS PER SEC rate */
       SysTick Config(ROM SysCtlClockGet() / BSP TICKS PER SEC);
       /* assing all priority bits for preemption-prio. and none to sub-prio. */
       NVIC SetPriorityGrouping(OU);
(9)
       /* set priorities of ALL ISRs used in the system, see NOTE00
       * Assign a priority to EVERY ISR explicitly by calling NVIC SetPriority().
       * DO NOT LEAVE THE ISR PRIORITIES AT THE DEFAULT VALUE!
(10)
      NVIC SetPriority(SysTick IRQn, SYSTICK PRIO);
      NVIC SetPriority(GPIOPortA IRQn, GPIOPORTA PRIO);
(11)
       /* ... */
                                                        /* enable IRQs... */
      NVIC EnableIRQ(GPIOPortA IRQn);
(12)
```

(1) The enumeration KernelUnawareISRs lists the priority numbers for the "kernel-unaware" interrupts. These priorities start with zero (highest possible). The priorities are suitable as the argument for the NVC SetPriority() CMSIS function.



**NOTE:** The NVIC allows you to assign the same priority level to multiple interrupts, so you can have more ISRs than priority levels running as "kernel-unaware" or "kernel-aware" interrupts.

- (2) The last value in the enumeration MAX\_KERNEL\_UNAWARE\_CMSIS\_PRI keeps track of the maximum priority used for a "kernel-unaware" interrupt.
- (3) The compile-time assertion ensures that the "kernel-unaware" interrupt priorities do not overlap the "kernel-aware" interrupts, which start at QF AWARE ISR CMSIS PRI.
- (4) The enumeration KernelAwareISRs lists the priority numbers for the "kernel-aware" interrupts.
- (5) The "kernel-aware" interrupt priorities start with the <a href="mailto:offset">QF\_AWARE\_ISR\_CMSIS\_PRI</a> offset, which is provided in the <a href="mailto:offset">qf</a> port.h header file.
- (6) The last value in the enumeration MAX\_KERNEL\_AWARE\_CMSIS\_PRI keeps track of the maximum priority used for a "kernel-aware" interrupt.
- (7) The compile-time assertion ensures that the "kernel-aware" interrupt priorities do not overlap the lowest priority level reserved for the PendSV exception (see Section 4.4).
- (8) The QF onStartup() callback function is where you set up the interrupts.
- (9) This call to the CMIS function NVIC\_SetPriorityGrouping() assigns all the priority bits to be preempt priority bits, leaving no priority bits as subpriority bits to preserve the direct relationship between the interrupt priorities and the ISR preemption rules. This is the default configuration out of reset for the ARM Cortex-M3/M4 cores, but it can be changed by some vendor-supplied startup code. To avoid any surprises, the call to NVIC SetPriorityGrouping(0U) is recommended.
- (10-11) The interrupt priories fall **all** interrupts ("kernel-unaware" and "kernel-aware" alike) are set explicitly by calls to the CMSIS function NVIC SetPriority().
- (12) All used IRQ interrupts need to be explicitly enabled by calling the CMSIS function NVIC EnableIRQ().

#### 1.1.3 The Use of the FPU (Cortex-M4F)

The QP ports described in this Application Note now support also the ARM Cortex-**M4F**. Compared to all other members of the Cortex-M family, the Cortex-M4F includes the single precision variant of the ARMv7-M **Floating-Point Unit** (Fpv4-SP). The hardware FPU implementation adds an extra floating-point register bank consisting of S0–S31 and some other FPU registers. This FPU register set represents additional context that need to be **preserved** across interrupts and task switching (e.g., in the preemptive QK kernel).

The Cortex-M4F has a very interesting feature called **lazy stacking** [ARM AN298]. This feature avoids an increase of interrupt latency by skipping the stacking of floating-point registers, if not required, that is:

- if the interrupt handler does not use the FPU, or
- if the interrupted program does not use the FPU.

If the interrupt handler has to use the FPU and the interrupted context has also previously used by the FPU, then the stacking of floating-point registers takes place at the point in the program where the interrupt handler first uses the FPU. The lazy stacking feature is programmable and by default it is turned **ON**. All QP ports to Cortex-M4F (both the cooperative Vanilla port and the preemptive QK port) are designed to **take advantage of the lazy stacking feature**.

Not only does the QK port work with the lazy FPU stacking, but, the preemptive QK kernel offers very significant advantages both in time (CPU cycles) and stack space, compared to any traditional blocking RTOS. Please refer to Section 4.2 for details of preserving the FPU context in the QK kernel.



#### 1.1.4 Cortex Microcontroller Software Interface Standard (CMSIS)

The ARM Cortex examples provided with this Application Note are compliant with the Cortex Microcontroller Software Interface Standard (CMSIS).



## 1.2 About QP™

**QP™** is a family of very lightweight, open source, state machine-based frameworks for developing event-driven applications. QP enables building well-structured embedded applications as a set of concurrently executing hierarchical state machines (UML statecharts) directly in C or C++ **without big tools**. QP is described in great detail in the book "Practical UML Statecharts in C/C++, Second Edition: Event-Driven Programming for Embedded Systems" [PSiCC2] (Newnes, 2008).

As shown in Figure 4, QP consists of a universal UML-compliant event processor (QEP), a portable real-time framework (QF), a tiny run-to-completion kernel (QK), and software tracing instrumentation (QS). Current versions of QP include: QP/C™ and QP/C++™, which require about 4KB of code and a few hundred bytes of RAM, and the ultralightweight QP-nano, which requires only 1-2KB of code and just several bytes of RAM.



Figure 4: QP components and their relationship with the target hardware, board support package (BSP), and the application



QP can work with or without a traditional RTOS or OS. In the simplest configuration, QP can completely **replace** a traditional RTOS. QP includes a simple non-preemptive scheduler and a fully preemptive kernel (QK). QK is smaller and faster than most traditional preemptive kernels or RTOS, yet offers fully deterministic, preemptive execution of embedded applications. QP can manage up to 63 concurrently executing tasks structured as state machines (called active objects in UML).

QP/C and QP/C++ can also work with a traditional OS/RTOS to take advantage of existing device drivers, communication stacks, and other middleware. QP has been ported to Linux/BSD, Windows, VxWorks, ThreadX, uC/OS-II, FreeRTOS.org, and other popular OS/RTOS.



#### 1.3 About QM™

**QM™** (QP™ Modeler) is a free, cross-platform, graphical UML modeling tool for designing and implementing real-time embedded applications based on the QP™ state machine frameworks. QM™ itself is based on the Qt framework and therefore runs naively on Windows, Linux, and Mac OS X.

QM<sup>™</sup> provides intuitive diagramming environment for creating good looking hierarchical state machine diagrams and hierarchical outline of your entire application. QM<sup>™</sup> eliminates coding errors by automatic generation of compact C or C++ code that is 100% traceable from your design. Please visit <u>state-machine.com/qm</u> for more information about QM<sup>™</sup>.

The code accompanying this App Note contains three application examples: the Dining Philosopher Problem [AN-DPP], the PEdestrian Light CONtrolled [AN-PELICAN] crossing, and the "Fly 'n' Shoot" game simulation for the EK-LM3S811 board (see Chapter 1 in [PSiCC2] all modeled with QM.



NOTE: The provided QM model files assume QM version 3.0.0 or higher.

D:\software\qpn\examples\msp430\ccs\pelican-eZ430-RF2500\pelican.qm - - X File Edit View Window Help 🍅 🖺 (🍃 🔚 🖠 🖰 🖺 🖺 🖺 🕒 💮 R P 1 0 0 0 ♂ X Statechart of Ped 🗵 Statechart of Pelican Explorer Property Editor: State name: pedswalk pedsEnabled carsEnabled superstate: pedsEnabled 🖃 🥮 pelican exit / 🕂 💶 qpn entry: PEDS\_WALK components BSP\_showState(*"pedsWalk"*); BSP\_signalPeds(PEDS\_WALK); QActive\_arm((QActive \*)me, carsGreen nedsWalk 🖹 🖥 Pelican ARS GREEN entry / PEDS WALK 🕯 🛊 flashCtr : uint8\_t exit / d ctor: void TIMEOUT 🖹 🕞 Statechart -> operational sGreenNoPed 🖨 🔲 operational WAITING -> carsEnabled pedsFlash ¬ OFF EOUT exit / □ carsEnabled exit: -> carsGreen arsGreenInt Q TIMEOUT QActive\_disarm((QActive \*) = carsGreen [me->flashCtr != 0] / •↓ ->carsG. WAITING carsGre... GreenPedWait → PED... → Q\_TI... EOUT 🛨 🔲 carsGre... -± carsGre... BSP\_signalPeds(PEDS\_BLANK) carsYellow carsYellow [(me->flashCtr & 1) == 0] / ¬ -> pedsWalk CARS YELLOW BSP\_signalPeds(PEDS\_DONT\_WALK Bird's Eye View ₽× pedsWalk ¬ O TIME. ⊕ pedsFlash offline ⊕ Ped 🗬 AO\_Pelican : QActive \* ė. 🗐 INFO> Code generation started (06:05:16.097 pm)
INFO> entire model D:\software\qpn\examples\msp430\ccs\pelican-eZ
430-RFZ500\pelican.qm
INFO> Code generation ended (time elapsed 0.112s)
INFO> 0 file(s) generated, 4 file(s) processed, 0 error(s), and 0 n pelican.h c main.c pelican.c c ped.c  $\Theta$ **Q Q** Ready 

Figure 5: The PELICAN example model opened in the QM™ modeling tool



## 1.4 Licensing QP

The **Generally Available (GA)** distributions of QP available for download from the <u>www.state-machine.com/downloads</u> website are offered under the same licensing options as the QP baseline code. These available licenses are:

- The GNU General Public License version 2 (GPL) as published by the Free Software Foundation and appearing in the file GPL.TXT included in the packaging of every Quantum Leaps software distribution. The GPL *open source* license allows you to use the software at no charge under the condition that if you redistribute the original software or applications derived from it, the complete source code for your application must be also available under the conditions of the GPL (GPL Section 2[b]).
- One of several Quantum Leaps commercial licenses, which are designed for customers who wish to retain the proprietary status of their code and therefore cannot use the GNU General Public License. The customers who license Quantum Leaps software under the commercial licenses do not use the software under the GPL and therefore are not subject to any of its terms.

For more information, please visit the licensing section of our website at: <a href="www.state-machine.com/licensing">www.state-machine.com/licensing</a>.



open source

## 1.5 Licensing QM™

The QM™ graphical modeling tool available for download from the <a href="www.state-machine.com/downloads">www.state-machine.com/downloads</a> website is **free** to use, but is not open source. During the installation you will need to accept a basic End-User License Agreement (EULA), which legally protects Quantum Leaps from any warranty claims, prohibits removing any copyright notices from QM, selling it, and creating similar competitive products.





# 2 Directories and Files

The code for the QP port to ARM Cortex-M with the IAR EWARM toolset is part of the standard QP distribution, which also contains example applications. Specifically, for this port the files are placed in the following directories:

Listing 2: Directories and files pertaining to the ARM Cortex-M QP port with IAR EWARM included in the standard QP distribution.

```
- QP/C directory (qpcpp for QP/C++)
qpc/
 +-include/
                    - QP public include files
                  - QP platform-independent public include
 | +-qassert.h
 | +-qevt.h
                    - QEvt declaration
 | +-qep.h
                   - QEP platform-independent public include
                   - OF platform-independent public include
 | +-af.h
 | +- . . .
                    - QP platform-dependent public include
 | +-qp port.h
                    - QP ports
 +-ports/
 | +-arm-cm/
                    - ARM-Cortex-M port
 | +-cmsis/
                    - CMSIS (Cortex-M Software Interface Standard)
 | | +-core cm0plus.h
 | | +- . . .
 | | +-qk/
                  - QK (Quantum Kernel) ports
 - IAR ARM compiler
 | | | | +-libqp_cortex-m3.a - QP library for Cortex-M3
| | | | +-libqp_cortex-m4f.a - QP library for Cortex-M4F
 - Release build
 | | | +-make cortex-m3.bat - Batch file to build QP libraries for Cortex-M3
 | | | +-make cortex-m4f.bat - Batch file to build QP libraries for Cortex-M4F
 - "vanilla" ports
 - IAR ARM compiler
 - Debug build
  | | | +-libqp_cortex-m3.a - QP library for Cortex-M3
 | | | | +-libqp_cortex-m4f.a - QP library for Cortex-M4F
              - Release build
- Spy build
 | | | +-make cortex-m3.bat - Batch file to build QP libraries for Cortex-M3
 | | | +-make cortex-m4f.bat - Batch file to build QP libraries for Cortex-M4F
```





```
+-examples/
                               - subdirectory containing the QP example files
| +-arm-cm/
                               - ARM Cortex-M port
| | +-qk/
                              - QK examples (preemptive kernel)
| | +-iar/ - IAR ARM compiler
| | | +-dpp-qk_ek-tm4c123gxl/ - DPP example for EK-TM4C123GXL (Cortex-M4F)
| | | | +-dbg/ - directory containing the Debug build
| | | | +-rel/ - directory containing the Release build
| | | | +-spy/ - directory containing the Spy build
| | | | +-dpp.eww - IAR workspace for the IAR Embedded Workbench
| | | | +-tm4c123gh6pm.icf- linker command file for TM4C123GH6PM MCU
| | | | +-bsp.c - Board Support Package for the DPP application | | | | +-bsp.h - BSP header file
                             - BSP header file
| | | | +-dpp.qm - the DPP model file for QM
| | | | +-dpp.h - the DPP header file
| | | | +-main.c - the main function
| | | | +-philo.c - the Philosopher active object
| | | | +-table.c - the Table active object
I I I I I I
| | | +-dpp-qk ek-lm3s811/ - Dining Philosophers example for EK-LM3S811
| | | | +-dbg/ - directory containing the Debug build
| | | | +-rel/ - directory containing the Release build
| | | | +-spy/ - directory containing the Spy build
| | | | +-dpp.eww - IAR workspace for the IAR Embedded Workbench
|\ |\ |\ |\ +-lm3s811.icf - linker command file for LM3S811 MCU
| | | | +-lm3s config.h - CMSIS-compliant configuration for LM3Sxx MCUs
| | | | +-bsp.c - Board Support Package for the DPP application | | | | +-bsp.h - BSP header file
| | | | +-dpp.qm - the DPP model file for QM
1 | 1 | 1 | . . .
| | | +-qame-qk ek-lm3s811/ - "Fly 'n' Shoot" game example for EK-LM3s811
| | | | +-. . .
| | | | +-game.eww - IAR workspace for the IAR Embedded Workbench
| | | | +-lm3s811.icf - linker command file for LM3S811 MCU
| | | | +-bsp.c - Board Support Package for this application
- Board Support Package for this application
- BSP header file
- the "Fly 'n' Shoot" game model file for QM
- the game header file
- the game header file
- the main function
- the Missile active object
- the Ship active object
- the Tunnel active object
I I I
- "vanilla" examples (non-preemptive scheduler of QF)
| | | +-iar/ - IAR EWARM compiler
| | | +-dpp ek-tm4c123gx1/ - DPP example for EK-TM4C123GXL (Cortex-M4F)
1 | 1 | 1 | 1 . . .
1 | | | | | . . .
11111...
```



## 2.1 Building the QP Libraries

All QP components are deployed as libraries that you statically link to your application. The pre-built libraries for QEP, QF, QS, and QK are provided inside the <qp>\ports\arm-cm\ directory (see Listing 2). This section describes steps you need to take to rebuild the libraries yourself.

**NOTE:** To achieve commonality among different development tools, Quantum Leaps software does not use the vendor-specific IDEs, such as the IAR Embedded Workbench IDE, for building the QP libraries. Instead, QP supports *command-line* build process based on simple batch scripts.

The code distribution contains the batch file  $make\_<core>.bat$  for building all the libraries located in the  $<qp>\ports\arm-cm\...$  directory. For example, to build the debug version of all the QP libraries for , with the IAR ARM compiler, QK kernel, you open a console window on a Windows PC, change directory to  $<qp>\ports\arm-cm\qk\iar\n$ , and invoke the batch by typing at the command prompt the following command:

make cortex-m3

The build process should produce the QP library in the location:  $\qp>\ports\arm-cm\qk\iar\dbg\.$  The make.bat files assume that the ARM toolset has been installed in the directory C:\tools\IAR\-ARM 6.5.

**NOTE:** You need to adjust the symbol IAR\_ARM at the top of the batch scripts if you've installed the IAR ARM compiler into a different directory.

In order to take advantage of the QS ("spy") instrumentation, you need to build the QS version of the QP libraries. You achieve this by invoking the make cortex-m3.bat utility with the "spy" target, like this:

The make process should produce the QP libraries in the directory: <qp>\ports\arm-cm\vanilla\iar-\spy\.

You choose the build configuration by providing a target to the <code>make\_cortex-m3.bat</code> utility. The default target is "dbg". Other targets are "rel", and "spy" respectively. The following table summarizes the targets accepted by <code>make\_cortex-m3.bat</code>.

Table 1: Make targets for the Debug, Release, and Spy software configurations

| Software Version | Build command                             |
|------------------|-------------------------------------------|
| Debug (default)  | make_cortex-m3<br>make_cortex-m4f         |
| Release          | make_cortex-m3 rel<br>make_cortex-m4f re  |
| Spy              | make_cortex-m3 spy<br>make_cortex-m4f spy |



## 2.2 Building and Debugging the Examples

The example applications for have been tested with the EK-LM3S811 evaluation board from Texas Instruments (see Figure 1) and the IAR EWARM toolset. The examples contain the IAR EWARM workspaces, so that you can conveniently build and debug the examples from the IAR IDE. The provided IAR workspaces support building the Debug, Release, and Spy configurations.

**NOTE:** The provided Make files also assume that you have defined the environment variable QPC, if you are using the QP/C framework or the environment variable QPCPP, if you are using the QP/C++ framework. These environment variables must contain the paths to the installation directories of the QP/C and QP/C++ frameworks, respectively.

Defining the QP framework locations in environment variables allows you to locate your application in any directory or file system, regardless of the relative path to the QP frameworks.



Figure 6: Debugging the DPP example the IAR EWARM IDE



## 3 The Vanilla Port

The "vanilla" port shows how to use QP™ on a "bare metal" -based system with the cooperative "vanilla" kernel. In the "vanilla" version of the QP, the only component requiring platform-specific porting is the QF. The other two components: QEP and QS require merely recompilation and will not be discussed here. With the vanilla port you're not using the QK component.

## 3.1 The qep\_port.h Header File

The QEP header file for the port is located in <qp>\¬ports\¬arm-cm\vanilla\iar\qep\_port.h.

Listing 3 shows the qep\_port.h header file for IAR. The IAR compiler is a standard C99 compiler, so I simply include the <stdint.h> header file that defines the platform-specific exact-with integer types.

### Listing 3: The qep port.h header file for /IAR.

#### 3.2 The QF Port Header File

The QF header file for the port is located in  $qp>\operatorname{ports\_arm-cm\_vanilla\_iar\_qf\_port.h}$ . This file specifies the interrupt locking/unlocking policy (QF critical section) as well as the configuration constants for QF (see Chapter 8 in [PSiCC2]).

The most important porting decision you need to make in the  $qf\_port.h$  header file is the policy for locking and unlocking interrupts. The allows using the simplest "unconditional interrupt unlocking" policy (see Section 7.3.2 of the book "Practical UML Statecharts in C/C++, Second Edition" [PSiCC2]), because is equipped with the standard nested vectored interrupt controller (NVIC) and generally runs ISRs with interrupts unlocked. Listing 4 shows the  $qf\_port.h$  header file for /IAR.

#### Listing 4 The gf port.h header file for /IAR.

```
^{\prime \star} The maximum number of active objects in the application, see NOTE1 ^{\star \prime}
(1) #define QF MAX ACTIVE
                                         32
                                        /* The number of system clock tick rates */
(2) #define QF MAX TICK RATE
                                    /* QF interrupt disable/enable and log2()... */
(3) #if ( CORE == ARM6M )
                                                 /* Cortex-M0/M0+/M1 ?, see NOTE2 */
                                         __disable_interrupt()
(4)
        #define QF INT DISABLE()
(5)
        #define QF INT ENABLE()
                                         enable interrupt()
        /* QF-aware ISR priority for CMSIS function NVIC SetPriority(), NOTE2
        #define QF AWARE ISR CMSIS PRI 0
(6)
                               /* macro to put the CPU to sleep inside QF idle() */
        #define QF CPU SLEEP() do { \
(7)
              WFI(); \
            QF INT ENABLE(); \
        } while (0)
```



```
/* Cortex-M3/M4/M4F, see NOTE3 */
(8) #else
        #define QF_INT_DISABLE()
#define QF_INT_ENABLE()
                                          __set_BASEPRI(QF BASEPRI)
(9)
                                          __set_BASEPRI(0U)
(10)
                         /* BASEPRI limit for QF-aware ISR priorities, see NOTE4 */
(11)
        #define QF BASEPRI (0xFFU >> 2)
         /* QF-aware ISR priority for CMSIS function NVIC SetPriority(), NOTE5
         #define QF AWARE ISR CMSIS PRI (QF BASEPRI >> (\overline{8} - NVIC PRIO BITS))
(12)
                          /* macro to put the CPU to sleep inside QF idle() */
         #define QF CPU SLEEP() do { \
(13)
               WFI(); \
             QF INT ENABLE(); \
         } while (0)
                   /* Cortex-M3/M4/M4F provide the CLZ instruction for fast LOG2 */
         \#define QF_LOG2(n_) ((uint8_t)(32U - CLZ(n )))
(14)
     #endif
                                             /* QF critical section entry/exit... */
(15) /* QF CRIT STAT TYPE not defined: unconditional interrupt unlocking" policy */
(16) #define QF CRIT ENTRY (dummy)
                                       QF INT DISABLE()
(17) #define QF CRIT EXIT (dummy)
                                         QF INT ENABLE()
(18) #define QF_CRIT_EXIT NOP()
                                         __no_operation()
(17) #include <intrinsics.h>
                                                       /* IAR intrinsic functions */
                                                                      /* QEP port */
(18) #include "qep port.h"
(19) #include "qvanilla.h"
                                                  /* "Vanilla" cooperative kernel */
(20) #include "qf.h"
                                      /* QF platform-independent public interface */
```

(1) The QF\_MAX\_ACTIVE specifies the maximum number of active object priorities in the application. You always need to provide this constant. Here, QF\_MAX\_ACTIVE is set to 32 to save some memory. You can increase this limit up to the maximum limit of 63 active object priorities in the system.

**NOTE:** The qf\_port.h header file does not change the default settings for all the rest of various object sizes inside QF. Please refer to Chapter 8 of [PSiCC2] for discussion of all configurable QF parameters.

- (2) The QF\_MAX\_TICK\_RATE specifies the maximum number of clock tick rates for QP time events. If you don't need to specify this limit, in which case the default of a single clock rate will be chosen.
- (3) As described in Section 1.1.1, the interrupt disabling policy for the ARMv6-M architecture (Cortex-M0/M0+) is different than the policy for the ARMv7-M. The macro \_\_core\_\_ is defined as \_\_arm6M\_\_ based on the command-line parameters for the Cortex-M0/M0+.
- (4) For the ARMv6-M architecture, the interrupt disabling policy uses the PRIMASK register to disable interrupts globally. The QF\_INT\_DISABLE() macro resolves in this case to the intrinsic IAR function disable interrupt(), which in turn generates the single "CPSD i" Thumb2 instruction.
- (5) For the ARMv6-M architecture, the QF\_INT\_ENABLE() macro resolves to the intrinsic IAR function enable interrupt(), which in turn generates the single "CPSE i" Thumb2 instruction.



- (6) For the ARMv6-M architecture, the QF\_AWARE\_ISR\_CMSIS\_PRI priority level is defined as zero, meaning that all interrupts are "kernel-aware", because all interrupt priorities are disabled by the kernel.
- (7) The macro QF\_CPU\_SLEEP() specifies how to enter the CPU sleep mode safely in the cooperative Vanilla kernel (see also Section 3.5). For the ARMv6-M architecture, the macro QF\_CPU\_SLEEP() first stops the CPU with the WFI instruction (Wait For Interrupt) and after the CPU is woken up by an interrupt, re-enables interrupts with the PRIMASK. This is possible, because the ARM Cortex-M CPU can be woken up by an interrupt, even though PRIMASK is set.
- (8) As described in Section 1.1.1, the interrupt disabling policy for the ARMv7-M architecture (Cortex-M3/M4/M4F) uses the BASEPRI register.
- (9) For the ARMv7-M architecture, the QF\_INT\_DISABLE() macro resolves to the intrinsic IAR function \_\_set\_BASEPRI(QF\_BASEPRI), which sets the BASEPRI register to the value specified in QF BASEPRI argument (see step (10) below).
- (10) For the ARM7-M architecture, the QF\_INT\_ENABLE() macro resolves to the intrinsic IAR function \_\_set\_BASEPRI(0U), which disables BASEPRI interrupt masking.
- (11) The QF\_BASEPRI value is defined such that it is the lowest priority for the minimum number of 3 priority-bits that the ARM7-M architecture must provide. This partitions the interrupts as "kernel-unaware" and "kernel-aware" interrupts, as shown in Figure 2 and Figure 3.
- (12) For the ARMv7-M architecture, the QF\_AWARE\_ISR\_CMSIS\_PRI priority level suitable for the CMSIS function NVIC SetPriority() is determined by the QF BASEPRI value.
- (13) The macro QF\_CPU\_SLEEP() specifies how to enter the CPU sleep mode safely in the cooperative Vanilla kernel (see also Section 3.5). For the ARMv7-M architecture, the macro QF\_CPU\_SLEEP() first disables interrupts by setting the PRIMASK, then clears the BASEPRI to enable all "kernel-aware" interrupts and only then stops the CPU with the WFI instruction (Wait For Interrupt). After the CPU is woken up by an interrupt, interrupts are re-enabled with the PRIMASK. This sequence is necessary, because the ARM Cortex-M3/M4 cores cannot be woken up by any interrupt blocked by the BASEPRI register.
- (14) The macro QF\_LOG2 () is defined to take advantage of the CLZ instruction (Count Leading Zeroes), which is available in the ARMv7-M architecture.

**NOTE:** The CLZ instruction is not implemented in the Cortex-M0/M0+/M1 (ARMv6M architecture). If the  $QF\_LOG2$  () macro is not defined, the QP framework will use the log2 implementation based on a lookup table.

- (15) The QF\_CRIT\_STAT\_TYPE is not defined, which means that the simple policy of "unconditional interrupt locking and unlocking" is applied.
- (16) The critical section entry macro disables interrupts by the policy established above
- (17) The critical section exit macro re-enables interrupts by the policy established above
- (18) The macro QF\_CRIT\_EXIT\_NOP() provides the protection against merging two critical sections occurring back-to-back in the QP code.
- (19) The IAR header file <intrinsics.h> declares the prototypes of the interrupt locking/unlocking functions.
- (18) This QF port uses the QEP event processor for implementing active object state machines.
- (19) This QF port uses the cooperative "vanilla" kernel.
- (20) The QF port must always include the platform-independent qf.h header file.



## 3.3 Handling Interrupts in the Non-Preemptive Vanilla Kernel

has been specifically designed to enable writing ISRs as plain C-functions, without any special interrupt entry or exit requirements. These ISRs are perfectly adequate for the non-preemptive Vanilla kernel.

Typically, ISRs are not part of the generic QP port, because it's much more convenient to define ISRs at the application level. The following listing shows all the ISRs in the DPP example application. Please note that the SysTick\_Handler() ISR calls the QF\_tickX() to perform QF time-event management. (The SysTick\_Handler() updates also the timestamp used in the QS software tracing instrumentation, see the upcoming Section 5).

**NOTE:** This Application Note complies with the CMSIS standard, which dictates the names of all exception handlers and IRQ handlers.

```
void SysTick_Handler(void) {
    . . .
    QF_TICK_X(OU, &l_SysTick_Handler);    /* process all armed time events */
    . . .
}
```

## 3.3.1 The Interrupt Vector Table

The CMSIS-compliant file startup\_ewarm.c contains an interrupt vector table (also called the exception vector table) starting usually at address 0x00000000, typically in ROM. The vector table contains the initialization value for the main stack pointer on reset, and the entry point addresses for all exception handlers. The exception number defines the order of entries in the vector table.

ARM-Cortex-M architecture requires you to place the initial Main Stack pointer and the addresses of all exception handlers and ISRs into the Interrupt Vector Table allocated typically in ROM. In the IAR compiler, the IDT is initialized in the startup\_ewarm.c C-language module located in the CMSIS directory.

Listing 5: The interrupt vector table defined in startup ewarm.c (IAR compiler).

```
//***********************
  // Reserve space for the system stack.
  (1) static unsigned long pulStack[STACK SIZE/sizeof(unsigned long)] @ ".noinit";
  //***************************
  // A union that describes the entries of the vector table. The union is needed
  // since the first entry is the stack pointer and the remainder are function
  // pointers.
  //****************************
  typedef union {
     void (*pfnHandler) (void);
     unsigned long ulPtr;
  } uVectorEntry;
  //*********************************
  // The vector table. Note that the proper constructs must be placed on this to
  // ensure that it ends up at physical address 0x0000.0000.
(2) __root const uVectorEntry __vector_table[] @ ".intvec" = {
    { .ulPtr = (unsigned long)pulStack + sizeof(pulStack) },
```



```
// The reset handler
(4)
         iar program start,
       NMI Handler,
                                                // The NMI handler
       HardFault Handler,
                                                // The hard fault handler
       0,
                                                // Reserved
                                                // SVCall handler
       SVC Handler,
       DebugMon Handler,
                                                // Debug monitor handler
       Ο,
                                                // Reserved
                                                // The PendSV handler
       PendSV Handler,
       SysTick Handler,
                                                // The SysTick handler
       // External Interrupts
                                               // GPIO Port A
       GPIOPortA IRQHandler,
       GPIOPortB IRQHandler,
                                                // GPIO Port B
   };
```

(1) The main stack is statically allocated in the .noinit section.

**NOTE:** You need to define this macro **STACK\_SIZE** (typically on the command line) for your specific application. All QP ports, including the Vanilla port and the QK port use only the main stack (the C-stack). User stack pointer is not used at all.

- (2) The vector table is allocated at using the root extended keyword.
- (3) The first element of the Cortex-M vector table is the top of the C-stack.
- (4) The reset vector points to the startup code, which in case of the IAR toolset is the iar program start() entry point.

## 3.4 Using the FPU in the "Vanilla" Port (Cortex-M4F)

If you have the Cortex-M4F CPU and your application uses the hardware FPU, it should be enabled because it is turned off out of reset. The CMSIS-compliant way of turning the FPU on looks as follows:

```
SCB->CPACR \mid = (0xFU << 20);
```

**NOTE:** The FPU must be enabled before executing any floating point instruction. An attempt to execute a floating point instruction will fault if the FPU is not enabled.

Depending on wheter or not you use the FPU in your ISRs, the "Vanilla" QP port allows you to configure the FPU in various ways, as described in the following sub-sections.

#### 3.4.1 FPU NOT used in the ISRs

If you use the FPU only at the task-level (inside active objects) and **none** of your ISRs use the FPU, you can setup the FPU **not** to use the automatic state preservation and **not** to use the lazy stacking feature as follows:

```
FPU->FPCCR &= ~((1U << FPU FPCCR ASPEN Pos) | (1U << FPU FPCCR LSPEN Pos));
```

With this setting, the Cortex-M4F processor handles the ISRs in the exact-same way as Cortex-M0-M3, that is, only the standard interrupt frame with R0-R3,R12,LR,PC,xPSR is used. This scheme is the fastest and incurs no additional CPU cycles to save and restore the FPU registers.

NOTE: This FPU setting will lead to FPU errors, if any of the ISRs indeed starts to use the FPU



#### 3.4.2 FPU used in the ISRs

If you use the FPU both at the task-level (inside active objects) and in any of your ISRs as well, you should setup the FPU to use the automatic state preservation and the lazy stacking feature as follows:

```
FPU->FPCCR |= (1U << FPU FPCCR ASPEN Pos) | (1U << FPU FPCCR LSPEN Pos);
```

This will enable the "lazy stacking feature" of the Cortex-M4F processor. The the "automatic state saving" and "lazy stacking" are enabled by default, so you typically don't need to change these settings.

**NOTE:** As described in the ARM Application Note "Cortex-M4(F) Lazy Stacking and Context Switching" [ARM AN298], the FPU automatic state saving requires **more stack** plus additional CPU time to save the FPU registers, but only when the FPU is actually used.

## 3.5 Idle Loop Customization in the "Vanilla" Port

As described in Chapter 7 of [PSiCC2], the "vanilla" port uses the non-preemptive scheduler built into QF. If no events are available, the non-preemptive scheduler invokes the platform-specific callback function  $QF\_onIdle()$ , which you can use to save CPU power, or perform any other "idle" processing (such as Quantum Spy software trace output).

**NOTE:** The idle callback  $QF\_onIdle()$  must be invoked with interrupts disabled, because the idle condition can be changed by any interrupt that posts events to event queues.  $QF\_onIdle()$  must internally enable interrupts, ideally atomically with putting the CPU to the power-saving mode (see also Chapter 7 in [PSiCC2]).

Because  $QF\_onIdle()$  must enable interrupts internally, the signature of the function depends on the interrupt locking policy. In case of the simple "unconditional interrupt locking and unlocking" policy, which is used in this port, the  $QF\_onIdle()$  takes no parameters.

Listing 6 shows an example implementation of QF\_onIdle() for the Stellaris MCU. Other embedded microcontrollers (e.g., ST's STM32) handle the power-saving mode very similarly.

### Listing 6: QF\_onldle() callback.

- (1) The cooperative Vanilla kernel calls the QF\_onIdle() callback with interrupts disabled, to avoid race condition with interrupts that can post events to active objects and thus invalidate the idle condition.
- (2) The sleep mode is used only in the non-debug configuration, because sleep mode stops CPU clock, which can interfere with debugging.



- (3) The Thumb2 instruction WFI (Wait for Interrupt) stops the CPU clock. Note that the CPU stops executing at this line and that interrupts are still **disabled**. An active interrupt first starts the CPU clock again, so the CPU starts executing again. Only after unlocking interrupts in line (4) the interrupt that woke the CPU up is serviced.
- (4-5) The QF onIdle() callback must re-enable interrupts in every path through the code.

 $\textbf{NOTE:} \ \ \textbf{The idle callback} \ \ \texttt{QF} \ \ \ \texttt{onIdle} \ (\textbf{)} \ \ \textbf{must unlock interrupts in every path through the code}.$ 



## 4 The QK Port

This section describes how to use QP on with the **preemptive** QK real-time kernel described in Chapter 10 of [PSiCC2]. The benefit is very fast, fully deterministic task-level response and that execution timing of the high-priority tasks (active objects) will be virtually insensitive to any changes in the lower-priority tasks. The downside is bigger RAM requirement for the stack. Additionally, as with any preemptive kernel, you must be very careful to avoid any sharing of resources among concurrently executing active objects, or if you do need to share resources, you need to protect them with the QK priority-ceiling mutex (again see Chapter 10 of [PSiCC2]).

**NOTE:** The preemptive configuration with QK uses **more stack** than the non-preemptive "Vanilla" configuration. You need to adjust the size of this stack to be large enough for your application.

## 4.1 Single-Stack, Preemptive Multitasking on ARM Cortex-M

The ARM Cortex-M architecture provides a rather unorthodox way of implementing preemptive multitasking, which is designed primarily for the traditional real-time kernels that use multiple per-task stacks. This section explains how the run-to-completion preemptive QK kernel works on ARM Cortex-M .

- The ARM Cortex-M processor executes application code in the Privileged Thread mode, which is exactly the mode entered out of reset. The exceptions (including all interrupts) are always processed in the Privileged Handler mode.
- 2. QK uses only the Main Stack Pointer (QK is a single stack kernel). The Process Stack Pointer is not used and is not initialized.
- 3. The QK port uses the PendSV (exception number 14) and the SVCall (exception number 11) to perform asynchronous preemptions and context switch, respectively (see Chapter 10 in [PSiCC2]). The application code (your code) **must** initialize the Interrupt Vector Table with the addresses of PendSV\_Handler and SVCall\_Handler exception handlers. Additionally, the interrupt table must be initialized with the SysTick handler that calls QF tick().
- 4. The application code (your code) **must** call the function <code>QK\_init()</code> to set the priority of the PendSV exception to the lowest level in the whole system (0xFF), and the priority of SVCall to the highest in the system (0x00). The function <code>QK\_init()</code> sets the priorities of exceptions 14 and 11 to the numerical values of 0xFF and 0x00, respectively. The priorities are set with interrupts disabled, but the interrupt status is restored upon the function return.

**NOTE:** The Stellaris ARM Cortex-M silicon supports only 3 most-significant bits of priority, therefore writing 0xFF to a priority register reads back 0xE0. See also Section 1.1.2.

- 5. It is strongly recommended that you do **not** assign the lowest priority (0xFF) to any interrupt in your application. With 3 MSB-bits of priority, this leaves the following 7 priority levels for you (listed from the lowest to the highest urgency): 0xC0, 0xA0, 0x80, 0x60, 0x40, 0x20, and 0x00 (the highest priority).
- 6. Every ISR **must** set the pending flag for the PendSV exception in the NVIC. This is accomplished in the macro QK\_ISR\_EXIT(), which **must** be called just before exiting from all ISRs (see upcoming Section 4.3.1).
- 7. ARM Cortex-M enters interrupt context without locking interrupts (without setting the PRIMASK bit). Generally, you should not lock interrupts inside ISRs. In particular, the QF services QF\_publish(), QF\_tick(), and QActive\_postFIFO() should be called with interrupts enabled, to avoid nesting of critical sections.



**NOTE:** If you don't wish an interrupt to be preempted by another interrupt, you can always prioritize that interrupt in the NVIC to a higher level (use a lower numerical value of priority).

- 8. In the whole prioritization of interrupts, including the PendSV exception, is performed entirely by the NVIC. Because the PendSV has the lowest priority in the system, the NVIC tail-chains to the PendSV exception only after exiting the last nested interrupt.
- 9. The restoring of the 8 registers comprising the interrupt stack frame in PendSV is wasteful in a single-stack kernel (see Listing 8(3) and (8)), but is necessary to perform full interrupt return from PendSV to signal End-Of-Interrupt to the NVIC.
- 10. The pushing of the 8 registers comprising the interrupt stack frame upon entry to SVCall is wasteful in a single-stack kernel (see Figure 7(10) and (12)), but is necessary to perform full interrupt return to the preempted context through the SVCall's return.
- 11. For Cortex-M4F processors with hardware FPU, the application can choose between two policies of using the FPU

## 4.1.1 Examples of Various Preemption Scenarios in QK

Figure 7 illustrates several preemption scenarios in QK.



Figure 7: Various preemption scenarios in the QK preemptive kernel for .

- (0) The time line in Figure 7 begins with the QK executing the idle loop.
- (1) At some point an interrupt occurs and the CPU immediately suspends the idle loop, pushes the interrupt stack frame to the Main Stack and starts executing the ISR.
- (2) The ISR performs its work, and in QK always sets the pending flag for the **PendSV** exception in the NVIC. The priority of the PendSV exception is configured to be the lowest of all exceptions, so the ISR continues executing and PendSV exception remains pending. At the ISR return, the CPU performs tail-chaining to the pending PendSV exception.



- (3) The whole job of the PendSV exception is to synthesize an interrupt stack frame on top of the stack and perform an interrupt return.
- (4) The PC (exception return address) of the synthesized stack frame is set to <code>QK\_schedule()</code> (more precisely to a thin wrapper around <code>QK\_schedule()</code>, see Section 4.4), so the PendSV exception returns to the QK scheduler. The scheduler discovers that the Low-priority task is ready to run (the ISR has posted event to this task). The QK scheduler enables interrupts and launches the Low-priority task, which is simply a C-function call in QK. The Low-priority task (active object) starts running. Some time later another interrupt occurs. The Low-priority task is suspended and the CPU pushes the interrupt stack frame to the Main Stack and starts executing the ISR
- (5) The Low-priority ISR runs and sets the pending flag for the PendSV exception in the NVIC. Before the Low-priority ISR completes, it too gets preempted by a High-priority ISR. The CPU pushes another interrupt stack frame and starts executing the High-priority ISR.
- (6) The High-priority ISR again sets the pending flag for the PendSV exception (setting an already set flag is not an error). When the High-priority ISR returns, the NVIC does not tail-chain to the PendSV exception, because a higher-priority ISR than PendSV is still active. The NVIC performs the normal interrupt return to the preempted Low-priority interrupt, which finally completes.
- (7) Upon the exit from the Low-priority ISR, the NVIC performs tail-chaining to the pending PendSV exception
- (8) The PendSV exception synthesizes an interrupt stack frame to return to the QK scheduler.
- (9) The QK scheduler detects that the High-priority task is ready to run and launches the High-priority task (normal C-function call). The High-priority task runs to completion and returns to the scheduler. The scheduler does not find any more higher-priority tasks to execute and needs to return to the preempted task. The only way to restore the interrupted context in is through the interrupt return, but the task is executing outside of the interrupt context (in fact, tasks are executing in the Privileged Thread mode). The task enters the Handler mode by causing the synchronous **SVCall** exception
- (10) The only job of the SVCall exception is to discard its own interrupt stack frame and return using the interrupt stack frame that has been on the stack from the moment of task preemption
- (11) The Low-priority task, which has been preempted all that time, resumes and finally runs to completion and returns to the QK scheduler. The scheduler does not find any more tasks to launch and causes the synchronous SVCall exception
- (12) The SVCall exception discards its own interrupt stack frame and returns using the interrupt stack frame from the preempted task context

## 4.2 Using the FPU with the preemptive QK kernel (Cortex-M4F)

If you have the Cortex-M4F CPU and your application uses the hardware FPU, it should be enabled because it is turned off out of reset. The CMSIS-compliant way of turning the FPU on looks as follows:

 $SCB->CPACR \mid = (0xFU << 20);$ 

**NOTE:** The FPU must be enabled before executing any floating point instruction. An attempt to execute a floating point instruction will fault if the FPU is not enabled.

Depending on how you use the FPU in your tasks (active objects) and ISRs, the QK QP port allows you to configure the FPU in various ways, as described in the following sub-sections.



## 4.2.1 FPU used in ONE task only and not in any ISRs

If you use the FPU only at a single task (active object) and **none** of your ISRs use the FPU, you can setup the FPU **not** to use the automatic state preservation and **not** to use the lazy stacking feature as follows:

```
FPU->FPCCR &= ~((1U << FPU FPCCR ASPEN Pos) | (1U << FPU FPCCR LSPEN Pos));
```

With this setting, the Cortex-M4F processor handles the ISRs in the exact-same way as Cortex-M0-M3, that is, only the standard interrupt frame with R0-R3,R12,LR,PC,xPSR is used. This scheme is the fastest and incurs no additional CPU cycles to save and restore the FPU registers.

**NOTE:** This FPU setting will lead to **FPU errors**, if more than one task or any of the ISRs indeed start to use the FPU

#### 4.2.2 FPU used in more than one task or the ISRs

If you use the FPU in more than one of the tasks (active objects) or in any of your ISRs, you should setup the FPU to use the automatic state preservation and the lazy stacking feature as follows:

```
FPU->FPCCR |= (1U << FPU_FPCCR_ASPEN_Pos) | (1U << FPU_FPCCR_LSPEN_Pos);
```

This is actually the default setting of the hardware FPU and is **recommended for the QK port**, because it is safer in view of code evolution. Future changes to the application can easily introduce FPU use in multiple active objects, which would be unsafe if the FPU context was not preserved automatically.

**NOTE:** As described in the ARM Application Note "Cortex-M4(F) Lazy Stacking and Context Switching" [ARM AN298], the FPU automatic state saving requires **more stack** plus additional CPU time to save the FPU registers, but only when the FPU is actually used.

#### 4.3 The QK Port Header File

In the QK port, you use very similar configuration as the "Vanilla" port described earlier. This section describes only the differences, specific to the QK component.

**NOTE:** As any **preemptive** kernel, QK needs to be notified about entering the interrupt context and about exiting an interrupt context in order to perform a context switch, if necessary.

#### Listing 7: qk porth.h header file

```
(1) #define QK_ISR_ENTRY() do {
(2)    QF_INT_DISABLE(); \
(3)    ++QK_intNest_; \
(4)    QF_INT_ENABLE(); \
    } while (0)

(5) #define QK_ISR_EXIT() do {
(6)    QF_INT_DISABLE(); \
(7)    --QK_intNest_; \
```



- (1) The QK\_ISR\_ENTRY() macro notifies QK about entering an ISR. The macro body is surrounded by the do {...} while (0) loop, which is the standard way of grouping instructions without creating a dangling-else or other syntax problems. In ARM Cortex-M, this macro is called with interrupts unlocked, because the ARM Cortex-M hardware does not set the PRIMASK upon interrupt entry.
- (2) Interrupts are disabled at the ARM Cortex-M core level to perform the following actions atomically.
- (3) The QK interrupt nesting level QK\_intNest\_ is incremented to account for entering an ISR. This prevents invoking the QK scheduler from event posting functions (such as QACTIVE\_POST() or QACTIVE\_POST\_LIFO()) to perform a synchronous preemption.
- (4) Interrupts are enabled at the ARM Cortex-M core level to allow interrupt preemptions.
- (5) The QK ISR EXIT() macro notifies QK about exiting an ISR.
- (6) Interrupts are disabled at the ARM Cortex-M core level to perform the following actions atomically.
- (7) The QK interrupt nesting level <code>QK\_intNest\_</code> is decremented to account for exiting an ISR. This balances step (3).
- (8) This test calls QK\_schedPrio\_() function to check whether a higher-priority task than the current one exists. If such a task exists, the QK\_schedPrio\_() function returns this priority, otherwise it returns 0.

**NOTE**: This test is repeated in in <code>PendSV\_Handler()</code> before actually invoking the QK scheduler. However, testing this condition sooner is an optimization to avoid the whole pending interrupt generation.

(9) This write to the NVIC\_INT\_CTRL register sets the pending flag for the PendSV exception.

**NOTE:** Setting the pending flag for the <code>PendSV</code> exception in every ISR is absolutely **critical** for proper operation of QK. It really does not matter at which point during the ISR execution this happens. Here the <code>PendSV</code> is pended at the exit from the ISR, but it could as well be pended upon the entry to the ISR, or anywhere in the middle.

- (10) Interrupts are enabled to perform regular exit from the ISR.
- (11) The QK port header file must include the platform-independent QK interface qk.h.

## 4.3.1 The QK Critical Section

The interrupt disabling policy in the QK port is the same as in the vanilla port. Please refer to the earlier Section 3.2 for the description of the critical section implementation.



## 4.4 QK Platform-Specific Code for ARM Cortex-M

The QK port to ARM Cortex-M requires coding the PendSV and SVCall excepitons in assembly. This ARM Cortex-M-specific code is located in the file  $qp>ports\arm-cm\q k iar\q k$  port.s.

### Listing 8: QK init() function for ARM Cortex-M (file qk\_port.s)

```
RSEG CODE: CODE: NOROOT (2)
       PUBLIC QK init
       PUBLIC PendSV_Handler ; CMSIS-compliant PendSV exception name
       PUBLIC SVC Handler ; CMSIS-compliant SVC exception name
       EXTERN QK_schedPrio_ ; external reference
       EXTERN QK sched
                              ; external reference
    ; The QK init function sets the priorities of PendSV and SVCall exceptions
    ; to 0xFF and 0x00, respectively. The function internally disables
    ; interrupts, but restores the original interrupt lock before exit.
    (1) QK init
 (2) MRS
            r0, PRIMASK ; store the state of the PRIMASK in r0
      CPSID i
                             ; disable interrupts (set PRIMASK)
(3)
(4) LDR r1,=0xE000ED18 ; System Handler Priority Register
             r2,[r1,#8]
                           ; load the System 12-15 Priority Register
(5)
     LDR
     MOVS r3, #0xFF
(6)
    r2,r3 ; set PRI_14 (PendSV) to 0xFF

STR r2,[r1,#8] ; write the System 12-15 Priority Register

LDR r2,[r1,#4] ; load the System 8-11 Priority Register

LSLS r3,r3.#8
(7)
(8)
(9)
(10) LDR
; set PRI_11 (SVCall) to 0x00 (13) STR r2,[r1,#4] ; write the Suction 2
(11) LSLS r3, r3, #8
                              ; write the System 8-11 Priority Register
(14) MSR PRIMASK,r0 ; restore the original PRIMASK
(15) BX lr ; return to the caller
```

- (1) The QK\_init() function sets the priorities of the PendSV exception (number 14) the to the lowest level 0xFF. The priority of SVCall exception (number 11) is set to the highest level 0x00 to avoid preemption of this exception.
- (2) The PRIMASK register is stored in r0.
- (3) Interrupts are locked by setting the PRIMASK.
- (4) The address of the NVIC System Handler Priority Register 0 is loaded into r1
- (5) The contents of the NVIC System Handler Priority Register 2 (note the offset of 8) is loaded into r2.
- (6-7) The mask value of 0xFF0000 is synthesized in r3.
- (8) The mask is then applied to set the priority byte PRI\_14 to 0xFF without changing priority bytes in this register.
- (9) The contents of r2 is stored in the NVIC System Handler Priority Register 2 (note the offset of 8).



- (10) The contents of the NVIC System Handler Priority Register 1 (note the offset of 4) is loaded into r2
- (11) The mask value of 0xFF000000 is synthesized in r3.
- (12) The mask is then applied to set the priority byte PRI\_11 to 0x00 without changing priority bytes in this register.
- (13) The contents of r2 is stored in the NVIC System Handler Priority Register 1 (note the offset of 4).
- (14) The original PRIMASK value is restored.
- (15) The function QK init returns to the caller.

#### Listing 9: PendSV Handler() function for ARM Cortex-M (file qk\_port.s).

```
(1) PendSV Handler:
 (2) PUSH {lr}
                          ; push the exception lr (EXC RETURN)
 (3) \#if (\_CORE\_ == \_ARM6M\_); Cortex-M0/M0+/M1?
      CPSID i
                                ; disable interrupts (set PRIMASK)
 (4)
                                ; Cortex-M3/M4/M4F
 (5) #else
 (6) MOVS r0, \#(0xFF >> 2); Keep in synch with QF_BASEPRI in qf_port.h!
      MSR BASEPRI, r0 ; disable interrupts at processor level
 (7)
    #endif
              QK_schedPrio_ ; check if we have preemption
 (8)
        BL
       CMP r0,#0
                                ; is prio == 0 ?
 (9)
     BNE.N scheduler ; if prio != 0, branch to scheduler
(10)
    #if (__CORE__ == __ARM6M__) ; Cortex-M0/M0+/M1 ?
       CPSIE i
(11)
                                ; enable interrupts (clear PRIMASK)
    #else
                                ; Cortex-M3/M4/M4F
                                ; enable interrupts (r0 == 0 \text{ at this point})
(12)
       MSR
              BASEPRI,r0
    #endif
(13) POP {r0}
                                ; pop the EXC RETURN into r0 (low register)
                                ; exception-return to the task
(14)
        BX
              r0
(15) scheduler:
(16) SUB sp,sp,#4 ; align the stack to 8-byte boundary
        MOVS r3,#1
(18) LSLS r3,r3,#24 ; r3:=(1 << 24), set the T bit (new xpsr)
(19) LDR r2,=QK_sched_ ; address of the QK scheduler (new pc)
(20) LDR r1,=svc_ret ; return address after the call (new lr)
(21) PUSH {r1-r3} ; push xpsr,pc,lr
(17)
      SUB sp,sp,\#(4*4) ; don't care for r12,r3,r2,r1
(22)
       PUSH {r0}
(23)
                                ; push the prio argument
                                                               (new r0)
       MOVS r0,#0x6
(24)
     MVNS r0,r0
(25)
                                ; r0:=~0x6=0xFFFFFF9
(26) BX
              r0
                                ; exception-return to the scheduler
(27) svc ret:
    \#if (__CORE__ == __ARM6M__) ; Cortex-M0/M0+/M1 ?
(28) CPSIE i
                                 ; enable interrupts (clear PRIMASK)
    #else
                                ; Cortex-M3/M4/M4F
(29)
      MOVS r0,#0
(30)
       MSR BASEPRI, r0 ; enable interrupts
```



#endif

```
(31) #ifdef ARMVFP
                                  ; If Vector FPU used--clear CONTROL[2] (FPCA bit)
               r0,CONTROL
(32)
        MRS
                                  ; r0 := CONTROL
(33)
              r1,#4
                                  ; r1 := 0 \times 04 (FPCA bit)
        MOVS
               r0,r1
(34)
        BICS
                                  ; r0 := r0 & ~r1
        MSR
                CONTROL, r0
                                 ; CONTROL := r0
(35)
    #endif
(36)
        SVC
                #0
                                  ; SV exception returns to the preempted task
```

- (1) The PendSV\_Handler exception is always entered via tail-chaining from the last nested interrupt (see Section 4.1).
- (2) The exception 1r (EXC RETURN) is pushed to the stack.

**NOTE:** In the presence of the FPU (Cortex-M4F), the EXC\_RETURN[4] bit carries the information about the stack frame format used, whereas EXC\_RETURN[4] ==0 means that the stack contains room for the S0-S15 and FPSCR registers in addition to the usual R0-R3,R12,LR,PC,xPSR registers. This information must be preserved, in order to properly return from the exception at the end.

- (3) For the ARMv6-M architecture...
- (4) Interrupts are disabled by setting the PRIMASK.
- (5) For the ARMv7-M architecture...
- (6-7) Interrupts are disabled by setting the BASEPRI register.

**NOTE:** The value moved to BASEPRI must be identical to QF\_BASEPRI used in qf\_port.h, see Section 3.2.

- (8) The function <code>QK\_schedPrio\_</code> is called to find the highest-priority task ready to run. The function is designed to be called with interrupt disabled and returns the priority of this task (in r0), or zero if the currently preempted task is the highest-priority.
- (9) The returned priority is tested against zero.
- (10) The branch to the QK scheduler (label scheduler) is taken if the priority is not zero.
- (11) For the ARMv6-M architecture, interrupts are enabled by clearing the PRIMASK.
- (12) For the ARMv7-M architecture, interrupts are enabled by setting the BASEPRI register to zero. (Please note that r0 must be zero at this point, so MOV r0, #0 is skipped).
- (13) The saved EXC\_RETURN is popped from the stack to r0. NOTE: the r0 register is used instead of 1r because the Cortex-M0 instruction set cannot manipulate the higher-registers (r9-r15).
- (14) This BX instruction causes exception-return to the preempted task. (Exception-return pops the 8-register exception stack frame and changes the processor state to the task-level).
- (15) The scheduler label is reached only when the function <code>QK\_schedPrio\_</code> has returned non-zero task priority. This means that the QK scheduler needs to be invoked to call this task and potentially any tasks that nest on it. The call to the QK scheduler must also perform the mode switch to the task-level.
- (16) The stack pointer is aligned to the 8-byte boundary.



**NOTE:** The excption stack-frame that is about to be built on top of the current stack must be aligned at 8-byte boundary. This alignment has been lost in step (2), where the EXC\_RETURN from 1r has been pushed to the stack. In step (11), the stack is aligned again by growing the stack by four more bytes. (The stack grows towards lower addresses in ARM Cortex-M, so the stack pointer is decremented).

- (17-18) The value (1 << 24) is synthesized in r3. This value is going to be stacked and later restored to xPSR register (only the T bit set).
- (19) The address of the QK scheduler function QK\_sched\_ is loaded into r2. This will be pushed to the stack as the PC register value.
- (20) The address of the svc\_ret label is loaded into r1. This will be pushed to the stack as the 1r register value.

**NOTE:** The address of the <code>svc\_ret</code> label must be a THUMB address, that is, the least-significant bit of this address must be set (this address must be **odd** number). This is essential for the correct return of the QK scheduler with setting the THUMB bit in the PSR. Without the LS-bit set, the ARM Cortex-M CPU will clear the T bit in the PSR and cause the Hard Fault. The IAR assembler/linker synthesize the correct THUMB address of the <code>svc\_ret</code> label without any extra work on the programmer's part.

- (21) Registers r3, r2 and r1 are pushed onto the stack.
- (22) The stack pointer is adjusted to leave room for 4 registers. The actual stack contents for these registers is irrelevant.
- (23) The original priority returned in r0 from QK\_schedPrio\_ is pushed to the stack. This will be restored to r0 register value. This operation completes the synthesis of the exception stack frame. After this step the stack looks as follows:

```
Hi memory
           (optionally S0-S15, FPSCR), if EXC RETURN[4]==0
          pc (interrupt return address)
          lr
          r12
          r3
          r2
          r1
          r0
          EXC RETURN (pushed in Listing 9(2))
old SP --> "aligner" (added in Listing 9(11))
          xPSR == 0x01000000
          PC == QK sched
          lr == svc ret
          r12 don't care
          r3 don't care
          r2 don't care
          r1 don't care
   SP --> r0 == priority returned from QK schedPrio ()
Low memory
```

(24-25) The special exception-return value 0xFFFFFF9 is synthesized in r0 (two instructions are used to make the code compatible with Cortex-M0, which has no barrel shifter). NOTE: the r0 register is used instead of 1r because the Cortex-M0 instruction set cannot manipulate the higher-registers (r9-r15).



**NOTE:** The exception-return value is consistent with the synthesized stack-frame with the lr[4] bit set to 1, which means that the FPU registers are **not** included in this stack frame.

(26) PendSV exception returns using the special value of the r0 register of 0xFFFFFF9 (return to Privileged Thread mode using the Main Stack pointer). The synthesized stack frame causes actually a function call to QK sched function in C.

**NOTE:** The return from the PendSV exception just executed switches the ARM Cortex-M core to the Privileged Thread mode. The QK\_sched\_ function re-enables interrupts before launching any task, so the tasks always run in the Thread mode with interrupts enabled and can be preempted by interrupts of any priority.

**NOTE:** In the presence of the FPU, the exception-return to the QK scheduler does **not** change any of the FPU status bit, such as CONTROL.FPCA or LSPACT.

- (27) The QK scheduler QK\_sched\_() returns to the svc\_ret label, because this return address is pushed to the stack in step (14). Please note that the address of the svc\_ret label must be a THUMB address (see also NOTE after step (14)).
- (28) For the ARMv6-M architecture, interrupts are enabled by clearing the PRIMASK.
- (29-30) For the ARMv7-M architecture, interrupts are enabled by setting the BASEPRI register to zero. (Please note that r0 must be zero at this point, so MOV r0, #0 is skipped).
- (31) The following code is assembled conditionally only when the FPU is actually used.
- (32-35) The read-modify-write code clears the CONTROL [2] bit [2]. This bit, called CONTROL.FPCA (Floating Point Active), causes generating the FPU-type stack frame, if the bit is set and the "automatic state saving" of the FPU is configured.

**NOTE:** Clearing the CONTROL.FPCA bit is safe in this situation, becaue the SVC exception is not using the FPU. Also, note that the CONTROL.FPCA bit is restored from ~EXC\_RETURN[4] when the SVC exception returns to the task level (see Listing 10(3)).

(36) The synchronous SVC exception is called to put the CPU into the exception mode and correctly return to the thread level.

#### Listing 10: SVC Handler () function for ARM Cortex-M (file qk\_port.s).

```
; The SVC Handler exception handler is used for returning back to the
  ; interrupted task. The SVCall exception simply removes its own interrupt
  ; stack frame from the stack and returns to the preempted task using the
   ; interrupt stack frame that must be at the top of the stack.
  (1) SVC Handler:
(2)
     ADD sp,sp,\#(9*4); remove one 8-register exception frame
                        ; plus the "aligner" from the stack
                        ; pop the original EXC RETURN into r0
(3)
     POP
           {r0}
           r0
(4)
                         ; return to the preempted task
     ВX
     ALIGNROM 2,0xFF
                        ; make sure the END is properly aligned
```

(1) The job of the SVCall exception is to discard its own stack frame and cause the exception-return to the original preempted task context. The stack contents just before returning from SVCall exception is shown below:



```
Hi memory
          (optionally S0-S15, FPSCR), if EXC RETURN[4]==0
          pc (interrupt return address)
           lr
           r12
           r3
          r2
          r1
          r0
   SP --> EXC RETURN (pushed in Listing 9(2))
           "aligner" (added in Listing 9(11))
          xPSR don't care
          PC don't care
          lr don't care
          r12 don't care
           r3 don't care
           r2
               don't care
           r1
               don't care
old SP --> r0
               don't care
Low memory
```

- (2) The stack pointer is adjusted to un-stack the 8 registers of the interrupt stack frame corresponding to the SVCall exception itself plus the "aligner" added to the stack in Listing 9(11).
- (3) The EXC\_RETURN saved in Listing 9(2) is popped from the stack into r0 (low register for Cortex-M0 compatibility)
- (4) SVCall exception returns to the interrupted task level using the original EXC\_RETURN, which codifies the stack frame type.

## 4.5 Setting up and Starting Interrupts in QF\_onStartup()

Setting up interrupts (e.g., SysTick) for the preemptive QK kernel is identical as in the non-preemptive case. Please refer to Section Error: Reference source not found.

## 4.6 Writing ISRs for QK

QK must be informed about entering and exiting every ISR, so that it can perform asynchronous preemptions. You inform the QK kernel about the ISR entry and exit through the macros QK\_ISR\_ENTRY() and QK\_ISR\_EXIT(), respectively. You need to call these macros in every ISR. The following listing shows the ISR the file <qp>\examples\arm-cm\qk\iar\dpp-qk-ev-lm3s811\bsp.c.



## 4.7 QK Idle Processing Customization in QK\_onldle()

QK can very easily detect the situation when no events are available, in which case QK calls the  $QK\_onIdle()$  callback. You can use  $QK\_onIdle()$  to suspended the CPU to save power, if your CPU supports such a power-saving mode. Please note that  $QK\_onIdle()$  is called repetitively from the event loop whenever the event loop has no more events to process, in which case only an interrupt can provide new events. The  $QK\_onIdle()$  callback is called with interrupts **enabled** (which is in contrast to the  $QF\_onIdle()$  callback used in the non-preemptive configuration, see Section 3.5).

The Thumb-2 instruction set used exclusively in ARM Cortex-M provides a special instruction WFI (Waitfor-Interrupt) for stopping the CPU clock, as described in the "ARMv7-M Reference Manual" [ARMv7-M]. The following Listing 11 shows the  $QF_onIdle()$  callback that puts ARM Cortex-M into the idle power-saving mode.

### Listing 11 QK\_onldle() for the preemptive QK configuration.

```
(1) void QK onIdle(void) {
       /* toggle the User LED on and then off, see NOTE01 */
(2)
       QF INT DISABLE();
       GPTOC->DATA Bits[USER LED] = USER LED; /* turn the User LED on */
(3)
       GPIOC->DATA Bits[USER LED] = 0;
                                                    /* turn the User LED off */
(4)
(5)
       QF INT ENABLE();
(6) #ifdef Q SPY
      . . .
(7) #elif defined NDEBUG
                                       /* sleep mode inteferes with debugging */
       /* put the CPU and peripherals to the low-power mode, see NOTE02
       * you might need to customize the clock management for your application,
       * see the datasheet for your particular ARM Cortex-M MCU.
       */
        WFI();
                                                        /* Wait-For-Interrupt */
(8)
   #endif
```

- (1) The QK onIdle() function is called with interrupts enabled.
- (2) The interrupts are disabled to prevent preemptions when the LED is on.
- (3-4) This QK port uses the USER LED of the EV-LM3S811 board to visualize the idle loop activity. The LED is rapidly toggled on and off as long as the idle condition is maintained, so the brightness of the LED is proportional to the CPU idle time (the wasted cycles). Please note that the LED is on in the critical section, so the LED intensity does not reflect any ISR or other processing. The USER LED of the EV-LM3S811 board is toggled on and off.
- (5) Interrupts are re-enabled.

**NOTE:** Obviously, toggling the USER LED is optional and is not necessary for correctness of the QK-port. You can eliminate code in lines (3-5) in your application.

- (6) This part of the code is only used in the QSpy build configuration. In this case the idle callback is used to transmit the trace data using the UART of the ARM Cortex-M device.
- (7) The following code is only executed when no debugging is necessary (release version).
- (8) The WFI instruction is generated using inline assembly.



## 4.8 Testing QK Preemption Scenarios

The DPP example application includes special instrumentation for convenient testing of various preemption scenarios, such as those illustrated in Figure 8.

The technique described in this section will allow you to trigger an interrupt at any machine instruction and observe the preemptions it causes. The interrupt used for the testing purposes is the GPIOA interrupt (INTID == 0). The ISR for this interrupt is shown below:

```
void GPIOPortA_IRQHandler(void) {
    QK_ISR_ENTRY();
    QActive_postFIFO(AO_Table, Q_NEW(QEvent, MAX_PUB_SIG)); /* for testing */
    QK_ISR_EXIT();
}
```

The ISR, as all interrupts in the system, invokes the macros <code>QK\_ISR\_ENTRY()</code> and <code>QK\_ISR\_EXIT()</code>, and also posts an event to the <code>Table</code> active object, which has higher priority than any of the Philosopher active object.

Figure 8 Triggering the GPIOA interrupt from the IAR EWARM debugger.



Figure 8 shows how to trigger the GPIOA interrupt from the IAR EWARM debugger. From the debugger you need to first open the memory view (see right-bottom corner of Figure 8). Next, type the address **0xE000EF00** in the Go to filed and click return. This is the address of the NVIC.STIR register, which stands for the Software Trigger Interrupt Register. This write-only register is useful for software-triggering various interrupts by writing the INTID to it. To trigger the GPIOA interrupt (INTID == 0) you need to write 0 to the byte at address **0xE000EF00** by clicking on this byte, typing one zero, and pressing the Enter key.



**NOTE:** The more convenient way to trigger the interrupt would be to access the NVIC.STIR register from the registers view. However, in IAR EWARM 6.50 the NVIC register option is missing (which is a known issue mentioned in the Errata).

The general testing strategy is to break into the application at an interesting place for preemption, set breakpoints to verify which path through the code is taken, and trigger the GPIO interrupt. Next, you need to free-run the code (don't use single stepping) so that the NVIC can perform prioritization. You observe the order in which the breakpoints are hit. This procedure will become clearer after a few examples.

#### 4.8.1 Interrupt Nesting Test

The first interesting test is verifying the correct tail-chaining to the PendSV exception after the interrupt nesting occurs, as shown in Figure 7(7). To test this scenario, you place a breakpoint inside the <code>GPIOPortA\_IRQHandler()</code> and also inside the <code>SysTick\_Handler()</code> ISR. When the breakpoint is hit, you remove the original breakpoint and place another breakpoint at the very next machine instruction (use the Disassembly window) and also another breakpoint on the first instruction of the <code>QK\_PendSV</code> handler. Next you trigger the GPIOA interrupt per the instructions given in the previous section. You hit the Run button.

The pass criteria of this test are as follows:

- 1. The first breakpoint hit is the one inside the <code>GPIOPortA\_IRQHandler()</code> function, which means that GPIO ISR preempted the SysTick ISR.
- 2. The second breakpoint hit is the one in the SysTick\_Handler(), which means that the SysTick ISR continues after the GPIOA ISR completes.
- 3. The last breakpoint hit is the one in PendSV\_Handler() exception handler, which means that the PendSV exception is tail-chained only after all interrupts are processed.

You need to remove all breakpoints before proceeding to the next test.

#### 4.8.2 Task Preemption Test

The next interesting test is verifying that tasks can preempt each other. You set a breakpoint anywhere in the Philosopher state machine code. You run the application until the breakpoint is hit. After this happens, you remove the original breakpoint and place another breakpoint at the very next machine instruction (use the Disassembly window). You also place a breakpoint inside the GPIOPortA\_IRQHandler() interrupt handler and on the first instruction of the PendSV\_Handler() handler. Next you trigger the GPIOA interrupt per the instructions given in the previous section. You hit the Run button.

The pass criteria of this test are as follows:

- 1. The first breakpoint hit is the one inside the <code>GPIOPortA\_IRQHandler()</code> function, which means that GPIO ISR preempted the <code>Philospher</code> task.
- 2. The second breakpoint hit is the one in <code>PendSV\_Handler()</code> exception handler, which means that the PendSV exception is activated before the control returns to the preempted <code>Philosopher</code> task.
- 3. After hitting the breakpoint in QK PendSV\_Handler handler, you single step into the QK\_scheduler\_(). You verify that the scheduler invokes a state handler from the Table state machine. This proves that the Table task preempts the Philosopher task.
- 4. After this you free-run the application and verify that the next breakpoint hit is the one inside the Philosopher state machine. This validates that the preempted task continues executing only after the preempting task (the Table state machine) completes.



#### 4.8.3 Testing the FPU (Cortex-M4F)

In order to test the FPU, the Board Support Package (BSP) for the Cortex-M4F EK-LM4F120XL board (see Figure 1) uses the FPU in the following contexts:

- In the idle loop via the QK onIdle() callback (QP priority 0)
- In the task level via the BSP\_random() function called from all five Philo active objects (QP priorities 1-5).
- In the task level via the BSP\_displayPhiloStat() function caled from the Table active object (QP priorty 6)
- In the ISR level via the SysTick Handler() ISR (priority above all tasks)

To test the FPU, you could step through the code in the debugger and verify that the expected FPU-type exception stack frame is used and that the FPU registers are saved and restored by the "lazy stacking feature" when the FPU is actually used.

Next, you can selectively comment out the FPU code at various levels of priority and verify that the QK context switching works as expected with both types of exception stak frames (with and without the FPU).

#### 4.8.4 Other Tests

Other interesting tests that you can perform include changing priority of the GPIOA interrupt to be lower than the priority of SysTick to verify that the PendSV is still activated only after all interrupts complete.

In yet another test you could post an event to Philosopher active object rather than Table active object from the GPIOPortA\_IRQHandler() function to verify that the QK scheduler will not preempt the Philosopher task by itself. Rather the next event will be queued and the Philosopher task will process the queued event only after completing the current event processing.



# **5 QS Software Tracing Instrumentation**

Quantum Spy (QS) is a software tracing facility built into all QP components and also available to the Application code. QS allows you to gain unprecedented visibility into your application by selectively logging almost all interesting events occurring within state machines, the framework, the kernel, and your application code. QS software tracing is minimally intrusive, offers precise time-stamping, sophisticated runtime filtering of events, and good data compression (please refer to "QSP Reference Manual" section in the "QP/C Reference Manual" an also to Chapter 11 in [PSiCC2]).

This QDK demonstrates how to use the QS to generate real-time trace of a running QP application. Normally, the QS instrumentation is inactive and does not add any overhead to your application, but you can turn the instrumentation on by defining the Q SPY macro and recompiling the code.

QS can be configured to send the real-time data out of the serial port of the target device. On the LM3S811 MCU, QS uses the built-in UART to send the trace data out. The EV-LM3S811 board has the UART connected to the virtual COM port provided by the USB debugger (see Figure 1), so the QSPY host application can conveniently receive the trace data on the host PC. The QS platform-dependent implementation is located in the file bsp.c and looks as follows:

#### Listing 12 QSpy implementation to send data out of the UART0 of the LM3S811 MCU.

```
(1) #ifdef Q SPY
(4)
       QSTimeCtr QS tickTime ;
       QSTimeCtr QS tickPeriod;
(5)
       enum OSDppRecords {
          QS PHILO DISPLAY = QS USER
       };
   /*.....*/
(6) uint8 t QS onStartup(void const *arg) {
       static uint8 t qsBuf[4*256];
                                                 /* buffer for Quantum Spy */
(7)
       QS initBuf(qsBuf, sizeof(qsBuf));
(8)
                                  /* enable the peripherals used by the UART */
                                 /* enable the peripherals used by the UARTO */
                                                   /* enable clock to UARTO */
       SYSCTL->RCGC1 \mid = (1 << 0);
                                                   /* enable clock to GPIOA */
       SYSCTL->RCGC2 \mid = (1 << 0);
                                              /* wait after enabling clocks */
        NOP();
        NOP();
        NOP();
                                  /* configure UARTO pins for UART operation */
       tmp = (1 << 0) | (1 << 1);
       GPIOA->DIR &= ~tmp;
       GPIOA->AFSEL |= tmp;
       GPIOA->DR2R |= tmp;
                                 /* set 2mA drive, DR4R and DR8R are cleared */
       GPIOA->SLR &= ~tmp;
       GPIOA->ODR &= ~tmp;
       GPIOA->PUR &= ~tmp;
       GPIOA->PDR &= ~tmp;
       GPIOA->DEN |= tmp;
             /* configure the UART for the desired baud rate, 8-N-1 operation */
       tmp = (((SystemFrequency * 8) / UART BAUD RATE) + 1) / 2;
```

```
UARTO->IBRD = tmp / 64;
       UARTO->FBRD = tmp % 64;
                                          /* configure 8-N-1 operation */
       UART0 - > LCRH = 0 \times 60;
       UARTO->LCRH \mid = 0 \times 10;
       UARTO->CTL = (1 << 0) | (1 << 8) | (1 << 9);
       QS tickPeriod = SystemFrequency / BSP TICKS PER SEC;
       QS_tickTime_ = QS_tickPeriod_; /* to start the timestamp at zero */
      return (uint8 t)1;
                                                   /* return success */
    }
    /*.....*/
(9) void QS onCleanup(void) {
    /*.....*/
(10) void QS onFlush (void) {
                                           /* Tx FIFO depth */
       uint16 t fifo = UART TXFIFO DEPTH;
       uint8 t const *block;
       QF INT LOCK (dummy);
       while ((block = QS getBlock(&fifo)) != (uint8 t *)0) {
          QF INT UNLOCK (dummy);
                                       /* busy-wait until TX FIFO empty */
          while ((UARTO->FR & UART FR TXFE) == 0) {
          }
          while (fifo-- != 0) {
                                           /* any bytes in the block? */
            UARTO->DR = *block++;
                                              /* put into the TX FIFO */
                                         /* re-load the Tx FIFO depth */
          fifo = UART TXFIFO DEPTH;
          QF INT LOCK (dummy);
       QF INT UNLOCK (dummy);
if ((HWREG(NVIC_ST_CTRL) & NVIC ST CTRL COUNT) == 0) { /* COUNT no set? */
         return QS tickTime - (QSTimeCtr)SysTick->VAL;
(13)
      else { /* the rollover occured, but the SysTick ISR did not run yet */
       return QS tickTime + QS tickPeriod - (QSTimeCtr)SysTick->VAL;
(14)
                                                           /* Q SPY */
    #endif
```

- (1) The QS instrumentation is enabled only when the macro Q SPY is defined
- (2-3) The QS implementation uses the UART driver provided in the Luminary Micro library.
- (4) These variables are used for time-stamping the QS data records. This QS\_tickTime\_ variable is used to hold the 32-bit-wide SysTick timestamp at tick. The QS\_tickPeriod\_variable holds the nominal number of hardware clock ticks between two subsequent SysTicks. The SysTick ISR increments QS tickTime by QS tickPeriod.
- (5) This enumeration defines application-specific QS trace record(s), to demonstrate how to use them.
- (6) You need to define the QS init() callback to initialize the QS software tracing.
- (7) You should adjust the QS buffer size (in bytes) to your particular application



- (8) You always need to call QS initBuf() from QS init() to initialize the trace buffer.
- (9) The QS exit () callback performs the cleanup of QS. Here nothing needs to be done.
- (10) The QS\_flush() callback flushes the QS trace buffer to the host. Typically, the function busy-waits for the transfer to complete. It is only used in the initialization phase for sending the QS dictionary records to the host (see please refer to "QSP Reference Manual" section in the "QP/C Reference Manual" an also to Chapter 11 in [PSiCC2])

## 5.1 QS Time Stamp Callback QS\_onGetTime()

The platform-specific QS port must provide function QS\_onGetTime() (Listing 12(11)) that returns the current time stamp in 32-bit resolution. To provide such a fine-granularity time stamp, the ARM Cortex-M port uses the SysTick facility, which is the same timer already used for generation of the system clock-tick interrupt.

NOTE: The QS onGetTime() callback is always called with interrupts locked.

Figure 9 shows how the SysTick Current Value Register reading is extended to 32 bits. The SysTick Current Value Register (NVIC\_ST\_CURRENT) counts down from the reload value stored in the SysTick Reload Value Register (NVIC\_ST\_RELOAD). When NVIC\_ST\_CURRENT reaches 0, the hardware automatically reloads the NVIC\_ST\_CURRENT counter from NVIC\_ST\_RELOAD on the subsequent clock tick. Simultaneously, the hardware sets the NVIC\_ST\_CTRL\_COUNT flag, which "remembers" that the reload has occurred.

The system clock tick ISR SysTick\_Handler() keeps updating the "tick count" variable QS\_tickTime\_by incrementing it each time by QS\_tickPeriod\_. The clock-tick ISR also clears the NVIC\_ST\_CTRL\_COUNT flag.



Figure 9 Using the SysTick Current Value Register to provide 32-bit QS time stamp.



Listing 12(11-15) shows the implementation of the function QS\_onGetTime(), which combines all this information to produce a monotonic time stamp.

- (12) The QS\_onGetTime() function tests the NVIC\_ST\_CTRL\_COUNT. This flag being set means that the NVIC\_ST\_CURRENT has rolled over to zero, but the SysTick ISR has not run yet (because interrupts are still locked).
- (13) Most of the time the NVIC\_ST\_CTRL\_COUNT flag is not set, and the time stamp is simply the sum of QS\_tickTime\_ + (-HWREG(NVIC\_ST\_CURRENT)). Please note that the NVIC\_ST\_CURRENT register is negated to make it to an up-counter rather than down-counter.
- (13) If the NVIC\_ST\_CTRL\_COUNT flag is set, the QS\_tickTime\_ counter misses one update period and must be additionally incremented by QS\_tickPeriod .

## 5.2 QS Trace Output in QF\_onldle()/QK\_onldle()

To be minimally intrusive, the actual output of the QS trace data happens when the system has nothing else to do, that is, during the idle processing. The following code snippet shows the code placed either in the QF onIdle() callback ("Vanilla" port), or QK onIdle() callback (in the QK port):

## Listing 13 QS trace output using the UART0 of the Stellaris LM3S811 MCU

```
#define UART TXFIFO DEPTH 16
    void QK onIdle(void) {
    #ifdef Q SPY
(1) if ((UARTO->FR & UART_FR_TXFE) != 0) { /* TX done? */ (2) uint16_t fifo = UART_TXFIFO_DEPTH; /* max bytes we can accept */
           uint8 t const *block;
(3)
           QF INT LOCK(igonre);
(4)
           block = QS_getBlock(&fifo); /* try to get next block to transmit */
(5)
           QF INT UNLOCK(igonre);
(6)
(7)
            while (fifo-- != 0) {
                                                        /* any bytes in the block? */
                UART0->DR = *block++;
(8)
                                                              /* put into the FIFO */
            }
    #elif defined NDEBUG
                                         /* sleep mode interferes with debugging */
```

- (1) The UART\_FR\_TXFE flag is set when the TX FIFO becomes empty. If the flag is set, the TX FIFO can be filled with up to 16 bytes of new data.
- (2) The fifo variable is initialized with the maximum number of bytes the QS\_getBlock() function can deliver (see "QS Programmer's Manual").
- (3) The block variable is the pointer to the contiguous data block returned from QS\_getBlock() function (see "QS Programmer's Manual").
- (4) Interrupts are locked to call QS getBlock().
- (5) The function QS\_getBlock() returns the contiguous data block of up-to UART\_TXFIFO\_DEPTH fifo bytes. The function also returns the actual number of bytes available in the fifo variable (passed as a pointer).
- (6) The interrupts are unlocked after the call to QS getBlock().



- (7) The while () loop goes over all bytes delivered from QS\_getBlock(). (NOTE: if zero bytes are delivered, the loop does not go even once.)
- (8) The next byte pointed to by the block pointer is inserted into the TX FIFO and the block pointer is advanced to the next byte.

## 5.3 Invoking the QSpy Host Application

The QSPY host application receives the QS trace data, parses it and displays on the host workstation (currently Windows or Linux). For the configuration options chosen in this port, you invoke the QSPY host application as follows (please refer to "QSP Reference Manual" section in the "QP/C Reference Manual" an also to Chapter 11 in [PSiCC2]):

qspy -cCOM5

The specific COM port obviously depends on how the Tiva-C virtual COM port enumerates on your machine. You might want to open the COM ports in the Device Manager to find out the COM port number.

the file EWARM\_UserGuide.pdf.



# **Related Documents and References**

| <b>Document</b> [PSiCC2] "Practical UML Statecharts in C/C++, Second Edition", Miro Samek, Newnes, 2008           | Location Available from most online book retailers, such as amazon.com. See also: http://www.statemachine.com/psicc2.htm                        |
|-------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
| [Samek+ 06b] "Build a Super Simple Tasker",<br>Miro Samek and Robert Ward, Embedded<br>Systems Design, July 2006. | http://www.embedded.com/showArticle.jhtml?<br>articleID=190302110                                                                               |
| [ARMv7-M] "ARM v7-M Architecture Application Level Reference Manual", ARM Limited                                 | Available from <a href="http://infocenter.arm.com/help/">http://infocenter.arm.com/help/</a> .                                                  |
| [Cortex-M3] "Cortex™-M3 Technical Reference<br>Manual", ARM Limited                                               | Available from <a href="http://infocenter.arm.com/help/">http://infocenter.arm.com/help/</a> .                                                  |
| [ARM AN298] ARM Application Note 298<br>"Cortex-M4(F) Lazy Stacking and Context<br>Switching", ARM 2012           | Available from<br>http://infocenter.arm.com/help/topic/com.arm.doc.dai0<br>298a/DAI0298A cortex_m4f_lazy_stacking_and_cont<br>ext_switching.pdf |
| [Luminary 12] "LM3S811 Microcontroller Data Sheet", Texas Instruments, 2012                                       | Texas Instruments literature number SPMS150I                                                                                                    |
| [Tiva-C 13] "Tiva ™ TM4C123GH6PM<br>Microcontroller (identical to LM4F230H5QR)",<br>Texas Instruments, 2013       | Texas Instruments literature number SPMS376B                                                                                                    |
| [IAR 13a] "ARM® IAR C/C++ Compiler<br>Reference Guide v6.60", IAR 2013                                            | Available in PDF as part of the ARM KickStart™ kit in the file EWARM_CompilerReference.pdf.                                                     |
| [IAR 13b] "IAR Linker and Library Tools<br>Reference Guide v6.60", IAR 2013                                       | Available in PDF as part of the ARM KickStart™ kit                                                                                              |
| [IAR 13c] "ARM® Embedded Workbench® IDE                                                                           | Available in PDF as part of the ARM KickStart™ kit in                                                                                           |

User Guide v6.60", IAR 2013

# 7 Contact Information

Quantum Leaps, LLC 103 Cobble Ridge Drive Chapel Hill, NC 27516 USA

+1 866 450 LEAP (toll free, USA only) +1 919 869-2998 (FAX)

e-mail: info@quantum-leaps.com
WEB: http://www.quantum-leaps.com
http://www.state-machine.com



"Practical UML Statecharts in C/C++, Second Edition: Event Driven Programming for Embedded Systems", by Miro Samek, Newnes, 2008

#### **Legal Disclaimers**

Information in this document is believed to be accurate and reliable. However, Quantum Leaps does not give any representations or warranties, expressed or implied, as to the accuracy or completeness of such information and shall have no liability for the consequences of use of such information.

Quantum Leaps reserves the right to make changes to information published in this document, including without limitation specifications and product descriptions, at any time and without notice. This document supersedes and replaces all information supplied prior to the publication hereof.

All designated trademarks are the property of their respective owners.

