In [1]:
# Generelle moduler og funksjonsbeskrivelser brukt i forelesningen
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = [8.00, 4.5]
plt.rcParams["figure.autolayout"] = True
plt.rcParams["axes.grid"] = True
plt.rcParams["axes.xmargin"] = 0.0

%matplotlib ipympl

<img src="figures/NTNU_logo_vertical.svg" align="left" style="width: 30%">

<br clear="all" />
<br></br>

# Intro to CMSIS-DSP

* **Course AIS2201 - Signal Processing**
* **Week 39, 2024**
* **Lecturer: Kai Erik Hoff**

# Topics

* What is `arm_math.h`
* C arrays and pointers
* Example of typical `arm_math.h` function
* How to import to STM32 project

# What is CMSIS-DSP?

* **CMSIS-DSP** is an open-source software library that implements common compute processing functions optimized for use on Arm Cortex-M and Cortex-A processors. [\[1\]](https://arm-software.github.io/CMSIS_6/latest/DSP/index.html)


* Part of [ARM](https://en.wikipedia.org/wiki/ARM_Architecture_(company))'s [Common Microcontroller Software Interface Standard (CMSIS)](https://arm-software.github.io/CMSIS_6/latest/General/index.html)

* Processing functions created to process signals as arrays containing numerous samples at a time
    * Similar in concept to e.g.  `numpy` 

# Arrays in `C`


* Used to store a sequence of values of a certain type:

>```C
>int my_array[] = {1, 2, 3, 4, 5}; // Declares an array of length 5 with integer values 1-5
>```

* Since datatype `int` is 2 bytes, the above declaration of `my_array` will allocate $5\cdot 2 = 10$ bytes for storage of array values.

* Array elements are accessed using indexing:

>```C
> int value_at_2 = my_array[2];
>```

* Indexing out of bounds will not be prevented by compiler. The following will run without errors:

>```C
> int mystery_value = my_array[5]; // Acquires whatever is stored "next to" the array in system memory, and casts to 'int'
>```

# Pointers in `C`

* A pointer is a reference to the memory location where a value is stored.<br>**Quick overview:**


>```C
> int my_value = 10;  // Declares integer variable
> int *value_ptr = &my_value;  // Creates pointer variable wich contains address for 'my_value' (eg. 0xd2e6eea4)
> int new_value = *value_ptr;  // Extracts value stored in address referenced by 'value_ptr' (i.e. 10)
> /* The variable 'new_value' is now equal to 10 */ 
>```

* Pointers allow functions to edit variables which are fed as input.

>```C
> /* Example function description */
> void increment(int* x){
>     *x = *x + 1;
> }
> 
> /* Example function call */
> void main() {
>     int my_value = 10;
>     increment(&my_value);
>     printf("%d", my_value); // Prints '11'
> }
>```

# Arrays are pointers!

* An array variable declared using `[]` simply contains a pointer to the location of the first element in the array.
    * Indexing becomes equivalent to incrementing the pointer addreess:

>```C
> int my_array[] = {1, 2, 3, 4, 5}; // Declares an array of length 5 with integer values 1-5
> int a = my_array[3]; // Assigns value at index 3 to variable 'a'
> int b = *(my_array + 3); // Assigns integer stored at address 'my_array + 3' to variable b 
> printf("%d %d", a, b); // Prints '4 4'
>```

* Functions processing arrays use pointers.
    * To avoid out-of-bounds errors, information about array size **must** be passed as well.

>```C
> void square(int* x, int* y, int size){
>     for (int i=0; i<size; i++){
>         y[i] = x[i]*x[i];
>     }
> }
> 
> void main() {
>     int my_array[] = {1, 2, 3, 4, 5}; // Declares an array of length 5 with integer values 1-5
>     int out_array[5] = {}; // Allocate memory for output array
>     square(my_array, out_array, 5); // Compute elementwise squared
>     printf("%d^2 = %d", my_array[2], out_array[2]); // Prints 3^2 = 9
> }
>```

* Arrays can be cast to different types (*Not generally advisable, but common practice when using DMA.*)

>```C
> uint32_t my_array[] = {0x1111aaaa}; // Declares an array of length 5 with integer values 1-5
> uint16_t* array_copy = (uint16_t*) my_array; 
> printf("%x %x", array_copy[1], array_copy[0]); // Prints '1111 aaaa'
>```

# Using the CMSIS-DSP Library

* Full library part of firmware package installed with CubeMX
    * Location: `<install_directory>\STM32Cube\Repository\STM32Cube_FW_Fx_Vx.xx.x\Drivers\CMSIS\DSP`

* Both input and output arrays are passed as arguments along with number of data points.
    * Allocate space for input *and* output in separate arrays, then pass both input and output to function.
    * Example using [arm_abs_f32](https://arm-software.github.io/CMSIS_5/DSP/html/group__BasicAbs.html#ga0e8fc7df3033cdbe9cda8a766a46e6d9):

>```C
>float32_t input[2] = {-5, 5};
>float32_t output[2] = {};
>arm_abs_f32(input, output, 2);
>```



* Each mathematical function has a number of variants depending on data type
    * Data type specified at end of function
    * Floating point data types used: `f16`, `f32` & `f64`
    * Fixed point data types used: `q16` & `q32`
        * Functionally integers, but interpreted as having "normalized range" $-1 \leq x < 1$

* Include instructions found [here](https://community.st.com/t5/stm32-mcus/how-to-integrate-cmsis-dsp-libraries-on-a-stm32-project/ta-p/666790)

# Double Buffering

* Overall program structure for handling sample stream in "batches"



<img src="figures/DualBufferAnim.gif" width="80%" style="margin-left:100px" />

* Total sample buffer size must be $2\times$ FFT window size

* Alternating halves of the `ADC_buffer` are copied to FFT input.

# FFT on an STM32





* Makes use of "basic" fft algorithms such as *radix-2*.
    * Divide and conquer algorithm expexts window size $N=2^k, k \in \mathbb{Z}$
    * Due to hardware limitations, FFT will only work for window sizes $N \in \{16, 32, 64, 128, 256, 512, 1024, 2048, 4096\}$

* FFT functions found in folder `TransformFunctions`
    * Requires inclusion of `"arm_const_structs.h"` in addition to `"arm_math.h"`

* An FFT-algorithm `struct` instance must be initialized before calculating FFT
    * Accomplished with separate `init`-function
    * Window size is determined here!
    * Allocates memory needed 

* Choose between `cfft` and `rfft`.
    * `cfft` produces two-sided DFT. <br>Expects complex input.
    * `rfft` produces one-sided DFT. <br>Expects real-valued input.

* Separate functions for different data types:
    * Floating-point: `f16`, `f32`, `f64`
    * Fixed-point: `q15`, `q31`
    * Example: `arm_cfft_f32` for calculating two-sided DFT with 32-bit floating point values.

## Example:

```C
/* Parameters */
uint32_t fftSize = 1024;
uint32_t ifftFlag = 0;
uint32_t doBitReverse = 1;

/* Create struct for computing CFFT */
arm_cfft_instance_f32 varInstCfftF32;

/* Initialize FFT struct */
arm_cfft_init_f32(&varInstCfftF32,fftSize);

/* Process the data through the CFFT/CIFFT module 
 Take note: the data in 'testInput_f32_10khz' is modified BY the function
 'arm_cfft_f32 and will contain the FFT output once the function has 
 executed.*/
arm_cfft_f32(&varInstCfftF32, testInput_f32_10khz, ifftFlag, doBitReverse);
```