# 10 Automated Batch Processing

This notebook combines all the previous steps in an automated manner. It allows one to use a selection of `.tflite` models to be deployed on an MCU target, run benchmarking inference, collect the evaluation results, and save the results.

Using this script one can evaluate the impact of optimizations like quantization, CMSIS-NN, compiler optimizations, FPUs, and more.

In [1]:
%run '00_README.ipynb'
%run 'H01_Models.ipynb'
%run 'H02_TFL-Conversion.ipynb'
%run 'H03_TFLu.ipynb'
%run 'H04_MCU-Verification.ipynb'
%run 'H05_RocketLogger.ipynb'
%run 'H06_Energy-Parser.ipynb'
%run 'H10_Batch-Processing.ipynb'

Imported helper functions from 00_README.ipynb
Imported all modules.
	Tensorflow Version:  2.2.0
	Numpy Version:  1.19.0
	Pandas Version:  1.0.5
Imported helper functions from H01_Models.ipynb
Imported helper functions from H02_TFL-Conversion.ipynb
Imported helper functions from H03_TFLu.ipynb
Imported helper functions from H04_MCU-Verification.ipynb


In [4]:
display(model_selection)

Dropdown(description='Select model:', options=('keras-model/01aA_LeNet-MNIST.h5', 'keras-model/01aW_LeNet-MNIS…

In [5]:
# load model
tf_model_file = model_selection.value
tf_model = tf.keras.models.load_model(tf_model_file)

# set model name
model_name = get_tf_model_string(tf_model_file)

In [6]:
 data_selection = widgets.Dropdown(
    options=sorted(glob.glob("keras-model/*.py")),
    description='Select model:',
    layout=Layout(width='100%')
)
display(data_selection)

Dropdown(description='Select model:', layout=Layout(width='100%'), options=('keras-model/01a_LeNet-MNIST_data.…

In [7]:
tf_model_data = data_selection.value
%run -i {tf_model_data}

The input length is 576 and the output length 10


### Which converted model would you like to deploy?

## Which quantizations?

In [8]:
# todo: make this checkmarks
display(tfl_model_selections)

SelectMultiple(description='Select model:', layout=Layout(height='200px', width='100%'), options=('TFLite-mode…

In [9]:
tfl_model_selections.value

('TFLite-model/01f_Depthwise-Conv_f-100_K-1_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-100_K-3_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-100_K-5_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-101_K-1_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-101_K-3_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-101_K-5_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-102_K-1_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-102_K-3_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-102_K-5_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-103_K-1_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-103_K-3_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-103_K-5_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-104_K-1_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-104_K-3_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-104_K-5_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-105_K-1_Q-full.tflite',
 'TFLite-model/01f_Depthwise-Conv_f-105_

### Select a base (default) model which will be used as a reference
This is usually the non-optimized float32, float32 .tflite model.

In [10]:
basic_model_file = f'./TFLite-model/{model_name}_Q-none.tflite'
basic_model_size = os.path.getsize(basic_model_file) / 1024

## Target device and serial

## Complete verification or how much data?

## Which optimizations?
You can select multiple pr pressing `CTRL` and clicking.

In [11]:
display(cmsis_selections, fpu_selections, compiler_selections)

SelectMultiple(description='Select cmsis modes:', options=(('none', './TFLu_benchmark-model_mbed'), ('cmsis-nn…

SelectMultiple(description='Select FPU modes:', options=(('FPU disabled', 0), ('FPU enabled', 1)), value=())

SelectMultiple(description='Select compiler options:', options=('-Ofast', '-Os'), value=())

In [None]:
# update mbed according to

## Non-loopable para

In [12]:
target_mcu = ''

display(target_selections,
        baudrate_slider,
        cycles_selection,
        layers_selection,
        reporting_selection,
        input_selection,
        energy_selection)

SelectMultiple(description='Select targets:', options=('auto', 'NUCLEO_L496ZG', 'NUCLEO_F767ZI', '...'), value…

FloatLogSlider(value=1000000.0, description='Baudrate', max=6.0, min=4.0, step=1.0)

Checkbox(value=False, description='Benchmark in cycles (instead of us)', indent=False)

Checkbox(value=False, description='Benchmark with layer granularity (instead of a whole inference)', indent=Fa…

Checkbox(value=True, description='Report the results of the inference via UART', indent=False)

Checkbox(value=True, description='Enabling custom input via UART (required for automated verification)', inden…

Checkbox(value=False, description='Enable custom settings for an energy measurement', indent=False)

In [13]:
# init
# write constants which stay across the optimizations
# like example pic
# download mbed

#for x
#    patch_arena_size(mbed_dirs[mbed_dir], arena_size_kb)

    
# we dont write the constants as the constant includes the filename of the model


In [14]:
# update all mpeds

In [15]:
# prepare dataframe / table for saving everything
# create folder with all the individual savings

In [16]:
arguments = set_compilation_macros(INPUT_LENGTH, OUTPUT_LENGTH, baudrate=int(baudrate_slider.value),
           cycles=cycles_selection.value, layers=layers_selection.value, 
            reporting=reporting_selection.value, 
           manual_input=input_selection.value, energy=energy_selection.value)
print(arguments)

-D INPUT_LENGTH=576 -D OUTPUT_LENGTH=10 -D BAUDRATE=1000000 -D BENCHMARK_LAYERS -D ENERGY_MEASUREMENT 


## Building and Flashing

In [17]:
if 'cmsis-nn' in cmsis_selections.value:
    update_mbed_project()
    mbed_dir = './TFLu_benchmark-model_mbed_cmsis-nn'
else:
    update_mbed_project(cmsis=False)
    mbed_dir = './TFLu_benchmark-model_mbed'

Updating repository ...
Saved working directory and index state WIP on master: a54197a layer gpio now toggles
Already up to date.
Done.


In [19]:
if 'LeNet' in model_name:
    patch_arena_size(cmsis_selections.value[0], 60)
elif 'ResNet' in model_name:
    patch_arena_size(cmsis_selections.value[0], 150)
else:
    patch_arena_size(cmsis_selections.value[0], 220)


In [20]:
# print a final summary of all the settings and the upcoming benchmarks

In [21]:
fpus = fpu_selections.value
compiler_flags = compiler_selections.value
tfl_model_files_selection = tfl_model_selections.value
mbed_dirs = cmsis_selections.value

target_devices = target_selections.value
baudrate = int(baudrate_slider.value)

total_combinations = len(fpus) * len(compiler_flags) * len(tfl_model_files_selection) * len(mbed_dirs)
print(f"We have a total of {total_combinations} combinations which will be benchmarked.")

We have a total of 126 combinations which will be benchmarked.


In [22]:
!mbed detect

[mbed] Working path "/Users/nope/ownCloud/projects/masterthesis/jupyter-workspace/ML-on-MCU_toolchain" (program)
       Limited information will be shown about connected targets/boards
---
[mbed] Detected "NUCLEO_L496ZG" connected to "/Volumes/NODE_L496ZG" and using com port "/dev/tty.usbmodem14103"


In [23]:
serial_port = '/dev/tty.usbmodem14103'

In [24]:
#setup rocketlogger
RL = RocketLogger('192.168.2.2', '~/.ssh/eth/id_rsa')

RocketLogger v1.1.5

RocketLogger Configuration:
  Sampling rate:    64kSps
  Data aggregation: downsampleaverage
  Update rate:      1Hz
  Webserver:        enabled
  Digital inputs:   enabled
  File format:      binary
  File name:        /home/rocketlogger/data/init-filename_2021-04-01.rld
  Channels:         0,2,
  Sample limit:     no limit



In [None]:
%%time

print("Starting ...\n++++")

# requires recompilation of the project
for target_device in target_devices:
    for compiler_flag in compiler_flags:
        print("Compiler flag is", compiler_flag)

        # requires recompilation of the project
        for fpu_status in fpus:
            print("FPU status:", fpu_status)

            for mbed_dir in mbed_dirs: # cmsis-nn and none
                print("\tUsing the following mbed:", mbed_dir)

                # we loop first over the files, as we don't need to recompile the whole project
                for tfl_model_file in tfl_model_files_selection:

                    try:
                        table_layers_summary, table_energy_summary = run_benchmark(RL, target_device, serial_port,
                                                                                   baudrate, tfl_model_file, mbed_dir, 
                                                                                   compiler_flag, fpu_status,
                                                                                   no_samples=10)

                        table_firmware = table_firmware.append(table_layers_summary, ignore_index=True)
                        table_energy = table_energy.append(table_energy_summary, ignore_index=True)
                    except Exception as err:
                        print("Error encountered; skipping. Errror: ",err)
                    #print(model_information)
                    print("-------------\n\nNext ...")

print("Fin.")

table_firmware.to_pickle(f"results/F-combined_no-cmsis_{target_device[0]}_{model_name}_layer_results_{date.today()}.pkl")
table_firmware.to_excel(f"results/F-combined_no-cmsis_{target_device[0]}_{model_name}_layer_results_{date.today()}.xlsx")


table_energy.to_pickle(f"results/F-combined_no-cmsis_{target_device[0]}_{model_name}_energy_results_{date.today()}.pkl")
table_energy.to_excel(f"results/F-combined_no-cmsis_{target_device[0]}_{model_name}_energy_results_{date.today()}.xlsx")


Starting ...
++++
Compiler flag is -Ofast
FPU status: 1
	Using the following mbed: ./TFLu_benchmark-model_mbed
Benchmarking: 01f_Depthwise-Conv_f-100_K-1_Q-full
Writing 'TFLite-model/01f_Depthwise-Conv_f-100_K-1_Q-full.tflite' to './TFLu_benchmark-model_mbed'
Writing the model was successful.
Writing image no. 7270 to './TFLu_benchmark-model_mbed'
Writing was successful.
	Building & flashing ...
	Finished building & flashing.
	Getting size of the binary blob ...
	First emasurmeent which will be discarded - warm up.
'rocketlogger set -r 64k -ch 0,2 -c -f /home/rocketlogger/data/NUCLEO_L496ZG_01f_Depthwise-Conv_f-100_K-1_2021-04-01.rld -d -format "bin" -C "" -s'
RocketLogger v1.1.5

RocketLogger Configuration:
  Sampling rate:    64kSps
  Data aggregation: downsampleaverage
  Update rate:      1Hz
  Webserver:        enabled
  Digital inputs:   enabled
  File format:      binary
  File name:        /home/rocketlogger/data/NUCLEO_L496ZG_01f_Depthwise-Conv_f-100_K-1_2021-04-01.rld
  Channe