# Benchmarking TFLite Models  

This notebook requires the native benchmark binary for linux that you can get from this page:  
 [https://www.tensorflow.org/lite/performance/measurement](https://www.tensorflow.org/lite/performance/measurement)  

 This binary must be placed into the "benchmarking folder".

 This notebook must be run under LINUX!

 TensorFlow Lite benchmark tools currently measure and calculate statistics for the following important performance metrics:

- Initialization time
- Inference time of warmup state
- Inference time of steady state
- Memory usage during initialization time
- Overall memory usage

In [10]:
import os, sys, math, datetime
import pathlib
from pathlib import Path

# import workbench.config.config
from workbench.config.config import initialize
from workbench.utils.utils import create_filepaths
from workbench.wandb import wandb_model_DB, get_model_DB_run_id_from_architecture, get_architecture_from_model_DB_run_id

import wandb

In [11]:
import re
from matplotlib import pyplot as plt
#import plotly.express as px
import pandas as pd


# enable plotly in VS Studio Code
#import plotly.io as pio
#pio.renderers.default = "notebook_connected"
#pio.renderers.default = "plotly_mimetype+notebook"

import wandb

In [12]:
# Configure pandas to show all columns & rows
pd.set_option('display.max_columns', None)
#pd.set_option('display.max_rows', None)
pd.set_option('display.max_colwidth', None)

In [13]:
models_dir = initialize()

In [14]:
automated = False

global model_name
model_name = "mobilenetv1_0.1_96_c3_o2_l5.MV1"
#model_name = "mobilenetv2_0.5_96_c3_o2_l5"
#model_name = "mobilenetv2_0.25_96_c3_o2_t5l512.MV1"


In [15]:

models_path, models_summary_path, models_image_path, models_layer_df_path, models_tf_path, models_tflite_path, models_tflite_opt_path = create_filepaths(model_name)

/mnt/c/tiny_mlc/tiny_cnn/models


In [16]:
models_benchmark_path = models_dir.joinpath(model_name, f"{model_name}_benchmark.txt")
models_benchmark_path
models_performance_path = models_dir.joinpath(model_name, f"{model_name}_performance.txt")
models_performance_path

PosixPath('/mnt/c/tiny_mlc/tiny_cnn/models/mobilenetv1_0.1_96_c3_o2_l5.MV1/mobilenetv1_0.1_96_c3_o2_l5.MV1_performance.txt')

In [17]:
models_tflite_opt_path.as_posix()

'/mnt/c/tiny_mlc/tiny_cnn/models/mobilenetv1_0.1_96_c3_o2_l5.MV1/mobilenetv1_0.1_96_c3_o2_l5.MV1_INT8.tflite'

# Benchmarking for tflite - non quantized

In [18]:
# ! ./benchmarking/linux_x86-64_benchmark_model \
#     --graph=$models_tflite_path \
#     --num_threads=1 \
#     --enable_op_profiling=true \
#     | tee $models_benchmark_path

# Performance for quantized tflite file

In [19]:
! ./benchmarking/linux_x86-64_benchmark_model_performance_options \
    --graph=$models_tflite_opt_path \
    --num_threads=1 \
    --enable_op_profiling=true \
    --report_peak_memory_footprint=true \
    | tee $models_benchmark_path

INFO: STARTING!
INFO: The list of TFLite runtime options to be benchmarked: [all]
INFO: Log parameter values verbosely: [0]
INFO: Num threads: [4]
INFO: Report the peak memory footprint: [1]
INFO: Graph: [/mnt/c/tiny_mlc/tiny_cnn/models/mobilenetv1_0.1_96_c3_o2_l5.MV1/mobilenetv1_0.1_96_c3_o2_l5.MV1_INT8.tflite]
INFO: Enable op profiling: [1]
INFO: #threads used for CPU inference: [4]
INFO: Use gpu: [0]
INFO: Use xnnpack: [1]
INFO: Loaded model /mnt/c/tiny_mlc/tiny_cnn/models/mobilenetv1_0.1_96_c3_o2_l5.MV1/mobilenetv1_0.1_96_c3_o2_l5.MV1_INT8.tflite
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
INFO: XNNPACK delegate created.
INFO: Explicitly applied XNNPACK delegate, and the model graph will be partially executed by the delegate w/ 2 delegate kernels.
INFO: The input model file size (MB): 0.086776
INFO: Initialized session in 139.423ms.
INFO: Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
INFO: count=2669 first=4

# Benchmarking for quantized .tflite file

In [20]:
! ./benchmarking/linux_x86-64_benchmark_model \
    --graph=$models_tflite_opt_path \
    --num_threads=1 \
    --enable_op_profiling=true \
    --report_peak_memory_footprint=true \
    | tee $models_benchmark_path

STARTING!
Log parameter values verbosely: [0]
Num threads: [1]
Report the peak memory footprint: [1]
Graph: [/mnt/c/tiny_mlc/tiny_cnn/models/mobilenetv1_0.1_96_c3_o2_l5.MV1/mobilenetv1_0.1_96_c3_o2_l5.MV1_INT8.tflite]
Enable op profiling: [1]
#threads used for CPU inference: [1]
Loaded model /mnt/c/tiny_mlc/tiny_cnn/models/mobilenetv1_0.1_96_c3_o2_l5.MV1/mobilenetv1_0.1_96_c3_o2_l5.MV1_INT8.tflite
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
The input model file size (MB): 0.086776
Initialized session in 170.103ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
count=2075 first=10421 curr=199 min=195 max=10421 avg=238.764 std=238

Running benchmark for at least 50 iterations and at least 1 seconds but terminate if exceeding 150 seconds.
count=4053 first=221 curr=205 min=202 max=832 avg=230.364 std=44

Inference timings in us: Init: 170103, First inference: 10421, Warmup (avg): 238.764, Inference (avg): 230.364
Not

In [21]:
if automated == False:
    ! code $models_benchmark_path

# Finding the tensor arena size

In [22]:
arena_size_path = Path.cwd().parent.joinpath("tflite-find-arena-size","build",  "find-arena-size")
arena_size_path

PosixPath('/mnt/c/tiny_mlc/tflite-find-arena-size/build/find-arena-size')

In [34]:
%%capture arena_size
! $arena_size_path $models_tflite_path


In [36]:
arena_size_raw = arena_size.stdout.strip()
arena_size_raw

'{"arena_size": 159408}'

In [37]:
try:
    import ast
    arena_size_dict = ast.literal_eval(arena_size_raw)
    arena_size = arena_size_dict["arena_size"]
    arena_size
except:
    arena_size = 0

In [27]:
run_id = get_model_DB_run_id_from_architecture(model_name)
run_id

't3imun41'

In [29]:
if len(run_id) > 1:

        PROJECT = "model_DB"

        run = wandb.init(
                # Set the project where this run will be logged
                project=PROJECT, 
                id = run_id, 
                resume="allow",
                )

        run.log({"arena_size" : arena_size})

        wandb.finish()

else:
        print(f"Could not find run_id {run_id}!")

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33msusbrock[0m. Use [1m`wandb login --relogin`[0m to force relogin


0,1
arena_size,▁

0,1
allocate_tensors_ms_%,0.209
allocate_tensors_ms_avg,0.028
allocate_tensors_ms_first,0.028
arena_size,0.0
first_inference_us,7424.0
inference_avg_us,330.124
init_us,455845.0
initialization_ms,455.845
model_size_MB,0.08718
modify_graph_with_delegate_mem_KB,836.0
