In [1]:
from IPython.display import Code

# Example: Compare MIPS of RISC-V Instruction Set Simulators

Typically MLonMCU would be used to benchmark TinyML workloads on real wardware or simulators. However it's flexibility also allows some interesting experiments not directly related to Embedded ML. In the following it the performance of some RISC-V ISA Simulators is compared using the MLonMCU command line or Python API.

## Supported components

**Models:** Any (`sine_model` used below)

**Frontends:** Any (`tflite` used below)

**Frameworks/Backends:** Any (`tvmaotplus` used below)

**Platforms/Targets:** `etiss`, `spike`, `ovpsim` (`etiss` and `spike` used below)

## Prerequisites

If not done already, setup a virtual python environment and install the required packages into it. (See `requirements.txt`)

In [2]:
Code(filename="requirements.txt")

Set up MLonmCU as usual, i.e. initialize an environment and install all required dependencies. Feel free to use the following minimal `environment.yml.j2` template:

In [3]:
Code(filename="environment.yml.j2")

Do not forget to set your `MLONMCU_HOME` environment variable first if not using the default location!

## Usage

If supported by the defined target, the measured MIPS (of the Simulation) is part of the report printed/returned my MLonMCU. The following shows you how to get rid of unwanted further information and how to increase the accuracy of the MIPS value.

### A) Command Line Interface

Let's start with an example benchmark of two models using 2 different RISC-V simulators:

In [4]:
!mlonmcu flow run resnet toycar --backend tvmaot --target etiss_pulpino --target spike -c run.export_optional=1

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - [session-13]  Processing stage LOAD
INFO - [session-13]  Processing stage BUILD
ERROR - Package "tflite.Model" is not installed. Hint: "pip install tlcpack[tvmc]".
ERROR - The process returned an non-zero exit code 5! (CMD: `/tmp/CodeSizeComparison-TsW6/venv/bin/python3 -m tvm.driver.tvmc compile /tmp/CodeSizeComparison-TsW6/workspace/temp/sessions/13/runs/0/resnet.tflite --target c --target-c-mcpu generic-rv32 --target-c-model etiss-rv32gc --runtime crt --executor aot --pass-config tir.disable_vectorize=True --pass-config tir.usmp.enable=False --dump-code relay --opt-level 3 --output /tmp/tmp82ik2eb3/default.tar -f mlf --model-format tflite --runtime-crt-system-lib 0 --executor-aot-unpacked-api 0 --executor-aot-interface-api packed --target-c-constants-byte-alignment 16 --target-c-workspace-byte-alignment 16`)
Traceback (most recent call last):
  File "/work/git/mlonmcu/mlonmcu/mlonmcu/sessi

The MIPS value can be found in the column next to the Cycles (which are in this case actually counting instructions). However there is a lot of further information we want to filter out next. This can be achieved using the `filter_cols` subprocess.

In [5]:
!mlonmcu flow run resnet toycar --backend tvmaot --target etiss_pulpino --target spike --postprocess filter_cols --config filter_cols.keep="Model,Target,MIPS" -c run.export_optional=1

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - [session-14]  Processing stage LOAD
INFO - [session-14]  Processing stage BUILD
ERROR - Package "tflite.Model" is not installed. Hint: "pip install tlcpack[tvmc]".
ERROR - The process returned an non-zero exit code 5! (CMD: `/tmp/CodeSizeComparison-TsW6/venv/bin/python3 -m tvm.driver.tvmc compile /tmp/CodeSizeComparison-TsW6/workspace/temp/sessions/14/runs/0/resnet.tflite --target c --target-c-mcpu generic-rv32 --target-c-model etiss-rv32gc --runtime crt --executor aot --pass-config tir.disable_vectorize=True --pass-config tir.usmp.enable=False --dump-code relay --opt-level 3 --output /tmp/tmpicoxsv6v/default.tar -f mlf --model-format tflite --runtime-crt-system-lib 0 --executor-aot-unpacked-api 0 --executor-aot-interface-api packed --target-c-constants-byte-alignment 16 --target-c-workspace-byte-alignment 16`)
Traceback (most recent call last):
  File "/work/git/mlonmcu/mlonmcu/mlonmcu/sessi

That looks much more clean! However the numbers seem quite low, especially for the smaller `toycar` (MLPerfTiny Anomaly Detection) model. Let's see if the MIPS will increase when running more than a single inference. We are using the `benchmark` feature for this.

*Hint*: Since we are now running our benchmarks 60 times more often, the following cell will likely need a few minutes to execute.

In [6]:
!mlonmcu flow run resnet toycar --backend tvmaot --target etiss_pulpino --target spike --postprocess config2cols --postprocess filter_cols --config filter_cols.keep="Model,Target,MIPS,config_benchmark.num_runs" --feature benchmark --config-gen benchmark.num_runs=1 --config-gen benchmark.num_runs=10 --config-gen benchmark.num_runs=50 -c run.export_optional=1

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - [session-15]  Processing stage LOAD
INFO - [session-15]  Processing stage BUILD
ERROR - Package "tflite.Model" is not installed. Hint: "pip install tlcpack[tvmc]".
ERROR - The process returned an non-zero exit code 5! (CMD: `/tmp/CodeSizeComparison-TsW6/venv/bin/python3 -m tvm.driver.tvmc compile /tmp/CodeSizeComparison-TsW6/workspace/temp/sessions/15/runs/0/resnet.tflite --target c --target-c-mcpu generic-rv32 --target-c-model etiss-rv32gc --runtime crt --executor aot --pass-config tir.disable_vectorize=True --pass-config tir.usmp.enable=False --dump-code relay --opt-level 3 --output /tmp/tmpmk4t7kxn/default.tar -f mlf --model-format tflite --runtime-crt-system-lib 0 --executor-aot-unpacked-api 0 --executor-aot-interface-api packed --target-c-constants-byte-alignment 16 --target-c-workspace-byte-alignment 16`)
Traceback (most recent call last):
  File "/work/git/mlonmcu/mlonmcu/mlonmcu/sessi

This look more promising. This experiment shows MIPS measurements might not be accurate for short-running simulations. Also spike seems to be more than twice as fast compared to ETISS.

### B) Python Scripting

Some imports

In [7]:
from tempfile import TemporaryDirectory
from pathlib import Path
import pandas as pd

from mlonmcu.context.context import MlonMcuContext
from mlonmcu.session.run import RunStage

Benchmark Configuration

In [8]:
FRONTEND = "tflite"
MODELS = ["resnet", "toycar"]
BACKEND = "tvmaot"
PLATFORM = "mlif"
TARGETS = ["etiss_pulpino", "spike"]
POSTPROCESSES = ["config2cols", "filter_cols"]
FEATURES = ["benchmark"]
CONFIG = {
    "filter_cols.keep": ["Model", "Target", "MIPS", "config_benchmark.num_runs"], "run.export_optional": True
}

Initialize and run a single benchmark

In [9]:
with MlonMcuContext() as context:
    with context.create_session() as session:
        for model in MODELS:
            for target in TARGETS:
                def helper(session, num=0):
                    cfg = CONFIG.copy()
                    cfg["benchmark.num_runs"] = num
                    run = session.create_run(config=cfg)
                    run.add_frontend_by_name(FRONTEND, context=context)
                    run.add_features_by_name(FEATURES, context=context)
                    run.add_model_by_name(model, context=context)
                    run.add_backend_by_name(BACKEND, context=context)
                    run.add_platform_by_name(PLATFORM, context=context)
                    run.add_target_by_name(target, context=context)
                    run.add_postprocesses_by_name(POSTPROCESSES)
                for num in [1, 10]:  # Removed 50 to cut down runtime
                    helper(session, num)
        session.process_runs(context=context)
        report = session.get_reports()
assert "Failing" not in report.df.columns
report.df

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - [session-16] Processing all stages
ERROR - Package "tflite.Model" is not installed. Hint: "pip install tlcpack[tvmc]".
ERROR - The process returned an non-zero exit code 5! (CMD: `/tmp/CodeSizeComparison-TsW6/venv/bin/python3 -m tvm.driver.tvmc compile /tmp/CodeSizeComparison-TsW6/workspace/temp/sessions/16/runs/0/resnet.tflite --target c --target-c-mcpu generic-rv32 --target-c-model etiss-rv32gc --runtime crt --executor aot --pass-config tir.disable_vectorize=True --pass-config tir.usmp.enable=False --dump-code relay --opt-level 3 --output /tmp/tmpivsn_d3_/default.tar -f mlf --model-format tflite --runtime-crt-system-lib 0 --executor-aot-unpacked-api 0 --executor-aot-interface-api packed --target-c-constants-byte-alignment 16 --target-c-workspace-byte-alignment 16`)
Traceback (most recent call last):
  File "/work/git/mlonmcu/mlonmcu/mlonmcu/session/run.py", line 1045, in process
    func()
  File 

Unnamed: 0,Model,Target,config_benchmark.num_runs
0,resnet,etiss_pulpino,1
1,resnet,etiss_pulpino,10
2,resnet,spike,1
3,resnet,spike,10
4,toycar,etiss_pulpino,1
5,toycar,etiss_pulpino,10
6,toycar,spike,1
7,toycar,spike,10
