# Example: Compare MIPS of RISC-V Instruction Set Simulators

Typically MLonMCU would be used to benchmark TinyML workloads on real wardware or simulators. However it's flexibility also allows some interesting experiments not directly related to Embedded ML. In the following it the performance of some RISC-V ISA Simulators is compared using the MLonMCU command line or Python API.

## Supported components

**Models:** Any (`sine_model` used below)

**Frontends:** Any (`tflite` used below)

**Frameworks/Backends:** Any (`tvmaotplus` used below)

**Platforms/Targets:** `etiss_pulpino`, `spike`, `ovpsim` (`etiss_pulpino` and `spike` used below)

## Prerequisites

Set up MLonmCU as usual, i.e. initialize an environment and install all required dependencies. Feel free to use the following minimal `environment.yml.j2` template:

```yaml
---
home: "{{ home_dir }}"
logging:
  level: DEBUG
  to_file: false
  rotate: false
cleanup:
  auto: true
  keep: 10
paths:
  deps: deps
  logs: logs
  results: results
  plugins: plugins
  temp: temp
  models:
    - "{{ home_dir }}/models"
    - "{{ config_dir }}/models"
repos:
  tvm:
    url: "https://github.com/apache/tvm.git"
    ref: de6d8067754d746d88262c530b5241b5577b9aae
  etiss:
    url: "https://github.com/tum-ei-eda/etiss.git"
    ref: 4d2d26fb1fdb17e1da3a397c35d6f8877bf3ceab
  spike:
    url: "https://github.com/riscv-software-src/riscv-isa-sim.git"
    ref: 0bc176b3fca43560b9e8586cdbc41cfde073e17a
  spikepk:
    url: "https://github.com/riscv-software-src/riscv-pk.git"
    ref: 7e9b671c0415dfd7b562ac934feb9380075d4aa2
  mlif:
    url: "https://github.com/tum-ei-eda/mlonmcu-sw.git"
    ref: 4b9a32659f7c5340e8de26a0b8c4135ca67d64ac
frameworks:
  default: tvm
  tvm:
    enabled: true
    backends:
      default: tvmaot
      tvmaot:
        enabled: true
        features: []
    features: []
frontends:
  tflite:
    enabled: true
    features: []
toolchains:
  gcc: true
platforms:
  mlif:
    enabled: true
    features: []
targets:
  default: spike
  spike:
    enabled: true
    features: []
  etiss_pulpino:
    enabled: true
    features: []
```

Do not forget to set your `MLONMCU_HOME` environment variable first if not using the default location!

## Usage

If supported by the defined target, the measured MIPS (of the Simulation) is part of the report printed/returned my MLonMCU. The following shows you how to get rid of unwanted further information and how to increase the accuracy of the MIPS value.

### A) Command Line Interface

Let's start with an example benchmark of two models using 2 different RISC-V simulators:

In [5]:
!mlonmcu flow run resnet toycar --backend tvmaot --target etiss_pulpino --target spike

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - Loading extensions.py (User)
INFO - [session-373]  Processing stage LOAD
INFO - [session-373]  Processing stage BUILD
INFO - [session-373]  Processing stage COMPILE
INFO - [session-373]  Processing stage RUN
INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-373] Done processing runs
INFO - Report:
   Session  Run   Model Frontend Framework Backend Platform         Target    Cycles        MIPS  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data Features                                             Config Postprocesses Comment
0      373    0  resnet   tflite       tvm  tvmaot     mlif  etiss_pulpino  82445446   68.000000     233324     124785         167488     65692       144      2485              122300       []  {'tflite.use_inout_data': False, 'tflite.visua...            []       -
1      373    1  resnet   tflite       

The MIPS value can be found in the column next to the Cycles (which are in this case actually counting instructions). However there is a lot of further information we want to filter out next. This can be achieved using the `filter_cols` subprocess.

In [6]:
!mlonmcu flow run resnet toycar --backend tvmaot --target etiss_pulpino --target spike --postprocess filter_cols --config filter_cols.keep="Model,Target,MIPS"

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - Loading extensions.py (User)
INFO - [session-374]  Processing stage LOAD
INFO - [session-374]  Processing stage BUILD
INFO - [session-374]  Processing stage COMPILE
INFO - [session-374]  Processing stage RUN
INFO - [session-374]  Processing stage POSTPROCESS
INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-374] Done processing runs
INFO - Report:
    Model         Target        MIPS
0  resnet  etiss_pulpino   68.000000
1  resnet          spike  246.446167
2  toycar  etiss_pulpino    2.000000
3  toycar          spike   48.016112


That looks much more clean! However the numbers seem quite low, especially for the smaller `toycar` (MLPerfTiny Anomaly Detection) model. Let's see if the MIPS will increase when running more than a single inference. We are using the `benchmark` feature for this.

*Hint*: Since we are now running our benchmarks 60 times more often, the following cell will likely need a few minutes to execute.

In [7]:
!mlonmcu flow run resnet toycar --backend tvmaot --target etiss_pulpino --target spike --postprocess config2cols --postprocess filter_cols --config filter_cols.keep="Model,Target,MIPS,config_benchmark.num_runs" --feature benchmark --config-gen benchmark.num_runs=1 --config-gen benchmark.num_runs=10 --config-gen benchmark.num_runs=50

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - Loading extensions.py (User)
INFO - [session-375]  Processing stage LOAD
INFO - [session-375]  Processing stage BUILD
INFO - [session-375]  Processing stage COMPILE
INFO - [session-375]  Processing stage RUN
INFO - [session-375]  Processing stage POSTPROCESS
INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-375] Done processing runs
INFO - Report:
     Model         Target        MIPS config_benchmark.num_runs
0   resnet  etiss_pulpino   68.000000                         1
1   resnet          spike  209.017940                         1
2   resnet  etiss_pulpino  127.000000                        10
3   resnet          spike  318.765075                        10
4   resnet  etiss_pulpino  136.000000                        50
5   resnet          spike  312.104929                        50
6   toycar  etiss_pulpino    2.000000                         1
7   toyc

This look more promising. This experiment shows MIPS measurements might not be accurate for short-running simulations. Also spike seems to be more than twice as fast compared to ETISS.

### B) Python Scripting

Some imports

In [4]:
from tempfile import TemporaryDirectory
from pathlib import Path
import pandas as pd

from mlonmcu.context.context import MlonMcuContext
from mlonmcu.session.run import RunStage

Benchmark Configuration

In [14]:
FRONTEND = "tflite"
MODELS = ["resnet", "toycar"]
BACKEND = "tvmaot"
PLATFORM = "mlif"
TARGETS = ["etiss_pulpino", "spike"]
POSTPROCESSES = ["config2cols", "filter_cols"]
FEATURES = ["benchmark"]
CONFIG = {
    "filter_cols.keep": ["Model", "Target", "MIPS", "config_benchmark.num_runs"]
}

Initialize and run a single benchmark

In [17]:
with MlonMcuContext() as context:
    session = context.create_session()
    for model in MODELS:
        for target in TARGETS:
            def helper(session, num=0):
                cfg = CONFIG.copy()
                cfg["benchmark.num_runs"] = num
                run = session.create_run(config=cfg)
                run.add_frontend_by_name(FRONTEND, context=context)
                run.add_features_by_name(FEATURES, context=context)
                run.add_model_by_name(model, context=context)
                run.add_backend_by_name(BACKEND, context=context)
                run.add_platform_by_name(PLATFORM, context=context)
                run.add_target_by_name(target, context=context)
                run.add_postprocesses_by_name(POSTPROCESSES)
            for num in [1, 10]:  # Removed 50 to cut down runtime
                helper(session, num)
    session.process_runs(context=context)
    report = session.get_reports()
report.df

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - Loading extensions.py (User)
INFO - [session-382] Processing all stages
INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-382] Done processing runs


Unnamed: 0,Model,Target,MIPS,config_benchmark.num_runs
0,resnet,etiss_pulpino,70.0,1
1,resnet,etiss_pulpino,127.0,10
2,resnet,spike,,1
3,resnet,spike,,10
4,toycar,etiss_pulpino,2.0,1
5,toycar,etiss_pulpino,18.0,10
6,toycar,spike,,1
7,toycar,spike,,10
