<span style="color:red; font-family:Helvetica Neue, Helvetica, Arial, sans-serif; font-size:2em;">An Exception was encountered at '<a href="#papermill-error-cell">In [5]</a>'.</span>

# Example: Memory Tracing for detailed dyn. Memory Estimations using ETISS

When measuring RAM usage of a program we can differentiate between two different classes:
- Static RAM usage (known after compilation/linking)
- Dynamic RAM usage (e.g. max. Heap/Stack utilization)

In the following an example is shown on how to use the `trace` feature of the `etiss_pulpino` target to measure the dynamic RAM usage in addition to the static usage for a simple benchmark.

## Supported components

**Models:** Any (`sine_model` used below)

**Frontends:** Any (`tflite` used below)

**Frameworks/Backends:** Any (`tvmaotplus` and `tvmrt` used below)

**Platforms/Targets:** `etiss_pulpino` only (`spike`, `ovpsim`, `gvsoc` will be added later)

**Features:** `trace` feature needs to be enabled

## Prerequisites

Set up MLonmCU as usual, i.e. initialize an environment and install all required dependencies. Feel free to use the following minimal `environment.yml.j2` template:

```yaml
---
home: "{{ home_dir }}"
logging:
  level: DEBUG
  to_file: false
  rotate: false
cleanup:
  auto: true
  keep: 10
paths:
  deps: deps
  logs: logs
  results: results
  plugins: plugins
  temp: temp
  models:
    - "{{ home_dir }}/models"
    - "{{ config_dir }}/models"
repos:
  tvm:
    url: "https://github.com/apache/tvm.git"
    ref: de6d8067754d746d88262c530b5241b5577b9aae
  etiss:
    url: "https://github.com/tum-ei-eda/etiss.git"
    ref: 4d2d26fb1fdb17e1da3a397c35d6f8877bf3ceab
  mlif:
    url: "https://github.com/tum-ei-eda/mlonmcu-sw.git"
    ref: 4b9a32659f7c5340e8de26a0b8c4135ca67d64ac
frameworks:
  default: tvm
  tvm:
    enabled: true
    backends:
      default: tvmaotplus
      tvmaotplus:
        enabled: true
        features: []
      tvmrt:
        enabled: true
        features: []
    features: []
frontends:
  tflite:
    enabled: true
    features: []
toolchains:
  gcc: true
platforms:
  mlif:
    enabled: true
    features: []
targets:
  default: etiss_pulpino
  etiss_pulpino:
    enabled: true
    features:
      trace: true
```

Do not forget to set your `MLONMCU_HOME` environment variable first if not using the default location!

## Usage

*Warning:* Since memory tracing involves writing a log of every single memory access to disk, this might drastically slow down the execution time as well as write a lot of data to your disk. (For larger models this might exceed 10GB per inference!)

### A) Command Line Interface

As an example, let's compare `tvmaotplus` (MicroTVM lightweight Ahead-of-Time runtime) with `tvmrt` (MicroTVMs legacy graph runtime).

To use the `trace` feature, just add `--feature trace` to the command line:

In [1]:
!mlonmcu flow run sine_model --backend tvmaotplus --backend tvmrt --target etiss_pulpino --feature trace

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO - [session-79]  Processing stage LOAD
INFO - [session-79]  Processing stage BUILD


INFO - [session-79]  Processing stage COMPILE


INFO - [session-79]  Processing stage RUN


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-79] Done processing runs
INFO - Report:
   Session  Run       Model Frontend Framework     Backend Platform         Target  Total Cycles  Total Instructions  Total CPI  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data  RAM stack  RAM heap  Validation Features                                             Config Postprocesses Comment
0       79    0  sine_model   tflite       tvm  tvmaotplus     mlif  etiss_pulpino          1891                1891        1.0      56534       4552           4360     52030       144      2492                 284        736      1040        True  [trace]  {'sine_model.output_shapes': {'Identity': [1, ...            []       -
1       79    1  sine_model   tflite       tvm       tvmrt     mlif  etiss_pulpino        329503              329503        1.0      82884     141420          12576     70164       144      2500        

By using the `filter_cols` postprocess we can strip away all unneeded information from the benchmark report to make it a bit more reabile:

In [2]:
!mlonmcu flow run sine_model --backend tvmaotplus --backend tvmrt --target etiss_pulpino --feature trace \
    --postprocess filter_cols --config filter_cols.keep="Backend,Total RAM,RAM data,RAM zero-init data,RAM stack,RAM heap"

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO - [session-80]  Processing stage LOAD
INFO - [session-80]  Processing stage BUILD


INFO - [session-80]  Processing stage COMPILE


INFO - [session-80]  Processing stage RUN


INFO - [session-80]  Processing stage POSTPROCESS


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-80] Done processing runs
INFO - Report:
      Backend  Total RAM  RAM data  RAM zero-init data  RAM stack  RAM heap
0  tvmaotplus       4552      2492                 284        736      1040
1       tvmrt     141420      2500              132292       5588      1040


It can be seen, that for this simple benchmark, the `tvmrt` backend uses approx. 8 times more stack than `tvmaotplus`. However this is probably neglectible compared to the total RAM usage in this scenario.

### B) Python Scripting

Some Imports

In [4]:
from tempfile import TemporaryDirectory
from pathlib import Path
import pandas as pd

from mlonmcu.context.context import MlonMcuContext
from mlonmcu.session.run import RunStage

Benchmark Configuration

In [5]:
FRONTEND = "tflite"
MODEL = "sine_model"
BACKENDS = ["tvmaotplus", "tvmrt"]
PLATFORM = "mlif"
TARGET = "etiss_pulpino"
FEATURES = ["trace"]
CONFIG = {}
POSTPROCESSES = []

Initialize and run a single benchmark

<span id="papermill-error-cell" style="color:red; font-family:Helvetica Neue, Helvetica, Arial, sans-serif; font-size:2em;">Execution using papermill encountered an exception here and stopped:</span>

In [6]:
with MlonMcuContext() as context:
    with context.create_session() as session:
        for backend in BACKENDS:
            run = session.create_run(config=CONFIG)
            run.add_features_by_name(FEATURES, context=context)
            run.add_frontend_by_name(FRONTEND, context=context)
            run.add_model_by_name(MODEL, context=context)
            run.add_backend_by_name(backend, context=context)
            run.add_platform_by_name(PLATFORM, context=context)
            run.add_target_by_name(TARGET, context=context)
            run.add_postprocesses_by_name(POSTPROCESSES)
        session.process_runs(context=context)
        report = session.get_reports()
assert "Failing" not in report.df.columns
report.df

INFO - Loading environment cache from file
INFO - Successfully initialized cache
INFO - [session-81] Processing all stages
INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-81] Done processing runs


Unnamed: 0,Session,Run,Model,Frontend,Framework,Backend,Platform,Target,Total Cycles,Total Instructions,Total CPI,Total ROM,Total RAM,ROM read-only,ROM code,ROM misc,RAM data,RAM zero-init data,RAM stack,RAM heap,Validation,Features,Config,Postprocesses,Comment
0,81,0,sine_model,tflite,tvm,tvmaotplus,mlif,etiss_pulpino,1891,1891,1.0,56534,4552,4360,52030,144,2492,284,736,1040,True,[trace],"{'sine_model.output_shapes': {'Identity': [1, ...",[],-
1,81,1,sine_model,tflite,tvm,tvmrt,mlif,etiss_pulpino,329503,329503,1.0,82884,141420,12576,70164,144,2500,132292,5588,1040,True,[trace],"{'sine_model.output_shapes': {'Identity': [1, ...",[],-


Filter out irrelevant data (using pandas here instead of MLonMCU postprocesses)

In [7]:
df = report.df
df[["Backend", "Total RAM", "RAM data", "RAM zero-init data", "RAM stack", "RAM heap"]]

Unnamed: 0,Backend,Total RAM,RAM data,RAM zero-init data,RAM stack,RAM heap
0,tvmaotplus,4552,2492,284,736,1040
1,tvmrt,141420,2500,132292,5588,1040
