# Example: Validate Model Outputs for correctness

For safety-critical applications we can not accept deviations at the model ouputs caused by the deployment method. The following shows how to verify if the generated model outputs are as expected.

*Warning:* The current version can only verify the bit-excactness of model outputs. Hence why it is very sensitive to even small derivations compared to the reference (golden) outputs. This limitation might be eliminated with a future revision of MLonMCUs `validate` feature.

## Supported components

**Models:** Any (`aww` and `resnet` used below)

**Frontends:** Any (`tflite` used below)

**Frameworks/Backends:** Any (`tvmaotplus` and `tflmi` used below)

**Platforms/Targets:** Any target supported by `mlif` or `espidf` platform

**Features:** `validate` and `debug` platform features have to be enabled 

## Prerequisites

Set up MLonmCU as usual, i.e. initialize an environment and install all required dependencies. Feel free to use the following minimal `environment.yml.j2` template:

```yaml
---
home: "{{ home_dir }}"
logging:
  level: DEBUG
  to_file: false
  rotate: false
cleanup:
  auto: true
  keep: 10
paths:
  deps: deps
  logs: logs
  results: results
  plugins: plugins
  temp: temp
  models:
    - "{{ home_dir }}/models"
    - "{{ config_dir }}/models"
repos:
  tensorflow:
    url: "https://github.com/tensorflow/tflite-micro.git"
    ref: f050eec7e32a0895f7658db21a4bdbd0975087a5
  tvm:
    url: "https://github.com/apache/tvm.git"
    ref: de6d8067754d746d88262c530b5241b5577b9aae
  etiss:
    url: "https://github.com/tum-ei-eda/etiss.git"
    ref: 4d2d26fb1fdb17e1da3a397c35d6f8877bf3ceab
  mlif:
    url: "https://github.com/tum-ei-eda/mlonmcu-sw.git"
    ref: 4b9a32659f7c5340e8de26a0b8c4135ca67d64ac
frameworks:
  default: tvm
  tflm:
    enabled: true
    backends:
      default: tflmi
      tflmi:
        enabled: true
        features: []
    features: []
  tvm:
    enabled: true
    backends:
      default: tvmaotplus
      tvmaotplus:
        enabled: true
        features: []
    features: []
frontends:
  tflite:
    enabled: true
    features: []
toolchains:
  gcc: true
platforms:
  mlif:
    enabled: true
    features:
      debug: true
      validate: true
targets:
  default: etiss_pulpino
  etiss_pulpino:
    enabled: true
    features: []
```

Do not forget to set your `MLONMCU_HOME` environment variable first if not using the default location!

## Usage

*Hint*: Due to the program being build in debug mode and running one inference for each provided input-output combination, the simulation time will likely decrease by some factors. Add the `--parallel` flag to your command line to allow MLonMCU to run multiple simulations in parallel.

*Hint:* We are not able to provide reference data for every model in out model zoo. If you might want to add reference data for your own models, see: TODO

### A) Command Line Interface

As an example, let's see if the `tflmi` and `tvmaotplus` backend produce different model outputs for the same model.

To enable the validation, just add `--feature debug --feature validate` to the command line:

In [1]:
!python -m mlonmcu.cli.main flow run aww resnet -b tflmi -b tvmaotplus -t etiss_pulpino -f debug -f validate

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO -  Processing stage LOAD
INFO -  Processing stage BUILD


INFO -  Processing stage COMPILE


INFO -  Processing stage RUN


ERROR - A platform error occured during the simulation. Reason: OUTPUT_MISSMATCH


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - Done processing runs
INFO - Report:
   Session  Run   Model Frontend Framework     Backend Platform         Target  Total Cycles  Total Instructions  Total CPI  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data  Validation           Features                                             Config Postprocesses Comment
0        0    0     aww   tflite      tflm       tflmi     mlif  etiss_pulpino    1412910849          1412910849        1.0     417540      36084         109424    307968       148      1764               34320        True  [debug, validate]  {'aww.output_shapes': {'Identity': [1, 12]}, '...            []       -
1        0    1     aww   tflite       tvm  tvmaotplus     mlif  etiss_pulpino     252817568           252817568        1.0     133126      59360          57816     75166       144      1760               57600        True  [debug, validate]  {'

Since we are building in debug mode, most of the reported metrics are not meaningful. Let's get rid of the using the `filter_cols` postprocess:

In [2]:
!python -m mlonmcu.cli.main flow run aww resnet -b tflmi -b tvmaotplus -t etiss_pulpino -f debug -f validate \
        --postprocess filter_cols --config filter_cols.keep="Model,Backend,Validation"

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO - [session-1]  Processing stage LOAD


INFO - [session-1]  Processing stage BUILD


INFO - [session-1]  Processing stage COMPILE


INFO - [session-1]  Processing stage RUN


ERROR - A platform error occured during the simulation. Reason: OUTPUT_MISSMATCH


INFO - [session-1]  Processing stage POSTPROCESS


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-1] Done processing runs
INFO - Report:
    Model     Backend  Validation
0     aww       tflmi        True
1     aww  tvmaotplus        True
2  resnet       tflmi        True
3  resnet  tvmaotplus       False


By investigating the 'Validation' column or the `OUTPUT_MISSMATCH` printed earlier (at least at the time of testing this example) you can see, that one out of 4 validation have failed. TVM beeing not bit-accurate for quantized models is a known issue which needs further investigation.

It is also possible to find out which model output has caused the missmatch by looking at the simulation outputs:

In [3]:
!python -m mlonmcu.cli.main flow run resnet -b tvmaotplus -t etiss_pulpino -f debug -f validate \
        --config etiss_pulpino.print_outputs=1

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO - [session-2]  Processing stage LOAD
INFO - [session-2]  Processing stage BUILD


INFO - [session-2]  Processing stage COMPILE


INFO - [session-2]  Processing stage RUN
=== Setting up configurations ===
Initializer::loadIni(): Ini sucessfully loaded /tmp/ValidateOutputs-RcEj/workspace/deps/install/etiss/examples/base.ini
Initializer::loadIni(): Ini sucessfully loaded /tmp/etiss_dynamic_aZ1f9oAxIa.ini
Initializer::loadIni(): Ini sucessfully loaded /tmp/tmp2kw4e79f/custom.ini
  Load Configs from .ini files:
ETISS: Info: Created new config container: global
ETISS: Info:   [BoolConfigurations]
ETISS: Info:     arch.enable_semihosting=false,
ETISS: Info:     arch.or1k.ignore_sr_iee=false,
ETISS: Info:     etiss.enable_dmi=true,
ETISS: Info:     etiss.load_integrated_libraries=true,
ETISS: Info:     etiss.log_pc=false,
ETISS: Info:     jit.debug=false,
ETISS: Info:     jit.gcc.cleanup=true,
ETISS: Info:     jit.verify=false,
ETISS: Info:     testing=false,
ETISS: Info:   [IntConfigurations]
ETISS: Info:     arch.or1k.if_stall_cycles=0,
ETISS: Info:     arch.rv32imacfdpv.mstatus_fs=1,
ETISS: Info: 

Program start.


Category 0: 0.03125
Category 1: 0.91015625
Category 2: 0.0078125
Category 3: 0.05078125
Category 4: 0
Category 5: 0
Category 6: 0
Category 7: 0
Category 8: 0
Category 9: 0.00390625
Predicted category: 1
MLIF: Wrong output in category 0! Expected 0.01953125
# Setup Cycles: 141
# Setup Instructions: 141
# Run Cycles: 687052903
# Run Instructions: 687052903
# Total Cycles: 1374132527
# Total Instructions: 1374132527
Program finish.
MLONMCU EXIT: 18


exit called with code: 18
CPU Time: 42.9447s    Simulation Time: 14.6125s
CPU Cycles (estimated): 1.37423e+09
MIPS (estimated): 94.0446
=== Simulation end ===

CPU0 exited with exception: 0x80000000: Finished cpu execution. This is the proper way to exit from etiss::CPUCore::execute.
ERROR - A platform error occured during the simulation. Reason: OUTPUT_MISSMATCH


heap starts at: 0x81a700
=== Results ===
ROM usage:        245.3 kB (0x3be48)
  read-only data: 170.5 kB (0x299d8)
  code:           74.7 kB (0x123e0)
  other required: 144 Bytes (0x90)
RAM usage:        108.3 kB (0x1a6f0) [stack and heap usage not included]
  data:           1.8 kB (0x6e0)
  zero-init data: 106.5 kB (0x1a010)
  stack:          unknown [missing trace file]
  heap:           unknown [missing trace file]


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-2] Done processing runs
INFO - Report:
   Session  Run   Model Frontend Framework     Backend Platform         Target  Total Cycles  Total Instructions  Total CPI  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data  Validation           Features                                             Config Postprocesses Comment
0        2    0  resnet   tflite       tvm  tvmaotplus     mlif  etiss_pulpino    1374132527          1374132527        1.0     245320     108272         170456     74720       144      1760              106512       False  [validate, debug]  {'resnet.output_shapes': {'Identity_int8': [1,...            []       -
