In [1]:
from IPython.display import Code

# Example: How to target a UMA-integrated backend using MLonMCU

In this example you will see how to use MLonMCU to run a NN model on a custom accelerator, which has been integrated into TVM using UMA. We take Vanilla (https://tvm.apache.org/docs/tutorial/uma.html) and QVanilla, which is a quantized version of Vanilla, as our accelerator examples.  

## Supported components

**Models:** Any (depends on the data-type supported by the accelerator)

**Frontends:** Any (`tflite` used below)

**Frameworks/Backends:** (`tvmaotplus` and `tvmaot` used below)

**Platforms/Targets:** (`mlif`)

**Features:** `uma_backends` and `vanilla_accelerator` 

## Prerequisites

Set up MLonmCU as usual, i.e. initialize an environment and install all required dependencies. For creating MLonMCU environment, use the exsiting UMA template as shown in the command below:

python -m mlonmcu.cli.main init /path/to/workspace -t uma

The UMA template (uma.yml.j2) file is the same as the following "environment.yml.j2" file. As you can see, it needs  a specific patch of TVM. Moreover, you see one new library that would be cloned as "etiss_accelerator_plugins", which is specific to the models of the QVanilla and Vanilla accelerator simulated via ETISS plugins.

In [2]:
Code(filename="environment.yml.j2")

Do not forget to set your `MLONMCU_HOME` environment variable first if not using the default location!

## Usage

The UMA backends for Vanilla and QVanilla have been added into MLonMCU resources (mlonmcu/resources/frameworks/tvm/tvmc_extension). So you can easily use them as the TVM backends in MLonMCU. This directory and the defined name for each backend should be passed as `uma_dir` and `uma_target` attributes of `uma_backends` feature. 

Here you can find three examples based on the accelerator models. 
The first one is targeting Vanilla as a hardcoded C model, similar to the UMA documentation example.
The simulated models of Vanilla and QVanilla have been also developed, which act as memeory-mapped pheripherals for ETISS target. Thus, the other two examples shows how to use the ETISS plugins to offload the patterns.

### A) Vanilla as a hardcoded C model.

You can take umatest.tflite model as a float model for testing Vanilla accelerator.

`uma_backends.uma_dir`= $(pwd)/../../../resources/frameworks/tvm

`uma_backends.uma_target`=vanilla_accelerator

In [3]:
!python3 -m mlonmcu.cli.main flow run umatest --backend tvmaotplus --target etiss -c tvmaotplus.desired_layout=NCHW -f uma_backends -c uma_backends.uma_dir=$(pwd)/../../../resources/frameworks/tvm -c uma_backends.uma_target=vanilla_accelerator

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO -  Processing stage LOAD
INFO -  Processing stage BUILD


INFO -  Processing stage COMPILE


INFO -  Processing stage RUN


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - Done processing runs


INFO - Report:
   Session  Run    Model Frontend Framework     Backend Platform Target  Total Cycles  Total Instructions  Total CPI  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data  Validation        Features                                             Config Postprocesses Comment
0        0    0  umatest   tflite       tvm  tvmaotplus     mlif  etiss       4055888             4055888        1.0     281524     223556         237960     43552        12      1728              221828        True  [uma_backends]  {'umatest.metadata_path': 'definition.yml', 'u...            []       -


### B) Vanilla as an ETISS plugin.

The file named `conv2dnchw.cc` in the UMA backend structure for each accelerator is the accelerator interface, which can be a hardcoded implementation of the supported operators (like the previous example) or for instance a dirver for configuration of the plugin's register interface. You can find both interface types (conv2dnchw.cc & conv2dnchw1.cc) in the directory, so please make sure which one is called (simply interchange the names, if needed).

For using the respective plugin in ETISS, we need to use `vanilla_accelerator` feature of ETISS target and its `plugin_name` attribute to pass the name of the plugin.

`uma_backends.uma_dir` = $(pwd)/../../../resources/frameworks/tvm

`uma_backends.uma_target` = vanilla_accelerator

`vanilla_accelerator.plugin_name` = VanillaAccelerator

In [4]:
!python3 -m mlonmcu.cli.main flow run umatest.tflite --backend tvmaotplus --target etiss -c tvmaotplus.desired_layout=NCHW -f uma_backends -c uma_backends.uma_dir=$(pwd)/../../../resources/frameworks/tvm -c uma_backends.uma_target=vanilla_accelerator -c etiss.print_outputs=1 -f vanilla_accelerator -c vanilla_accelerator.plugin_name=VanillaAccelerator

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO - [session-1]  Processing stage LOAD
INFO - [session-1]  Processing stage BUILD


INFO - [session-1]  Processing stage COMPILE


INFO - [session-1]  Processing stage RUN


=== Setting up configurations ===
Initializer::loadIni(): Ini sucessfully loaded /tmp/TVM-UMA-IfkK/workspace/deps/install/etiss/examples/base.ini
Initializer::loadIni(): Ini sucessfully loaded /tmp/etiss_dynamic_5TPkyAMzRN.ini
Initializer::loadIni(): Ini sucessfully loaded /tmp/tmp743vde_b/custom.ini
  Load Configs from .ini files:
ETISS: Info: Created new config container: global
ETISS: Info:   [BoolConfigurations]
ETISS: Info:     arch.enable_semihosting=true,
ETISS: Info:     arch.or1k.ignore_sr_iee=false,
ETISS: Info:     etiss.enable_dmi=true,
ETISS: Info:     etiss.load_integrated_libraries=true,
ETISS: Info:     etiss.log_pc=false,
ETISS: Info:     jit.debug=false,
ETISS: Info:     jit.gcc.cleanup=true,
ETISS: Info:     jit.verify=false,
ETISS: Info:     testing=false,
ETISS: Info:   [IntConfigurations]
ETISS: Info:     arch.or1k.if_stall_cycles=0,
ETISS: Info:     arch.rv32imacfdpv.mstatus_fs=1,
ETISS: Info:     etiss.max_block_size=100,
ETISS: Info:     ETI

Program start.
# Setup Cycles: 42
# Setup Instructions: 42
# Run Cycles: 4055654
# Run Instructions: 4055654
# Total Cycles: 4055888
# Total Instructions: 4055888
Program finish.
MLONMCU EXIT: 0
CPU Time: 0.129764s    Simulation Time: 0.477722s
CPU Cycles (estimated): 4.15245e+06
MIPS (estimated): 8.69219
=== Simulation end ===

CPU0 exited with exception: 0x80000000: Finished cpu execution. This is the proper way to exit from etiss::CPUCore::execute.


heap starts at: 0x836960
=== Results ===
ROM usage:        281.5 kB (0x44bb4)
  read-only data: 238.0 kB (0x3a188)
  code:           43.6 kB (0xaa20)
  other required: 12 Bytes (0xc)
RAM usage:        223.6 kB (0x36944) [stack and heap usage not included]
  data:           1.7 kB (0x6c0)
  zero-init data: 221.8 kB (0x36284)
  stack:          unknown [missing trace file]
  heap:           unknown [missing trace file]


INFO - All runs completed successfuly!


INFO - Postprocessing session report
INFO - [session-1] Done processing runs
INFO - Report:
   Session  Run    Model Frontend Framework     Backend Platform Target  Total Cycles  Total Instructions  Total CPI  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data  Validation                             Features                                             Config Postprocesses Comment
0        1    0  umatest   tflite       tvm  tvmaotplus     mlif  etiss       4055888             4055888        1.0     281524     223556         237960     43552        12      1728              221828        True  [uma_backends, vanilla_accelerator]  {'umatest.metadata_path': 'definition.yml', 'u...            []       -


Now you can see that the simulation starts with initializing VanillaAccelerator.

### C) QVanilla as an ETISS plugin.

As stated before, QVanilla is a quantized version of Vanilla that can perform the quantized conv2d and bias addition. Similar to Vanilla, the convoluation needs to have stride one and same padding features. This accelerator has been implemented in ETISS as a zero-cycle model (QVanillaAccelerator) and a model with timing considerations (QVanillaAcceleratorT). 

Please make sure that you are using the right interface (conv2dnchw.cc) to configure the registers of the respective plugin.

Here, we use a very small model named qnn_model.tflite to test the flow, but you feel free to use the existing quantized models in MLonMCU such as aww, vww, resnet or any other model that contains such supported operators by QVanilla.

`uma_backends.uma_dir` = $(pwd)/../../../resources/frameworks/tvm

`uma_backends.uma_target` = q_vanilla_accelerator

`vanilla_accelerator.plugin_name` = QVanillaAccelerator/QVanillaAcceleratorT

In [5]:
!python3 -m mlonmcu.cli.main flow run qnn_model.tflite --backend tvmaotplus --target etiss -c tvmaotplus.desired_layout=NCHW -f uma_backends -c uma_backends.uma_dir=$(pwd)/../../../resources/frameworks/tvm -c uma_backends.uma_target=q_vanilla_accelerator -c etiss.print_outputs=1 -f vanilla_accelerator -c vanilla_accelerator.plugin_name=QVanillaAcceleratorT

INFO - Loading environment cache from file
INFO - Successfully initialized cache


INFO - [session-2]  Processing stage LOAD
INFO - [session-2]  Processing stage BUILD


INFO - [session-2]  Processing stage COMPILE


INFO - [session-2]  Processing stage RUN
=== Setting up configurations ===
Initializer::loadIni(): Ini sucessfully loaded /tmp/TVM-UMA-IfkK/workspace/deps/install/etiss/examples/base.ini
Initializer::loadIni(): Ini sucessfully loaded /tmp/etiss_dynamic_z55P0ayTsd.ini
Initializer::loadIni(): Ini sucessfully loaded /tmp/tmpe9ocuare/custom.ini
  Load Configs from .ini files:
ETISS: Info: Created new config container: global
ETISS: Info:   [BoolConfigurations]
ETISS: Info:     arch.enable_semihosting=true,
ETISS: Info:     arch.or1k.ignore_sr_iee=false,
ETISS: Info:     etiss.enable_dmi=true,
ETISS: Info:     etiss.load_integrated_libraries=true,
ETISS: Info:     etiss.log_pc=false,
ETISS: Info:     jit.debug=false,
ETISS: Info:     jit.gcc.cleanup=true,
ETISS: Info:     jit.verify=false,
ETISS: Info:     testing=false,
ETISS: Info:   [IntConfigurations]
ETISS: Info:     arch.or1k.if_stall_cycles=0,
ETISS: Info:     arch.rv32imacfdpv.mstatus_fs=1,
ETISS: Info:     etiss

Program start.
start time= 311593750
start cpu cycle= 9971


# Setup Cycles: 42
# Setup Instructions: 42
# Run Cycles: 527029
# Run Instructions: 527029
# Total Cycles: 527262
# Total Instructions: 527262
Program finish.
MLONMCU EXIT: 0
CPU Time: 0.0171297s    Simulation Time: 0.367881s
CPU Cycles (estimated): 548150
MIPS (estimated): 1.49002
=== Simulation end ===

CPU0 exited with exception: 0x80000000: Finished cpu execution. This is the proper way to exit from etiss::CPUCore::execute.


heap starts at: 0x805760
=== Results ===
ROM usage:        247.1 kB (0x3c50c)
  read-only data: 204.2 kB (0x31dd0)
  code:           42.8 kB (0xa730)
  other required: 12 Bytes (0xc)
RAM usage:        22.3 kB (0x5744) [stack and heap usage not included]
  data:           1.7 kB (0x6c0)
  zero-init data: 20.6 kB (0x5084)
  stack:          unknown [missing trace file]
  heap:           unknown [missing trace file]


INFO - All runs completed successfuly!
INFO - Postprocessing session report
INFO - [session-2] Done processing runs
INFO - Report:
   Session  Run      Model Frontend Framework     Backend Platform Target  Total Cycles  Total Instructions  Total CPI  Total ROM  Total RAM  ROM read-only  ROM code  ROM misc  RAM data  RAM zero-init data  Validation                             Features                                             Config Postprocesses Comment
0        2    0  qnn_model   tflite       tvm  tvmaotplus     mlif  etiss        527262              527262        1.0     247052      22340         204240     42800        12      1728               20612        True  [vanilla_accelerator, uma_backends]  {'qnn_model.metadata_path': 'definition.yml', ...            []       -
