<a href="https://colab.research.google.com/github/uwsampl/tutorial/blob/master/notebook/06_TVM_Tutorial_MicroTVM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Please run the following block to ensure TVM is setup for *this notebook*.  Each notebook may have its own runtime.

In [1]:
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

if IN_COLAB:
    ! gsutil cp "gs://tvm-fcrc-binariesd5fce43e-8373-11e9-bfb6-0242ac1c0002/tvm.tar.gz" /tmp/tvm.tar.gz
    ! mkdir -p /tvm
    ! tar -xf /tmp/tvm.tar.gz --strip-components=4 --directory /tvm
    ! ls -la /tvm
    ! bash /tvm/package.sh
    # Add TVM to the Python path.
    import sys
    sys.path.append('/tvm/python')
    sys.path.append('/tvm/topi/python')
    ! pip install mxnet
else:
    print("Notebook executing locally, skipping Colab setup ...")



Copying gs://tvm-fcrc-binariesd5fce43e-8373-11e9-bfb6-0242ac1c0002/tvm.tar.gz...
\
Operation completed over 1 objects/115.9 MiB.                                    
total 164
drwxr-xr-x 21 root root  4096 Jun 15 00:25 .
drwxr-xr-x  1 root root  4096 Jun 15 00:25 ..
drwx------  8 root root  4096 May 31 08:14 3rdparty
drwx------ 12 root root  4096 Jun 14 21:19 apps
drwx------  3 root root  4096 Jun 15 00:20 build
drwx------  4 root root  4096 Jun 14 21:19 cmake
-rw-------  1 root root 10778 Jun 14 21:19 CMakeLists.txt
drwx------  6 root root  4096 Jun 14 21:19 conda
-rw-------  1 root root  5736 Jun 14 21:19 CONTRIBUTORS.md
drwx------  3 root root  4096 Jun 14 21:19 docker
drwx------ 11 root root  4096 Jun 14 21:19 docs
drwx------  4 root root  4096 Jun 14 21:19 golang
drwx------  3 root root  4096 May 31 08:14 include
-rw-------  1 root root 10542 Jun 14 21:19 Jenkinsfile
drwx------  6 root root  4096 Jun 14 21:19 jvm
-rw-------  1 root root 11357 Jun 14 21:19 LICENSE
-rw-------  1 root

# MicroTVM

In this notebook, we're going to cover:

*   the C code generation backend
*   the graph runtime
*   a demo of ResNeT on the graph runtime
*   the low-level device interface

TODO: Do we even show the low-level device interface?

First, we run some necessary imports.

In [0]:
import os

import numpy as np
import tvm
from tvm.contrib import graph_runtime, util
from tvm import relay
import tvm.micro as micro

## C Codegen

Let's take a look at the C codegen backend for a simple function.

In [3]:
# First, we construct the function.
ty = relay.TensorType(shape=(1024,), dtype="float32")
x = relay.var("x", ty)
y = relay.var("y", ty)
func = relay.Function([x, y], relay.add(x, y))

# Then build it with the appropriate configuration.
with tvm.build_config(disable_vectorize=True):
  _, src_mod, _ = relay.build(func, target="c", params={})

# Now, we can see the source that was generated.
print(src_mod.get_source())

#include "tvm/runtime/c_runtime_api.h"
#include "tvm/runtime/c_backend_api.h"
#include "tvm/runtime/micro/utvm_device_lib.h"
extern void* __tvm_module_ctx = NULL;
#ifdef __cplusplus
extern "C"
#endif
TVM_DLL int32_t fused_add( void* args,  void* arg_type_ids, int32_t num_args) {
  if (!((num_args == 3))) {
    TVMAPISetLastError("fused_add: num_args should be 3");
    return -1;
  }
  void* arg0 = (((TVMValue*)args)[0].v_handle);
  int32_t arg0_code = (( int32_t*)arg_type_ids)[0];
  void* arg1 = (((TVMValue*)args)[1].v_handle);
  int32_t arg1_code = (( int32_t*)arg_type_ids)[1];
  void* arg2 = (((TVMValue*)args)[2].v_handle);
  int32_t arg2_code = (( int32_t*)arg_type_ids)[2];
  float* placeholder = (float*)(((TVMArray*)arg0)[0].data);
  int64_t* arg0_shape = (int64_t*)(((TVMArray*)arg0)[0].shape);
  int64_t* arg0_strides = (int64_t*)(((TVMArray*)arg0)[0].strides);
  if (!(arg0_strides == NULL)) {
    if (!((1 == ((int32_t)arg0_strides[0])))) {
      TVMAPISetLastError("arg0.strides: e

In [0]:
def build_c_module(func):
  with tvm.build_config(disable_vectorize=True):
    _, src_mod, _ = relay.build(func, target="c", params={})
  return src_mod

In [0]:
func = relay.Function([], relay.const(1.0))
print(build_c_module(func).get_source())

TVMError: ignored

Try building some other functions and see what source is generated.

In [0]:
# TODO: Try another function.
func = ...
print(build_c_module(func).get_source())

## Graph Runtime Execution

The primary execution strategy for running models with MicroTVM is the graph runtime.



In [8]:
DEVICE_TYPE = "host"
# TODO: Explain this variable.
BINUTIL_PREFIX = ""

shape = (1024,)
dtype = "float32"
    
# Construct Relay program.
x = relay.var("x", relay.TensorType(shape=shape, dtype=dtype))
xx = relay.multiply(x, x)
z = relay.add(xx, relay.const(1.0))
func = relay.Function([x], z)

with micro.Session(DEVICE_TYPE, BINUTIL_PREFIX) as sess:
    mod, params = sess.micro_build(func)

    mod.set_input(**params)
    x_in = np.random.uniform(size=shape[0]).astype(dtype)
    print(f"input: {x_in}")
    mod.run(x=x_in)
    result = mod.get_output(0).asnumpy()

    print(f"result: {result}")

input: [0.29276216 0.39650485 0.71247035 ... 0.8835988  0.80035686 0.70527506]
result: [1.0857097 1.1572161 1.507614  ... 1.7807468 1.6405711 1.4974129]


## ResNet-18 Demo

Before we get into the mechanics of MicroTVM, we start with a motivating example.  In the code block below, we perform image recognition on a picture of a cat using ResNet-18.

First, we import the model from MxNet's model zoo and use TVM's MxNet model importer.

In [0]:
import mxnet as mx
from mxnet.gluon.model_zoo.vision import get_model

# Fetch a mapping from class IDs to human-readable labels.
synset_url = "".join(["https://gist.githubusercontent.com/zhreshold/",
                      "4d0b62f3d01426887599d4f7ede23ee5/raw/",
                      "596b27d23537e5a1b5751d2b0481ef172f58b539/",
                      "imagenet1000_clsid_to_human.txt"])
synset_name = "synset.txt"
download(synset_url, synset_name)
with open(synset_name) as f:
    synset = eval(f.read())
    
block = get_model("resnet18_v1", pretrained=True)
func, params = relay.frontend.from_mxnet(
    block, shape={"data": image.shape})

ModuleNotFoundError: ignored

Now, we download a picture of a cat and convert it to a format the model can understand.

In [0]:
from mxnet.gluon.utils import download
from PIL import Image

dtype = "float32"

img_name = "cat.png"
download("https://github.com/dmlc/mxnet.js/blob/master/data/cat.png?raw=true",
         img_name)
image = Image.open(img_name).resize((224, 224))
image = np.array(image) - np.array([123., 117., 104.])
image /= np.array([58.395, 57.12, 57.375])
image = image.transpose((2, 0, 1))
image = image[np.newaxis, :]
image = tvm.nd.array(image.astype(dtype))

# TODO: print image

With that, we can create a MicroTVM session, build a graph runtime module, then run the model.


In [0]:
DEVICE_TYPE = "host"
# TODO: Explain this variable.
BINUTIL_PREFIX = ""

with micro.Session(DEVICE_TYPE, BINUTIL_PREFIX) as sess:
    mod, params = relay_micro_build(func, BINUTIL_PREFIX, params=params)
    # Set model weights.
    mod.set_input(**params)
    # Execute with `image` as the input.
    mod.run(data=image)
    # Get outputs.
    tvm_output = mod.get_output(0)

    prediction_idx = np.argmax(tvm_output.asnumpy()[0])
    prediction = synset[prediction_idx]
    print(prediction)

## The Low-Level Device Interface

The low-level device interface is used by TVM to communicate with MicroDevices. It exposes three important functions to interact with the MicroDevice: 
- Read from device memory.
- Write to device memory.
- Start function execution.


Below is an example which uses an allocated region of memory on the host machine to simulate a MicroDevice. This HostLowLevelDevice provides MicroTVM users with an easy test setup to experiment with, without having to communicate and debug external MicroDevices. However, the LowLevelDevice interface can be implemented for any MicroDevice, for example over JTAG or a network connection, which will enable you to run TVM on your own MicroDevice.

TODO: Add prints that are only on the `tutorial` branch?


```
#@title 
#include <sys/mman.h>
#include <cstring>
#include "low_level_device.h"
#include "micro_common.h"

namespace tvm {
namespace runtime {
/*!
 * \brief emulated low-level device on host machine
 */
class HostLowLevelDevice final : public LowLevelDevice {
 public:
  /*!
   * \brief constructor to initialize on-host memory region to act as device
   * \param num_bytes size of the emulated on-device memory region
   */
  explicit HostLowLevelDevice(size_t num_bytes)
    : size_(num_bytes) {
    size_t size_in_pages = (num_bytes + kPageSize - 1) / kPageSize;
    int mmap_prot = PROT_READ | PROT_WRITE | PROT_EXEC;
    int mmap_flags = MAP_ANONYMOUS | MAP_PRIVATE;
    base_addr_ = DevBaseAddr(
      (reinterpret_cast<std::uintptr_t>(
        mmap(nullptr, size_in_pages * kPageSize, mmap_prot, mmap_flags, -1, 0))));
  }

  /*!
   * \brief destructor to deallocate on-host device region
   */
  ~HostLowLevelDevice() {
    munmap(base_addr_.cast_to<void*>(), size_);
  }

  void Write(DevBaseOffset offset,
             void* buf,
             size_t num_bytes) final {
    void* addr = (offset + base_addr_).cast_to<void*>();
    std::memcpy(addr, buf, num_bytes);
  }

  void Read(DevBaseOffset offset,
            void* buf,
            size_t num_bytes) final {
    void* addr = (offset + base_addr_).cast_to<void*>();
    std::memcpy(buf, addr, num_bytes);
  }

  void Execute(DevBaseOffset func_offset, DevBaseOffset breakpoint) final {
    DevAddr func_addr = func_offset + base_addr_;
    reinterpret_cast<void (*)(void)>(func_addr.value())();
  }

  DevBaseAddr base_addr() const final {
    return base_addr_;
  }

  const char* device_type() const final {
    return "host";
  }

 private:
  /*! \brief base address of the micro device memory region */
  DevBaseAddr base_addr_;
  /*! \brief size of memory region */
  size_t size_;
};

const std::shared_ptr<LowLevelDevice> HostLowLevelDeviceCreate(size_t num_bytes) {
  std::shared_ptr<LowLevelDevice> lld =
      std::make_shared<HostLowLevelDevice>(num_bytes);
  return lld;
}
}  // namespace runtime
}  // namespace tvm
```

# Homework: RISC-V

To use RISC-V as a target device, there are some extra steps that can't be done within this notebook.

First, you will need to download and compile [TVM](https://github.com/dmlc/tvm) on your own machine.

Next, you will need [Spike](https://github.com/riscv/riscv-isa-sim) (a RISC-V ISA simulator) and [OpenOCD](https://github.com/ntfreak/openocd) (provides a high-level debugging interface to compatible devices).

TODO: Flesh out.