[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/uwsampl/tutorial/blob/master/notebook/06_TVM_Tutorial_MicroTVM.ipynb)

Please run the following block to ensure TVM is setup for *this notebook*.  Each notebook may have its own runtime.

In [0]:
! gsutil cp "gs://tvm-fcrc-binariesd5fce43e-8373-11e9-bfb6-0242ac1c0002/tvm.tar.gz" /tmp/tvm.tar.gz
! mkdir -p /tvm
! tar -xf /tmp/tvm.tar.gz --strip-components=4 --directory /tvm
! ls -la /tvm
# Move this block after we are done with pkg step
! bash /tvm/package.sh
import sys
sys.path.append('/tvm/python')
sys.path.append('/tvm/topi/python')

# MicroTVM


#### TODO: should we add some basic info about MicroTVM itself? People might miss things during the talk.

First, we run some necessary imports.


In [0]:
import os

import numpy as np
import tvm
from tvm.contrib import graph_runtime, util
from tvm import relay
import tvm.micro as micro

/env/python


ModuleNotFoundError: ignored

## C Codegen

Let's take a look at the C codegen backend for a simple function.

In [0]:
# First, we construct the function.
ty = relay.TensorType(shape=(1024,), dtype="float32")
x = relay.var("x", ty)
y = relay.var("y", ty)
func = relay.Function([x], relay.add(x, y))

# Then build it with the appropriate configuration.
with tvm.build_config(disable_vectorize=True):
  _, src_mod, _ = relay.build(func, target="c", params={})

# Now, we can see the source that was generated.
print(src_mod.get_source())

In [0]:
def build_c_module(func):
  with tvm.build_config(disable_vectorize=True):
    _, src_mod, _ = relay.build(func, target="c", params={})
  return src_mod

In [0]:
func = relay.Function([], relay.const(1.0))
print(build_c_module(func).get_source())

Try building some other functions and see what source is generated.

In [0]:
# TODO: Try another function.
func = ...
print(build_c_module(func).get_source())

## ResNet-18 Demo

Before we get into the mechanics of MicroTVM, we start with a motivating example.  In the code block below, we perform image recognition on a picture of a cat using ResNet-18.

First, we import the model from MxNet's model zoo and use TVM's MxNet model importer.

In [0]:
import mxnet as mx
from mxnet.gluon.model_zoo.vision import get_model

# Fetch a mapping from class IDs to human-readable labels.
synset_url = "".join(["https://gist.githubusercontent.com/zhreshold/",
                      "4d0b62f3d01426887599d4f7ede23ee5/raw/",
                      "596b27d23537e5a1b5751d2b0481ef172f58b539/",
                      "imagenet1000_clsid_to_human.txt"])
synset_name = "synset.txt"
download(synset_url, synset_name)
with open(synset_name) as f:
    synset = eval(f.read())
    
block = get_model("resnet18_v1", pretrained=True)
func, params = relay.frontend.from_mxnet(
    block, shape={"data": image.shape})

Now, we download a picture of a cat and convert it to a format the model can understand.

In [0]:

from mxnet.gluon.utils import download
from PIL import Image

dtype = "float32"

# Read raw image and preprocess into the format ResNet can work on.
img_name = "cat.png"
download("https://github.com/dmlc/mxnet.js/blob/master/data/cat.png?raw=true",
         img_name)
image = Image.open(img_name).resize((224, 224))
image = np.array(image) - np.array([123., 117., 104.])
image /= np.array([58.395, 57.12, 57.375])
image = image.transpose((2, 0, 1))
image = image[np.newaxis, :]
image = tvm.nd.array(image.astype(dtype))

With that, we can create a MicroTVM session, build a graph runtime module, then run the model.


In [0]:
DEVICE_TYPE = "host"
BINUTIL_PREFIX = ""

def relay_micro_build(func, binutil_prefix, params=None):
    with tvm.build_config(disable_vectorize=True):
        graph, c_mod, params = relay.build(func, target="c", params=params)

    micro_mod = create_micro_mod(c_mod, BINUTIL_PREFIX)
    ctx = tvm.micro_dev(0)
    mod = graph_runtime.create(graph, micro_mod, ctx)
    return mod, params
  

with micro.Session(DEVICE_TYPE, BINUTIL_PREFIX) as sess:
    mod, params = relay_micro_build(func, BINUTIL_PREFIX, params=params)
    # Set model weights.
    mod.set_input(**params)
    # Execute with `image` as the input.
    mod.run(data=image)
    # Get output.
    tvm_output = mod.get_output(0)

    prediction_idx = np.argmax(tvm_output.asnumpy()[0])
    prediction = synset[prediction_idx]
    assert prediction == "tiger cat"

## The Low-Level Device Interface

The low-level device interface is used by TVM to communicate with MicroDevices. It exposes three important functions to interact with the MicroDevice: 
- Read from device memory.
- Write to device memory.
- Start function execution.


Below is an example which uses an allocated region of memory on the host machine to simulate a MicroDevice. This HostLowLevelDevice provides MicroTVM users with an easy test setup to experiment with, without having to communicate and debug external MicroDevices. However, the LowLevelDevice interface can be implemented for any MicroDevice, for example over JTAG or a network connection, which will enable you to run TVM on your own MicroDevice.


In [0]:
#@title 
#include <sys/mman.h>
#include <cstring>
#include "low_level_device.h"
#include "micro_common.h"

namespace tvm {
namespace runtime {
/*!
 * \brief emulated low-level device on host machine
 */
class HostLowLevelDevice final : public LowLevelDevice {
 public:
  /*!
   * \brief constructor to initialize on-host memory region to act as device
   * \param num_bytes size of the emulated on-device memory region
   */
  explicit HostLowLevelDevice(size_t num_bytes)
    : size_(num_bytes) {
    size_t size_in_pages = (num_bytes + kPageSize - 1) / kPageSize;
    int mmap_prot = PROT_READ | PROT_WRITE | PROT_EXEC;
    int mmap_flags = MAP_ANONYMOUS | MAP_PRIVATE;
    base_addr_ = DevBaseAddr(
      (reinterpret_cast<std::uintptr_t>(
        mmap(nullptr, size_in_pages * kPageSize, mmap_prot, mmap_flags, -1, 0))));
  }

  /*!
   * \brief destructor to deallocate on-host device region
   */
  ~HostLowLevelDevice() {
    munmap(base_addr_.cast_to<void*>(), size_);
  }

  void Write(DevBaseOffset offset,
             void* buf,
             size_t num_bytes) final {
    void* addr = (offset + base_addr_).cast_to<void*>();
    std::memcpy(addr, buf, num_bytes);
  }

  void Read(DevBaseOffset offset,
            void* buf,
            size_t num_bytes) final {
    void* addr = (offset + base_addr_).cast_to<void*>();
    std::memcpy(buf, addr, num_bytes);
  }

  void Execute(DevBaseOffset func_offset, DevBaseOffset breakpoint) final {
    DevAddr func_addr = func_offset + base_addr_;
    reinterpret_cast<void (*)(void)>(func_addr.value())();
  }

  DevBaseAddr base_addr() const final {
    return base_addr_;
  }

  const char* device_type() const final {
    return "host";
  }

 private:
  /*! \brief base address of the micro device memory region */
  DevBaseAddr base_addr_;
  /*! \brief size of memory region */
  size_t size_;
};

const std::shared_ptr<LowLevelDevice> HostLowLevelDeviceCreate(size_t num_bytes) {
  std::shared_ptr<LowLevelDevice> lld =
      std::make_shared<HostLowLevelDevice>(num_bytes);
  return lld;
}
}  // namespace runtime
}  // namespace tvm

# Extra Credit: RISC-V

TODO: Fill in steps to download and install RISC-V tools

# Graveyard

## Graph Runtime Execution

The primary execution strategy for running models with MicroTVM is the graph runtime.  Here's a helper function for building a graph runtime module from a Relay function.

In [0]:
def micro_build_graph_runtime(func, device_type, params={}):
    """Create a graph runtime module with a micro device context."""
    with tvm.build_config(disable_vectorize=True):
        with relay.build_config(opt_level=3):
            graph, host_mod, params = relay.build(func, target="c",
                                                  params=params)

    micro_mod = micro.from_source_module(host_mod, device_type)
    ctx = tvm.micro_dev(0)
    graph_mod = graph_runtime.create(graph, micro_mod, ctx)
    return graph_mod, params

Being able to construct a graph runtime module means we can now plug in existing models and they just work.

In [0]:
from tvm.relay.testing import resnet

resnet_func, params = resnet.get_workload(num_classes=10,
                                          num_layers=18,
                                          image_shape=(3, 32, 32))
# Remove the final softmax layer, because uTVM does not currently support
# it.
resnet_func_no_sm = relay.Function(resnet_func.params,
                                   resnet_func.body.args[0],
                                   resnet_func.ret_type)

# For now, we use the "host" emulated device for execution.  In the next
# section, we'll implement our own simple low-level device.
device_type = "host"
micro.init(device_type)
mod, params = micro_build_graph_runtime(resnet_func_no_sm, device_type,
                                        params=params)
# Set the model weights.
mod.set_input(**params)
# Generate random input.
data = np.random.uniform(size=mod.get_input(0).shape)
mod.run(data=data)
result = mod.get_output(0).asnumpy()
# We gave a random input, so all we want is a result with some nonzero
# entries.
assert result.sum() != 0.0