[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/uwsampl/tutorial/blob/master/notebook/02_TVM_Tutorial_Relay.ipynb)

Please run the following block to ensure TVM is setup for *this notebook*, each notebook may have its own runtime.

In [57]:
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

if IN_COLAB:
    ! gsutil cp "gs://tvm-fcrc-binariesd5fce43e-8373-11e9-bfb6-0242ac1c0002/tvm.tar.gz" /tmp/tvm.tar.gz
    ! mkdir -p /tvm
    ! tar -xf /tmp/tvm.tar.gz --strip-components=4 --directory /tvm
    ! ls -la /tvm
    ! bash /tvm/package.sh
    # Add TVM to the Python path.
    import sys
    sys.path.append('/tvm/python')
    sys.path.append('/tvm/topi/python')
else:
    print("Notebook executing locally, skipping Colab setup ...")

Notebook executing locally, skipping Colab setup ...


# Relay: an Extensible Deep Learning IR

Last year TVM introduced Relay IR – a second generation high-level IR for deep learning. 

Relay's design comes from a simple insight that the critical difference between regular IRs
and deep learning IRs are the primitive values they manipulate. Relay is designed using well
known insights from the programming languages community coupled with TVM's existing 
infrastructure to provide state of the art performance. 

If you are familiar with ideas from programming languages or existing computation graph
representations we will connect Relay to your existing knowledge during this tutorial.

We will first cover the design of Relay, then elaborate on how one can use it to 
accomplish a wide variety of tasks. This piece of the tutorial focused directly 
on Relay but Relay will be present throughout all of the content today, and serves
as the interface layer to TVM.

...

In [58]:
import tvm
from tvm import relay
import tvm.relay.testing
from tvm.relay.expr_functor import ExprMutator
import torch
import torchvision
import onnx

## Language 

We will briefly introduce the concepts of Relay below before showing how to use Relay to accomplish specific tasks.
You can find a full language specification [here](https://docs.tvm.ai/langref/index.html).

### Variables 

In [59]:
# A single Relay variable, the string is just a hint
x = relay.var('x')

# A Relay variable with a different dtype, defaults to float32.
x = relay.var('x', dtype='int32')

# A Relay variable with a different shape.
x = relay.var('x', shape=(10, 1))

### Operators

Relay provides high performance operators defined in TVM that implement the primitive operations needed by deep learning applications. Operators can be applied to arguments just like regular Python or C++ functions. Common arithemetic operations are provided both via names and operator overloading.

Variables can be used to construct Relay *expressions* which replace the concept of graphs present in previous frameworks. A Relay expression can be viewed much like a graph with extra functionality as we will see as we go
forward.

In [60]:
w = relay.op.add(x, x)
print(w)

v0.0.1
free_var %x: Tensor[(10, 1), float32]
add(%x, %x)


In [61]:
z = x + x
print(z)

v0.0.1
free_var %x: Tensor[(10, 1), float32]
add(%x, %x)


### Functions

The fundamental packaging of computation in Relay is the function. A function is a combination of a set of inputs,
and a Relay expression. One view is a function is no different than the ones in programming languages today, and another is that it replaces named subgraphs.

In [62]:
f = relay.Function([x], z)
print(f)

v0.0.1
fn (%x: Tensor[(10, 1), float32]) {
  add(%x, %x)
}


### Module

Finally we can give functions a global name and package many of them together into a module. When we add a function to the module, it will be type checked before hand.

When we print the module you can see the program annotated with all type information. 

In [63]:
mod = relay.Module({})
fname = relay.GlobalVar('f')
mod[fname] = f

print(mod)

v0.0.1
def @f(%x: Tensor[(10, 1), float32]) -> Tensor[(10, 1), float32] {
  add(%x, %x) /* ty=Tensor[(10, 1), float32] */
}



## Frontends

Relay comes with a variety of frontends and supports most major frameworks including TensorFlow, PyTorch, MxNet, ONNX, Keras and Caffe2.

Below we provide a couple examples of using these frontends to import models into Relay.

You can find specific tutorials on deploying pretrained models below:
    - [ONNX](https://docs.tvm.ai/tutorials/frontend/from_onnx.html#sphx-glr-tutorials-frontend-from-onnx-py)
    - [TensorFlow](https://docs.tvm.ai/tutorials/frontend/from_tensorflow.html#sphx-glr-tutorials-frontend-from-tensorflow-py)
    - [Keras](https://docs.tvm.ai/tutorials/frontend/from_keras.html#sphx-glr-tutorials-frontend-from-keras-py)
    - [PyTorch](https://tvm.ai/2019/05/30/pytorch-frontend)
    - [Caffe2](https://docs.tvm.ai/tutorials/frontend/from_caffe2.html#sphx-glr-tutorials-frontend-from-caffe2-py)

In [66]:
torch_resnet18 = torchvision.models.resnet18()
dummy_input = torch.randn(10, 3, 224, 224)
torch.onnx.export(torch_resnet18, dummy_input, "resnet.onnx", verbose=True)
onnx_resnet18 = onnx.load('resnet.onnx')
func, params = relay.frontend.from_onnx(onnx_resnet18, shape={ '0': (10, 3, 224, 224) })
print(func)

graph(%0 : Float(10, 3, 224, 224)
      %1 : Float(64, 3, 7, 7)
      %2 : Float(64)
      %3 : Float(64)
      %4 : Float(64)
      %5 : Float(64)
      %6 : Long()
      %7 : Float(64, 64, 3, 3)
      %8 : Float(64)
      %9 : Float(64)
      %10 : Float(64)
      %11 : Float(64)
      %12 : Long()
      %13 : Float(64, 64, 3, 3)
      %14 : Float(64)
      %15 : Float(64)
      %16 : Float(64)
      %17 : Float(64)
      %18 : Long()
      %19 : Float(64, 64, 3, 3)
      %20 : Float(64)
      %21 : Float(64)
      %22 : Float(64)
      %23 : Float(64)
      %24 : Long()
      %25 : Float(64, 64, 3, 3)
      %26 : Float(64)
      %27 : Float(64)
      %28 : Float(64)
      %29 : Float(64)
      %30 : Long()
      %31 : Float(128, 64, 3, 3)
      %32 : Float(128)
      %33 : Float(128)
      %34 : Float(128)
      %35 : Float(128)
      %36 : Long()
      %37 : Float(128, 128, 3, 3)
      %38 : Float(128)
      %39 : Float(128)
      %40 : Float(128)
      %41 : Float(128)
      %42 :



TVMError: Traceback (most recent call last):
  [bt] (8) 9   libtvm.dylib                        0x0000000115285fdd tvm::relay::backend::RelayBuildModule::GetFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<tvm::runtime::ModuleNode> const&)::'lambda1'(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 429
  [bt] (7) 8   libtvm.dylib                        0x00000001152861f2 tvm::relay::backend::RelayBuildModule::Build(tvm::relay::Function, tvm::Map<tvm::Integer, tvm::Target, void, void> const&, tvm::Target const&) + 130
  [bt] (6) 7   libtvm.dylib                        0x00000001152863f9 tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::relay::Function, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, tvm::runtime::NDArray, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, tvm::runtime::NDArray> > > const&) + 345
  [bt] (5) 6   libtvm.dylib                        0x00000001153542ca tvm::relay::ModuleNode::FromExpr(tvm::relay::Expr const&, tvm::Map<tvm::relay::GlobalVar, tvm::relay::Function, void, void> const&) + 938
  [bt] (4) 5   libtvm.dylib                        0x0000000115352b17 tvm::relay::ModuleNode::Add(tvm::relay::GlobalVar const&, tvm::relay::Function const&, bool) + 151
  [bt] (3) 4   libtvm.dylib                        0x00000001156315b8 tvm::relay::InferType(tvm::relay::Function const&, tvm::relay::Module const&, tvm::relay::GlobalVar const&) + 472
  [bt] (2) 3   libtvm.dylib                        0x0000000115630687 tvm::relay::TypeInferencer::Infer(tvm::relay::Expr) + 135
  [bt] (1) 2   libtvm.dylib                        0x000000011531ee23 tvm::relay::ErrorReporter::RenderErrors(tvm::relay::Module const&, bool) + 5555
  [bt] (0) 1   libtvm.dylib                        0x0000000114ed5949 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  [bt] (8) 9   libtvm.dylib                        0x0000000115352b17 tvm::relay::ModuleNode::Add(tvm::relay::GlobalVar const&, tvm::relay::Function const&, bool) + 151
  [bt] (7) 8   libtvm.dylib                        0x00000001156315b8 tvm::relay::InferType(tvm::relay::Function const&, tvm::relay::Module const&, tvm::relay::GlobalVar const&) + 472
  [bt] (6) 7   libtvm.dylib                        0x000000011563066b tvm::relay::TypeInferencer::Infer(tvm::relay::Expr) + 107
  [bt] (5) 6   libtvm.dylib                        0x000000011564d05a tvm::relay::TypeSolver::Solve() + 1114
  [bt] (4) 5   libtvm.dylib                        0x000000011564d6f8 tvm::TypedEnvFunc<bool (tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&)>::operator()(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&) const + 328
  [bt] (3) 4   libtvm.dylib                        0x00000001153a6ba9 std::__1::__function::__func<void tvm::runtime::TypedPackedFunc<bool (tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&)>::AssignTypedLambda<bool (*)(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&)>(bool (*)(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&))::'lambda'(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*), std::__1::allocator<void tvm::runtime::TypedPackedFunc<bool (tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&)>::AssignTypedLambda<bool (*)(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&)>(bool (*)(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&))::'lambda'(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 137
  [bt] (2) 3   libtvm.dylib                        0x00000001153a6c4f void tvm::runtime::detail::unpack_call_dispatcher<bool, 0, 4, bool (*)(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&)>::run<tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue>(bool (* const&)(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&), tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&) + 95
  [bt] (1) 2   libtvm.dylib                        0x00000001154b479e tvm::relay::ConcatenateRel(tvm::Array<tvm::relay::Type, void> const&, int, tvm::Attrs const&, tvm::relay::TypeReporter const&) + 1918
  [bt] (0) 1   libtvm.dylib                        0x0000000114ed5949 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  File "/Users/jroesch/Git/tvm/src/relay/ir/error.cc", line 132
TVMError: [1m
Error(s) have occurred. We have annotated the program with them:

[0m[1mIn `main`: 
[0mv0.0.1
fn (%v0: Tensor[(10, 3, 224, 224), float32]) {
  %0 = nn.conv2d(%v0, meta[relay.Constant][0], strides=[2, 2], padding=[3, 3], kernel_size=[7, 7])
  %1 = nn.batch_norm(%0, meta[relay.Constant][1], meta[relay.Constant][2], meta[relay.Constant][3], meta[relay.Constant][4], epsilon=1e-05)
  %2 = %1.0
  %3 = nn.relu(%2)
  %4 = nn.max_pool2d(%3, pool_size=[3, 3], strides=[2, 2], padding=[1, 1])
  %5 = nn.conv2d(%4, meta[relay.Constant][5], padding=[1, 1], kernel_size=[3, 3])
  %6 = nn.batch_norm(%5, meta[relay.Constant][6], meta[relay.Constant][7], meta[relay.Constant][8], meta[relay.Constant][9], epsilon=1e-05)
  %7 = %6.0
  %8 = nn.relu(%7)
  %9 = nn.conv2d(%8, meta[relay.Constant][10], padding=[1, 1], kernel_size=[3, 3])
  %10 = nn.batch_norm(%9, meta[relay.Constant][11], meta[relay.Constant][12], meta[relay.Constant][13], meta[relay.Constant][14], epsilon=1e-05)
  %11 = %10.0
  %12 = add(%11, %4)
  %13 = nn.relu(%12)
  %14 = nn.conv2d(%13, meta[relay.Constant][15], padding=[1, 1], kernel_size=[3, 3])
  %15 = nn.batch_norm(%14, meta[relay.Constant][16], meta[relay.Constant][17], meta[relay.Constant][18], meta[relay.Constant][19], epsilon=1e-05)
  %16 = %15.0
  %17 = nn.relu(%16)
  %18 = nn.conv2d(%17, meta[relay.Constant][20], padding=[1, 1], kernel_size=[3, 3])
  %19 = nn.batch_norm(%18, meta[relay.Constant][21], meta[relay.Constant][22], meta[relay.Constant][23], meta[relay.Constant][24], epsilon=1e-05)
  %20 = %19.0
  %21 = add(%20, %13)
  %22 = nn.relu(%21)
  %23 = nn.conv2d(%22, meta[relay.Constant][25], strides=[2, 2], padding=[1, 1], kernel_size=[3, 3])
  %24 = nn.batch_norm(%23, meta[relay.Constant][26], meta[relay.Constant][27], meta[relay.Constant][28], meta[relay.Constant][29], epsilon=1e-05)
  %25 = %24.0
  %26 = nn.relu(%25)
  %27 = nn.conv2d(%26, meta[relay.Constant][30], padding=[1, 1], kernel_size=[3, 3])
  %28 = nn.batch_norm(%27, meta[relay.Constant][31], meta[relay.Constant][32], meta[relay.Constant][33], meta[relay.Constant][34], epsilon=1e-05)
  %29 = %28.0
  %30 = nn.conv2d(%22, meta[relay.Constant][35], strides=[2, 2], kernel_size=[1, 1])
  %31 = nn.batch_norm(%30, meta[relay.Constant][36], meta[relay.Constant][37], meta[relay.Constant][38], meta[relay.Constant][39], epsilon=1e-05)
  %32 = %31.0
  %33 = add(%29, %32)
  %34 = nn.relu(%33)
  %35 = nn.conv2d(%34, meta[relay.Constant][40], padding=[1, 1], kernel_size=[3, 3])
  %36 = nn.batch_norm(%35, meta[relay.Constant][41], meta[relay.Constant][42], meta[relay.Constant][43], meta[relay.Constant][44], epsilon=1e-05)
  %37 = %36.0
  %38 = nn.relu(%37)
  %39 = nn.conv2d(%38, meta[relay.Constant][45], padding=[1, 1], kernel_size=[3, 3])
  %40 = nn.batch_norm(%39, meta[relay.Constant][46], meta[relay.Constant][47], meta[relay.Constant][48], meta[relay.Constant][49], epsilon=1e-05)
  %41 = %40.0
  %42 = add(%41, %34)
  %43 = nn.relu(%42)
  %44 = nn.conv2d(%43, meta[relay.Constant][50], strides=[2, 2], padding=[1, 1], kernel_size=[3, 3])
  %45 = nn.batch_norm(%44, meta[relay.Constant][51], meta[relay.Constant][52], meta[relay.Constant][53], meta[relay.Constant][54], epsilon=1e-05)
  %46 = %45.0
  %47 = nn.relu(%46)
  %48 = nn.conv2d(%47, meta[relay.Constant][55], padding=[1, 1], kernel_size=[3, 3])
  %49 = nn.batch_norm(%48, meta[relay.Constant][56], meta[relay.Constant][57], meta[relay.Constant][58], meta[relay.Constant][59], epsilon=1e-05)
  %50 = %49.0
  %51 = nn.conv2d(%43, meta[relay.Constant][60], strides=[2, 2], kernel_size=[1, 1])
  %52 = nn.batch_norm(%51, meta[relay.Constant][61], meta[relay.Constant][62], meta[relay.Constant][63], meta[relay.Constant][64], epsilon=1e-05)
  %53 = %52.0
  %54 = add(%50, %53)
  %55 = nn.relu(%54)
  %56 = nn.conv2d(%55, meta[relay.Constant][65], padding=[1, 1], kernel_size=[3, 3])
  %57 = nn.batch_norm(%56, meta[relay.Constant][66], meta[relay.Constant][67], meta[relay.Constant][68], meta[relay.Constant][69], epsilon=1e-05)
  %58 = %57.0
  %59 = nn.relu(%58)
  %60 = nn.conv2d(%59, meta[relay.Constant][70], padding=[1, 1], kernel_size=[3, 3])
  %61 = nn.batch_norm(%60, meta[relay.Constant][71], meta[relay.Constant][72], meta[relay.Constant][73], meta[relay.Constant][74], epsilon=1e-05)
  %62 = %61.0
  %63 = add(%62, %55)
  %64 = nn.relu(%63)
  %65 = nn.conv2d(%64, meta[relay.Constant][75], strides=[2, 2], padding=[1, 1], kernel_size=[3, 3])
  %66 = nn.batch_norm(%65, meta[relay.Constant][76], meta[relay.Constant][77], meta[relay.Constant][78], meta[relay.Constant][79], epsilon=1e-05)
  %67 = %66.0
  %68 = nn.relu(%67)
  %69 = nn.conv2d(%68, meta[relay.Constant][80], padding=[1, 1], kernel_size=[3, 3])
  %70 = nn.batch_norm(%69, meta[relay.Constant][81], meta[relay.Constant][82], meta[relay.Constant][83], meta[relay.Constant][84], epsilon=1e-05)
  %71 = %70.0
  %72 = nn.conv2d(%64, meta[relay.Constant][85], strides=[2, 2], kernel_size=[1, 1])
  %73 = nn.batch_norm(%72, meta[relay.Constant][86], meta[relay.Constant][87], meta[relay.Constant][88], meta[relay.Constant][89], epsilon=1e-05)
  %74 = %73.0
  %75 = add(%71, %74)
  %76 = nn.relu(%75)
  %77 = nn.conv2d(%76, meta[relay.Constant][90], padding=[1, 1], kernel_size=[3, 3])
  %78 = nn.batch_norm(%77, meta[relay.Constant][91], meta[relay.Constant][92], meta[relay.Constant][93], meta[relay.Constant][94], epsilon=1e-05)
  %79 = %78.0
  %80 = nn.relu(%79)
  %81 = nn.conv2d(%80, meta[relay.Constant][95], padding=[1, 1], kernel_size=[3, 3])
  %82 = nn.batch_norm(%81, meta[relay.Constant][96], meta[relay.Constant][97], meta[relay.Constant][98], meta[relay.Constant][99], epsilon=1e-05)
  %83 = %82.0
  %84 = add(%83, %76)
  %85 = nn.relu(%84)
  %86 = nn.global_avg_pool2d(%85)
  %87 = shape_of(%86, dtype="int32")
  %88 = take(%87, int64(0), axis=0)
  %89 = expand_dims(%88, axis=0)
  %90 = expand_dims(int64(-1), axis=0)
  %91 = (%89, %90)
  concatenate(%91)[31man internal invariant was violated while typechecking your program [14:25:05] /Users/jroesch/Git/tvm/src/relay/op/tensor/transform.cc:204: Check failed: e_dtype == dtype (int64 vs. int32) : relay.concatenate requires all tensors have the same dtype
; [39m
}
// meta data omitted. you can use show_meta_data=True to include meta data


## Text Format

## Executing Relay


## Pass Manager

Relay has a flexible and configurable pass manager with an elegant API which be used to easily compose and schedule pass pipelines. We believe an easy to configure pipeline is important to enable intelligent exploration between a variety of 


## Optimizations

Defining optimizations to transform your program is straight forward and easy to do in Relay.

For example let's define a constant evaluator for Relay.

## Quantization

## Heterogeneous Execution

Relay supports a high-level interface for scheduling computation across multiple heterogeneous devices. An interesting property of this pass is that it is not special, it is built using Relay's standard machinery for
passes. 

We implement this by using an annotation to mark which computations we would like to schedule on which device, 
and a pass inserts all the appropriate calls to synchronize memory across devices. 

The below pass uses this machinery to schedule all convolutions onto the GPU.

In [52]:
class ScheduleConv2d(ExprMutator):
    def __init__(self, device):
        self.device = device
        super().__init__()

    def visit_call(self, expr):
        visit = super().visit_call(expr)
        if expr.op == tvm.relay.op.get("nn.conv2d"):
            return relay.annotation.on_device(visit, self.device)
        else:
            return visit

def schedule_conv2d_on_gpu(expr):
    sched = ScheduleConv2d(tvm.gpu(0))
    return sched.visit(expr)

In [53]:
# TODO(@jroesch): use pass manager
resnet, params = relay.testing.resnet.get_workload()
print(resnet)
resnet = schedule_conv2d_on_gpu(resnet)
print(resnet)
resnet = relay.ir_pass.rewrite_annotated_ops(resnet, 0)
print(resnet)

v0.0.1
fn (%data: Tensor[(1, 3, 224, 224), float32], %bn_data_gamma: Tensor[(3,), float32], %bn_data_beta: Tensor[(3,), float32], %bn_data_moving_mean: Tensor[(3,), float32], %bn_data_moving_var: Tensor[(3,), float32], %conv0_weight: Tensor[(64, 3, 7, 7), float32], %bn0_gamma: Tensor[(64,), float32], %bn0_beta: Tensor[(64,), float32], %bn0_moving_mean: Tensor[(64,), float32], %bn0_moving_var: Tensor[(64,), float32], %stage1_unit1_bn1_gamma: Tensor[(64,), float32], %stage1_unit1_bn1_beta: Tensor[(64,), float32], %stage1_unit1_bn1_moving_mean: Tensor[(64,), float32], %stage1_unit1_bn1_moving_var: Tensor[(64,), float32], %stage1_unit1_conv1_weight: Tensor[(64, 64, 3, 3), float32], %stage1_unit1_bn2_gamma: Tensor[(64,), float32], %stage1_unit1_bn2_beta: Tensor[(64,), float32], %stage1_unit1_bn2_moving_mean: Tensor[(64,), float32], %stage1_unit1_bn2_moving_var: Tensor[(64,), float32], %stage1_unit1_conv2_weight: Tensor[(64, 64, 3, 3), float32], %stage1_unit1_sc_weight: Tensor[(64, 64, 1, 1)

## Virtual Machine

Relay supports three execution mechanisms, the VM provides the best balance between performance and expressivity out of them, but is also the newest and is still in alpha quality. We are working on shipping the remaining pieces of the VM over the next few weeks, with a beta version coming soon. 

Details about the VM can be found here..., but users can use the VM today from the high-level executor interface in Python, and directly in C++.


## Ahead of time compilation
A final example of how what 

## VTA
...