# Architecture

![image](https://github.com/dmlc/web-data/raw/master/tvm/tutorial/tvm_support_list.png)

![image](https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2017/10/04/nnvm-1.gif)

## NNVM – Computation graph intermediate representation (IR) stack

1. NNVM is to represent workloads from different frameworks into standardized computation graphs and then translate these high-level graphs into execution graphs
2. The computation graph, which is presented in a framework-agnostic format, is inspired from the layer definition in Keras and tensor operators from numpy.
3. NNVM also ships with routines, called Pass by following the LLVM convention, to manipulate these graphs. These routines either add new attributes into the graph to execute them or modify graphs to improve efficiency.
4. NNVM compiler, which compiles a high-level computation graph into optimized machine codes
5. NNVM provides a specification of the computation graph and operator with graph optimization routines, and operators are implemented and optimized for target hardware by using TVM
6. This compiler can match and even outperform state-of-the-art performance on two radically different hardware: ARM CPU and Nvidia GPUs.
7. NNVM is a runtime agnostic compiler

NNVM provides

1. Interface of graph definition
2. Optimizer
3. Runtime of kernel functions on various hardwares

## TVM – Tensor IR stack

1. This operators used in computation graphs and optimizes them for target backend hardware
2. Unlike NNVM, it provides a hardware-independent, domain-specific language to simplify the operator implementation in the tensor index level.
3. TVM also offers scheduling primitives, such as multi-threading, tiling, and caching, to optimize the computation to fully utilize the hardware resources.
4. These schedules are hardware-dependent and can either be hand-coded or it is possible to search optimized schema automatically.

![image](https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2017/10/07/nnvm-2-2.gif)

## Lets do one sample program

### Tensorflow Example

In [19]:
import tensorflow as tf
a = tf.Variable([1, 2, 3, 4])
b = tf.Variable([4, 4, 4, 4])
c = tf.add(a,b)
print(c)

tf.Tensor([5 6 7 8], shape=(4,), dtype=int32)


### NNVM Example

Create graph using ```nnvm``` 

In [6]:
import nnvm
import nnvm.symbol as sym
x = sym.Variable("x")
y = sym.Variable("y")
z = sym.elemwise_add(x, y)
compute_graph = nnvm.graph.create(z)
print(compute_graph.ir())

Graph(%x, %y) {
  %2 = elemwise_add(%x, %y)
  ret %2
}


Compile graph using ```nnvm```

In [7]:
shape = (4,)
deploy_graph, lib, params = nnvm.compiler.build(compute_graph, target="llvm", shape={"x": shape}, dtype="float32")

Deploy the graph in target CPU or GPU

In [20]:
import tvm
from tvm.contrib import graph_runtime, util
module = graph_runtime.create(deploy_graph, lib, tvm.cpu(0))

In [21]:
import numpy as np
x_np = np.array([1, 2, 3, 4]).astype("float32")
y_np = np.array([4, 4, 4, 4]).astype("float32")

Set input to model

In [15]:
module.set_input(x=x_np, y=y_np)

<tvm.contrib.graph_runtime.GraphModule at 0x10bce93d0>

In [16]:
module.run()

In [17]:
out = module.get_output(0, out=tvm.nd.empty(shape))

In [18]:
print(out.asnumpy())

[5. 6. 7. 8.]
