# Instrumentation Tool Example

For the purpose of this notebook, we will build an instrumentation tool with the Amanda framework step by step. 
With this example, we demonstrate how to implement instrumentation tools with Amanda‘s APIs and applied them to different DNN models.

Firstly, please install the dependencies and Amanda following the installation instructions in [README](../../../README.md)


## Prepare a CNN model

We start the example by defining a simple convolution neural network (CNN) model with the [PyTorch](https://pytorch.org/) machine learning library.

In [1]:
import torch
import torch.nn as nn

class ConvNeuralNet(nn.Module):
    def __init__(self, num_classes):
        super(ConvNeuralNet, self).__init__()
        self.conv_layer1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3)
        self.conv_layer2 = nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3)
        self.max_pool1 = nn.MaxPool2d(kernel_size = 2, stride = 2)
        
        self.conv_layer3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)
        self.conv_layer4 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3)
        self.max_pool2 = nn.MaxPool2d(kernel_size = 2, stride = 2)
        
        self.fc1 = nn.Linear(1600, 128)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(128, num_classes)
    
    def forward(self, x):
        out = self.conv_layer1(x)
        out = self.conv_layer2(out)
        out = self.max_pool1(out)
        
        out = self.conv_layer3(out)
        out = self.conv_layer4(out)
        out = self.max_pool2(out)
                
        out = out.reshape(out.size(0), -1)
        
        out = self.fc1(out)
        out = self.relu1(out)
        out = self.fc2(out)
        return out

This network is executed (forward propagation) with the following lines.
It will call the `forward` function of the `ConvNeuralNet` object to process the defined operators.
With out any loss of generality, we randomly initialize the input sample.

In [2]:
X = torch.rand((1, 3, 32, 32))
model = ConvNeuralNet(num_classes=10)

Y = model(X)
print(Y)

tensor([[-0.0489,  0.0333,  0.0456, -0.0176,  0.0560,  0.0398,  0.0094, -0.0745,
         -0.1290,  0.0474]], grad_fn=<AddmmBackward>)


use another model

## Convolution operator counting tool

As the previous code shows a typical scenario of how we define and process a DNN model, it is common for us to conduct some analysis and debug tasks on the model.
For example, we may want to get the execution trace of the operators or dump the output tensor of a particular operator.
To begin with, we show a example of counting the counting the occurrence of convolution operators.
Intuitively, this can be done by going through the source code or insert codes to the DNN model definition.
A better way is to use the module hook API as we shown in [the latter of this notebook.](#module-hook)
However, this methods are coupled with the DNN source code and cannot be generalized to other analysis tasks.

To this end, we borrow the wisdom of instrumentation concept from programming analysis.
Such tasks targeting on DNN models can be implemented with the DNN model instrumentation abstraction.
The instrumentation tool to count the convolution operators is defined as following.

It is composed of registering analysis routines to locate particular operators and inserting instrumentation routines to execute target code, which is accumulate the global counter.
These are the fundamental components of the instrumentation tool.
Much complex tools can be implemented following the same rationale.

import amanda

Introduce the tool's apis.

In [3]:
import amanda

class CountOPTool(amanda.Tool):
    def __init__(self, op_name: str):
        super().__init__()
        self.counter = 0
        self.op_name = op_name
        self.add_inst_for_op(self.callback)

    # analysis routine, filter conv2d operators
    def callback(self, context: amanda.OpContext):
        op = context.get_op()
        if self.op_name in op.__name__:
            context.insert_before_op(self.counter_op)

    # instrumentation routine: op for counting
    def counter_op(self, *inputs):
        self.counter += 1
        return inputs



Then this instrumentation tool can be applied to the DNN execution process with the `amanda.apply(tool: amanda.Tool)` API.
All the DNN model executed within this context is instrumented by the framework.

In [4]:
tool = CountOPTool("conv2d")

with amanda.apply(tool):
    Y = model(X)
    print(f"Execution time of conv2d op: {tool.counter}")

Execution time of conv2d op: 4


instrumentation routine
analysis routine

## Instrument the backward process

The mapping of forward and backward process

One to many mapping

accumulate op

In [5]:
class CountOPTool(amanda.Tool):
    def __init__(self, op_name: str, backward_op_name: str):
        super().__init__()
        self.counter = 0
        self.backward_counter = 0
        self.op_name = op_name
        self.backward_op_name = backward_op_name
        self.add_inst_for_op(self.callback)
        self.add_inst_for_op(self.backward_callback, backward=True, require_outputs=True)

    # analysis routine, filter conv2d operators
    def callback(self, context: amanda.OpContext):
        op = context.get_op()
        if self.op_name in op.__name__:
            context.insert_before_op(self.counter_op)

    # analysis routine, filter conv2d operators
    def backward_callback(self, context: amanda.OpContext):
        op = context.get_backward_op()
        if self.backward_op_name in op.__name__:
            context.insert_after_backward_op(self.counter_backward_op)

    # instrumentation routine: op for counting
    def counter_op(self, *inputs):
        self.counter += 1
        return inputs
    
    def counter_backward_op(self, *inputs):
        self.backward_counter += 1
        return inputs

Similarly, we can apply this updated counter tool to the DNN execution.
Note that a explicit backward process is invoked.

In [6]:
tool = CountOPTool(op_name="conv2d", backward_op_name="Conv")
X = torch.rand((1, 3, 32, 32))
model = ConvNeuralNet(10)

with amanda.tool.apply(tool):
    Y = model(X)
    Y.backward(torch.rand_like(Y))

    print(f"Execution time of conv2d op: {tool.counter}, backward op: {tool.backward_counter}")

Execution time of conv2d op: 4, backward op: 4


`one to many mapping`
demonstrate with graph

## One tool to all the models



In [7]:
from torchvision.models import resnet50

x = torch.rand((1, 3, 227, 227))
model = resnet50()

tool = CountOPTool(op_name="conv2d", backward_op_name="Conv")

with amanda.tool.apply(tool):

    y = model(x)
    y.backward(torch.rand_like(y))
    print(f"Execution time of conv2d op: {tool.counter}, backward op: {tool.backward_counter}")


Execution time of conv2d op: 53, backward op: 53


In [8]:
class CountOPTool(amanda.Tool):
    def __init__(self, op_name: str, backward_op_name: str):
        super().__init__()
        self.counter = 0
        self.backward_counter = 0
        self.op_name = op_name
        self.backward_op_name = backward_op_name
        self.add_inst_for_op(self.callback)
        self.add_inst_for_op(self.backward_callback, backward=True, require_outputs=True)

    # analysis routine, filter conv2d operators
    def callback(self, context: amanda.OpContext):
        op = context.get_op()
        if self.op_name in op.name:
            context.insert_before_op(self.counter_op)

    # analysis routine, filter conv2d operators
    def backward_callback(self, context: amanda.OpContext):
        op = context.get_backward_op()
        if self.backward_op_name in op.name:
            context.insert_after_backward_op(self.counter_backward_op)

    # instrumentation routine: op for counting
    def counter_op(self, *inputs):
        self.counter += 1
        return inputs
    
    def counter_backward_op(self, *inputs):
        self.backward_counter += 1
        return inputs

## Tensorflow and context mapping

the analysis routine and instrumentation routine may seem to be indentical in the eager mode execution.
Things are different in tensorflow's graph mode execution.

insert graph.

In [9]:
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)
from examples.common.tensorflow.model.resnet_50 import ResNet50

model = ResNet50()
x = tf.random.uniform(shape=[8, 224, 224, 3])

tool = CountOPTool(op_name="Conv2D", backward_op_name="Conv2DBackpropFilter")

with amanda.tool.apply(tool):
    y = model(x)
    with tf.Session() as session:
        session.run(tf.initialize_all_variables())
        g = tf.gradients(y, x)

        session.run(g)
print(tool.counter, tool.backward_counter)

106 53


The problem is this two library have different convention for naming

In [10]:
from amanda.tools.mapping import MappingTool

def torch_op_name_rule(context: amanda.OpContext):
    context["op_name"] = context.get_op().__name__
    context["backward_op_name"] = context.get_backward_op().__name__ if context.get_backward_op() is not None else None


def tf_op_name_rule(context: amanda.OpContext):
    context["op_name"] = context.get_op().name if context.get_op() is not None else None
    context["backward_op_name"] = context.get_backward_op().name if context.get_backward_op() is not None else None

mapping_tool = MappingTool(
    rules=[
        ["pytorch", torch_op_name_rule],
        ["tensorflow", tf_op_name_rule],
    ]
)

We update the `CountOPTool` with the `MappingTool` of rules dealing with the naming convention of different frameworks.
This reflects the rationale of amanda to unify the programming model and interface while offloading case-by-case conversions for reuse.

In [11]:
class CountOPTool(amanda.Tool):
    def __init__(self, op_name: str, backward_op_name: str):
        super().__init__()

        # specify tool dependencies
        self.depends_on(mapping_tool)

        self.counter = 0
        self.backward_counter = 0
        self.op_name = op_name
        self.backward_op_name = backward_op_name
        self.add_inst_for_op(self.callback)
        self.add_inst_for_op(self.backward_callback, backward=True, require_outputs=True)

    # analysis routine, filter conv2d operators
    def callback(self, context: amanda.OpContext):
        if self.op_name in context["op_name"]:
            context.insert_before_op(self.counter_op)

    # analysis routine, filter conv2d operators
    def backward_callback(self, context: amanda.OpContext):
        if self.backward_op_name in context["backward_op_name"]:
            context.insert_after_backward_op(self.counter_backward_op)

    # instrumentation routine: op for counting
    def counter_op(self, *inputs):
        self.counter += 1
        return inputs
    
    def counter_backward_op(self, *inputs):
        self.backward_counter += 1
        return inputs

In [12]:
from torchvision.models import resnet50

x = torch.rand((1, 3, 227, 227))
model = resnet50()

tool = CountOPTool(op_name="conv2d", backward_op_name="Conv")

with amanda.tool.apply(tool):

    y = model(x)
    y.backward(torch.rand_like(y))
    print(f"Execution time of conv2d op: {tool.counter}, backward op: {tool.backward_counter}")

Execution time of conv2d op: 53, backward op: 53


In [13]:
import tensorflow as tf
from examples.common.tensorflow.model.resnet_50 import ResNet50

model = ResNet50()
x = tf.random.uniform(shape=[8, 224, 224, 3])

tool = CountOPTool(op_name="Conv2D", backward_op_name="Conv2DBackpropFilter")

with amanda.tool.apply(tool):
    y = model(x)
    with tf.Session() as session:
        session.run(tf.initialize_all_variables())
        g = tf.gradients(y, x)

        session.run(g)
print(tool.counter, tool.backward_counter)

212 159


## Control APIs


## module hook


with module api
and hook