1. Introduction to CustomOp


Need to create subclasses of `CustomOp` to provide execution, code generation and other functionality in FINN.

In [1]:
from finn.custom_op.base import CustomOp
dir(CustomOp)

['__abstractmethods__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_abc_cache',
 '_abc_negative_cache',
 '_abc_negative_cache_version',
 '_abc_registry',
 'execute_node',
 'get_nodeattr',
 'get_nodeattr_allowed_values',
 'get_nodeattr_def',
 'get_nodeattr_types',
 'infer_node_datatype',
 'make_shape_compatible_op',
 'set_nodeattr',
 'verify_node']

(note: the `CustomOp` base class has moved into `finn-base`: https://github.com/Xilinx/finn-base/blob/dev/src/finn/custom_op/base.py -- the `finn` Docker container already has `finn-base` set up as a dependency)

Some points of importance:

1. `CustomOp` instances (in Python) are not meant to store any data, only provide functionality on top of data stored in ONNX. Each `CustomOp` instance has a member `self.onnx_node` which gives access to the ONNX `NodeProto` with attributes. There is also a custom attribute setter/getter system in `CustomOp` to make this process easier.

2. `CustomOp` subclasses need to implement the methods above (those not starting with underscore).

3. To be discoverable in the custom op register, `CustomOp` subclasses must set the `domain` field to the name of the Python module they appear in. For instance, to use the custom `Im2Col` op type from [here](https://github.com/Xilinx/finn-base/blob/dev/src/finn/custom_op/general/im2col.py), the ONNX node must use `domain=finn.custom_op.general`.


## A Simple CustomOp Example

Let's make a simple CustomOp that raises its input to a given exponent (specified as attribute). For now it'll only work in Python, but later we'll add C++ execution capability too.

In [2]:
from onnx import helper
import numpy as np

class MyPythonPowerOp(CustomOp):
    
    # here we use the CustomOp attribute system to make it easier
    # to set/get custom attributes on this node
    def get_nodeattr_types(self):
        return {
            # each entry is:
            # name of attribute : (dtype, required, default value)
            # dtype follows the ONNX attribute protobuf so
            # "i" is int, "s" is string, "f" is float,
            # "ints" is a list of integers...
            # also good practice to document what each attribute does here:
            
            # which integer power to raise the input to
            "exponent" : ("i", True, 0),
            # execution mode : currently only python
            "exec_mode" : ("s", True, "python"),
        }
    
    # return an ONNX node that has the same shape inference behavior
    # here we want in shape = out shape, so we use the ONNX ReLU
    # node to mimic its shape inference behavior
    # we have access to the entire ModelWrapper to help make this decision
    # (the parameter called model)
    def make_shape_compatible_op(self, model):
        node = self.onnx_node
        # make a Relu node connected to the same in-out tensors to get
        # shape inference
        # a general-purpose alternative is to use a Constant node that 
        # produces the desired shape
        return helper.make_node("Relu", [node.input[0]], [node.output[0]])

    # used for FINN DataType inference: set the output tensors' datatypes
    # accordingly for this node
    # here we assume input datatype = output datatype
    # we have access to the entire ModelWrapper to help make this decision
    # (the parameter called model)
    def infer_node_datatype(self, model):
        node = self.onnx_node
        # data type stays the same
        dtype = model.get_tensor_datatype(node.input[0])
        model.set_tensor_datatype(node.output[0], dtype)
    
    # execute this node
    # context: used for both input and output, dictionary of named
    #          tensors
    # graph: the ONNX GraphProto (ModelWrapper.graph), generally 
    #         not needed to execute a single node
    def execute_node(self, context, graph):
        exec_mode = self.get_nodeattr("exec_mode")
        if exec_mode == "python":
            # get names of node input and output tensors
            i_name = self.onnx_node.input[0]
            o_name = self.onnx_node.output[0]
            # grab input tensor from context
            i_tensor = context[i_name]
            # get which power to raise to from attribute
            expnt = self.get_nodeattr("exponent")
            # compute and put output into context
            o_tensor = np.power(i_tensor, expnt)
            context[o_name] = o_tensor
        else:
            raise Exception("Only python exec_mode is supported")
        
    # can use to do a sanity check of all the node's properties
    # optional, not implemented here
    def verify_node(self):
        pass
        
        

To make sure our custom op is available, it needs to be registered. The best practice for this is to create a submodule under `finn.custom_op` which includes a `custom_op` dictionary that maps strings (op names) to classes (op implementations). Since we're in a Jupyter notebook we'll just hijack it at runtme like this:

In [23]:
import finn.custom_op.general as general
general.custom_op["MyPythonPowerOp"] = MyPythonPowerOp

We can see which custom ops are registered under this submodule by looking at the dictionary:

In [24]:
general.custom_op

{'DebugMarker': finn.custom_op.general.debugmarker.DebugMarker,
 'QuantAvgPool2d': finn.custom_op.general.quantavgpool2d.QuantAvgPool2d,
 'MaxPoolNHWC': finn.custom_op.general.maxpoolnhwc.MaxPoolNHWC,
 'StreamingDataflowPartition': finn.custom_op.general.streamingdataflowpartition.StreamingDataflowPartition,
 'MultiThreshold': finn.custom_op.general.multithreshold.MultiThreshold,
 'XnorPopcountMatMul': finn.custom_op.general.xnorpopcount.XnorPopcountMatMul,
 'Im2Col': finn.custom_op.general.im2col.Im2Col,
 'MyPythonPowerOp': __main__.MyPythonPowerOp}

## Let's Try Out our CustomOp

We'll manually build a small ONNX graph containing our node in order to try out some of the functionality. This would normally go into the unit test for this CustomOp.

In [25]:
from finn.core.modelwrapper import ModelWrapper
from onnx import TensorProto

def make_graph(ishape, exp, op_type = "MyPythonPowerOp"):
    inp = helper.make_tensor_value_info(
        "inp", TensorProto.FLOAT, ishape
    )
    outp = helper.make_tensor_value_info(
        "outp", TensorProto.FLOAT, ishape
    )

    custom_node = helper.make_node(
        # op type string in ONNX, what we used to register the custom op
        op_type,
        # name of input tensor
        ["inp"],
        # name of output tensor
        ["outp"],
        # specify domain s.t. FINN can find our op under this submodule
        domain="finn.custom_op.general",
        # set up attributes
        exponent = int(exp),
        exec_mode = "python"
    )

    graph = helper.make_graph(
        nodes=[custom_node], name="custom_graph", inputs=[inp], outputs=[outp]
    )
    model = helper.make_model(graph, producer_name="custom-model")
    return ModelWrapper(model)

In [26]:
# generate a small graph with our custom op
input_shape = (1, 2, 4)
ret_model = make_graph(input_shape, 2)
ret_model.model.graph.node

[input: "inp"
output: "outp"
op_type: "MyPythonPowerOp"
attribute {
  name: "exec_mode"
  s: "python"
  type: STRING
}
attribute {
  name: "exponent"
  i: 2
  type: INT
}
domain: "finn.custom_op.general"
]

In [27]:
from finn.core.datatype import DataType
from finn.util.basic import gen_finn_dt_tensor

# generate a random input of e.g signed 4-bit values
random_input = gen_finn_dt_tensor(DataType.INT4, input_shape)
random_input


array([[[-1., -7.,  5.,  3.],
        [-5.,  4.,  6.,  0.]]], dtype=float32)

In [28]:
from finn.core.onnx_exec import execute_onnx

# run with FINN's execute_onnx
inp_dict = {"inp" : random_input}
ret = execute_onnx(ret_model, inp_dict)
ret

{'outp': array([[[ 1., 49., 25.,  9.],
         [25., 16., 36.,  0.]]], dtype=float32)}

## A CustomOp with C++ Generation

In [29]:
from finn.util.basic import make_build_dir, CppBuilder
import subprocess

# derive from our previous example
class MyMixedPowerOp(MyPythonPowerOp):
    
    # here we use the CustomOp attribute system to make it easier
    # to set/get custom attributes on this node
    def get_nodeattr_types(self):
        return {
            # each entry is:
            # name of attribute : (dtype, required, default value)
            # dtype follows the ONNX attribute protobuf so
            # "i" is int, "s" is string, "f" is float,
            # "ints" is a list of integers...
            # also good practice to document what each attribute does here:
            
            # which integer power to raise the input to
            "exponent" : ("i", True, 0),
            # execution mode : python or c++
            "exec_mode" : ("s", True, "python"),
            # code generation directory
            "codegen_dir" : ("s", False, ""),
        }
    
    def my_custom_cpp_gen(self):
        codegen_dir = make_build_dir(prefix="my_custom_op")
        # set attribute for codegen dir
        self.set_nodeattr("codegen_dir", codegen_dir)
        # generate some C++ code
        cpp_code = """
#include <iostream>
#include <fstream>
using namespace std;
#define EXPONENT %d

int main(int argc, char **argv) {
    ifstream infile("input.txt");
    ofstream outfile("output.txt");
    
    float elem;
    while (infile >> elem)
    {
        float res = 1.0;
        for(int i=0; i < EXPONENT; i++) {
            res *= elem;
        }
        outfile << res << "\\n";
    }

    return 0;
}
        """ % (self.get_nodeattr("exponent"))
        with open(codegen_dir+"/top.cpp", "w") as f:
            f.write(cpp_code)
        builder = CppBuilder()
        # to enable additional debug features please uncommand the next line
        builder.append_includes("--std=c++11")
        builder.append_includes("-O3")
        builder.append_sources(codegen_dir + "/*.cpp")
        builder.set_executable_path(codegen_dir + "/node_model")
        builder.build(codegen_dir)
    
    # execute this node
    # context: used for both input and output, dictionary of named
    #          tensors
    # graph: the ONNX GraphProto (ModelWrapper.graph), generally 
    #         not needed to execute a single node
    def execute_node(self, context, graph):
        exec_mode = self.get_nodeattr("exec_mode")
        # get names of node input and output tensors
        i_name = self.onnx_node.input[0]
        o_name = self.onnx_node.output[0]
        # grab input tensor from context
        i_tensor = context[i_name]
        # get which power to raise to from attribute
        expnt = self.get_nodeattr("exponent")
        if exec_mode == "python":
            # compute and put output into context
            o_tensor = np.power(i_tensor, expnt)
            context[o_name] = o_tensor
        elif exec_mode == "c++":
            build_dir = self.get_nodeattr("codegen_dir")
            # save input as txt, could preprocess, change layout etc..
            np.savetxt(build_dir+"/input.txt", i_tensor.flatten())
            bash_command = ["./node_model"]
            proc_run = subprocess.Popen(bash_command, cwd=build_dir, stdout=subprocess.PIPE)
            proc_run.communicate()
            o_tensor = np.loadtxt(build_dir+"/output.txt")
            o_tensor = o_tensor.reshape(i_tensor.shape)
            context[o_name] = o_tensor
        else:
            raise Exception("Only python and c++ exec_mode is supported")
        
    # can use to do a sanity check of all the node's properties
    # optional, not implemented here
    def verify_node(self):
        pass
        
        

In [30]:
# register our new op
general.custom_op["MyMixedPowerOp"] = MyMixedPowerOp

# make graph with new op
mixedop_graph = make_graph(input_shape, 2, op_type = "MyMixedPowerOp")
mixedop_graph.graph.node

[input: "inp"
output: "outp"
op_type: "MyMixedPowerOp"
attribute {
  name: "exec_mode"
  s: "python"
  type: STRING
}
attribute {
  name: "exponent"
  i: 2
  type: INT
}
domain: "finn.custom_op.general"
]

In [31]:
from finn.custom_op.registry import getCustomOp

# get FINN wrapper for this node, with all the functionality
op_inst = getCustomOp(mixedop_graph.model.graph.node[0])
print("Available functions: " + str(dir(op_inst)))
# query some attributes
print("codegen_dir: " + op_inst.get_nodeattr("codegen_dir"))
print("exec_mode: " + op_inst.get_nodeattr("exec_mode"))

Available functions: ['__abstractmethods__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_abc_cache', '_abc_negative_cache', '_abc_negative_cache_version', '_abc_registry', 'execute_node', 'get_nodeattr', 'get_nodeattr_allowed_values', 'get_nodeattr_def', 'get_nodeattr_types', 'infer_node_datatype', 'make_shape_compatible_op', 'my_custom_cpp_gen', 'onnx_node', 'set_nodeattr', 'verify_node']
codegen_dir: 
exec_mode: python


## Implement a code generation transformation


In [33]:
#from finn.transformation.base import Transformation
# can derive from NodeLocalTransformation for faster (parallel) execution
from finn.transformation.base import NodeLocalTransformation
import os

class MyNodeLocalCodeGen(NodeLocalTransformation):
    
    # will get called (possibly in parallel) for each node
    def applyNodeLocal(self, node):
        # keep track whether we changed anything
        modified_graph = False
        # check node type before we do anything
        if node.op_type == "MyMixedPowerOp":
            # get FINN wrapper for this node, with all the functions
            op_inst = getCustomOp(node)
            if not os.path.isdir(op_inst.get_nodeattr("codegen_dir")):
                # call the codegen function we defined
                # this will modify the underlying node by setting attribute
                op_inst.my_custom_cpp_gen()
                # codegen function modifies attribute
                modified_graph = True
        # important: must return modified_graph = False at some point
        # otherwise transformation will run in infinite loop!
        return (node, modified_graph)

In [34]:
mixedop_graph_new = mixedop_graph.transform(MyNodeLocalCodeGen())

In [35]:
new_op_inst = getCustomOp(mixedop_graph_new.graph.node[0])
codegen_dir = new_op_inst.get_nodeattr("codegen_dir")
print(codegen_dir)

/tmp/finn_dev_maltanar/my_custom_opva5okbax


In [36]:
! ls {codegen_dir}

compile.sh  node_model	top.cpp


In [37]:
! cat {codegen_dir}/top.cpp


#include <iostream>
#include <fstream>
using namespace std;
#define EXPONENT 2

int main(int argc, char **argv) {
    ifstream infile("input.txt");
    ofstream outfile("output.txt");
    
    float elem;
    while (infile >> elem)
    {
        float res = 1.0;
        for(int i=0; i < EXPONENT; i++) {
            res *= elem;
        }
        outfile << res << "\n";
    }

    return 0;
}
        

### Manually generate input and run C++ node model

In [38]:
! echo "7.0 8.0 9.0" > {codegen_dir}/input.txt

In [39]:
! cd {codegen_dir}; ./node_model

In [40]:
! cat {codegen_dir}/output.txt

49
64
81


In [41]:
! rm {codegen_dir}/*.txt

### Use FINN execution flow

In [42]:
# generate a random input of e.g signed 4-bit values
random_input = gen_finn_dt_tensor(DataType.INT4, input_shape)
random_input

array([[[ 6., -3., -3.,  3.],
        [-7., -3.,  3.,  6.]]], dtype=float32)

In [43]:
# run with FINN's execute_onnx, custom node will use Python execution
new_op_inst.set_nodeattr("exec_mode", "python")
inp_dict = {"inp" : random_input}
ret = execute_onnx(mixedop_graph_new, inp_dict)
ret

{'outp': array([[[36.,  9.,  9.,  9.],
         [49.,  9.,  9., 36.]]], dtype=float32)}

In [44]:
# run with FINN's execute_onnx, custom node will use Python execution
new_op_inst.set_nodeattr("exec_mode", "c++")
ret = execute_onnx(mixedop_graph_new, inp_dict)
ret

{'outp': array([[[36.,  9.,  9.,  9.],
         [49.,  9.,  9., 36.]]])}