# FINN - CustomOps
-----------------------------------------------------------------
<font size="3">This notebook should give a more detailed insight into FINN custom operation nodes. </font>

<font size="3">Following showSrc function is used to print the source code of function calls in the Jupyter notebook: </font>

In [1]:
import inspect

def showSrc(what):
    print("".join(inspect.getsourcelines(what)[0]))

<font size="3">FINN uses many custom operations (`op_type` in ONNX NodeProto) that are not defined in the ONNX operator schema. These custom nodes are marked with `domain="finn"` in the protobuf to identify them as such. These nodes can represent specific operations that we need for low-bit networks, or operations that are specific to a particular hardware backend.

A very abstract version of a custom op node representing a streaming fc layer is shown below. </font>

## Outline
---------------------------
* <font size="3">Basic FINN-ONNX node</font>
* <font size="3">CustomOp class</font>
* <font size="3">HLS FINN-ONNX node</font>
* <font size="3">HLSCustomOp class</font>

## Basic FINN-ONNX node

<font size="3">To create a FINN-ONNX node you can use the helper function of ONNX. Because it is an ONNX NodeProtobuf, but with several additional attributes. The procedure is shown with an example for a multithreshold node. </font>

`multithreshold_node = helper.make_node(
    "MultiThreshold",
    ["v", "thresholds"],
    ["out"],
    domain="finn",
    out_scale=2.0,
    out_bias=-1.0,
    out_dtype="",
)`


<font size="3">The `helper.make_node` function gets the op_type as first argument. In this case it is *MultiThreshold*. Then the inputs and outputs are passed. Beside the data input the multithreshold node has an additional input to pass the threshold values. 

The next attribute (`domain`) is to specify that it is a FINN-ONNX node. It must be set to `"finn"`, so that the functions that work with FINN-ONNX nodes can directly recognize that it is a CustomOp. The attributes `out_scale` and `out_bias` are special multithreshold attributes to manipulate the output value. `out_dtype` contains the output data type.
    
**Note**: each FINN-ONNX node has its own special attributes, which must be set correctly to ensure proper processing.</font>

## CustomOp class

<font size="3">Custom Ops are represented in FINN as ONNX nodes on the one hand and by a CustomOp class on the other hand. This allows easier access to different attributes and introduces special custom op functions. See below for the standard CustomOp class.</font>

In [2]:
from finn.custom_op import CustomOp
showSrc(CustomOp)

class CustomOp(ABC):
    """CustomOp class all custom op nodes are based on. Contains different functions 
    every custom node should have. Some as abstract methods, these have to be filled when
    writing a new custom op node."""
    def __init__(self, onnx_node):
        super().__init__()
        self.onnx_node = onnx_node

    def get_nodeattr(self, name):
        """Get a node attribute by name. Data is stored inside the ONNX node's
        AttributeProto container. Attribute must be part of get_nodeattr_types.
        Default value is returned if attribute is not set."""
        try:
            (dtype, req, def_val) = self.get_nodeattr_types()[name]
            attr = get_by_name(self.onnx_node.attribute, name)
            if attr is not None:
                # dtype indicates which ONNX Attribute member to use
                # (such as i, f, s...)
                ret = attr.__getattribute__(dtype)
                if dtype == "s":
                    # decode string attribut

<font size="3">When instantiating the class, the ONNX node is passed to access all attributes of the node within the class. This is accompanied by the functions `get_nodeattr()`and `set_nodeattr()`, which each instance of this class has. Furthermore 4 abstract methods are implemented, which are described in more detail in the commands of the code and will be exemplarily explained for the multithreshold node in the following. </font>

In [3]:
from finn.custom_op.multithreshold import MultiThreshold
showSrc(MultiThreshold)

class MultiThreshold(CustomOp):
    """Class that corresponds to a multithresholding node."""
    def get_nodeattr_types(self):
        return {
            "out_dtype": ("s", True, ""),
            "out_scale": ("f", False, 1.0),
            "out_bias": ("f", False, 0.0),
        }

    def make_shape_compatible_op(self):
        node = self.onnx_node
        return helper.make_node("Relu", [node.input[0]], [node.output[0]])

    def infer_node_datatype(self, model):
        node = self.onnx_node
        odt = self.get_nodeattr("out_dtype")
        model.set_tensor_datatype(node.output[0], DataType[odt])

    def execute_node(self, context, graph):
        node = self.onnx_node
        # save inputs
        v = context[node.input[0]]
        thresholds = context[node.input[1]]
        # retrieve attributes if output scaling is used
        out_scale = self.get_nodeattr("out_scale")
        out_bias = self.get_nodeattr("out_bias")
        # calculate output
        output = multithresh

<font size="3"> `get_nodeattr_types`: returns a dict for the permitted attributes for node. It returns a triple with following values for each of the special multithreshold attributes. </font>
* <font size="3">`dtype`: indicates which member of the ONNX AttributeProto will be utilized </font>
* <font size="3">`require`: indicates whether this attribute is required </font>
* <font size="3">`default_value`: indicates the default value that will be used if the attribute is not set </font>

<font size="3">`make_shape_compatible_op`: To use the flow of FINN, the transformation pass [infer_shapes](https://github.com/Xilinx/finn/blob/master/src/finn/transformation/infer_shapes.py) is applied to the graphs in various places. In order for this transformation to be applied to CustomOps, they must first be converted to standard ONNX nodes with the same shape behavior. This means, nodes where the relationship between input and output shape is the same. 

This is done at this point. Since the output shape of a multithreshold node is the same as the input shape, it can be replaced by a `"Relu"` node from the standard node library of onnx.</font>

<font size="3">`infer_node_datatype`: sets the output tensor data type accordingly to the attribute `out_dtype` </font>

<font size="3">`execute_node`: This function allows the execution of the node, depending on the CustomOp a different functionality has to be implemented. In the case of the multithreshold node the input values and the thresholds are first extracted and after the attributes for the output scaling have been retrieved, the output is calculated with the help of a separate function. For more details regarding this function please take a look in the code [here](https://github.com/Xilinx/finn/blob/master/src/finn/custom_op/multithreshold.py). </font>

<font size="3">FINN has a subset of CustomOps that correspond to the [finn-hls](https://finn-hlslib.readthedocs.io/en/latest/) library. In the next part of the Jupyter notebook these are described in more detail. </font>

## HLS FINN-ONNX node

<font size="3">The creation of an HLS FINN-ONNX node looks very similar to the creation of a basic FINN-ONNX node. But three new attributes are introduced that are necessary to enable the processing of HLS FINN-ONNX nodes in FINN.</font>

`FCLayer_node = helper.make_node(
    "StreamingFCLayer_Batch",
    node_inp_list,
    node_outp_list,
    domain="finn",
    backend="fpgadataflow",
    code_gen_dir="",
    executable_path="",
    resType="ap_resource_lut()",
    MW=mw,
    MH=mh,
    SIMD=simd,
    PE=pe,
    inputDataType=<FINN DataType>,
    weightDataType=<FINN DataType>,
    outputDataType=<FINN DataType>,
    ActVal=actval,
    binaryXnorMode=<0/1>,
    noActivation=<0/1>
)`

<font size="3">`"StreamingFCLayer_Batch"` describes the op_type, then the inputs and outputs are declared. This is still like building a default onnx node without additional attributes. But since this is a custom op node of FINN, the attribute `domain="finn"` must be set. The streaming fc layer is a custom op from the [finn-hls](https://finn-hlslib.readthedocs.io/en/latest/) library, this information is set in the node using the `backend` attribute. To execute a custom op from the [finn-hls](https://finn-hlslib.readthedocs.io/en/latest/) library, the corresponding c++ code must be created and an executable must be produced. Where the generated code is stored is specified in the `code_gen_dir` attribute and `executable_path` specifies the path to the produced executable. In addition to the data types of the input and output tensors, the node also contains various other attributes resulting from the parameters of the corresponding [finn-hls](https://finn-hlslib.readthedocs.io/en/latest/) library function. More detailed information can be found in the documentation of [finn-hlslib](https://finn-hlslib.readthedocs.io/en/latest/).</font>

## HLSCustomOp class

<font size="3">If it is a node from the [finn-hls](https://finn-hlslib.readthedocs.io/en/latest/) library another class is used which is derived from the CustomOp class:</font>

In [4]:
from finn.custom_op.fpgadataflow import HLSCustomOp
showSrc(HLSCustomOp)

class HLSCustomOp(CustomOp):
    """HLSCustomOp class all custom ops that correspond to a finn-hlslib 
    function are based on. Contains different functions every fpgadataflow 
    custom node should have. Some as abstract methods, these have to be filled
    when writing a new fpgadataflow custom op node."""
    def __init__(self, onnx_node):
        super().__init__(onnx_node)

        self.code_gen_dict = {}

        # getting templates from templates.py

        # template for single node execution
        self.docompute_template = templates.docompute_template

        # templates for single node ip generation
        # cpp file
        self.ipgen_template = templates.ipgen_template
        # tcl script
        self.ipgentcl_template = templates.ipgentcl_template

    def get_nodeattr_types(self):
        return {
            "backend": ("s", True, "fpgadataflow"),
            "code_gen_dir_npysim": ("s", False, ""),
            "code_gen_dir_ipgen": ("s", False, ""),
           

<font size="3">When creating an instance of this class, a template is introduced, which forms the layout for the c++ code to execute the node. It has some general constructs, like the inclusion of bnn-library.h, which contains the references to the finn-hls library, and of cnpy.h and npy2apintstream.hpp, which support the transfer of python numpy arrays in c++. The idea of this template is to replace the variables marked with `$ $` with c++ calls during code generation. Then the template can be written into a .cpp file and be compiled.

**`get_nodeattr_types()`**: each instance of the HLSCustomOp class must have the attributes `code_gen_dir` and `executable_path`, since to execute these nodes c++ code must be generated and correspondingly the executables.

</font>



<font size="3">**`code_generation(model)`**: all functions required for code generation are called and the `$ $` variables in the template are replaced accordingly and written into a .cpp file. Almost all of these subfunctions are implemented as abstract methods in the class, so they are completely customized for each custom op node. A special function is `generate_params()`. This is not implemented as an abstract method, but as a normal function, but contains by default only `pass`. This is because some custom op nodes do not have parameters that need to be generated and in this way the function is skipped. For example for a streaming fc layer node a parameter generation is necessary. How such a parameter generation can look like is described in more detail in the course of this notebook.
</font>

<font size="3">**`compile_singlenode_code()`**: To compile the generated code, the compile command must be built. This is done in this function. It creates an instance of the `CppBuilder()` class and assembles the various components for the function. The `.build` function creates the executable and then sets the corresponding attribute. The class `CppBuilder` is a transformation and a more detailed description can be found in Jupyter notebook [FINN-CodeGenerationAndCompilation](FINN-CodeGenerationAndCompilation.ipynb).
</font>

<font size="3">**`dynamic_input_to_npy(context, count)`**: creates a .npy file for all inputs of the node. These files will be stored in the directory specified by code_gen_dir. The argument `count` must be used to specify the number of inputs. `context` contains the values for the inputs.</font>

<font size="3">**`npy_to_dynamic_output(context)`**: reads the output values and sets `context` dictionary accordingly. When executing the c++ executable of the node, the output values are written to a .npy file. </font>

<font size="3">**`exec_precompiled_singlenode_model()`**: executes precompiled executable which is specified in `executable_path`</font>

<font size="3">**`execute_node(context,graph)`**: calls first `dynamic_input_to_npy()`, then executes the executable using `exec_precompiled_singlenode_model()` and at the end reads the output .npy file with `npy_to_dynamic_output`</font>

#### Generate Parameter
<font size="3">Parameters have to be generated for specific types of HLSCustomOps. For example if the node is a streaming fc layer, there are weights and activation values, which are written to separate .h files and added to the template using `#include`. For streaming fc layer the parameter generation looks like this:
</font>

In [5]:
from finn.custom_op.fpgadataflow.streamingfclayer_batch import StreamingFCLayer_Batch
showSrc(StreamingFCLayer_Batch.generate_params)

    def generate_params(self, model, path):
        """Saves weights into params.h and if existing thresholds into thresh.h."""
        code_gen_dir = path
        # weights
        weights = model.get_initializer(self.onnx_node.input[1])
        # convert weights into hlslib-compatible format
        weight_tensor = self.get_hls_compatible_weight_tensor(weights)
        export_wdt = self.get_weight_datatype()
        # we have converted bipolar weights to binary for export,
        # so use it as such for weight generation
        if self.get_weight_datatype() == DataType.BIPOLAR:
            export_wdt = DataType.BINARY
        weight_hls_code = numpy_to_hls_code(
            weight_tensor, export_wdt, "weights", True, True
        )
        # write weights into params.h
        # code_gen_dir = self.get_nodeattr("code_gen_dir_npysim")
        f_weights = open("{}/params.h".format(code_gen_dir), "w")

        if export_wdt.bitwidth() != 1:
            f_weights.write(
               

<font size="3">First, the values for the weights are extracted with `get_initializer()` using the ModelWrapper. At this point it is assumed that the second input of the streamingfclayer specifies the weights. After a few manipulations the weights are written in `params.h`. If there are threshold values, they will be prepared and written to `thresh.h`. </font>