# DNN-Based DDoS Anomaly Detection in the Network Data Plane

In this 4-part notebook series, we show how a quantized neural network (QNN) can be trained to classify packets as belonging to DDoS (malicious) or regular (benign) network traffic flows. The model is trained with quantized weights and activations, and we use the [Brevitas](https://github.com/Xilinx/brevitas) framework to train the QNN. The model is then converted into an FPGA-friendly RTL implementation for high-throughput inference, which can be integrated with a packet-processing pipeline in the network data plane.

This notebook series is composed of 4 parts. Below is a brief summary of what each part covers.

[Part 1](./1-train.ipynb): How to use Brevitas to train a quantized neural network for our target application, which is classifying packets as belonging to malicious/DDoS or benign/normal network traffic flows. The output trained model at the end of this part is a pure software implementation, i.e. it cannot be converted to a custom RTL FINN model to run on an FPGA just yet.

[Part 2](./2-prepare.ipynb): This notebook focuses on taking the output software model from the previous part and preparing it for hardware-friendly implementation using the FINN framework. The notebook describes the steps taken to "surgery" the software model in order for hardware generation via FINN. We also verify that all the changes made to the software model in this notebook DO NOT affect the output predictions in the "surgeried" model.

[Part 3](./3-build.ipynb): In this notebook, we use the FINN framework to build the custom RTL accelerator for our target model. FINN can generate a variety of RTL accelerators, and this notebook covers some build configuration parameters that influence these outputs.

[Part 4](./4-verify.ipynb): The generated hardware is simulated using cycle-accurate RTL simulation tools, and its outputs are compared against the original software-only model trained in part one. The output model from this step is now ready to be integrated into a larger FPGA design, which in this context is a packet-processing network data plane pipeline designed for identifying anomalous DDoS flows from benign flows.

This tutorial series is a supplement to our demo paper presented at EuroP4 2023 workshop, titled [Enabling DNN Inference in the Network Data Plane](https://dl.acm.org/doi/10.1145/3630047.3630191). You can cite our work using the following BibTeX snippet:

```
@inproceedings{siddhartha2023enabling,
  title={Enabling DNN Inference in the Network Data Plane},
  author={Siddhartha and Tan, Justin and Bansal, Rajesh and Chee Cheun, Huang and Tokusashi, Yuta and Yew Kwan, Chong and Javaid, Haris and Baldi, Mario},
  booktitle={Proceedings of the 6th on European P4 Workshop},
  pages={65--68},
  year={2023}
}
```

# Part 2: Perform surgery on software model to prepare it for hardware acceleration

In this part, we will take the software trained model using Brevitas from the previous notebook ([Part One](./1-train.ipynb)) and perform model "surgery" on it in order to prepare it for hardware generation using the FINN framework. This is a common preprocessing step that needs to be carried out before going through the FINN model generation process. Depending on the model that was trained in part one, there may be a need to customize some of these steps for optimal results.

### House-keeping

Let's first get started with some house-keeping; we will import necessary libraries/packages and declare global constants for this notebook, similar to the house-keeping in part one.

Quick note: **always import onnx before torch**. This is a workaround for a [known bug](https://github.com/onnx/onnx/issues/2394).

In [2]:
import os
import onnx
import json
from os.path import join
from copy import deepcopy
import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import DataLoader
from brevitas.nn import QuantLinear, QuantReLU, QuantIdentity
from brevitas.export import export_qonnx
from finn.transformation.qonnx.convert_qonnx_to_finn import ConvertQONNXtoFINN
from finn.util.visualization import showInNetron
from qonnx.util.cleanup import cleanup as qonnx_cleanup
from qonnx.core.modelwrapper import ModelWrapper
from qonnx.core.datatype import DataType
from qonnx.transformation.general import GiveReadableTensorNames, GiveUniqueNodeNames, RemoveStaticGraphInputs
from qonnx.transformation.infer_shapes import InferShapes
from qonnx.transformation.infer_datatypes import InferDataTypes
from qonnx.transformation.fold_constants import FoldConstants

from utils.common import bcolors
from utils.dataset import CICIDS2017_PerPacket
from utils.train_test import test, verify, verify_onnx

# Setting seeds for reproducibility
torch.manual_seed(0)
np.random.seed(0)

# Path to this end-to-end example's directory
EXAMPLE_DIR = join(os.environ['FINN_ROOT'], "notebooks/end2end_example/ddos-anomaly-detector")

# Path to build directory from part one
BUILD_DIR_P1 = join(EXAMPLE_DIR, "build", "part_01")

# Path to build directory to write outputs from this notebook to
BUILD_DIR = join(EXAMPLE_DIR, "build", "part_02")
os.makedirs(BUILD_DIR, exist_ok=True)

# Path to where datasets are stored
DATASET_DIR = join(EXAMPLE_DIR, "data")

# get the target device to run model on
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Target device: {device}")

Target device: cpu


There are two additional house-keeping steps for this notebook: (i) load the binarized test set for verification purposes, and (ii) load the trained model from part one.

In [3]:
# let's load the dataset_metadata.json config parameters
with open(join(BUILD_DIR_P1, "dataset_metadata.json"), "r") as fp:
    dataset_metadata = json.load(fp)

# extract feature names from metadata
feature_columns = [x[0] for x in dataset_metadata["ordering"]]

test_set_fpath = join(DATASET_DIR, "cicids2017-split.test.csv")
dataset = CICIDS2017_PerPacket(test_set_fpath)
test_set, _ = dataset.get_binarized_dataset(feature_columns)

# Batch size to use for inference
batch_size = 1000
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)

Loading CIC-IDS2017 per-packet-level dataset
Loaded dataset of length = 130777
Dataset statistics: 67375/130777 (51.52% TRUE labels)
Dataset metadata: {
    "total_in_bitwidth": 128,
    "ordering": [
        [
            "total_bytes",
            32
        ],
        [
            "duration_usec",
            64
        ],
        [
            "total_pkts",
            16
        ],
        [
            "total_urg",
            16
        ]
    ],
    "total_out_bitwidth": 1
}


In [4]:
# let's declare the model architecture -- note that this must be identical to the 
# model used in part one, or else there may be errors loading in the trained weights
input_size = dataset_metadata["total_in_bitwidth"]
hidden1 = 32
hidden2 = 32
weight_bit_width = 2
act_bit_width = 2
num_classes = dataset_metadata["total_out_bitwidth"]

model = nn.Sequential(
      QuantLinear(input_size, hidden1, bias=True, weight_bit_width=weight_bit_width),
      nn.BatchNorm1d(hidden1),
      nn.Dropout(0.5),
      QuantReLU(bit_width=act_bit_width),
      QuantLinear(hidden1, hidden2, bias=True, weight_bit_width=weight_bit_width),
      nn.BatchNorm1d(hidden2),
      nn.Dropout(0.5),
      QuantReLU(bit_width=act_bit_width),
      QuantLinear(hidden2, num_classes, bias=True, weight_bit_width=weight_bit_width)
)

# Make sure the model is on CPU before loading a pretrained state_dict
model = model.cpu()

# Load pretrained weights
trained_state_dict = torch.load(join(BUILD_DIR_P1, "trained_model.pth"))
model.load_state_dict(trained_state_dict, strict=True)

<All keys matched successfully>

Now that the model is loaded with our pre-trained weights, let's verify that it delivers the same test accuracy as observed in the previous notebook.

In [5]:
# Move model to target device and run inference on test set
model.to(device)
print(f"Trained model test accuracy = {100*test(model, test_loader, device):.4f}%")

  return super(Tensor, self).rename(names)


Trained model test accuracy = 85.1426%


### Network Surgery

Often, it is desirable to make some changes to our trained network prior to generating FPGA RTL using FINN. This step is known in general as "network surgery". This step depends on the model and is not generally necessary, but in this case we want to make a couple of changes to get better results with FINN.

We start by first moving the model to the CPU.

In [6]:
# Move the model to CPU before surgery
model = model.cpu()

One common surgery step is to pad input vector to a byte-aligned number of bits, i.e., multiple of 8b. For example, in [this notebook](https://github.com/Xilinx/finn/blob/v0.10/notebooks/end2end_example/cybersecurity/1-train-mlp-with-brevitas.ipynb), the input vector is 593b wide, and is padded to 600b to make the folding (parallelization) for the first layer simpler. The padding is done with zero values, and subsequent inference requests are made with extra 7b of zero-padding to each input. The first layer weight matrix is also affected, and zero weights must be added to the matrix. To see how that can be done, please refer to the "Network Surgery" section in [this notebook](https://github.com/Xilinx/finn/blob/v0.10/notebooks/end2end_example/cybersecurity/1-train-mlp-with-brevitas.ipynb). **In this example notebook, our inputs are already byte-aligned, so we skip this step.**

We start by first making a copy of the original software model.

In [7]:
surgery_model = deepcopy(model)

Next, we'll modify the expected input/output ranges. In FINN, we prefer to work with bipolar {-1, +1} instead of binary {0, 1} values. To achieve this, we'll create a "wrapper" model that handles the pre/postprocessing as follows:

* on the input side, we'll pre-process by (x + 1) / 2 in order to map incoming {-1, +1} inputs to {0, 1} ones which the trained network is used to. Since we're just multiplying/adding a scalar, these operations can be [*streamlined*](https://finn.readthedocs.io/en/latest/nw_prep.html#streamlining-transformations) by FINN and implemented with no extra cost.

* on the output side, we'll add a binary quantizer which maps everthing below 0 to -1 and everything above 0 to +1. This is essentially the same behavior as the sigmoid we used earlier, except the outputs are bipolar instead of binary.

In [8]:
class ExportModel(nn.Module):
    def __init__(self, my_pretrained_model):
        super(ExportModel, self).__init__()
        self.pretrained = my_pretrained_model
        self.qnt_output = QuantIdentity(
            quant_type='binary', 
            scaling_impl_type='const',
            bit_width=1, min_val=-1.0, max_val=1.0)
    
    def forward(self, x):
        # assume x contains bipolar {-1,1} elems
        # shift from {-1,1} -> {0,1} since that is the
        # input range for the trained network
        x = (x + torch.tensor([1.0]).to(x.device)) / 2.0  
        out_original = self.pretrained(x)
        out_final = self.qnt_output(out_original)   # output as {-1,1}     
        return out_final

model_for_export = ExportModel(surgery_model)
model_for_export.to(device)

ExportModel(
  (pretrained): Sequential(
    (0): QuantLinear(
      in_features=128, out_features=32, bias=True
      (input_quant): ActQuantProxyFromInjector(
        (_zero_hw_sentinel): StatelessBuffer()
      )
      (output_quant): ActQuantProxyFromInjector(
        (_zero_hw_sentinel): StatelessBuffer()
      )
      (weight_quant): WeightQuantProxyFromInjector(
        (_zero_hw_sentinel): StatelessBuffer()
        (tensor_quant): RescalingIntQuant(
          (int_quant): IntQuant(
            (float_to_int_impl): RoundSte()
            (tensor_clamp_impl): TensorClampSte()
            (delay_wrapper): DelayWrapper(
              (delay_impl): _NoDelay()
            )
          )
          (scaling_impl): StatsFromParameterScaling(
            (parameter_list_stats): _ParameterListStats(
              (first_tracked_param): _ViewParameterWrapper(
                (view_shape_impl): OverTensorView()
              )
              (stats): _Stats(
                (stats_impl): AbsM

We can run inference with our test set using the same helper `test()` function used before, except now we can pass a `bipolar=True` flag to the arguments to indicate that our export model now accepts bipolar inputs and outputs bipolar outputs. There should be no change in accuracy.

In [9]:
print(f"Bipolar export model test accuracy = {100*test(model_for_export, test_loader, device, bipolar=True):.4f}%")

Bipolar export model test accuracy = 85.1426%


### Verification of bipolar model

We can go a step further and verify that each output from this model matches the expected output from our original trained Brevitas model. We provide a `verify()` method to simplify this verification step in software. Note that we only take a subset of the test set for doing the verification for faster completion. The terms in a confusion matrix -- true-positive (`tp`), true-negative (`tn`), false-positive (`fp`), and false-negative (`fn`) -- are also printed so we can be confident that both positive and negative classes were being predicted.

In [10]:
num_verif = 1000
verif_tensors = test_set.tensors[0][:num_verif]
if verify(model_for_export, model, verif_tensors, device, bipolar=True):
    print(f"{bcolors.OKGREEN}Model output matches with reference Brevitas software model output!{bcolors.ENDC}")
else:
    print(f"{bcolors.FAIL}Model output differs from reference Brevitas software model output! Something went wrong...{bcolors.ENDC}")

ok 1000 nok 0 (tp=367, tn=633, fp=0, fn=0): 100%|███████████████| 1000/1000 [00:04<00:00, 232.09it/s]

[92mModel output matches with reference Brevitas software model output![0m





### Exporting the model

Now for the final step in this notebook: exporting the model into an QONNX format.

[ONNX](https://onnx.ai/) is an open format built to represent machine learning models, and the FINN compiler expects an ONNX model as input. We'll now export our network into ONNX to be imported and used in FINN for the next notebooks. Note that the particular ONNX representation used for FINN differs from standard ONNX, you can read more about this [here](https://finn.readthedocs.io/en/latest/internals.html#intermediate-representation-finn-onnx).

You can see below how we export a trained network in Brevitas into a FINN-compatible ONNX representation (QONNX). QONNX is the format we can export from Brevitas, to feed it into the FINN compiler, we will need to make a conversion to the FINN-ONNX format which is the intermediate representation the compiler works on. The conversion of the FINN-ONNX format is a FINN compiler transformation and to be able to apply it to our model, we will need to wrap it into [ModelWrapper](https://finn.readthedocs.io/en/latest/internals.html#modelwrapper). This is a wrapper around the ONNX model which provides several helper functions to make it easier to work with the model. Then we can call the conversion function to obtain the model in FINN-ONNX format.

In [11]:
# declare path to output ONNX file to save to
model_for_export_fpath = join(BUILD_DIR, "ready-for-finn.onnx")

# Start the export process
input_shape = (1, dataset_metadata["total_in_bitwidth"])

# create a QuantTensor instance to mark input as bipolar during export
input_a = np.random.randint(0, 1, size=input_shape).astype(np.float32)
input_a = 2 * input_a - 1
scale = 1.0
input_t = torch.from_numpy(input_a * scale)

# Move to CPU before export
model_for_export.cpu()

# Export to ONNX
export_qonnx(
    model_for_export,
    export_path=model_for_export_fpath,
    input_t=input_t
)

# clean-up
qonnx_cleanup(model_for_export_fpath, out_file=model_for_export_fpath)

# ModelWrapper
model_for_export = ModelWrapper(model_for_export_fpath)

# Setting the input datatype explicitly because it doesn't get derived from the export function
model_for_export.set_tensor_datatype(model_for_export.graph.input[0].name, DataType["BIPOLAR"])
model_for_export = model_for_export.transform(ConvertQONNXtoFINN())
model_for_export.save(model_for_export_fpath)

print(f"Model saved to {model_for_export_fpath}")



Model saved to /home/sids/workspace/project_find_ml/finn/notebooks/end2end_example/ddos-anomaly-detector/build/part_02/ready-for-finn.onnx


### Viewing the ONNX Model in Netron

We can visualize what the exported ONNX model looks like using [Netron](https://github.com/lutzroeder/netron), which is a visualizer for neural networks and allows interactive investigation of network properties. For example, you can click on the individual nodes and view the properties. Particular things of note:

* The input tensor `global_in` is annotated with `finn_datatype = BIPOLAR`
* The input preprocessing (x + 1) / 2 is exported as part of the network (initial `Add` and `Div` layers)
* Brevitas `QuantLinear` layers are exported to ONNX as `MatMul`. We've exported the padded version; shape of the first MatMul node's weight parameter is 128x32 (`num_inputs` x `size_of_first_layer`).
* The weight parameters (second inputs) for MatMul nodes are annotated with `finn_datatype = INT2`
* The quantized activations are exported as `MultiThreshold` nodes with `module = qonnx.custom_op.general`
* There's a final `MultiThreshold` node with threshold = 0 (second input) to produce the final bipolar output (this is the `qnt_output` from `ExportModel`

In [12]:
showInNetron(model_for_export_fpath)

Serving '/home/sids/workspace/project_find_ml/finn/notebooks/end2end_example/ddos-anomaly-detector/build/part_02/ready-for-finn.onnx' at http://0.0.0.0:8081


### Verification of ONNX model

We can also verify that the exported ONNX model to make sure that we have not lost any functional correctness going through the steps above. Since this is an ONNX model now, we need ONNX runtime to execute the graph to produce the inference output.

Before running the verification, we need to prepare our FINN-ONNX model. In particular, all the intermediate tensors need to have statically defined shapes. To do this, we apply some graph transformations to the model like a kind of "tidy-up" to make it easier to process. 

**Graph transformations in FINN:** The whole FINN compiler is built around the idea of transformations, which gradually transform the model into a synthesizable hardware description. Although FINN offers functionality that automatically calls a standard sequence of transformations (covered in the next notebook), you can also manually call individual transformations (like we do here), as well as adding your own transformations, to create custom flows. You can read more about these transformations in [this notebook](https://github.com/Xilinx/finn/blob/v0.10/notebooks/end2end_example/bnn-pynq/tfc_end2end_example.ipynb).

In [13]:
# load the model from file
model_for_verif = ModelWrapper(model_for_export_fpath)

# apply the tidy-up transformations
model_for_verif = model_for_verif.transform(InferShapes())
model_for_verif = model_for_verif.transform(FoldConstants())
model_for_verif = model_for_verif.transform(GiveUniqueNodeNames())
model_for_verif = model_for_verif.transform(GiveReadableTensorNames())
model_for_verif = model_for_verif.transform(InferDataTypes())
model_for_verif = model_for_verif.transform(RemoveStaticGraphInputs())

# save the verification model
model_for_verif_fpath = join(BUILD_DIR, "ready-for-finn-verif.onnx")
model_for_verif.save(model_for_verif_fpath)

**Would the FINN compiler still work if we didn't do this?** The compilation step in the next notebook applies these transformations internally and would work fine, but we're going to use FINN's verification capabilities below and these require the tidy-up transformations. Note that in FINN v0.10 release, `qonnx_cleanup()` already handles most, if not all, of these transformations, and they may not be strictly needed for the verification step. However, running them again doesn't produce any unwanted side-effects, and can be thought of as a sanity-check to handle edge-cases where some annotations were missing. This is especially important for the `InferShapes()` and `InferDataTypes()` transformations, as the ONNX runtime needs to know the shape and data types of tensors in order to process the graph during inference.

We can view our "verification" model after the transformations. Note that all intermediate tensors must have their shapes specified (indicated by numbers next to the arrows going between layers). Additionally, the `InferDataTypes()` transformation has propagated quantization annotations to the outputs of `MultiThreshold` layers (expand by clicking the + next to the name of the tensor to see the quantization annotation) and the final output tensor.

In [14]:
showInNetron(model_for_verif_fpath)

Stopping http://0.0.0.0:8081
Serving '/home/sids/workspace/project_find_ml/finn/notebooks/end2end_example/ddos-anomaly-detector/build/part_02/ready-for-finn-verif.onnx' at http://0.0.0.0:8081


Similar to the verification step above, we have provided a helper verification function, `verify_onnx()` that verifies that the outputs of the ONNX model match that from the golden Brevitas reference model.

In [15]:
num_verif = 100
verif_tensors = test_set.tensors[0][:num_verif]
if verify_onnx(model_for_verif, model, verif_tensors, device):
    print(f"{bcolors.OKGREEN}FINN-ONNX model output matches with reference Brevitas software model output!{bcolors.ENDC}")
else:
    print(f"{bcolors.FAIL}FINN-ONNX model output differs from reference Brevitas software model output! Something went wrong...{bcolors.ENDC}")

ok 100 nok 0 (tp=35, tn=65, fp=0, fn=0): 100%|█████████████████████| 100/100 [00:05<00:00, 17.87it/s]

[92mFINN-ONNX model output matches with reference Brevitas software model output![0m





That's it! If all the verification steps passed, we are ready to move on to the next notebook in this series: [Part 3](./3-build.ipynb).