# Use pre-trained neural network to reproduce published results

Paper: *"Fine-grained TLS services classification with reject option"*

In [None]:
import sys
!{sys.executable} -m pip install "torch>=1.10" --index-url https://download.pytorch.org/whl/cu118
!{sys.executable} -m pip install cesnet_datazoo cesnet_models tqdm

Load a pre-trained neural network. We use the mm-CESNET-v1, which is the first version of the multi-modal CESNET architecture. The selected weights were trained on the 40th week of the CESNET-TLS22 dataset.

In [2]:
from cesnet_models.models import MM_CESNET_V1_Weights, mm_cesnet_v1

pretrained_weights = MM_CESNET_V1_Weights.CESNET_TLS22_WEEK40
model = mm_cesnet_v1(weights=pretrained_weights)

Downloading: "https://liberouter.org/datazoo/download?bucket=cesnet-models&file=mmv1_CESNET_TLS22_WEEK40.pth" to C:\Users\janlu/.cache\torch\hub\checkpoints\mmv1_CESNET_TLS22_WEEK40.pth
100%|██████████| 4.70M/4.70M [00:00<00:00, 8.75MB/s]


Download and initialize a dataset class of the CESNET-TLS22 dataset.

Prepare dataset configuration:

- Select test period. Samples from this week will be used to test the model.
- Use data transforms provided in the pre-trained model.
- Select the same application classes on which the model was trained.

In [5]:
from cesnet_datazoo.datasets import CESNET_TLS22
from cesnet_datazoo.config import DatasetConfig, AppSelection

dataset = CESNET_TLS22(data_root="data/CESNET-TLS22/", size="XS")

dataset_config = DatasetConfig(
    dataset=dataset,
    test_period_name="W-2021-41",
    ppi_transform=pretrained_weights.transforms["ppi_transform"],
    flowstats_transform=pretrained_weights.transforms["flowstats_transform"],
    use_tcp_features=pretrained_weights.meta["use_tcp_features"],
    apps_selection=AppSelection.FIXED,
    apps_selection_fixed_known=pretrained_weights.meta["classes"],
    need_train_set=False,
    return_tensors=True,)

dataset.set_dataset_config_and_initialize(dataset_config)
test_dataloader = dataset.get_test_dataloader()

Iterate over the test dataloader and use the model to compute predictions. Use a GPU if availalable.

In [6]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from tqdm import tqdm

def compute_model_predictions(model: nn.Module, test_dataloader: DataLoader, device) -> tuple[torch.Tensor, torch.Tensor]:
    model.eval()
    y_true = []
    y_pred = []
    with torch.no_grad():
        for _, batch_ppi, batch_flowstats, batch_labels in tqdm(test_dataloader, total=len(test_dataloader)):
            batch_ppi, batch_flowstats, batch_labels = batch_ppi.to(device), batch_flowstats.to(device), batch_labels.to(device)
            out = model((batch_ppi, batch_flowstats))
            preds = out.argmax(dim=1)
            y_true.append(batch_labels)
            y_pred.append(preds)
    y_true, y_pred = torch.cat(y_true).cpu(), torch.cat(y_pred).cpu()
    return y_true, y_pred

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = model.to(device)
y_true, y_pred = compute_model_predictions(model, test_dataloader, device=device)

100%|██████████| 2355/2355 [00:21<00:00, 111.38it/s]


Finally, compute classification accuracy.

In [7]:
from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_true, y_pred)
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.9702
