Convolutional (Conv2d)
https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d

Average Pooling https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html#torch.nn.AvgPool2d

Normalization
https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html#torch.nn.BatchNorm1d

Spatial Dropout
https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html#torch.nn.Dropout2d

Fully connected layer -> nn.Linear()


koszt - cross entropy

aktywacja lineara - relu

aktywacja po 16 neuronach - softmax

zrobili L1, L2 i dropout, ale nie wchodziłam w to dokładnie

batch - 64, 128 dawały najlepsze wyniki

eksperymentalnie sprawdzili, że 100 epok daje zbieżność

w adamie dali 0.0001 learning rate


### Modele sieci 1D, 2D, 3D

In [7]:
class Model1D(nn.Module):
    def __init__(self):
        super().__init__()
        self.convblock1 = nn.Sequential(
            nn.Conv1d(in_channels=1, out_channels=32, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(32),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock2 = nn.Sequential(
            nn.Conv1d(in_channels=32, out_channels=64, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(64),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock3 = nn.Sequential(
            nn.Conv1d(in_channels=64, out_channels=128, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(128),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock4 = nn.Sequential(
            nn.Conv1d(in_channels=128, out_channels=256, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(256),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        # Nazwa bloku do rozważenia, taka mi się wymyśliła, ale nie upieram się przy niej.
        self.evaluator = nn.Sequential(
            nn.Flatten(),
            nn.Linear(1024, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, 16)
            # Bez aktywacji na końcu, bo softmax się doda automatycznie razem z cross entropy.
        )

    def forward(self, x):
        x = self.convblock1(x)
        x = self.convblock2(x)
        x = self.convblock3(x)
        x = self.convblock4(x)
        x = self.evaluator(x)
        return x

In [8]:
class Model2D(nn.Module):
    def __init__(self):
        super().__init__()
        self.convblock1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=32, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(32),
            nn.AvgPool2d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock2 = nn.Sequential(
            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(64),
            nn.AvgPool2d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock3 = nn.Sequential(
            nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(128),
            nn.AvgPool2d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock4 = nn.Sequential(
            nn.Conv2d(in_channels=128, out_channels=256, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),

            # Oni w artykule dają normalizację, ale pytorch wywala błąd.
            # Moim zdaniem słusznie, bo o ile rozumiem ten wzór na normalizację (co jest w dokumentacji),
            # to przy wymiarze wejścia [batch_size, 256, 1, 1] wyjście to tensor zer o takim samym wymiarze.
            # Po prostu normalizacja zmienia nam średnią na 0 (odchylenie standardowe też, ale to nieistotne),
            # a jak mamy 1 element, to zmiana średniej na 0, to zmiana elementu na 0.
            # Więc nawet jak jakoś obejdziemy ten błąd, to wyniki będą bez sensu.
            # Moim zdaniem w artykule jest błąd (a przynajmniej na rysunku, może w implementacji tego nie ma).

            nn.BatchNorm2d(256),
            nn.Dropout(p=0.05)
        )

        self.evaluator = nn.Sequential(
            nn.Flatten(),
            nn.Linear(256, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, 16)
        )

    def forward(self, x):
        x = torch.reshape(x, (x.shape[0],1,8,8)) # można ten reshape dać gdzie indziej jak się znajdzie lepsze miejsce
        x = self.convblock1(x)
        x = self.convblock2(x)
        x = self.convblock3(x)
        x = self.convblock4(x)
        x = self.evaluator(x)
        return x

In [9]:
class Model3D(nn.Module):
    def __init__(self):
        super().__init__()
        self.convblock1 = nn.Sequential(
            nn.Conv3d(in_channels=1, out_channels=32, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm3d(32),
            nn.AvgPool3d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock2 = nn.Sequential(
            nn.Conv3d(in_channels=32, out_channels=64, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm3d(64),
            nn.AvgPool3d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock3 = nn.Sequential(
            nn.Conv3d(in_channels=64, out_channels=128, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm3d(128),
            nn.Dropout(p=0.05)
        )

        self.convblock4 = nn.Sequential(
            nn.Conv3d(in_channels=128, out_channels=256, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm3d(256),
            nn.Dropout(p=0.05)
        )

        self.evaluator = nn.Sequential(
            nn.Flatten(),
            nn.Linear(256, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, 16)
        )

    def forward(self, x):
        x = torch.reshape(x, (x.shape[0],1,4,4,4))
        x = self.convblock1(x)
        x = self.convblock2(x)
        x = self.convblock3(x)
        x = self.convblock4(x)
        x = self.evaluator(x)
        return x

# Lightning

###importy

In [4]:
!pip install polars --upgrade

Collecting polars
  Downloading polars-0.20.26-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (28.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m28.0/28.0 MB[0m [31m52.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: polars
  Attempting uninstall: polars
    Found existing installation: polars 0.20.2
    Uninstalling polars-0.20.2:
      Successfully uninstalled polars-0.20.2
Successfully installed polars-0.20.26


In [5]:
! pip install pytorch-lightning

Collecting pytorch-lightning
  Downloading pytorch_lightning-2.2.4-py3-none-any.whl (802 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m802.2/802.2 kB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m
Collecting torchmetrics>=0.7.0 (from pytorch-lightning)
  Downloading torchmetrics-1.4.0.post0-py3-none-any.whl (868 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m868.8/868.8 kB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
Collecting lightning-utilities>=0.8.0 (from pytorch-lightning)
  Downloading lightning_utilities-0.11.2-py3-none-any.whl (26 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.13.0->pytorch-lightning)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.13.0->pytorch-lightning)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.13.0-

In [6]:
import os
import gdown
import torch
import pytorch_lightning as pl
import polars
from torch.utils.data import DataLoader, random_split
import torch.nn as nn
from torchmetrics.functional import accuracy
import torch.nn.functional as F

### dane

In [10]:
class LazyDataset(torch.utils.data.Dataset):
  def __init__(self, inputs: torch.tensor, labels: torch.tensor, n_classes: int):
    self.inputs = inputs
    self.labels = labels
    self.n_classes = n_classes

  def __len__(self):
    return self.labels.shape[0]

  def __getitem__(self, idx):
    temp = torch.zeros(self.n_classes)
    temp[self.labels[idx].item()] = 1
    return self.inputs[idx], temp

In [11]:
from torchvision.transforms import Normalize

class IoTDataModule(pl.LightningDataModule):
  def __init__(self, file_id: str, file_name: str, batch_size: int=64, binary_classification: bool=False):
    super().__init__()
    self.file_id = file_id
    self.file_name = file_name
    self.batch_size = batch_size
    self.binary_classification = binary_classification

    if self.binary_classification:
      self.n_classes = 2
      self.mapping = {"Normal": 0, "Anomaly": 1}
    else:
      self.n_classes = 5
      self.mapping = {'Normal': 0, 'Mirai': 1, 'DoS': 2, 'Scan': 3, 'MITM ARP Spoofing': 4}

  def prepare_data(self):
    if not os.path.exists(self.file_name):
      gdown.download(id=self.file_id, output=self.file_name)

  def setup(self,stage = None):
    data = torch.reshape(polars.read_csv(self.file_name, columns = range(64)).cast(polars.Float32).to_torch(), (-1, 1, 64))

    if self.binary_classification:
      labels = torch.reshape(polars.read_csv(self.file_name, columns = [64])['Label'].replace(self.mapping, return_dtype=polars.UInt8).to_torch(), (-1, 1))
    else:
      #labels = torch.reshape(polars.read_csv(self.file_name, columns = [65])['Cat'].replace(self.mapping, return_dtype=polars.UInt8).to_torch(), (-1, 1))
      labels = polars.read_csv(self.file_name, columns = [65])['Cat'].replace(self.mapping, return_dtype=polars.UInt8).to_torch()
    dataset = LazyDataset(data, labels, self.n_classes)
    del data, labels

    # Train, test i val.
    l = len(dataset)
    train_and_val_size = int(l * .75)
    dataset, self.test_dataset = random_split(dataset, [train_and_val_size, l - train_and_val_size])

    ll = len(dataset)
    train_size = int( ll * .9)
    self.train_dataset, self.val_dataset = random_split(dataset, [train_size, ll - train_size])
    del dataset

  def train_dataloader(self):
    return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True)

  def val_dataloader(self):
    return DataLoader(self.val_dataset, batch_size=self.batch_size)

  def test_dataloader(self):
    return DataLoader(self.test_dataset, batch_size=self.batch_size)

In [12]:
dm = IoTDataModule(file_id='1k8RqOM7hBvL8uomcKnnLr5HasYRWnYeW', file_name='iot-intrusion_with_headers.csv')
dm.prepare_data()
dm.setup()

Downloading...
From (original): https://drive.google.com/uc?id=1k8RqOM7hBvL8uomcKnnLr5HasYRWnYeW
From (redirected): https://drive.google.com/uc?id=1k8RqOM7hBvL8uomcKnnLr5HasYRWnYeW&confirm=t&uuid=fc74117d-6526-4f39-93b1-2db44bc61c87
To: /content/iot-intrusion_with_headers.csv
100%|██████████| 212M/212M [00:01<00:00, 106MB/s]


### model

In [13]:
class Model1D(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.convblock1 = nn.Sequential(
            nn.Conv1d(in_channels=1, out_channels=32, kernel_size=5, padding='same'),
            nn.ReLU(),
            nn.BatchNorm1d(32),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        self.c1 = nn.Conv1d(in_channels=1, out_channels=32, kernel_size=5, padding='same')
        self.relu1 = nn.ReLU()
        self.norm1 = nn.BatchNorm1d(32)
        self.pool1 = nn.AvgPool1d(2)
        self.drop1 = nn.Dropout(p=0.05)

        self.convblock2 = nn.Sequential(
            nn.Conv1d(in_channels=32, out_channels=64, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(64),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock3 = nn.Sequential(
            nn.Conv1d(in_channels=64, out_channels=128, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(128),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        self.convblock4 = nn.Sequential(
            nn.Conv1d(in_channels=128, out_channels=256, kernel_size=5, padding='same'),
            nn.ReLU(inplace=True),
            nn.BatchNorm1d(256),
            nn.AvgPool1d(2),
            nn.Dropout(p=0.05)
        )

        # Nazwa bloku do rozważenia, taka mi się wymyśliła, ale nie upieram się przy niej.
        self.evaluator = nn.Sequential(
            nn.Flatten(),
            nn.Linear(1024, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, num_classes)
            # Bez aktywacji na końcu, bo softmax się doda automatycznie razem z cross entropy.
        )

    def forward(self, x):
      x = self.convblock1(x)
      x = self.convblock2(x)
      x = self.convblock3(x)
      x = self.convblock4(x)
      x = self.evaluator(x)
      return x

In [14]:
class AnomalyClassifier(pl.LightningModule):

  def __init__(self, lr, model_type=1, binary_classification=False):
    super().__init__()
    self.save_hyperparameters()
    self.lr = lr
    self.current_epoch_training_loss = torch.tensor(0.0)
    self.training_step_outputs = []
    self.validation_step_outputs = []

    self.num_classes = 2 if binary_classification else 5

    if model_type == 1:
      self.model = Model1D(self.num_classes)
    elif model_type == 2:
      self.model = Model2D(self.num_classes)
    else:
      self.model = Model3D(self.num_classes)

  def forward(self, x):
    return self.model(x)

  def compute_loss(self, x, y):
    #print(x)
    return F.cross_entropy(x, y)

  def common_step(self, batch, batch_idx):
    x, y = batch
    outputs = self(x)
    loss = self.compute_loss(outputs,y)
    return loss, outputs, y

  def common_test_valid_step(self, batch, batch_idx):
    loss, outputs, y = self.common_step(batch, batch_idx)
    preds = torch.argmax(outputs, dim=1)
    z = torch.argmax(y, dim=1)
    acc = accuracy(preds, z, num_classes = self.num_classes, task="multiclass")
    return loss, acc

  def training_step(self, batch, batch_idx):
    loss, outputs, y = self.common_step(batch, batch_idx)
    self.training_step_outputs.append(loss)
    preds = torch.argmax(outputs, dim=1)
    z = torch.argmax(y, dim=1)
    acc = accuracy(preds, z, num_classes = self.num_classes, task="multiclass")
    #print(f'train_loss: {loss}, train_acc: {acc}')
    self.log_dict({"train_loss": loss, "train_accuracy": acc}, on_step = False, on_epoch = True, prog_bar = True) # logger=True?
    return {'loss':loss}

  def on_train_epoch_end(self):
    outs = torch.stack(self.training_step_outputs)
    self.current_epoch_training_loss = outs.mean()
    #print(f'train loss: {self.current_epoch_training_loss}')
    self.training_step_outputs.clear()

  def validation_step(self, batch, batch_idx):
    loss, acc = self.common_test_valid_step(batch, batch_idx)
    self.validation_step_outputs.append(loss)

    self.log_dict({'val_loss': loss, 'val_acc': acc}, on_step=True, on_epoch=True, prog_bar=True, logger=True)
    #print(f'val_loss: {loss}, val_acc: {acc}')
    return {'val_loss':loss, 'val_acc': acc}

  def on_validation_epoch_end(self):
    outs = torch.stack(self.validation_step_outputs)
    avg_loss = outs.mean()
    self.logger.experiment.add_scalars('train and vall losses', {'train': self.current_epoch_training_loss.item() , 'val': avg_loss.item()}, self.current_epoch)
    self.validation_step_outputs.clear()

  def test_step(self, batch, batch_idx):
    loss, acc = self.common_test_valid_step(batch, batch_idx)

    self.log_dict({'test_loss': loss, 'test_acc': acc}, on_step=True, on_epoch=True, prog_bar=True, logger=True)
    return {'test_loss': loss, 'test_acc': acc}

  def configure_optimizers(self):
    optimizer =  torch.optim.Adam(self.parameters(), lr=self.lr)
    lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
    return [optimizer], [lr_scheduler]

### logger

In [15]:
from pytorch_lightning.loggers import TensorBoardLogger

In [16]:
logger = TensorBoardLogger("lightning_logs", name="iot_model")

### trening

In [17]:
from pytorch_lightning.callbacks import ModelCheckpoint
MODEL_CKPT_PATH = 'checkpoints/'
MODEL_CKPT = 'model-{epoch:02d}-{val_loss:.2f}'

checkpoint_callback = ModelCheckpoint(
    monitor='val_loss',
    dirpath=MODEL_CKPT_PATH,
    filename=MODEL_CKPT,
    save_top_k=3,
    mode='min')

In [26]:
classifier = AnomalyClassifier(0.0001, model_type=1)
#classifier = AnomalyClassifier.load_from_checkpoint(checkpoint_callback.best_model_path)
trainer = pl.Trainer(accelerator="auto", max_epochs=10, precision=32, logger=logger, callbacks=[checkpoint_callback])

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


In [27]:
trainer.fit(classifier, dm)

INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name  | Type    | Params
----------------------------------
0 | model | Model1D | 744 K 
----------------------------------
744 K     Trainable params
0         Non-trainable params
744 K     Total params
2.977     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=10` reached.


# Poligon

Tu testuję różne rzeczy, ale tak, żeby się nie mieszały z resztą