<a href="https://colab.research.google.com/github/esthy13/cil-intrusion-detection/blob/main/notebooks/1_der_draft.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DER++ for Intrusion Detection (CIC-IDS)

Minimal **working** implementation of **Dark Experience Replay++**
for class-incremental intrusion detection.

In [29]:
!git clone https://github.com/esthy13/cil-intrusion-detection
%cd cil-intrusion-detection
!git pull
# resetting the path to content to avoid issues in the rest of the notebook
%cd ..

fatal: destination path 'cil-intrusion-detection' already exists and is not an empty directory.
/content/cil-intrusion-detection
Already up to date.
/content


In [30]:
import os
import glob
import numpy as np
import pandas as pd
import random
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from sklearn.metrics import accuracy_score, f1_score

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

## Dataset

In [31]:
class IDSBaseDataset(Dataset):
    def __init__(self, root_dir, split="train"):
        """
        root_dir: path to 2017/
        split: 'train' or 'test'
        """
        csv_dir = os.path.join(root_dir, split)
        csvs = glob.glob(os.path.join(csv_dir, "*.csv"))
        assert len(csvs) > 0, f"No CSV files found in {csv_dir}"

        df = pd.concat([pd.read_csv(c) for c in csvs], ignore_index=True)

        labels = list(df["Label"].unique())

        if "benign" not in labels:
            raise ValueError("Dataset must contain a 'benign' class")

        # Enforcing benign as class 0
        labels = ["benign"] + sorted([l for l in labels if l != "benign"])

        self.classes = labels
        self.class_to_idx = {c: i for i, c in enumerate(self.classes)}

        self.x = df.drop(columns=["Label"]).values.astype(np.float32)
        self.y = np.array(
            [self.class_to_idx[label] for label in df["Label"]],
            dtype=np.int64
        )

    def __len__(self):
        return len(self.y)

    def __getitem__(self, idx):
        return torch.tensor(self.x[idx]), torch.tensor(self.y[idx])
    def set_features(self, new_x):
        assert new_x.shape == self.x.shape
        self.x = new_x.astype(np.float32)

## Task builder

In [32]:
class RemappedSubset(Dataset):
    """
    Subset that remaps global class indices to [0..C-1]
    """
    def __init__(self, dataset, indices, class_ids):
        self.dataset = dataset
        self.indices = indices
        self.class_map = {cid: i for i, cid in enumerate(class_ids)}

    def __len__(self):
        return len(self.indices)

    def __getitem__(self, idx):
        x, y = self.dataset[self.indices[idx]]
        return x, torch.tensor(self.class_map[y.item()])


In [33]:
def build_task(dataset, class_names):
    class_ids = [dataset.class_to_idx[c] for c in class_names]
    idxs = np.where(np.isin(dataset.y, class_ids))[0]
    return RemappedSubset(dataset, idxs, class_ids)

In [34]:
def build_scenario( all_classes, attacks_pattern, benign_class="benign"):
    """
    all_classes: ordered list of class names (benign must be first)
    attacks_pattern: list of ints, number of NEW attacks per task
                     e.g. [1,1,1] or [3,2] or [5]
    benign_class: name of benign class (default: 'benign')

    Returns:
        tasks: list of lists of class names (cumulative)
    """

    if benign_class not in all_classes:
        raise ValueError(f"Benign class '{benign_class}' not found in classes")

    if all_classes[0] != benign_class:
        raise ValueError(
            f"Benign class must be index 0, got {all_classes[0]}"
        )

    attack_classes = [c for c in all_classes if c != benign_class]

    if sum(attacks_pattern) != len(attack_classes):
        raise ValueError(
            f"Invalid attacks_pattern: sum={sum(attacks_pattern)}, "
            f"but there are {len(attack_classes)} attack classes"
        )

    tasks = []
    current_index = 0

    for _, n_new in enumerate(attacks_pattern):
        current_index += n_new
        seen_attacks = attack_classes[:current_index]
        seen_classes = [benign_class] + seen_attacks
        tasks.append(seen_classes)

    return tasks


In [35]:
class UpToNormalizer:
    """
    Continual min-max normalizer using only past and present data
    """
    def __init__(self):
        self.min = None
        self.max = None

    def update(self, x):
        """
        x: numpy array [N, D]
        """
        batch_min = x.min(axis=0)
        batch_max = x.max(axis=0)

        if self.min is None:
            self.min = batch_min
            self.max = batch_max
        else:
            self.min = np.minimum(self.min, batch_min)
            self.max = np.maximum(self.max, batch_max)

    def normalize(self, x):
        """
        x: numpy array [N, D]
        """
        eps = 1e-8
        return (x - self.min) / (self.max - self.min + eps)

## Model

In [36]:
class CILModel(nn.Module):
    def __init__(self, input_dim, feature_dim=128):
        super().__init__()

        self.feature_extractor = nn.Sequential(
            nn.Linear(input_dim, 256),
            nn.ReLU(),
            nn.Linear(256, feature_dim)
        )

        self.classifier = None  # SOLO para entrenamiento

    def forward(self, x):
        feats = self.feature_extractor(x)
        feats = F.normalize(feats, dim=1)

        if self.classifier is not None:
            logits = self.classifier(feats)
            return logits, feats

        return feats

    def update_classifier(self, num_classes):
        self.classifier = nn.Linear(
            self.feature_extractor[-1].out_features,
            num_classes
        )

## Replay Buffer (DER++)

In [37]:
class ReservoirBuffer:
    def __init__(self, size):
        self.size = size
        self.indices = []
        self.labels = []
        self.logits = []
        self.n_seen = 0

    def add(self, indices, labels, logits):
        """
        indices: list[int]
        labels: tensor [B]
        logits: tensor [B, C]
        """
        indices = indices.tolist() if torch.is_tensor(indices) else indices
        labels = labels.detach().cpu()
        logits = logits.detach().cpu()

        for idx, y, logit in zip(indices, labels, logits):
            self.n_seen += 1

            if len(self.indices) < self.size:
                self.indices.append(idx)
                self.labels.append(y)
                self.logits.append(logit)
            else:
                j = random.randint(0, self.n_seen - 1)
                if j < self.size:
                    self.indices[j] = idx
                    self.labels[j] = y
                    self.logits[j] = logit

    def sample(self, batch_size, current_n_classes):
        if len(self.indices) == 0:
            return None

        idxs = np.random.choice(
            len(self.indices),
            min(batch_size, len(self.indices)),
            replace=False
        )

        indices = [self.indices[i] for i in idxs]
        labels = torch.stack([self.labels[i] for i in idxs])

        padded_logits = []
        for i in idxs:
            logit = self.logits[i]

            if logit.shape[0] < current_n_classes:
                pad = torch.zeros(
                    current_n_classes - logit.shape[0]
                )
                logit = torch.cat([logit, pad], dim=0)

            padded_logits.append(logit)

        logits = torch.stack(padded_logits)

        return indices, labels, logits


## DER++ Training Loop

In [38]:
def train_task(model, loader, buffer, optimizer,
               alpha=0.5, beta=0.5, epochs=1):

    ce = nn.CrossEntropyLoss()
    model.train()

    for _ in range(epochs):
        for batch_idx, (x, y) in enumerate(loader):

            x, y = x.to(device), y.to(device)

            # Forward current batch
            logits, _ = model(x)
            loss = ce(logits, y)

            # ----- DER++ Replay -----
            buf = buffer.sample(len(x), model.classifier.out_features)

            if buf is not None:
                replay_indices, replay_labels, replay_logits = buf

                # Re-fetch from BASE dataset (global indices)
                bx = torch.stack([
                    loader.dataset.dataset[i][0]
                    for i in replay_indices
                ]).to(device)

                by = replay_labels.to(device)
                blog = replay_logits.to(device)

                replay_out, _ = model(bx)

                # Expand stored logits if classifier grew
                if blog.shape[1] < replay_out.shape[1]:
                    pad = torch.zeros(
                        blog.shape[0],
                        replay_out.shape[1] - blog.shape[1],
                        device=device
                    )
                    blog = torch.cat([blog, pad], dim=1)

                # DER++
                loss += alpha * F.mse_loss(replay_out, blog)
                loss += beta * ce(replay_out, by)

            # Backprop
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            # ----- Add current batch to buffer -----
            original_indices = [
                loader.dataset.indices[i]
                for i in range(
                    batch_idx * loader.batch_size,
                    batch_idx * loader.batch_size + len(x)
                )
            ]

            buffer.add(original_indices, y, logits)

In [39]:
def evaluate(model, dataset, seen_classes):
    model.eval()

    eval_dataset = build_task(dataset, seen_classes)
    loader = DataLoader(eval_dataset, batch_size=256, shuffle=False)

    all_preds, all_targets = [], []

    with torch.no_grad():
        for x, y in loader:
            x = x.to(device)
            logits, _ = model(x)
            preds = logits.argmax(1).cpu().numpy()

            all_preds.append(preds)
            all_targets.append(y.numpy())

    all_preds = np.concatenate(all_preds)
    all_targets = np.concatenate(all_targets)

    acc = accuracy_score(all_targets, all_preds)
    f1  = f1_score(all_targets, all_preds, average="macro")

    return acc, f1

## Run Experiment

In [40]:
# !unzip cil-intrusion-detection/data/processed/2017.zip

In [41]:
# Paths
DATA_ROOT = "2017"  # <-- folder created by unzip

# Datasets
train_dataset = IDSBaseDataset(DATA_ROOT, split="train")
test_dataset  = IDSBaseDataset(DATA_ROOT, split="test")

print(train_dataset.class_to_idx)


{'benign': 0, 'bot': 1, 'ddos': 2, 'dos': 3, 'ftp-patator': 4, 'portscan': 5, 'ssh-patator': 6, 'web-attack': 7}


In [47]:
input_dim = train_dataset.x.shape[1]

# Task definition (example)
all_classes = [
    "benign",
    "dos",
    "ddos",
    "portscan",
    "ssh-patator",
    "ftp-patator",
    "web-attack",
    "bot"
]

scenarios = []

# # Scenario A: 1+1+1+1+1+1+1+1
# scenarios.append(build_scenario(all_classes, [1,1,1,1,1,1,1]))

# Scenario B: 5+3
scenarios.append(build_scenario(all_classes, [3, 4]))

# # Scenario C: 2+3+3
# scenarios.append(build_scenario(all_classes, [1, 3, 3]))

for scenario_id, tasks in enumerate(scenarios):
    print(f"\n=== Scenario {scenario_id+1} ===")

    # Initialize CILModel with default feature_dim (128) if not specified.
    # The number of classes will be handled by update_classifier.
    model = CILModel(input_dim).to(device)
    buffer = ReservoirBuffer(size=4000)

    # Reset datasets (important!)
    train_dataset = IDSBaseDataset(DATA_ROOT, split="train")
    test_dataset  = IDSBaseDataset(DATA_ROOT, split="test")

    for task_id, seen_classes in enumerate(tasks):
        print(f"\n=== Task {task_id}: {seen_classes}")

        # Update the classifier for the current number of classes for this task
        model.update_classifier(len(seen_classes))
        model.classifier = model.classifier.to(device)

        # Re-initialize optimizer after classifier update to include new parameters
        # and apply the correct learning rate based on task_id
        if task_id == 0:
            optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)
        else:
            # The original code used 1e-3 for task_id > 0
            optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

        normalizer = UpToNormalizer()

        # --- UP-TO NORMALIZATION STEP ---
        task_dataset = build_task(train_dataset, seen_classes)
        task_x = np.stack([task_dataset[i][0].numpy() for i in range(len(task_dataset))])

        normalizer.update(task_x)

        train_dataset.set_features(
            normalizer.normalize(train_dataset.x)
        )
        test_dataset.set_features(
            normalizer.normalize(test_dataset.x)
        )

        # train_dataset.set_features(train_dataset.x)
        # test_dataset.set_features(test_dataset.x)
        # --------------------------------

        train_loader = DataLoader(
            build_task(train_dataset, seen_classes),
            batch_size=128,
            shuffle=True
        )

        train_task(model, train_loader, buffer, optimizer)

        acc, f1 = evaluate(model, test_dataset, seen_classes)
        print(f"Accuracy: {acc:.4f} | Macro-F1: {f1:.4f}")


=== Scenario 1 ===

=== Task 0: ['benign', 'dos']
Accuracy: 0.9565 | Macro-F1: 0.8486

=== Task 1: ['benign', 'dos', 'ddos']
Accuracy: 0.9737 | Macro-F1: 0.9339

=== Task 2: ['benign', 'dos', 'ddos', 'portscan']
Accuracy: 0.9206 | Macro-F1: 0.8157

=== Task 3: ['benign', 'dos', 'ddos', 'portscan', 'ssh-patator']
Accuracy: 0.9323 | Macro-F1: 0.6852

=== Task 4: ['benign', 'dos', 'ddos', 'portscan', 'ssh-patator', 'ftp-patator']
Accuracy: 0.9305 | Macro-F1: 0.7826

=== Task 5: ['benign', 'dos', 'ddos', 'portscan', 'ssh-patator', 'ftp-patator', 'web-attack']
Accuracy: 0.9510 | Macro-F1: 0.7065

=== Task 6: ['benign', 'dos', 'ddos', 'portscan', 'ssh-patator', 'ftp-patator', 'web-attack', 'bot']
Accuracy: 0.9541 | Macro-F1: 0.4548

=== Scenario 2 ===

=== Task 0: ['benign', 'dos', 'ddos', 'portscan']
Accuracy: 0.8083 | Macro-F1: 0.2235

=== Task 1: ['benign', 'dos', 'ddos', 'portscan', 'ssh-patator', 'ftp-patator', 'web-attack', 'bot']
Accuracy: 0.9469 | Macro-F1: 0.4378

=== Scenario 3 ==

**Scenario 1** does not perform well the model fails to learn the new classes
**Scenario 2** and **Scenario 3** macro f1 score gets better with more classes which means that the model is working better