# Experience Replay (ER) Prototype for Class Incremental Learning IDS

This notebook provides a minimal prototype of the **Experience Replay (ER)** strategy
used in **Class-Incremental Learning (CIL)** for Intrusion Detection Systems.

The goal of this notebook is **not** to build the full training pipeline, but to:
- understand what Experience Replay is,
- show how a replay memory works,
- demonstrate how past samples can be reused when learning new classes.

This prototype will later be translated into the `strategies.py` module of the project repository.


**Experience Replay in Class-Incremental Learning**

In Class-Incremental Learning, the model is trained on a sequence of tasks.
Each task introduces new attack classes, while previously learned classes
are no longer directly available in the training data.

This often leads to **catastrophic forgetting**, where the model loses the
ability to recognize older attack types.

**Experience Replay (ER)** mitigates this problem by storing a small set of
samples from previous tasks in a memory buffer. During training on a new task,
these stored samples are replayed together with the current task data.


In [1]:
import random
import torch


In [2]:
class ReplayBuffer:
    """
    Simple Experience Replay buffer.

    This buffer stores a limited number of past samples (x, y) from
    previously seen tasks and allows random sampling of replay batches.
    """

    def __init__(self, capacity: int):
        """
        Parameters
        ----------
        capacity : int
            Maximum number of samples that can be stored in memory.
        """
        self.capacity = capacity
        self.x = []  # stored input samples
        self.y = []  # stored labels

    def __len__(self):
        """Return the current number of stored samples."""
        return len(self.y)

    def add_batch(self, x_batch: torch.Tensor, y_batch: torch.Tensor):
        """
        Add a batch of samples to the replay memory.

        If the memory is full, samples are randomly replaced
        (simple baseline strategy).

        Parameters
        ----------
        x_batch : torch.Tensor
            Batch of input features.
        y_batch : torch.Tensor
            Corresponding labels.
        """
        for x_i, y_i in zip(x_batch, y_batch):
            if len(self.y) < self.capacity:
                self.x.append(x_i.detach().cpu())
                self.y.append(int(y_i))
            else:
                j = random.randint(0, self.capacity - 1)
                self.x[j] = x_i.detach().cpu()
                self.y[j] = int(y_i)

    def sample(self, batch_size: int, device: str):
        """
        Sample a random mini-batch from the replay memory.

        Parameters
        ----------
        batch_size : int
            Number of replay samples to return.
        device : str
            Device where tensors should be moved ('cpu' or 'cuda').

        Returns
        -------
        x : torch.Tensor
            Replay input batch.
        y : torch.Tensor
            Replay label batch.
        """
        if len(self.y) == 0:
            return None, None

        idx = random.sample(
            range(len(self.y)),
            k=min(batch_size, len(self.y))
        )

        x = torch.stack([self.x[i] for i in idx]).to(device)
        y = torch.tensor([self.y[i] for i in idx], dtype=torch.long).to(device)

        return x, y


##  Testing the Replay Buffer

To verify that the replay buffer behaves as expected, we test it using
randomly generated data. This ensures that samples are stored correctly
and that replay batches can be retrieved.


In [3]:
# Create a small replay buffer
buffer = ReplayBuffer(capacity=5)

# Fake data simulating features and labels
x_fake = torch.randn(10, 3)          # 10 samples, 3 features
y_fake = torch.randint(0, 4, (10,))  # labels in range [0, 3]

# Add samples to memory
buffer.add_batch(x_fake, y_fake)

print("Replay buffer size:", len(buffer))

# Sample a replay batch
x_rep, y_rep = buffer.sample(batch_size=2, device="cpu")
print("Replay batch shapes:", x_rep.shape, y_rep.shape)


Replay buffer size: 5
Replay batch shapes: torch.Size([2, 3]) torch.Size([2])


## Interpretation

The replay buffer stores a fixed number of samples from previous tasks.
When new data arrives, old samples are randomly replaced if the buffer is full.

During training on a new task, a mini batch from the replay buffer can be
combined with the current task mini-batch. This allows the model to continue
seeing older classes, reducing catastrophic forgetting.

This mechanism represents the core idea of Experience Replay and will be
integrated into the main training loop of the project.
