<a href="https://colab.research.google.com/github/JhonatanBilbao/Tesis-articulos/blob/main/RUIDS_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🧠 RUIDS: Robust Unsupervised Intrusion Detection System

Implementación basada en el artículo:

> **Wei Wang, Songlei Jian, Yusong Tan, Qingbo Wu, Chenlin Huang** (2023)  
> *Robust unsupervised network intrusion detection with self-supervised masked context reconstruction*  
> Computers & Security, Volume 128, 2023, 103131  
> https://doi.org/10.1016/j.cose.2023.103131

---

**Resumen del artículo**:

RUIDS es un sistema robusto de detección de intrusos que funciona sin etiquetas, utilizando:

- Aprendizaje auto-supervisado con un esquema de reconstrucción de contexto enmascarado.
- Transformers para capturar relaciones temporales.
- Reconstrucción para aprender representaciones robustas.
- Evaluación basada en reconstrucción y pérdida de contraste para detectar anomalías.

---


### 📌 Configuración inicial y librerías

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score
from sklearn.model_selection import train_test_split

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)


<torch._C.Generator at 0x7cb8745c5330>

### 📌 Datos sintéticos (puedes reemplazar con KDD, UNSW, CICIDS)

In [2]:
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=10000, n_features=41, n_informative=30,
                           n_redundant=5, n_clusters_per_class=2, weights=[0.7, 0.3], random_state=SEED)

scaler = StandardScaler()
X = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X[y == 0], y[y == 0], test_size=0.5, random_state=SEED)
X_test_attack = X[y == 1]
y_test_attack = y[y == 1]
X_test_full = np.vstack([X_test, X_test_attack])
y_test_full = np.concatenate([y_test, y_test_attack])


### 📌 Bloque Transformer simple

In [3]:
class TransformerBlock(nn.Module):
    def __init__(self, dim, heads=4):
        super().__init__()
        self.attn = nn.MultiheadAttention(embed_dim=dim, num_heads=heads, batch_first=True)
        self.ff = nn.Sequential(
            nn.LayerNorm(dim),
            nn.Linear(dim, dim * 2),
            nn.ReLU(),
            nn.Linear(dim * 2, dim)
        )

    def forward(self, x):
        attn_output, _ = self.attn(x, x, x)
        x = x + attn_output
        return self.ff(x) + x


### 📌 Definición del modelo RUIDS

In [4]:
class RUIDS(nn.Module):
    def __init__(self, input_dim, hidden_dim=64):
        super().__init__()
        self.encoder = nn.Linear(input_dim, hidden_dim)
        self.transformer = TransformerBlock(hidden_dim)
        self.decoder = nn.Linear(hidden_dim, input_dim)

    def forward(self, x, mask_ratio=0.1):
        batch_size, feat_dim = x.shape
        mask = torch.rand(batch_size, feat_dim) > mask_ratio
        x_masked = x.clone()
        x_masked[~mask] = 0
        x_encoded = self.encoder(x_masked)
        x_transformed = self.transformer(x_encoded.unsqueeze(1)).squeeze(1)
        x_decoded = self.decoder(x_transformed)
        return x_decoded, mask


### 📌 Función de entrenamiento

In [5]:
def train_ruids(model, X_train, epochs=20, lr=1e-3, alpha=1.0):
    model.train()
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    X_tensor = torch.tensor(X_train, dtype=torch.float32).to(device)

    for epoch in range(epochs):
        out, mask = model(X_tensor)
        loss = F.mse_loss(out[mask], X_tensor[mask])
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        print(f"Epoch {epoch+1}: Loss = {loss.item():.4f}")


### 📌 Función de evaluación

In [6]:
def evaluate(model, X_test, y_test):
    model.eval()
    X_tensor = torch.tensor(X_test, dtype=torch.float32).to(device)
    with torch.no_grad():
        out, mask = model(X_tensor)
        scores = ((out - X_tensor)**2).mean(dim=1).cpu().numpy()

    auc = roc_auc_score(y_test, scores)
    pred = (scores > np.percentile(scores, 70)).astype(int)
    acc = accuracy_score(y_test, pred)
    f1 = f1_score(y_test, pred)
    print(f"Accuracy: {acc:.4f}, F1-score: {f1:.4f}, AUC: {auc:.4f}")
    return acc, f1, auc


### 📌 Entrenamiento y evaluación final

In [7]:
model = RUIDS(input_dim=X.shape[1]).to(device)
train_ruids(model, X_train)
evaluate(model, X_test_full, y_test_full)


Epoch 1: Loss = 1.1369
Epoch 2: Loss = 1.0866
Epoch 3: Loss = 1.0356
Epoch 4: Loss = 0.9866
Epoch 5: Loss = 0.9464
Epoch 6: Loss = 0.9061
Epoch 7: Loss = 0.8648
Epoch 8: Loss = 0.8290
Epoch 9: Loss = 0.7940
Epoch 10: Loss = 0.7601
Epoch 11: Loss = 0.7274
Epoch 12: Loss = 0.6968
Epoch 13: Loss = 0.6690
Epoch 14: Loss = 0.6403
Epoch 15: Loss = 0.6134
Epoch 16: Loss = 0.5885
Epoch 17: Loss = 0.5626
Epoch 18: Loss = 0.5404
Epoch 19: Loss = 0.5180
Epoch 20: Loss = 0.4963
Accuracy: 0.6671, F1-score: 0.5641, AUC: 0.7278


(0.6670763558150253, 0.564071615369141, np.float64(0.7278403008854161))