# SNN-IDS Audit Notebook (CSE-CIC-IDS2018)

Questo notebook è pensato per essere auditabile: ogni scelta è documentata file-per-file e riga-per-riga dove rilevante. Include:
- Setup ambiente e dati
- Pipeline dati riproducibile
- Training recipe ottimizzata per tabulari (GRU per finestre temporali)
- Metriche e calibrazione
- Smoke test e test completo (finestre: 5s, 1m, 5m; LR grid)

Note: il codice vive nel repository; il notebook chiama le funzioni senza duplicazioni di logica.


## 1) Setup ambiente

Requisiti minimi:
- Python 3.10+
- pacchetti: pandas, numpy, scikit-learn, tensorflow, matplotlib, seaborn, tqdm

In Colab eseguire le celle seguenti; in locale assicurarsi che `pip install -r requirements.txt` sia stato eseguito.


In [1]:
# Se sei in Colab, decommenta le righe seguenti
!git clone --single-branch --branch fix-branch https://github.com/devedale/snn-ids.git
#!git clone https://github.com/devedale/snn-ids.git
%cd snn-ids
!pip install -q pandas numpy scikit-learn tensorflow matplotlib seaborn tqdm


Cloning into 'snn-ids'...
remote: Enumerating objects: 364, done.[K
remote: Counting objects: 100% (13/13), done.[K
remote: Compressing objects: 100% (10/10), done.[K
remote: Total 364 (delta 7), reused 3 (delta 3), pack-reused 351 (from 2)[K
Receiving objects: 100% (364/364), 30.27 MiB | 40.84 MiB/s, done.
Resolving deltas: 100% (180/180), done.
/content/snn-ids


## 2) Setup dati
Scarica i CSV CSE-CIC-IDS2018 nelle cartelle già attese da `config.py` (`data/cicids/2018`). In Colab puoi caricare dal tuo Drive o usare Kaggle API.


In [2]:
#Uncomment to download full starting dataset
#!curl -L -o ./data/cicids2018.zip  https://www.kaggle.com/api/v1/datasets/download/edoardodalesio/intrusion-detection-evaluation-dataset-cic-ids2018
#!unzip -o ./data/cicids2018.zip  "Tuesday-20-02-2018.csv"   "Wednesday-21-02-2018.csv"  "Thursday-22-02-2018.csv" "Friday-23-02-2018.csv" -d ./data # or #!unzip -o ./data/cicids2018.zip -d ./data

# Importa librerie del progetto
import os, json
import numpy as np
import pandas as pd
from config import DATA_CONFIG, PREPROCESSING_CONFIG, TRAINING_CONFIG, BENCHMARK_CONFIG
from preprocessing.process import preprocess_pipeline
from training.train import train_model
from evaluation.metrics import evaluate_model_comprehensive

print("Config dataset:", DATA_CONFIG["dataset_path"])




#Uncomment to use preprocessed cache
!mkdir preprocessed_cache
!curl -L -o ./preprocessed_cache/preprocessed_cache.zip  https://www.kaggle.com/api/v1/datasets/download/edoardodalesio/cic-ids-2018-benign-vs-attack
!unzip -o ./preprocessed_cache/preprocessed_cache.zip -d ./preprocessed_cache
!touch data/Friday-02-03-2018.csv data/Friday-16-02-2018.csv data/Friday-23-02-2018.csv data/Thursday-01-03-2018.csv data/Thursday-15-02-2018.csv data/Thursday-22-02-2018.csv data/Tuesday-20-02-2018.csv data/Wednesday-14-02-2018.csv data/Wednesday-21-02-2018.csv data/Wednesday-28-02-2018.csv
#



Config dataset: data
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  9.8G  100  9.8G    0     0  79.5M      0  0:02:07  0:02:07 --:--:-- 66.8M
Archive:  ./preprocessed_cache/preprocessed_cache.zip
  inflating: ./preprocessed_cache/Friday-02-03-2018/attack_records.csv  
  inflating: ./preprocessed_cache/Friday-02-03-2018/benign_records.csv  
  inflating: ./preprocessed_cache/Friday-16-02-2018/attack_records.csv  
  inflating: ./preprocessed_cache/Friday-16-02-2018/benign_records.csv  
  inflating: ./preprocessed_cache/Friday-23-02-2018/attack_records.csv  
  inflating: ./preprocessed_cache/Friday-23-02-2018/benign_records.csv  
  inflating: ./preprocessed_cache/Thursday-01-03-2018/attack_records.csv  
  inflating: ./preprocessed_cache/Thursday-01-03-2018/benign_records.csv  
  inflating: ./prepro

In [3]:
!python3 benchmark-progressive.py

2025-08-24 21:44:46.038377: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1756071886.079282    5819 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1756071886.092660    5819 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1756071886.129618    5819 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1756071886.129684    5819 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1756071886.129689    5819 computation_placer.cc:177] computation placer alr

In [6]:
!zip -r risultati.zip benchmark_results/ progressive_benchmark_20250824_214453/ models/ benchmark_run_20250824_223858.zip benchmark_run_20250824_224433.zip benchmark_run_20250824_225704.zip benchmark_run_20250824_230738.zip

  adding: benchmark_results/ (stored 0%)
  adding: benchmark_results/20250824_215219_gru_evaluation/ (stored 0%)
  adding: benchmark_results/20250824_215219_gru_evaluation/visualizations/ (stored 0%)
  adding: benchmark_results/20250824_215219_gru_evaluation/visualizations/roc_curves.png (deflated 8%)
  adding: benchmark_results/20250824_215219_gru_evaluation/visualizations/accuracy_per_class.png (deflated 14%)
  adding: benchmark_results/20250824_215219_gru_evaluation/visualizations/confusion_matrix_cybersecurity.png (deflated 25%)
  adding: benchmark_results/20250824_215219_gru_evaluation/visualizations/confusion_matrix_detailed.png (deflated 19%)
  adding: benchmark_results/20250824_215219_gru_evaluation/evaluation_report.json (deflated 87%)
  adding: benchmark_results/20250824_215219_gru_evaluation/visualization_summary.json (deflated 34%)
  adding: benchmark_results/20250824_214833_lstm_evaluation/ (stored 0%)
  adding: benchmark_results/20250824_214833_lstm_evaluation/visualizati

Esempi di utilizzo:

  # 1. Eseguire un test rapido (smoke test) per verificare che tutto funzioni
  python3 benchmark.py --smoke-test

  # 2. Eseguire un singolo test con un modello specifico e dimensione del campione
  python3 benchmark.py --model gru --sample-size 20000

  # 3. Eseguire il benchmark completo su tutti i modelli e iperparametri di default
  python3 benchmark.py --full

  # 4. Eseguire il benchmark completo con una dimensione del campione personalizzata
  python3 benchmark.py --full --sample-size 50000

  # 5. Eseguire un singolo test specificando iperparametri custom (nota: devono essere nel formato atteso dal modulo di training)
  python3 benchmark.py --model lstm --epochs 15 --batch-size 128 --learning-rate 0.0005
        '''
    )
    
    # Argomenti principali per la selezione della modalità
    parser.add_argument('--smoke-test', action='store_true', help='Esegue uno smoke test veloce e leggero.')
    parser.add_argument('--full', action='store_true', help='Esegue il benchmark completo su più modelli e iperparametri.')
    
    # Argomenti per la configurazione di base
    parser.add_argument('--sample-size', type=int, help='Numero totale di campioni da utilizzare (BENIGN + ATTACK).')
    parser.add_argument('--data-path', type=str, help='Path alla directory contenente i file CSV del dataset.')
    parser.add_argument('--output-dir', type=str, default='benchmark_results', help='Directory per salvare i risultati.')

    # Argomenti per la configurazione del modello (usati in test singoli o come override)
    parser.add_argument('--model', choices=['dense', 'gru', 'lstm'], help='Tipo di modello da testare in un singolo run.')
    parser.add_argument('--epochs', type=int, help="Override del numero di epoche per il training (es. 10).")
    parser.add_argument('--batch-size', type=int, help="Override della batch size per il training (es. 64).")
    parser.add_argument('--learning-rate', type=float, help="Override del learning rate (es. 0.001).")
    
    args = parser.parse_args()

## 3) Smoke test (GRU)
Esegue pipeline ridotta per verificare fine-to-end: bilanciamento security, IP→ottetti, finestre, training GRU (K-Fold), valutazione con PNG.


In [None]:
# Smoke test
from sklearn.model_selection import train_test_split

# Override per test rapido
PREPROCESSING_CONFIG['sample_size'] = 3000
TRAINING_CONFIG['model_type'] = 'gru'
TRAINING_CONFIG['hyperparameters']['epochs'] = [2]
TRAINING_CONFIG['hyperparameters']['batch_size'] = [32]

X, y, label_encoder = preprocess_pipeline()
model, log, model_path = train_model(X, y, model_type='gru')

# Valutazione rapida
from sklearn.model_selection import train_test_split
X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
report = evaluate_model_comprehensive(model, X_te, y_te, class_names=label_encoder.classes_.tolist(), output_dir='notebook_eval/smoke')
report['basic_metrics']['accuracy']


## 4) Test completo (GRU) con finestre 5s, 1m, 5m e grid LR
In questo test variamo:
- finestre temporali: `window_size` e `step` coerenti con risoluzioni 5s, 1m, 5m
- learning rate: `[1e-3, 5e-4, 1e-4]`
- epoche moderate per tempi ragionevoli


In [None]:
from copy import deepcopy

results = []
base_prep = deepcopy(PREPROCESSING_CONFIG)
base_train = deepcopy(TRAINING_CONFIG)

# Grid finestre (timesteps) e learning rate
window_configs = [
    {"name": "5s", "window_size": 10, "step": 5},
    {"name": "1m", "window_size": 60//6, "step": 10},  # es: 10 step
    {"name": "5m", "window_size": 50, "step": 10},
]
lr_grid = [1e-3, 5e-4, 1e-4]

for wc in window_configs:
    PREPROCESSING_CONFIG['use_time_windows'] = True
    PREPROCESSING_CONFIG['window_size'] = wc['window_size']
    PREPROCESSING_CONFIG['step'] = wc['step']

    for lr in lr_grid:
        TRAINING_CONFIG['model_type'] = 'gru'
        TRAINING_CONFIG['hyperparameters']['epochs'] = [5]
        TRAINING_CONFIG['hyperparameters']['batch_size'] = [64]
        TRAINING_CONFIG['hyperparameters']['learning_rate'] = [lr]

        print(f"\n=== Config: {wc['name']} | lr={lr} ===")
        X, y, le = preprocess_pipeline()
        model, log, path = train_model(X, y, model_type='gru')
        X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
        rep = evaluate_model_comprehensive(model, X_te, y_te, le.classes_.tolist(), output_dir=f'notebook_eval/{wc["name"]}_lr{lr}')
        results.append({'window': wc['name'], 'lr': lr, 'accuracy': rep['basic_metrics']['accuracy']})

# Ripristina config
PREPROCESSING_CONFIG.update(base_prep)
TRAINING_CONFIG.update(base_train)

pd.DataFrame(results).sort_values('accuracy', ascending=False).head()


## 5) Riproducibilità
Impostiamo i seed per rendere i risultati ripetibili (entro i limiti dell'hardware).


In [None]:
import os, random
import numpy as np
import tensorflow as tf

SEED = 42
os.environ['PYTHONHASHSEED'] = str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

print('Seed impostato:', SEED)


## 6) Audit dati e feature
Controlliamo distribuzione classi, percentuali, e presenza di attacchi rilevanti nel sample selezionato.


In [None]:
from collections import Counter

def audit_distribution(y, label_encoder):
    counts = Counter(y)
    classes = label_encoder.classes_.tolist()
    dist = {classes[i]: int(counts.get(i, 0)) for i in range(len(classes))}
    total = sum(dist.values())
    df = pd.DataFrame({
        'classe': list(dist.keys()),
        'conteggio': list(dist.values())
    }).sort_values('conteggio', ascending=False)
    df['percentuale'] = (df['conteggio'] / total * 100).round(2)
    return df

# Esempio live (riutilizza X,y,label_encoder se esistono)
try:
    audit_distribution(y, label_encoder)
except Exception as e:
    print('Esegui prima il smoke test per generare X,y,label_encoder')


## 7) Metriche e calibrazione
Oltre alle metriche standard, aggiungiamo ECE (Expected Calibration Error) per valutare la calibrazione delle probabilità.


In [None]:
import numpy as np

def expected_calibration_error(y_true, y_proba, n_bins=10):
    # binning su max probability
    confidences = y_proba.max(axis=1)
    predictions = y_proba.argmax(axis=1)
    accuracies = (predictions == y_true).astype(float)
    bins = np.linspace(0.0, 1.0, n_bins + 1)
    ece = 0.0
    for i in range(n_bins):
        mask = (confidences > bins[i]) & (confidences <= bins[i+1])
        if mask.any():
            avg_conf = confidences[mask].mean()
            avg_acc = accuracies[mask].mean()
            ece += np.abs(avg_acc - avg_conf) * mask.mean()
    return float(ece)

# Esempio: usa il modello dallo smoke test, se disponibile
try:
    y_proba = model.predict(X_te, verbose=0)
    print('ECE:', expected_calibration_error(y_te, y_proba, n_bins=15))
except Exception as e:
    print('Esegui prima smoke test e valutazione per avere y_te e y_proba')


## 8) Documentazione file-per-file
In questa sezione spieghiamo le scelte implementative nei file chiave: `preprocessing/process.py`, `training/train.py`, `evaluation/metrics.py`, `benchmark.py` e `config.py`.


In [None]:
import inspect, textwrap
import preprocessing.process as P
import training.train as T
import evaluation.metrics as E
import benchmark as B
import config as C

def show_source(obj, start=None, end=None):
    src = inspect.getsource(obj)
    if start or end:
        lines = src.splitlines()
        src = "\n".join(lines[start:end])
    print(textwrap.dedent(src))

print('--- config.py (sezioni principali) ---')
print('DATA_CONFIG:'); print(C.DATA_CONFIG)
print('\nPREPROCESSING_CONFIG:'); print(C.PREPROCESSING_CONFIG)
print('\nTRAINING_CONFIG:'); print(C.TRAINING_CONFIG)

print('\n--- preprocessing.process: load_and_balance_dataset ---')
show_source(P.load_and_balance_dataset)
print('\n--- preprocessing.process: preprocess_pipeline ---')
show_source(P.preprocess_pipeline)

print('\n--- training.train: _train_k_fold ---')
show_source(T._train_k_fold)
print('\n--- training.train: _train_split ---')
show_source(T._train_split)

print('\n--- evaluation.metrics: evaluate_model_comprehensive ---')
show_source(E.evaluate_model_comprehensive)

print('\n--- benchmark.SNNIDSBenchmark (run_smoke_test) ---')
show_source(B.SNNIDSBenchmark.run_smoke_test)


### Note progettuali
- Zero hard-code: tutte le scelte sono in `config.py`; il notebook applica override solo per esperimenti.
- Pipeline riproducibile: sampling e bilanciamento documentati; seed fissati.
- Training recipe tabulari: GRU su finestre 3D, scaling per-fold, StratifiedKFold.
- Metriche e PNG: confusion matrix dettagliata, cybersecurity, ROC, accuracy per classe, ECE.
- Notebook auditabile: usa `inspect` per mostrare il codice sorgente eseguito.


## 9) Limitazioni
- I risultati su classi rare vanno interpretati con cautela; forniamo sempre breakdown per‑classe.
- La calibrazione (ECE) è informativa ma non esaustiva.
- Il bilanciamento “security” riduce bias ma non sostituisce protocolli di acquisizione realistici.
- Evitiamo leakage scalando per‑fold; ulteriori audit sono comunque consigliati in ambienti operativi.
- Per produzione sono necessarie valutazioni cost‑sensitive e monitoraggio del drift.
