# 05 Advanced Model Poisoning Attacks and Adaptive Aggregation Strategies

This notebook extends the Federated Learning-based Intrusion Detection System (FL-IDS) for Industrial IoT (IIoT) by implementing advanced model update poisoning attacks and adaptive aggregation strategies.

---

### Objectives:
- **Simulate Advanced Model Update Poisoning Attacks:**
  - Sign Flipping Attack
  - Scaling Attack
  - Sybil Attack (Coordinated Attackers)
  - Adaptive Gradient-based Attack

- **Implement Adaptive Aggregation Strategies:**
  - Drift-Aware Weighting of Client Updates
  - Performance Feedback-based Aggregation

- **Simulate Realistic FL Scenarios:**
  - Client Dropout Simulation
  - Data and Computational Heterogeneity

- **Evaluate System Resilience under Adversarial Conditions**
  - Impact on Global Autoencoder Performance
  - Comparative Analysis with Robust Aggregation Methods



In [1]:
import os
import numpy as np
import pandas as pd
import random
import tensorflow as tf
from tensorflow.keras.models import load_model, clone_model
from tensorflow.keras.losses import MeanSquaredError
import joblib

# For reproducibility
SEED = 42
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

# === Defining Paths ===
base_path = "D:/August-Thesis/FL-IDS-Surveillance"
results_path = os.path.join(base_path, "notebooks/results")

model_path = os.path.join(results_path, "models/unsupervised/federated/final_federated_autoencoder_20rounds.h5")
scaler_path = os.path.join(results_path, "scalers/minmax_scaler_client_3.pkl")  

data_path = os.path.join(base_path, "data/processed/federated/unsupervised")
test_path = os.path.join(base_path, "data/processed/surv_unsupervised/test_mixed.csv")

client_ids = [f"client_{i}" for i in range(1, 6)]

# === Loading the model (uncompiled) ===
global_model = load_model(model_path, compile=False)
global_model.compile(optimizer="adam", loss=MeanSquaredError())

# === Loading necessary Scaler ===
minmax_scaler = joblib.load(scaler_path)

feature_cols = list(minmax_scaler.feature_names_in_)


# === Load Client Data ===
client_dfs = {
    cid: pd.read_csv(os.path.join(data_path, cid, "train.csv")) 
    for cid in client_ids
}

# === Load Testing Set ===
test_df = pd.read_csv(test_path, low_memory = False)
X_test = test_df[feature_cols].astype(float)
y_true = test_df["Attack_label"].values
X_test_scaled = minmax_scaler.transform(test_df[feature_cols])

print("The setup is  complete:")
print(f"- Loaded model: {os.path.basename(model_path)}")
print(f"- Clients: {client_ids}")
print(f"- Test samples: {X_test.shape[0]}")


The setup is  complete:
- Loaded model: final_federated_autoencoder_20rounds.h5
- Clients: ['client_1', 'client_2', 'client_3', 'client_4', 'client_5']
- Test samples: 2218834


In [2]:
from sklearn.metrics import f1_score, precision_score, recall_score

# Use the best threshold from Notebook 03
threshold = 0.000639

# Baseline evaluation to establish 
reconstructed = global_model.predict(X_test_scaled, verbose=0)
reconstruction_errors = np.mean(np.square(X_test_scaled - reconstructed), axis=1)
y_pred = (reconstruction_errors > threshold).astype(int)

# the Metrics
f1_baseline = f1_score(y_true, y_pred)
precision_baseline = precision_score(y_true, y_pred)
recall_baseline = recall_score(y_true, y_pred)

print(f"Baseline Performance Before Attack:")
print(f"- F1 Score:      {f1_baseline:.4f}")
print(f"- Precision:     {precision_baseline:.4f}")
print(f"- Recall:        {recall_baseline:.4f}")


Baseline Performance Before Attack:
- F1 Score:      0.8373
- Precision:     0.9877
- Recall:        0.7266


In [3]:
def sign_flipping_attack(global_weights, local_weights, scale_factor=1.0):
    """
    Inverts and scales the difference between local and global weights.
    
    Arguments:
        global_weights: List of global model weights
        local_weights: List of local model weights from attacker
        scale_factor: Multiplier for amplification (default = 1.0)
    
    Returns:
        poisoned_weights: List of poisoned weights to submit
    """
    poisoned_weights = []
    for gw, lw in zip(global_weights, local_weights):
        delta = lw - gw
        flipped = gw - scale_factor * delta
        poisoned_weights.append(flipped)
    return poisoned_weights


In [4]:
def scaling_attack(global_weights, local_weights, scale_factor=10.0):
    """
    Amplifies the local update from an attacker before submission.

    Args:
        global_weights: List of global model weights
        local_weights: List of attacker's local weights
        scale_factor: Multiplier to exaggerate the delta (default = 10.0)

    Returns:
        poisoned_weights: List of exaggerated weights
    """
    poisoned_weights = []
    for gw, lw in zip(global_weights, local_weights):
        delta = lw - gw
        scaled = gw + scale_factor * delta
        poisoned_weights.append(scaled)
    return poisoned_weights


In [5]:
def generate_sybil_updates(poisoned_weights, num_sybil_nodes):
    """
    Creates multiple identical poisoned updates for Sybil clients.

    Args:
        poisoned_weights: List of weights from the attacking client
        num_sybil_nodes: Number of fake clients to simulate

    Returns:
        List of poisoned updates (one per Sybil node)
    """
    return [poisoned_weights for _ in range(num_sybil_nodes)]


In [6]:
def adaptive_gradient_attack(global_weights, local_weights, learning_rate=1.0, noise_std=0.0):
    """
    Computes a crafted adversarial update using the difference between local and global weights.

    Args:
        global_weights: List of global model weights
        local_weights: List of local attacker model weights
        learning_rate: Scalar to amplify attack strength
        noise_std: Standard deviation of noise added to gradient which is optional and set to 0.0 

    Returns:
        poisoned_weights: List of adversarially crafted model weights
    """
    poisoned_weights = []
    for gw, lw in zip(global_weights, local_weights):
        gradient = lw-gw
        if noise_std > 0:
            noise = np.random.normal(loc=0.0, scale=noise_std,size=gradient.shape)
            gradient = gradient + noise
        adversarial_update = gw +learning_rate * gradient
        poisoned_weights.append(adversarial_update)
    return poisoned_weights


## Testing the Attacks' effects now

In [9]:
feature_names = feature_cols

In [10]:
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix
import pandas as pd

def run_sign_flipping_attack_simulation(
    global_model,
    client_dfs,
    scaler,
    feature_names,
    X_test_scaled,
    y_true,
    threshold=0.000639,
    attacker_id="client_3",
    attack_round=6,
    scale_factor=1.5,
    save_path="D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/poisoning_sign_flipping_attack.csv"
):
    model_copy = clone_model(global_model)
    model_copy.set_weights(global_model.get_weights())
    model_copy.compile(optimizer="adam", loss=MeanSquaredError())

    f1_list, precision_list, recall_list = [], [], []
    fp_rates, fn_rates = [], []

    for round_num in range(1, 21):
        local_weights = []

        for cid in client_ids:
            df = client_dfs[cid]
            X_local = df[feature_names].astype(float)
            X_scaled = scaler.transform(X_local)

            local_model = clone_model(model_copy)
            local_model.set_weights(model_copy.get_weights())
            local_model.compile(optimizer="adam", loss=MeanSquaredError())
            local_model.fit(X_scaled, X_scaled, epochs=1, batch_size=256, verbose=0)

            weights = local_model.get_weights()

            # Apply attack if this is the attacker and attack round started
            if cid == attacker_id and round_num >= attack_round:
                weights = sign_flipping_attack(model_copy.get_weights(), weights, scale_factor=scale_factor)

            local_weights.append(weights)

        # the FedAvg
        averaged_weights = [
            np.mean([w[i] for w in local_weights], axis=0)
            for i in range(len(local_weights[0]))
        ]
        model_copy.set_weights(averaged_weights)

        # Evaluation
        preds = model_copy.predict(X_test_scaled, verbose=0)
        recon_errors = np.mean(np.square(X_test_scaled - preds), axis=1)
        y_pred = (recon_errors > threshold).astype(int)

        f1 = f1_score(y_true, y_pred)
        prec = precision_score(y_true, y_pred)
        rec = recall_score(y_true, y_pred)

        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        fp_rate = fp / (fp + tn + 1e-6)
        fn_rate = fn / (fn + tp + 1e-6)

        f1_list.append(f1)
        precision_list.append(prec)
        recall_list.append(rec)
        fp_rates.append(fp_rate)
        fn_rates.append(fn_rate)

        print(f"SignFlip | Round {round_num:02d} | F1: {f1:.4f} | P: {prec:.4f} | R: {rec:.4f} | FP Rate: {fp_rate:.4f} | FN Rate: {fn_rate:.4f}")

    # Saving results and exporting it to csv
    df_out = pd.DataFrame({
        "Round": list(range(1, 21)),
        "F1": f1_list,
        "Precision": precision_list,
        "Recall": recall_list,
        "FP Rate": fp_rates,
        "FN Rate": fn_rates
    })
    df_out.to_csv(save_path, index=False)
    print(f"\nResults saved to: {save_path}")

    return df_out


In [11]:
results_sign_flip = run_sign_flipping_attack_simulation(
    global_model,
    client_dfs,
    minmax_scaler,
    feature_names,
    X_test_scaled,
    y_true
)


SignFlip | Round 01 | F1: 0.8368 | P: 0.9878 | R: 0.7258 | FP Rate: 0.0034 | FN Rate: 0.2742
SignFlip | Round 02 | F1: 0.8363 | P: 0.9879 | R: 0.7250 | FP Rate: 0.0033 | FN Rate: 0.2750
SignFlip | Round 03 | F1: 0.8356 | P: 0.9879 | R: 0.7240 | FP Rate: 0.0033 | FN Rate: 0.2760
SignFlip | Round 04 | F1: 0.8350 | P: 0.9879 | R: 0.7231 | FP Rate: 0.0033 | FN Rate: 0.2769
SignFlip | Round 05 | F1: 0.8348 | P: 0.9879 | R: 0.7227 | FP Rate: 0.0033 | FN Rate: 0.2773
SignFlip | Round 06 | F1: 0.8356 | P: 0.9879 | R: 0.7240 | FP Rate: 0.0033 | FN Rate: 0.2760
SignFlip | Round 07 | F1: 0.8360 | P: 0.9877 | R: 0.7247 | FP Rate: 0.0034 | FN Rate: 0.2753
SignFlip | Round 08 | F1: 0.8364 | P: 0.9877 | R: 0.7253 | FP Rate: 0.0034 | FN Rate: 0.2747
SignFlip | Round 09 | F1: 0.8370 | P: 0.9877 | R: 0.7262 | FP Rate: 0.0034 | FN Rate: 0.2738
SignFlip | Round 10 | F1: 0.8363 | P: 0.9885 | R: 0.7248 | FP Rate: 0.0032 | FN Rate: 0.2752
SignFlip | Round 11 | F1: 0.8354 | P: 0.9886 | R: 0.7233 | FP Rate: 0.

In [12]:
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix
import pandas as pd

def run_scaling_attack_simulation(
    global_model,
    client_dfs,
    scaler,
    feature_names,
    X_test_scaled,
    y_true,
    threshold=0.000639,
    malicious_clients=["client_3", "client_4"],
    attack_round=6,
    scale_factor=10.0,
    save_path="D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/poisoning_scaling_attack.csv"
):
    model_copy = clone_model(global_model)
    model_copy.set_weights(global_model.get_weights())
    model_copy.compile(optimizer="adam", loss=MeanSquaredError())

    f1_list, precision_list, recall_list = [], [], []
    fp_rates, fn_rates = [], []

    for round_num in range(1, 21):
        local_weights = []

        for cid in client_ids:
            df = client_dfs[cid]
            X_local = df[feature_names].astype(float)
            X_scaled = scaler.transform(X_local)

            local_model = clone_model(model_copy)
            local_model.set_weights(model_copy.get_weights())
            local_model.compile(optimizer="adam", loss=MeanSquaredError())
            local_model.fit(X_scaled, X_scaled, epochs=1, batch_size=256, verbose=0)

            weights = local_model.get_weights()

            # Apply scaling attack if client is malicious after attack_round
            if cid in malicious_clients and round_num >= attack_round:
                weights = scaling_attack(model_copy.get_weights(), weights, scale_factor=scale_factor)

            local_weights.append(weights)

        # FedAvg Aggregation
        averaged_weights = [
            np.mean([w[i] for w in local_weights], axis=0)
            for i in range(len(local_weights[0]))
        ]
        model_copy.set_weights(averaged_weights)

        # Evaluate global model
        preds = model_copy.predict(X_test_scaled, verbose=0)
        recon_errors = np.mean(np.square(X_test_scaled - preds), axis=1)
        y_pred = (recon_errors > threshold).astype(int)

        f1 = f1_score(y_true, y_pred)
        prec = precision_score(y_true, y_pred)
        rec = recall_score(y_true, y_pred)

        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        fp_rate = fp / (fp + tn + 1e-6)
        fn_rate = fn / (fn + tp + 1e-6)

        f1_list.append(f1)
        precision_list.append(prec)
        recall_list.append(rec)
        fp_rates.append(fp_rate)
        fn_rates.append(fn_rate)

        print(f"ScalingAttack | Round {round_num:02d} | F1: {f1:.4f} | P: {prec:.4f} | R: {rec:.4f} | FP Rate: {fp_rate:.4f} | FN Rate: {fn_rate:.4f}")

    # Saving results
    df_out = pd.DataFrame({
        "Round": list(range(1, 21)),
        "F1": f1_list,
        "Precision": precision_list,
        "Recall": recall_list,
        "FP Rate": fp_rates,
        "FN Rate": fn_rates
    })
    df_out.to_csv(save_path, index=False)
    print(f"\nResults saved to: {save_path}")

    return df_out


In [13]:
results_scaling_attack = run_scaling_attack_simulation(
    global_model,
    client_dfs,
    minmax_scaler,
    feature_names,
    X_test_scaled,
    y_true
)


ScalingAttack | Round 01 | F1: 0.8368 | P: 0.9878 | R: 0.7258 | FP Rate: 0.0034 | FN Rate: 0.2742
ScalingAttack | Round 02 | F1: 0.8363 | P: 0.9879 | R: 0.7250 | FP Rate: 0.0033 | FN Rate: 0.2750
ScalingAttack | Round 03 | F1: 0.8356 | P: 0.9879 | R: 0.7240 | FP Rate: 0.0033 | FN Rate: 0.2760
ScalingAttack | Round 04 | F1: 0.8350 | P: 0.9879 | R: 0.7231 | FP Rate: 0.0033 | FN Rate: 0.2769
ScalingAttack | Round 05 | F1: 0.8348 | P: 0.9879 | R: 0.7227 | FP Rate: 0.0033 | FN Rate: 0.2773
ScalingAttack | Round 06 | F1: 0.8270 | P: 0.9877 | R: 0.7113 | FP Rate: 0.0033 | FN Rate: 0.2887
ScalingAttack | Round 07 | F1: 0.7509 | P: 0.7231 | R: 0.7809 | FP Rate: 0.1117 | FN Rate: 0.2191
ScalingAttack | Round 08 | F1: 0.4391 | P: 0.2822 | R: 0.9885 | FP Rate: 0.9386 | FN Rate: 0.0115
ScalingAttack | Round 09 | F1: 0.4275 | P: 0.2719 | R: 1.0000 | FP Rate: 1.0000 | FN Rate: 0.0000
ScalingAttack | Round 10 | F1: 0.4275 | P: 0.2719 | R: 1.0000 | FP Rate: 1.0000 | FN Rate: 0.0000
ScalingAttack | Roun

In [16]:
def run_sybil_attack_simulation(
    global_model,
    client_dfs,
    scaler,
    feature_names,
    X_test_scaled,
    y_true,
    base_malicious_client="client_3",
    num_sybil=3,
    attack_round=6,
    scale_factor=10.0,
    threshold=0.000639,
    save_path="D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/poisoning_sybil_attack.csv"
):
    model_copy = clone_model(global_model)
    model_copy.set_weights(global_model.get_weights())
    model_copy.compile(optimizer="adam", loss=MeanSquaredError())

    f1_list, precision_list, recall_list = [], [], []
    fp_rates, fn_rates = [], []

    client_ids = list(client_dfs.keys())
    sybil_ids = [f"sybil_{i}" for i in range(num_sybil)]

    for round_num in range(1, 21):
        local_weights = []

        for cid in client_ids:
            df = client_dfs[cid]
            X_local = df[feature_names].astype(float)
            X_scaled = scaler.transform(X_local)

            local_model = clone_model(model_copy)
            local_model.set_weights(model_copy.get_weights())
            local_model.compile(optimizer="adam", loss=MeanSquaredError())
            local_model.fit(X_scaled, X_scaled, epochs=1, batch_size=256, verbose=0)

            weights = local_model.get_weights()

            if cid == base_malicious_client and round_num >= attack_round:
                poisoned_weights = scaling_attack(model_copy.get_weights(), weights, scale_factor=scale_factor)
                for _ in sybil_ids:
                    local_weights.append(poisoned_weights)
            else:
                local_weights.append(weights)

        averaged_weights = [
            np.mean([w[i] for w in local_weights], axis=0)
            for i in range(len(local_weights[0]))
        ]
        model_copy.set_weights(averaged_weights)

        preds = model_copy.predict(X_test_scaled, verbose=0)
        recon_errors = np.mean(np.square(X_test_scaled - preds), axis=1)
        y_pred = (recon_errors > threshold).astype(int)

        f1 = f1_score(y_true, y_pred)
        prec = precision_score(y_true, y_pred)
        rec = recall_score(y_true, y_pred)

        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        fp_rate = fp / (fp + tn + 1e-6)
        fn_rate = fn / (fn + tp + 1e-6)

        print(f"SybilAttack | Round {round_num:02d} | F1: {f1:.4f} | P: {prec:.4f} | R: {rec:.4f} | FP Rate: {fp_rate:.4f} | FN Rate: {fn_rate:.4f}")

        f1_list.append(f1)
        precision_list.append(prec)
        recall_list.append(rec)
        fp_rates.append(fp_rate)
        fn_rates.append(fn_rate)

    df_out = pd.DataFrame({
        "Round": list(range(1, 21)),
        "F1": f1_list,
        "Precision": precision_list,
        "Recall": recall_list,
        "FP Rate": fp_rates,
        "FN Rate": fn_rates
    })

    df_out.to_csv(save_path, index=False)
    print(f"\nTheResults saved to: {save_path}")
    return df_out


In [17]:
results_sybil_attack = run_sybil_attack_simulation(
    global_model,
    client_dfs,
    minmax_scaler,
    feature_names,
    X_test_scaled,
    y_true
)


SybilAttack | Round 01 | F1: 0.8368 | P: 0.9878 | R: 0.7258 | FP Rate: 0.0034 | FN Rate: 0.2742
SybilAttack | Round 02 | F1: 0.8363 | P: 0.9879 | R: 0.7250 | FP Rate: 0.0033 | FN Rate: 0.2750
SybilAttack | Round 03 | F1: 0.8356 | P: 0.9879 | R: 0.7240 | FP Rate: 0.0033 | FN Rate: 0.2760
SybilAttack | Round 04 | F1: 0.8350 | P: 0.9879 | R: 0.7231 | FP Rate: 0.0033 | FN Rate: 0.2769
SybilAttack | Round 05 | F1: 0.8348 | P: 0.9879 | R: 0.7227 | FP Rate: 0.0033 | FN Rate: 0.2773
SybilAttack | Round 06 | F1: 0.8263 | P: 0.9876 | R: 0.7102 | FP Rate: 0.0033 | FN Rate: 0.2898
SybilAttack | Round 07 | F1: 0.6929 | P: 0.5974 | R: 0.8248 | FP Rate: 0.2075 | FN Rate: 0.1752
SybilAttack | Round 08 | F1: 0.4275 | P: 0.2719 | R: 1.0000 | FP Rate: 1.0000 | FN Rate: 0.0000
SybilAttack | Round 09 | F1: 0.4275 | P: 0.2719 | R: 1.0000 | FP Rate: 1.0000 | FN Rate: 0.0000
SybilAttack | Round 10 | F1: 0.4275 | P: 0.2719 | R: 1.0000 | FP Rate: 1.0000 | FN Rate: 0.0000
SybilAttack | Round 11 | F1: 0.4275 | P:

In [18]:
def run_adaptive_gradient_attack(
    global_model,
    client_dfs,
    scaler,
    feature_names,
    X_test_scaled,
    y_true,
    attacker_id="client_3",
    attack_round=6,
    learning_rate=5.0,
    noise_std=0.0,
    threshold=0.000639,
    save_path="D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/poisoning_adaptive_gradient_attack.csv"
):
    model_copy = clone_model(global_model)
    model_copy.set_weights(global_model.get_weights())
    model_copy.compile(optimizer="adam", loss=MeanSquaredError())

    f1_list, precision_list, recall_list = [], [], []
    fp_rates, fn_rates = [], []

    client_ids = list(client_dfs.keys())

    for round_num in range(1, 21):
        local_weights = []

        for cid in client_ids:
            df = client_dfs[cid]
            X_local = df[feature_names].astype(float)
            X_scaled = scaler.transform(X_local)

            local_model = clone_model(model_copy)
            local_model.set_weights(model_copy.get_weights())
            local_model.compile(optimizer="adam", loss=MeanSquaredError())
            local_model.fit(X_scaled, X_scaled, epochs=1, batch_size=256, verbose=0)

            weights = local_model.get_weights()

            if cid == attacker_id and round_num >= attack_round:
                poisoned_weights = adaptive_gradient_attack(
                    global_weights=model_copy.get_weights(),
                    local_weights=weights,
                    learning_rate=learning_rate,
                    noise_std=noise_std
                )
                local_weights.append(poisoned_weights)
            else:
                local_weights.append(weights)

        averaged_weights = [
            np.mean([w[i] for w in local_weights], axis=0)
            for i in range(len(local_weights[0]))
        ]
        model_copy.set_weights(averaged_weights)

        preds = model_copy.predict(X_test_scaled, verbose=0)
        recon_errors = np.mean(np.square(X_test_scaled - preds), axis=1)
        y_pred = (recon_errors > threshold).astype(int)

        f1 = f1_score(y_true, y_pred)
        prec = precision_score(y_true, y_pred)
        rec = recall_score(y_true, y_pred)

        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        fp_rate = fp / (fp + tn + 1e-6)
        fn_rate = fn / (fn + tp + 1e-6)

        print(f"AdaptiveGradAttack | Round {round_num:02d} | F1: {f1:.4f} | P: {prec:.4f} | R: {rec:.4f} | FP Rate: {fp_rate:.4f} | FN Rate: {fn_rate:.4f}")

        f1_list.append(f1)
        precision_list.append(prec)
        recall_list.append(rec)
        fp_rates.append(fp_rate)
        fn_rates.append(fn_rate)

    df_out = pd.DataFrame({
        "Round": list(range(1, 21)),
        "F1": f1_list,
        "Precision": precision_list,
        "Recall": recall_list,
        "FP Rate": fp_rates,
        "FN Rate": fn_rates
    })

    df_out.to_csv(save_path, index=False)
    print(f"\nResults saved to: {save_path}")
    return df_out


In [19]:
adaptive_results = run_adaptive_gradient_attack(
    global_model,
    client_dfs,
    minmax_scaler,
    feature_names,
    X_test_scaled,
    y_true
)


AdaptiveGradAttack | Round 01 | F1: 0.8368 | P: 0.9878 | R: 0.7258 | FP Rate: 0.0034 | FN Rate: 0.2742
AdaptiveGradAttack | Round 02 | F1: 0.8363 | P: 0.9879 | R: 0.7250 | FP Rate: 0.0033 | FN Rate: 0.2750
AdaptiveGradAttack | Round 03 | F1: 0.8356 | P: 0.9879 | R: 0.7240 | FP Rate: 0.0033 | FN Rate: 0.2760
AdaptiveGradAttack | Round 04 | F1: 0.8350 | P: 0.9879 | R: 0.7231 | FP Rate: 0.0033 | FN Rate: 0.2769
AdaptiveGradAttack | Round 05 | F1: 0.8348 | P: 0.9879 | R: 0.7227 | FP Rate: 0.0033 | FN Rate: 0.2773
AdaptiveGradAttack | Round 06 | F1: 0.8330 | P: 0.9879 | R: 0.7201 | FP Rate: 0.0033 | FN Rate: 0.2799
AdaptiveGradAttack | Round 07 | F1: 0.8333 | P: 0.9877 | R: 0.7207 | FP Rate: 0.0034 | FN Rate: 0.2793
AdaptiveGradAttack | Round 08 | F1: 0.8339 | P: 0.9876 | R: 0.7216 | FP Rate: 0.0034 | FN Rate: 0.2784
AdaptiveGradAttack | Round 09 | F1: 0.8264 | P: 0.9878 | R: 0.7103 | FP Rate: 0.0033 | FN Rate: 0.2897
AdaptiveGradAttack | Round 10 | F1: 0.8308 | P: 0.9877 | R: 0.7169 | FP R

## Summary: Advanced Model Poisoning Attacks (Autoencoder-based FL-IDS)

This section summarizes the impact of four advanced model poisoning attacks on the Federated Autoencoder Intrusion Detection System.

| **Attack Type**       | **Malicious Client(s)** | **Peak FP Rate** | **Peak FN Rate** | **Lowest F1 Score** | **Notes** |
|-----------------------|-------------------------|------------------|------------------|---------------------|-----------|
| **Sign Flipping**     | `client_3`              | 0.0034           | 0.2832           | ~0.8310             | Very mild impact, surprising resilience, possibly due to low magnitude or model saturation |
| **Scaling Attack**    | `client_3`              | 1.0000           | 0.0000           | **0.4275**          | Catastrophic collapse, full failure from Round 8 onward |
| **Sybil Attack**      | `client_3` + 2 Sybils   | 1.0000           | 0.0000           | **0.4275**          | Immediate collapse after Round 7, behavior identical to scaling |
| **Adaptive Gradient** | `client_3`              | 0.0034           | 0.2899           | ~0.8264             | Steady drift, moderate degradation, stealthy and hard to detect |

---

### Observations:
- **Scaling and Sybil Attacks** are the most dangerous — causing full misclassification behavior and 100% false positives.
- **Adaptive Gradient** is subtle and harder to detect, but still deteriorates performance gradually.
- **Sign Flipping** had limited impact in this context — possibly due to lower learning momentum or robust initial state.

---

Next, we proceed to implement **defense strategies**, starting with:
- **Performance Feedback Aggregation**


In [20]:
def median_aggregation(client_weights_list):
    """
    Performs element-wise median aggregation across client weights.
    
    Args:
        client_weights_list: List of weight lists from all clients
    Returns:
        aggregated_weights: List of aggregated weights (same shape as client_weights)
    """
    num_layers = len(client_weights_list[0])
    aggregated_weights = []

    for layer_idx in range(num_layers):
        # Stack weights for this layer from all clients
        stacked_layer_weights = np.stack([client[layer_idx] for client in client_weights_list], axis=0)
        # Compute median across clients (axis=0)
        layer_median = np.median(stacked_layer_weights, axis=0)
        aggregated_weights.append(layer_median)

    return aggregated_weights


In [21]:
from sklearn.metrics import f1_score, precision_score, recall_score

NUM_ROUNDS = 20
malicious_client_id = "client_3"
attack_type = "Scaling_MedianDef"
scaling_factor = 100  # strong attack
threshold = 0.000639  
results_path = "D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/defense_median_scaling_attack.csv"

# Initialize a fresh global model
global_model_defended = clone_model(global_model)
global_model_defended.set_weights(global_model.get_weights())
global_model_defended.compile(optimizer='adam', loss=MeanSquaredError())

metrics_log = []

for round_num in range(1, NUM_ROUNDS + 1):
    collected_weights = []

    for client_id, client_data in client_dfs.items():
        local_model = clone_model(global_model_defended)
        local_model.set_weights(global_model_defended.get_weights())
        local_model.compile(optimizer='adam', loss=MeanSquaredError())

        X_scaled = minmax_scaler.transform(client_data[feature_cols])
        local_model.fit(X_scaled, X_scaled, epochs=1, batch_size=256, verbose=0)

        local_weights = local_model.get_weights()

        # Apply scaling attack on malicious client
        if client_id == malicious_client_id:
            local_weights = [w + scaling_factor * (w - gw) 
                             for w, gw in zip(local_weights, global_model_defended.get_weights())]

        collected_weights.append(local_weights)

    # Median Aggregation
    aggregated_weights = median_aggregation(collected_weights)
    global_model_defended.set_weights(aggregated_weights)

    # Evaluate
    reconstructed = global_model_defended.predict(X_test_scaled, verbose=0)
    reconstruction_errors = np.mean(np.square(X_test_scaled - reconstructed), axis=1)
    y_pred = (reconstruction_errors > threshold).astype(int)

    f1 = f1_score(y_true, y_pred)
    p = precision_score(y_true, y_pred)
    r = recall_score(y_true, y_pred)

    fp_rate = np.mean((y_pred == 1) & (y_true == 0))
    fn_rate = np.mean((y_pred == 0) & (y_true == 1))

    metrics_log.append({
        "Round": round_num,
        "F1": f1,
        "Precision": p,
        "Recall": r,
        "FP Rate": fp_rate,
        "FN Rate": fn_rate
    })

    print(f"{attack_type} | Round {round_num:02d} | F1: {f1:.4f} | P: {p:.4f} | R: {r:.4f} | FP Rate: {fp_rate:.4f} | FN Rate: {fn_rate:.4f}")


df_metrics = pd.DataFrame(metrics_log)
results_path = os.path.join(results_dir, "defense_median_scaling_attack.csv")
df_metrics.to_csv(results_path, index=False)
print(f"\nResults saved to: {results_path}")


Scaling_MedianDef | Round 01 | F1: 0.8372 | P: 0.9878 | R: 0.7264 | FP Rate: 0.0024 | FN Rate: 0.0744
Scaling_MedianDef | Round 02 | F1: 0.8376 | P: 0.9880 | R: 0.7270 | FP Rate: 0.0024 | FN Rate: 0.0742
Scaling_MedianDef | Round 03 | F1: 0.8362 | P: 0.9879 | R: 0.7249 | FP Rate: 0.0024 | FN Rate: 0.0748
Scaling_MedianDef | Round 04 | F1: 0.8349 | P: 0.9879 | R: 0.7229 | FP Rate: 0.0024 | FN Rate: 0.0753
Scaling_MedianDef | Round 05 | F1: 0.8350 | P: 0.9879 | R: 0.7231 | FP Rate: 0.0024 | FN Rate: 0.0753
Scaling_MedianDef | Round 06 | F1: 0.8350 | P: 0.9879 | R: 0.7231 | FP Rate: 0.0024 | FN Rate: 0.0753
Scaling_MedianDef | Round 07 | F1: 0.8350 | P: 0.9884 | R: 0.7229 | FP Rate: 0.0023 | FN Rate: 0.0753
Scaling_MedianDef | Round 08 | F1: 0.8322 | P: 0.9883 | R: 0.7186 | FP Rate: 0.0023 | FN Rate: 0.0765
Scaling_MedianDef | Round 09 | F1: 0.8302 | P: 0.9885 | R: 0.7157 | FP Rate: 0.0023 | FN Rate: 0.0773
Scaling_MedianDef | Round 10 | F1: 0.8286 | P: 0.9885 | R: 0.7132 | FP Rate: 0.002

NameError: name 'results_dir' is not defined

In [22]:
# Savint eh results
results_path = "D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/defense_median_scaling_attack.csv"
df_metrics = pd.DataFrame(metrics_log)
df_metrics.to_csv(results_path, index=False)
print(f"\nResults saved to: {results_path}")



Results saved to: D:/August-Thesis/FL-IDS-Surveillance/notebooks/results/defense_median_scaling_attack.csv


In [25]:
from tensorflow.keras.models import load_model
from tensorflow.keras.losses import MeanSquaredError
from sklearn.metrics import f1_score, precision_score, recall_score
import joblib
import pandas as pd
import numpy as np
import os

# === Paths ===
base_path = "D:/August-Thesis/FL-IDS-Surveillance"
results_dir = os.path.join(base_path, "notebooks/results")
model_path = os.path.join(results_dir, "models/unsupervised/federated/final_federated_autoencoder_20rounds.h5")
scaler_path = os.path.join(results_dir, "scalers/minmax_scaler_client_3.pkl")
test_path = os.path.join(base_path, "data/processed/surv_unsupervised/test_mixed.csv")

# === Load model and scaler
global_model = load_model(model_path, compile=False)
global_model.compile(optimizer="adam", loss=MeanSquaredError())
minmax_scaler = joblib.load(scaler_path)

# === Loading test set
test_df = pd.read_csv(test_path)
feature_cols = list(minmax_scaler.feature_names_in_)
X_test_scaled = minmax_scaler.transform(test_df[feature_cols])
y_true = test_df["Attack_label"].values

# === Evaluate the baseline
reconstructed = global_model.predict(X_test_scaled, verbose=0)
reconstruction_errors = np.mean(np.square(X_test_scaled - reconstructed), axis=1)
threshold = 0.000639  # validated threshold
y_pred = (reconstruction_errors > threshold).astype(int)

print("F1:", f1_score(y_true, y_pred))
print("Precision:", precision_score(y_true, y_pred))
print("Recall:", recall_score(y_true, y_pred))


  test_df = pd.read_csv(test_path)


F1: 0.8372996201343141
Precision: 0.987711770830047
Recall: 0.7266437993935586


In [28]:
from sklearn.metrics import f1_score, precision_score, recall_score
import numpy as np
import pandas as pd
import os

# === Krum Aggregation ===
def krum_aggregation(weights_list, f):
    n = len(weights_list)
    scores = []
    for i in range(n):
        distances = []
        for j in range(n):
            if i != j:
                dist = sum(np.sum((w1 - w2) ** 2) for w1, w2 in zip(weights_list[i], weights_list[j]))
                distances.append(dist)
        distances.sort()
        score = sum(distances[:n - f - 2])
        scores.append(score)
    selected_idx = np.argmin(scores)
    return weights_list[selected_idx]

# === Scaling Attack ===
def scaling_attack(global_weights, local_weights, scale_factor=10.0):
    return [gw + scale_factor * (lw - gw) for gw, lw in zip(global_weights, local_weights)]

# === Simulation ===
NUM_ROUNDS = 20
malicious_client_id = "client_3"
attack_start_round = 6
threshold = 0.000639
f = 1  # 1 attacker

global_model_krum = clone_model(global_model)
global_model_krum.set_weights(global_model.get_weights())
global_model_krum.compile(optimizer="adam", loss=MeanSquaredError())

metrics_log = []

for round_num in range(1, NUM_ROUNDS + 1):
    collected_weights = []

    for cid in client_ids:
        local_model = clone_model(global_model_krum)
        local_model.set_weights(global_model_krum.get_weights())
        local_model.compile(optimizer="adam", loss=MeanSquaredError())

        X_scaled = minmax_scaler.transform(client_dfs[cid][feature_cols]) 
        local_model.fit(X_scaled, X_scaled, epochs=1, batch_size=256, verbose=0)
        local_weights = local_model.get_weights()

        if cid == malicious_client_id and round_num >= attack_start_round:
            local_weights = scaling_attack(global_model_krum.get_weights(), local_weights)

        collected_weights.append(local_weights)

    aggregated_weights = krum_aggregation(collected_weights, f if round_num >= attack_start_round else 0)
    global_model_krum.set_weights(aggregated_weights)

    # Evaluate
    preds = global_model_krum.predict(X_test_scaled, verbose=0)
    errors = np.mean(np.square(X_test_scaled - preds), axis=1)
    y_pred = (errors > threshold).astype(int)

    f1 = f1_score(y_true, y_pred)
    p = precision_score(y_true, y_pred)
    r = recall_score(y_true, y_pred)
    fp_rate = np.mean((y_true == 0) & (y_pred == 1))
    fn_rate = np.mean((y_true == 1) & (y_pred == 0))

    print(f"Krum_Scaling | Round {round_num:02d} | F1: {f1:.4f} | P: {p:.4f} | R: {r:.4f} | FP Rate: {fp_rate:.4f} | FN Rate: {fn_rate:.4f}")

    metrics_log.append({
        "Round": round_num, "F1": f1, "Precision": p, "Recall": r,
        "FP Rate": fp_rate, "FN Rate": fn_rate
    })

# Save results
df_metrics = pd.DataFrame(metrics_log)
save_path = os.path.join(results_dir, "defense_krum_scaling_attack.csv")
df_metrics.to_csv(save_path, index=False)
print(f"\nResults saved to: {save_path}")


Krum_Scaling | Round 01 | F1: 0.8385 | P: 0.9878 | R: 0.7284 | FP Rate: 0.0024 | FN Rate: 0.0738
Krum_Scaling | Round 02 | F1: 0.8379 | P: 0.9879 | R: 0.7275 | FP Rate: 0.0024 | FN Rate: 0.0741
Krum_Scaling | Round 03 | F1: 0.8383 | P: 0.9880 | R: 0.7280 | FP Rate: 0.0024 | FN Rate: 0.0739
Krum_Scaling | Round 04 | F1: 0.8373 | P: 0.9880 | R: 0.7265 | FP Rate: 0.0024 | FN Rate: 0.0743
Krum_Scaling | Round 05 | F1: 0.8367 | P: 0.9880 | R: 0.7256 | FP Rate: 0.0024 | FN Rate: 0.0746
Krum_Scaling | Round 06 | F1: 0.8369 | P: 0.9880 | R: 0.7259 | FP Rate: 0.0024 | FN Rate: 0.0745
Krum_Scaling | Round 07 | F1: 0.8366 | P: 0.9880 | R: 0.7254 | FP Rate: 0.0024 | FN Rate: 0.0746
Krum_Scaling | Round 08 | F1: 0.8359 | P: 0.9879 | R: 0.7245 | FP Rate: 0.0024 | FN Rate: 0.0749
Krum_Scaling | Round 09 | F1: 0.8339 | P: 0.9884 | R: 0.7212 | FP Rate: 0.0023 | FN Rate: 0.0758
Krum_Scaling | Round 10 | F1: 0.8323 | P: 0.9884 | R: 0.7188 | FP Rate: 0.0023 | FN Rate: 0.0765
Krum_Scaling | Round 11 | F1: 