# Federated Learning: Attack and Defense Demonstration

In this notebook, we will show an attack on a federated learning environment as well as how to defend against it. Federated learning is a machine learning technique in which a model is trained on multiple devices. This approach is mainly used to ensure data privacy, and when it would be impossible to transfer data to a central server.

However, it is possible to attack a federated learning environment, particularly through the use of malicious devices that introduce noisy data. In this demonstration, we will simulate this kind of attack and then introduce a defense mechanism to mitigate it.

### Import necessary libraries

In [None]:
import numpy as np
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# Define a simple linear regression model
class LinearRegression:
    def __init__(self, learning_rate=0.0001, weight=None, bias=None):
        self.learning_rate = learning_rate
        self.weight = weight if weight is not None else np.random.randn(10)  # Initialize weight as a vector of size 10
        self.bias = bias if bias is not None else np.random.randn()

    def predict(self, x):
        return np.dot(self.weight, x) + self.bias

    def update(self, gradient_weight, gradient_bias):
        self.weight -= self.learning_rate * gradient_weight
        self.bias -= self.learning_rate * gradient_bias

    def mse(self, data):
        total_error = 0
        for x, y in data:
            y_pred = self.predict(x)
            total_error += (y - y_pred) ** 2
        return total_error / len(data)

## Linear Regression Model
We defined a simple linear regression model with the following components:
- **Initialization**: The model has a learning rate, weights, and bias. The weight vector is size 10.
- **Predict**: This function will predict the output for an input `x` using $y = wx + b$, where `w` is the weight and `b` is the bias.
- **Update**: This function will update the model's weights and bias based on the gradients. The learning rate is responsible for the step size for the update.
- **Mean Squared Error (MSE)**: This function calculates the mean squared error for the given data. It's used to show the model's performance - lower values mean better performance.


Define functions to simulate local training on devices and federated learning:

In [None]:
# Simulate local training on devices
def local_training(model, data, epochs=100):
    for _ in range(epochs):
        gradient_weight = 0
        gradient_bias = 0
        for x, y in data:
            y_pred = model.predict(x)
            gradient_weight += -2 * x * (y - y_pred)
            gradient_bias += -2 * (y - y_pred)
        model.update(gradient_weight, gradient_bias)

# Federated learning simulation
def federated_learning(global_model, local_data_list, epochs=100):
    num_devices = len(local_data_list)
    for _ in range(epochs):
        global_gradient_weight = 0
        global_gradient_bias = 0
        for local_data in local_data_list:
            local_model = LinearRegression(weight=global_model.weight, bias=global_model.bias)
            local_training(local_model, local_data)
            global_gradient_weight += (local_model.weight - global_model.weight) / num_devices  # Average difference
            global_gradient_bias += (local_model.bias - global_model.bias) / num_devices  # Average difference
        global_model.update(global_gradient_weight, global_gradient_bias)

# Generate random data for devices
def generate_data(num_points, x_range, y_range):
    x = np.random.uniform(x_range[0], x_range[1], num_points)
    y = np.random.uniform(y_range[0], y_range[1], num_points)
    return list(zip(x, y))

## Local Training and Federated Learning
### Local Training
The `local_training` function simulates the training of a model on a local device. Steps:

1. For each epoch, initialize the gradients for weight and bias to 0.
2. For each data point, predict the output and calculate the gradients for weight and bias based on the error.
3. After processing all of the data points, it will update the model's weight and bias using the total gradients.

### Federated Learning
The `federated_learning` function simulates the federated learning process. Steps:

1. For each epoch, initialize the global gradients for weight and bias to zero.
2. For each local dataset (representing a device's data):
   - Create a local model with the same weight and bias as the global model.
   - Local model is trained using the `local_training` function.
   - The difference between the local model's weight and bias and the global model's weight and bias is found and added to the global gradients. This difference is then averaged over the number of devices.
3. After processing all local datasets, the global model's weight and bias are updated using the accumulated global gradients.

### Data Generation
The `generate_data` function simulates the generation of random data for the devices. It creates random `x` and `y` values within the specified ranges and returns them.

Next, we'll load a real dataset, preprocess it, and simulate the federated learning process:

In [None]:
# Load the diabetes dataset
diabetes = datasets.load_diabetes()
X, y = diabetes.data, diabetes.target

# Standardize the dataset
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split the dataset into chunks to simulate data on different devices
num_devices = 4
device_data = []
for _ in range(num_devices):
    X_train, X, y_train, y = train_test_split(X, y, test_size=0.75)
    device_data.append(list(zip(X_train, y_train)))

# Initialize a global model
global_model = LinearRegression()

# Evaluate model's metrics before federated learning
all_data = [item for sublist in device_data for item in sublist]
initial_mse = global_model.mse(all_data)
print(f'Initial MSE: {initial_mse}')

# Perform federated learning with the devices
federated_learning(global_model, device_data)

# Evaluate model's metrics after federated learning
post_devices_mse = global_model.mse(all_data)
print(f'MSE after training: {post_devices_mse}')

## Data Preprocessing and Initial Federated Learning
In the above code, we perform the following steps:

1. **Load the Diabetes Dataset**: The diabetes dataset from the `sklearn.datasets` module. This dataset has 10 baseline variables such as age, sex, BMI, average blood pressure, and six blood serum measurements for 442 diabetes patients. The target variable is a measurement of the disease's progress one year after the baseline.

2. **Data Standardization**: Standardize the dataset using the `StandardScaler` from `sklearn.preprocessing`.

3. **Simulate Data on Different Devices**: Split the dataset into chunks to simulate the scenario in which the data is distributed across different devices.In other words, each device gets a subset of the data. This is a common practice in federated learning environemnents where each device (mobile phone, for example) has local data that isn't shared directly due to privacy concerns.

4. **Initialize a Global Model**: Create an instance of our `LinearRegression` class, which acts as our global model.

5. **Evaluate Model Before Training**: Before beginning federated learning process, we test the model's performance on all data using the Mean Squared Error (MSE). This gives us a baseline.

6. **Federated Learning**: Perform federated learning using the `federated_learning` function. After training, we test the model's performance again to see the difference (improvement or degredation).

The outputs show the model's MSE before and after federated learning.

Next, we introduce an attack on the model by adding a noisy device and measure its impact on the model.

In [None]:
# Introduce a fifth device with noisy data
num_devices += 1
X_noisy, _, y_noisy, _ = train_test_split(X, y, test_size=0.75)
X_noisy += np.random.normal(0, 5, X_noisy.shape)  # Adding large noise to features
y_noisy += np.random.normal(0, 100, y_noisy.shape)  # Adding large noise to targets
device_data.append(list(zip(X_noisy, y_noisy)))

# Re-evaluate the model with the new data
federated_learning(global_model, device_data)

# Evaluate model's metrics after federated learning with 5 devices
all_data += list(zip(X_noisy, y_noisy))
post_5_devices_mse = global_model.mse(all_data)
print(f'MSE after poisoning: {post_5_devices_mse}')

## Adversarial Attack: Introducing Noisy Data
Above, we simulated an adversarial attack by introducing a 5th device with noisy data. Steps:

1. **Introduce a Fifth Device** We add a new device to the simulation. This device will send noisy data to the model.

2. **Generate Noisy Data**: For this device, we add a lot of noise to the features (`X_noisy`) and targets (`y_noisy`). This is meant to simulate a scenario in which someone attempts to sabotage the learning environment.

3. **Re-evaluate the Model with Noisy Data**: We then redo the federated learning, including the noisy data.

4. **Evaluate Model's Metrics After Poisoning**: After training, we measure the model's performance once again. This will show the effect that the noisy 5th device had on model performance

Cearly based on the output, the model's performance suffered from the iintroduction of the noisy device.

## Averaged Federated Learning with Trust Scores
Now we're goin gto implement a defense to this attack. The idea of the defense is to assign each model a trust score. This score will be used to judge the device's trustworthiness. This score will be updated based on how large of an update to the model the device has.

Here's a step-by-step breakdown of the process:

In [None]:
# Redo but with averaged federated learning
# Split the dataset into chunks to simulate data on different devices
num_devices = 4
device_data = []
for _ in range(num_devices):
    X_train, X, y_train, y = train_test_split(X, y, test_size=0.75)
    device_data.append(list(zip(X_train, y_train)))

# Initialize a global model
global_model = LinearRegression()

# Evaluate model's metrics before federated learning
all_data = [item for sublist in device_data for item in sublist]
initial_mse = global_model.mse(all_data)
print(f'Initial MSE: {initial_mse}')

1. **Data Splitting for Devices:**
   - Same as before.

2. **Global Model Initialization:**
   - Same as before.

3. **Evaluation Before Training:**
   - Same as before.

### Federated Learning with Model Averaging

This is where we introduce the model averaging and trust scores:

In [None]:
# Federated learning simulation with model averaging
def averaged_federated_learning(global_model, local_data_list, epochs=100):
    for _ in range(epochs):
        global_gradient_weight = 0
        global_gradient_bias = 0
        total_trust = sum(trust_scores)

        for device_id, local_data in enumerate(local_data_list):
            # Skip devices with trust score below threshold
            if trust_scores[device_id] < trust_threshold:
                continue

            local_model = LinearRegression(weight=global_model.weight, bias=global_model.bias)
            local_training(local_model, local_data)

            # Weighted averaging using trust scores
            weight_diff = (local_model.weight - global_model.weight)
            bias_diff = (local_model.bias - global_model.bias)

            global_gradient_weight += (weight_diff * trust_scores[device_id]) / total_trust
            global_gradient_bias += (bias_diff * trust_scores[device_id]) / total_trust

            # Adjust trust scores based on the magnitude of the update
            if np.linalg.norm(weight_diff) > 0.05:  # Example threshold
                trust_scores[device_id] *= 0.8  # Decrease trust score
            else:
                trust_scores[device_id] *= 1.05  # Increase trust score, but ensure it doesn't grow indefinitely

        global_model.update(global_gradient_weight, global_gradient_bias)

4. **Model Averaging Based on Trust Scores:**
   - This time, instead of simply averaging the updates sent to the model, we add weight to each update based on the trust score of the device sending it.
   - All devices initially have a trust score of 1.0 - the default is to trust.
   - When the model aggregates the updates, it will do so using a weighted average.
   - If the magnitude of the model update from a device is greater than a certain number, (in this case, 0.05), the trust score of that device will be lowered. This is because a large update may be linked to the fact that that device is far off from the others, indicating the presence of noisy data. On the other hand, if the update is small, the trust score is slightly increased.
   - Devices with low trust scores (0.5) are deemed untrustworthy and are ignored in future updates.

5. **Training with Averaged Federated Learning:**
   - Retrain the global model and evaluate.

In [None]:
# Initialize trust scores for all devices
trust_scores = [1.0 for _ in range(num_devices)]
trust_threshold = 0.5  # Threshold below which a device is considered untrustworthy

# Perform federated learning with the devices
averaged_federated_learning(global_model, device_data)

# Evaluate model's metrics after federated learning
post_devices_mse = global_model.mse(all_data)
print(f'MSE after averaging: {post_devices_mse}')

6. **Introducing the Noisy Device**



In [None]:
# Introduce a fifth device with noisy data
X_noisy, _, y_noisy, _ = train_test_split(X, y, test_size=0.75)
X_noisy += np.random.normal(0, 5, X_noisy.shape)  # Adding large noise to features
y_noisy += np.random.normal(0, 100, y_noisy.shape)  # Adding large noise to targets
device_data.append(list(zip(X_noisy, y_noisy)))
# Add a trust score for the new device
trust_scores.append(1.0)

7. **Training with the Noisy Device:**

In [None]:


# Re-evaluate the model with the poisoned data but with averaged federated learning
averaged_federated_learning(global_model, device_data)

# Evaluate model's metrics after federated learning with 5 devices
all_data += list(zip(X_noisy, y_noisy))
post_5_devices_mse = global_model.mse(all_data)
print(f'MSE after poisoning but with averaged federated learning: {post_5_devices_mse}')

8. **Analysis and Conclusion:**
   - The results show the ability of the federated learning environment to protect from malicious devices.
   - The initial MSE prior to the introduction of the noisy 5th device and after introducing federated learning were `29482.5` and `26469.9` respectively. this shows the model benefitted from the introduction of federated learning.
   - After introducing the noisy 5th device and re-training, the MSE was `29356.4`. This value is higher than the initial MSE. This means that the model's performance was worsened by this malicious device.
   - The introduction of trust scores, though, was able to mitigate this negative impact of the noise.
   - When re-instantiated, the initial model had an MSE of `28427.9`. After the first round of averaged federated learning the model's MSE was `26383.7`. These are very similar results to the first time through.
   - However, when the model was poisoned but trained using the trust score system, the MSE was `23259.9`. This is the ***lowest MSE of all***.
   - This suggests that this approach was not only able to defend against the attack,, but actually improved the model's overall performance.