# Federated models: regression using the California housing database

In this notebook, we explain how you can use a federated learning environment to create a regression model. 
In the notebook on [Linear regression for a simple 2D case](./federated_models_linear_regression.ipynb), we explained the basic concepts of the framework, so now we will go slightly faster.
## The data 
First, we load a dataset (included in the framework) to allow for regression experiments.

In [None]:
import shfl
from shfl.data_base.california_housing import CaliforniaHousing

database = CaliforniaHousing()
train_data, train_labels, test_data, test_labels = database.load_data()

Now, we are going to explore the data:

In [None]:
print("Shape of train_data: " + str(train_data.shape))
print("Shape of train_labels: " + str(train_labels.shape))
print("One sample features: " + str(train_data[0]))
print("One sample label: " + str(train_labels[0]))

## The model
Model definition:

In [None]:
import tensorflow as tf

def model_builder():
    # create model
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Dense(8, input_dim=8, kernel_initializer='normal', activation='relu'))
    model.add(tf.keras.layers.Dense(1, kernel_initializer='normal'))
    
    # Define configuration
    criterion = tf.keras.losses.MeanSquaredError()
    optimizer = tf.keras.optimizers.Adam()
    metrics = [tf.keras.metrics.mae]
    
    return shfl.model.DeepLearningModel(model=model, criterion=criterion, optimizer=optimizer, metrics=metrics)

## Run the federated learning experiment
Federated environment definition:

In [None]:
iid_distribution = shfl.data_distribution.IidDataDistribution(database)
federated_data, test_data, test_label = iid_distribution.get_federated_data(num_nodes=20, percent=10)

aggregator = shfl.federated_aggregator.FedAvgAggregator()
federated_government = shfl.federated_government.FederatedGovernment(model_builder, federated_data, aggregator)

Reshaping data:

In [None]:
import numpy as np

class Reshape(shfl.private.FederatedTransformation):
    
    def apply(self, labeled_data):
        labeled_data.label = np.reshape(labeled_data.label, (labeled_data.label.shape[0], 1))
        
shfl.private.federated_operation.apply_federated_transformation(federated_data, Reshape())

Running experiment:

In [None]:
test_label = np.reshape(test_label, (test_label.shape[0], 1))
federated_government.run_rounds(3, test_data, test_label)

## Add differential privacy 

We wish to add Differential Privacy to our federated learning experiment, and assess its effect on the quality of the global model. In the following, it is shown how to perform that by easy steps using Sherpa.ai framework. As shown below, by selecting a sensitivity we are ready to run the private federated experiment using the desired differential privacy mechanism.

### Model's sensitivity
We will apply the Laplace mechanism, employing a fixed sensitivity for the model. 
Intuitively, the model's sensitivity is defined as the maximum change in the output when one single training data is changed or removed.
The choice of the sensitivity is critical since it determines the amount of noise applied to the data, and thus excessive distortion might result in an unusable model.
We can sample model's sensitivity using the functionality provided by the framework: 

In [None]:
from shfl.differential_privacy import SensitivitySampler
from shfl.differential_privacy import L1SensitivityNorm


class UniformDistribution(shfl.differential_privacy.ProbabilityDistribution):
    """
    Implement Uniform sampling over the data
    """
    def __init__(self, sample_data):
        self._sample_data = sample_data

    def sample(self, sample_size):
        row_indices = np.random.randint(low=0, high=self._sample_data.shape[0], size=sample_size, dtype='l')
        
        return self._sample_data[row_indices, :]
    

class DeepLearningSample(shfl.model.DeepLearningModel):
    """
    Adds the "get" method to model's class
    """
    def get(self, data_array):
        data = data_array[:, 0:-1]
        labels = data_array[:, -1].reshape(-1,1)
        train_model = self.train(data, labels)
        
        return self.get_model_params()


def model_builder_sample():
    # create model
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Dense(8, input_dim=8, kernel_initializer='normal', activation='relu'))
    model.add(tf.keras.layers.Dense(1, kernel_initializer='normal'))
    
    # Define configuration
    criterion = tf.keras.losses.MeanSquaredError()
    optimizer = tf.keras.optimizers.Adam()
    metrics = [tf.keras.metrics.mae]
    
    return DeepLearningSample(model=model, criterion=criterion, optimizer=optimizer, metrics=metrics)


class L1SensitivityNormLists(L1SensitivityNorm):
    """
    Implements the L1 norm of the difference between lists of parameters x_1 and x_2
    """
    def compute(self, x_1, x_2):
        x = []
        for x_1_i, x_2_i in zip(x_1, x_2):
            x.append(np.sum(np.abs(x_1_i - x_2_i)))   
        
        return np.max(x) # This could be allowed to be an array

    
sample_data = np.hstack((train_data, train_labels.reshape(-1,1)))
distribution = UniformDistribution(sample_data)
sampler = SensitivitySampler()
n_samples = 100
max_sensitivity, mean_sensitivity = sampler.sample_sensitivity(
    model_builder_sample(), 
    L1SensitivityNormLists(), distribution, n=n_samples, gamma=0.05)
print("Max sensitivity from sampling: " + str(max_sensitivity))
print("Mean sensitivity from sampling: " + str(mean_sensitivity))

### Run the federated learning experiment with differential privacy
The Laplace mechanism provided by the Sherpa.ai Federated Learning and Differential Privacy Framework is then assigned as the private access type to the model parameters of each client in a new `FederatedGovernment` object. 
This results in an $\epsilon$-differentially private FL model.
For example, by choosing the value $\epsilon = 0.5$, we can run the FL experiment with DP:

In [None]:
from shfl.differential_privacy import LaplaceMechanism

params_access_definition = LaplaceMechanism(sensitivity=mean_sensitivity, epsilon=0.5)
federated_governmentDP = shfl.federated_government.FederatedGovernment(
    model_builder, federated_data, aggregator, model_params_access=params_access_definition)
federated_governmentDP.run_rounds(3, test_data, test_label)