<a href="https://colab.research.google.com/github/ghimirebimal/ML-Projects/blob/main/Federated_Learning_using_CelebA_Image_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 8: Federated Learning using CelebA Image dataset
### Deliver local models to edge devices to train and aggregate the local models to create a global model.

### Learning Objectives
* Understand the basics of Federated Learning.
* Learn to create clients and data shards creating a federated learning environment.
* Create a global model by aggregating weights of all the trained local models.


<img src='https://drive.google.com/uc?id=1gtt3xdGCvXaj40MgsdK67PGm6CFEbohF' width=500cm>

Source: https://miro.medium.com/max/1400/1*HaH611vAy2eB1e42vz3X4g.png

### Imports
Import all the required libraries for the lab including numpy, pandas, Tensorlfow, and Keras layers to create neural network.

In [None]:
import numpy as np
import pandas as pd
import cv2
import random
from imutils import paths

from sklearn.utils import shuffle
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
from tensorflow.keras import backend as K

### Mount Google Drive

In the code cell below, we mount the google drive to the colab environment so that we have access to the local version of the dataset.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### Read CSV
Use Pandas to load CelebA Image CSV file.

In [None]:
mydata = pd.read_csv('/content/drive/My Drive/Intro2MLDatasets/Lab8/list_attr_celeba.csv')

### Details on Dataset
Utilize Pandas functions to visualize the shape and first 20 instances of the dataset. More functions can be used to explore the dataset.

In [None]:
mydata.shape

(202599, 41)

In [None]:
mydata.head(10)

Unnamed: 0,image_id,5_o_Clock_Shadow,Arched_Eyebrows,Attractive,Bags_Under_Eyes,Bald,Bangs,Big_Lips,Big_Nose,Black_Hair,Blond_Hair,Blurry,Brown_Hair,Bushy_Eyebrows,Chubby,Double_Chin,Eyeglasses,Goatee,Gray_Hair,Heavy_Makeup,High_Cheekbones,Male,Mouth_Slightly_Open,Mustache,Narrow_Eyes,No_Beard,Oval_Face,Pale_Skin,Pointy_Nose,Receding_Hairline,Rosy_Cheeks,Sideburns,Smiling,Straight_Hair,Wavy_Hair,Wearing_Earrings,Wearing_Hat,Wearing_Lipstick,Wearing_Necklace,Wearing_Necktie,Young
0,000001.jpg,-1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,1,1,-1,1,-1,-1,1,-1,-1,1,-1,-1,-1,1,1,-1,1,-1,1,-1,-1,1
1,000002.jpg,-1,-1,-1,1,-1,-1,-1,1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,1,-1,1,-1,-1,1,-1,-1,-1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,1
2,000003.jpg,-1,-1,-1,-1,-1,-1,1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,1,1,-1,-1,1,-1,-1,-1,-1,-1,1,-1,-1,-1,-1,-1,1
3,000004.jpg,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,1,-1,-1,-1,-1,1,-1,1,-1,1,1,-1,1
4,000005.jpg,-1,1,1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,-1,-1,1,1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,1
5,000006.jpg,-1,1,1,-1,-1,-1,1,-1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,1,-1,-1,1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,1,-1,-1,1
6,000007.jpg,1,-1,1,1,-1,-1,1,1,1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,-1,1,-1,-1,1,-1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,1
7,000008.jpg,1,1,-1,1,-1,-1,1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,-1,1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1
8,000009.jpg,-1,1,1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,1,-1,-1,1,1,-1,1,-1,1,-1,1,-1,-1,1,-1,1,-1,-1,1
9,000010.jpg,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,1,-1,-1,1


### Drop features
Drop the features from the dataset which are not needed. Smiling is the label to be predicted and the Image ID  is used as an identifier. All the other features are excluded from the dataset because in this lab we are reading each pixel of an image to be used as a feature and the Smiling attribute as a label. The prediction depends on the model which learns from the pixels of an image. 

In [None]:
mydata.drop(mydata.columns.difference(['image_id','Smiling']), 1, inplace=True)
mydata.head(10)

Unnamed: 0,image_id,Smiling
0,000001.jpg,1
1,000002.jpg,1
2,000003.jpg,-1
3,000004.jpg,-1
4,000005.jpg,-1
5,000006.jpg,-1
6,000007.jpg,-1
7,000008.jpg,-1
8,000009.jpg,1
9,000010.jpg,-1


### Labels List
Create a list named Label which corresponds to the feature Smiling. This list will be used later to create training and testing data. 

In [None]:
labels = list()

# Iterate over dataframe to store target labels for attribute "Smiling"
for (columnName, columnData) in mydata.iteritems():
  if columnName == 'Smiling':
    for i in range(0, 2988):
      labels.append(columnData.values[i])

### Read image to convert into an array
Load function reads the images from the path provided as a list to the function. It reads the image, flattens the pixel value, and scales it to value [0,1] and returns the list of lists where each list is the scaled pixel value of one image.

In [None]:
#Given a path, iterate over and read each image and convert to an array 
def load(path):
  data = list()

  for imgpath in path:
    im_grayscale = cv2.imread(imgpath, 0)
    image = np.array(im_grayscale).flatten()
    #As the max pixel value is 255, dividing numpy array by 255 scales the image between value 0 and 1.
    data.append(image/255)

  return data

### Image path
Create path for each image from the folder and call the load function. The length of image_list and labels are equal.

In [None]:
image_path = '/content/drive/My Drive/Intro2MLDatasets/images_celeba'
img_paths = list(paths.list_images(image_path)) 
image_list = load(img_paths)

In [None]:
image_list[0]

### Convert labels into binary and split data
Convert the label into binary format and split the data into train and test.

In [None]:
#convert the label into binary data 0 or 1
lb = LabelBinarizer()
labels = lb.fit_transform(labels)

#Split feature and label into train and test sets
x_train, x_test, y_train, y_test = train_test_split(image_list, labels, test_size=0.1, shuffle=True)

### Create clients
This function creates a dictionary with number of client devices as a key and (image pixel, label) as the value for the Federated learning.


In [None]:
"""
Creates data shard for each client to train in local model. 
"""
def create_clients(image_list, labels, num_clients=10):
  client_shards_dict = {}

  #list containing unique name of each client created
  client_names = []
  for i in range(num_clients):
    client_names.append('{}_{}'.format('client', i+1))

  #zip the image and corresponding label as a list and shuffle the data
  data = list(zip(image_list, labels))
  random.shuffle(data)
  
  #create shards of random (image, label) for each clients
  # list of lists of tuples where (image pixel, label)
  shards = []
  size = len(data)//num_clients
  for i in range(0, size*num_clients, size):
    shards.append(data[i:i + size])

  #create dictionary where key is the client name and value is the data with list of (image pixel, label) assigned to the client
  for i in range(len(client_names)):
    client_shards_dict[client_names[i]] = shards[i]

  return client_shards_dict  

### Call create_clients
Call create_clients function to create the dictionary and save it in the variable clients. 

In [None]:
"""
Call create_clients function to create data shards for each client using only training datasets.
"""
clients = create_clients(x_train, y_train, num_clients=10)

### Create tensors
This function takes the values from the dictionary and converts into tensor slices. It returns the shuffled tensorflow dataset in provided batch quantity.

In [None]:
"""
Takes data_shard as an argument and creates a tensorflow_dataset object.
"""
def batch_data(data_shard):
  data, label = zip(*data_shard)
  dataset = tf.data.Dataset.from_tensor_slices((list(data), list(label)))
  return dataset.shuffle(len(label)).batch(32)

### Call batch_data
A dictionary is created which stores each client as the key and the tensorflow objects of batches of images and it's corresponding labels.

In [None]:
"""
Function call batch_data to convert data shards into dataset objects which is a ready-to-use datasets for use with tensorflow framework.
"""
clients_batched = {}
for (client_name, data) in clients.items():
  clients_batched[client_name] = batch_data(data)

### Create tensorflow object for test data
As similar to the training data, we now create tensorflow object datasets for the testing purpose.

In [None]:
"""
Create tensorflow dataset object to test on the global model after several iteration of federated learning.
"""
test_batched = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(len(y_test))

### Define Multilayer Perceptron ANN
Define a class that gives the skeleton to create the neural network.

In [None]:
"""
Define a class for Multilayer perceptron NN.
"""
class SimpleMLP:
  @staticmethod
  def build(shape, classes):
    model = Sequential()
    model.add(Dense(200, input_shape=(shape,), activation="relu"))
    model.add(Dense(200, activation="relu"))
    model.add(Dense(classes, activation="sigmoid"))
    return model

### Initialize variables to train model
Initialize all the variables like epoch, learning_rate, metrics to evaluate, and the optimizer to be used in the neural network.

In [None]:
#Defining total FL iterations, evaluation metric, binaryentropy loss between labels and predictions, and SGD optimizer
FL_epochs = 2
lr = 0.01
loss='binary_crossentropy'
metrics=['accuracy']
optimizer = SGD(learning_rate = lr, decay= lr / FL_epochs, momentum = 0.9)

### Calculate weighted scale of dataset for each client
For each client, calculate the proportion of dataset in the batch to the total dataset.

In [None]:
def weight_scaling_factor(clients_batched, client):
  client_names = list(clients_batched.keys())

  #bs is the integer represeting number of batch
  bs = list(clients_batched[client])[0][0].shape[0]
  
  #clients_cardinality comprises the cardinality of all the dataset 
  clients_cardinality = []
  for client_name in client_names:
    clients_cardinality.append(tf.data.experimental.cardinality(clients_batched[client_name]).numpy())

  #sum of cardinality of the entire dataset
  global_count = sum(clients_cardinality) * bs

  #cardinality of a dataset specific to this client
  local_count = tf.data.experimental.cardinality(clients_batched[client]).numpy() * bs

  #return the scale of the local dataset compared to entire dataset 
  return local_count / global_count

### Scale the weights
The function below returns the list of weight for the local model after it is scaled to the scaling factor generated from the above function. 

In [None]:
#Returns the list of lists of weights scaled to the specific client dataset scale 
def scale_model_weights(weight, scalar):
  final_weight = []
  steps = len(weight)
  for i in range(steps):
    final_weight.append(weight[i] * scalar)  #weight[i] is a list of weights
  return final_weight

### Calculate average weights from the local  models
Given the list of list containing weights for each local model, an average model weight is calculated to update the global model.

In [None]:
#Returns the list of average_weights for the global model 
def sum_scaled_weights(scaled_local_weight_list):
  average_weights = list()
  for weight_list_tuple in zip(*scaled_local_weight_list):
    layer_mean = tf.math.reduce_sum(weight_list_tuple, axis=0) #adds corresponding weight of every model which is mean
    average_weights.append(layer_mean.numpy())
  
  return average_weights

### Evaluate the model
The function below helps evaluate the model with accuracy and loss metrics. 

In [None]:
#Test model to make predictions on the available test dataset 
def test_model(x_test, y_test, model, epoch):
  cce = tf.keras.losses.BinaryCrossentropy(from_logits=True)
  pred = model.predict(x_test)
  for ind,item in enumerate(pred):
    if item[0] > 0.5:
      item[0] = 1
    else:
      item[0] = 0

  loss = cce(y_test, pred)
  acc = accuracy_score(y_test, pred)
  print('epoch: {} | global_acc: {:.2%} | global_loss: {}'.format(epoch, acc, loss))
  return acc, loss

### Create global model
The code below is the main of the program which creates the global model. For each client a local model is created and trained with the corresponding batch data. Each model weights of the clients are then calculated, scaled, and aggregated to update the weight of the global model.

In [None]:
tf.get_logger().setLevel('ERROR')
global_MLP = SimpleMLP()
#CelebA image shape is (218, 178, 3), Grayscale CelebA image shape is (218, 178, 1)
global_model = global_MLP.build(38804, 1)

for FL_epoch in range(FL_epochs):
  global_weights = global_model.get_weights()
  scaled_local_weight_list = []
 
  #randomly shuffle the client names
  client_names = list(clients_batched.keys())
  random.shuffle(client_names)

  """
  For each client, creates a local model and sets the weight to the global model weights.
  Trains the model using the data shard for the specific client.
  Generates the new local weight from the model and scales it to the fraction of the dataset.
  """
  
  for client in client_names:
    local_MLP = SimpleMLP()
    local_model = local_MLP.build(38804, 1)
    local_model.compile(loss=loss, optimizer=optimizer, metrics=metrics)

    local_model.set_weights(global_weights)
    local_model.fit(clients_batched[client], epochs=1)
   
    scaling_factor = weight_scaling_factor(clients_batched, client)
    scaled_weights = scale_model_weights(local_model.get_weights(), scaling_factor)
    scaled_local_weight_list.append(scaled_weights)

    K.clear_session()
  
  #The list of lists containing weights of each local model is then reverse zipped and aggregated weight of each layer from all the local model is calculated
  average_weights = sum_scaled_weights(scaled_local_weight_list)
    
  global_model.set_weights(average_weights)

  for(xt, yt) in test_batched:
    global_acc, global_loss = test_model(xt, yt, global_model, FL_epoch)

**What is the advantage of using Federated Learning? [3 points]**
The main advantage of federated learning is it allows collaborative learning without sharing data. Only the model parameters are sent to a server which aggregates all the parameters received from various distributed devices. This frameworks preserves privacy, enhances secuirty and reduce bandwith requirement.

**How can we achieve security and privacy using Federated Learning in ML based systems? [4 points]**
In the framework of federated learning, clients or devices only required to send model parameters while the data remains local in the device. As only model parameters are exchanged,  it not only preserves privacy but also enhances security. With just model parameters, it is difficult for an attacker/server to reveal sensitive information that the data would have contained. 

**What are the possible ways that one can use for updating global model parameters in Federated Learning? [2 points]**

The common way to update global parameters is by aggregating all the parameters (local) received from clients. The most common way for aggregation is federated averaging which is performed in the server side and then the global model is updated by aggregated parameters. After this, global model is tested and if the test is satisfactory, the updated global parameters are sent back to end devices.

**What are the challenges you see in FL? (1 point)**
The main challenges in FL are that the end devices are usually resource constrained in terms of power, computation, storage and communication bandwidth to participate in federated learning and achieve the full potential of FL. The other challenges are, end IoT devices are usually vulnerable and attackers might get control of such device(s) to degrade the performance of global model. 