<a href="https://colab.research.google.com/github/harikuts/federated-learning-trials/blob/master/FederatedLearningRepro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview

This notebook contains the reproduction of results of the original paper on federated learning.

## Plan

The roadmap for development is as follows:
*   Construct standard MNIST example.
*   To be continued.




# Standard MNIST

There are baseline implementations of a standard example of MNIST. A Keras implementation staands as the first example, but we will port this over to Tensorflow as it provides more low-level functionality.

## Example 1: Keras Implementation

A standard MNIST example from Keras (https://keras.io/examples/mnist_cnn/) is used as a basis to compare our fedeerated model to.

In [0]:
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

import pdb

# Configuration
batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# pdb.set_trace()

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

## Example 2.0: Tensorflow Implementation (Easy)

We begin by using a tutorial provided by Tensorflow (https://www.tensorflow.org/tutorials/quickstart/beginner). It already happens to use the MNIST example.

In [14]:
from __future__ import absolute_import, division, print_function, unicode_literals


import tensorflow as tf

tf.enable_eager_execution()

# Import MNIST data
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Create model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(32, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

# Predictions
predictions = model(x_train[:1]).numpy()
# Softmax
tf.nn.softmax(predictions).numpy()

# Defining the loss function
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()

# Compile model
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'], validation_data=(x_test, y_test))

# Fit model
model.fit(x_train, y_train, epochs=16)

Train on 60000 samples
Epoch 1/16
Epoch 2/16
Epoch 3/16
Epoch 4/16
Epoch 5/16
Epoch 6/16
Epoch 7/16
Epoch 8/16
Epoch 9/16
Epoch 10/16
Epoch 11/16
Epoch 12/16
Epoch 13/16
Epoch 14/16
Epoch 15/16
Epoch 16/16


<tensorflow.python.keras.callbacks.History at 0x7eff419781d0>

In [0]:
# print(model.get_weights()[0].shape)
# print(model.get_weights()[1].shape)
# print(model.get_weights()[2].shape)
# print(model.get_weights()[3].shape)

import numpy as np
a = np.array([1, 2, 3, 4])

b = a + a
b = sum([a,a])
print(b)
b = b / 2
print(b)

[2 4 6 8]
[1. 2. 3. 4.]


# Experimental Approaches

## Federated Learning Validation

### Network Model
Here we use nodes to carry models. The reason for doing this to prevent the instantiation of new models each time weights have to be transferred. Instead, the state of each model can be preserved in the node that it resides in.

In [0]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import random
import pdb

# Used to start execution ASAP
tf.enable_eager_execution()

# Configuration
num_clients = 4
num_epochs = 2
num_server_rounds = 4
num_client_rounds = 4
nonIID = True
print ("Configuration:" + \
       "\n\t%d clients." % (num_clients) + \
       "\n\t%d training epochs." % (num_epochs)  + \
       "\n\tUsing %sIID data." % ("non-" if nonIID else ""))

# Server class
class Server:
  def __init__(self, modelGenerator):
    self.model = modelGenerator()
    self.clients = []
    self.neighbors = []
# Client class
class Client:
  def __init__(self, modelGenerator):
    self.model = modelGenerator()
    self.neighbors = []
    self.x_data = None
    self.y_data = None
    self.data_size = None
  def plotAccuracy(self, histories):
    # Compile histories
    categorical_accuracy = []
    val_categorical_accuracy = []
    for history in histories:
      categorical_accuracy = categorical_accuracy + history.history['acc']
      # val_categorical_accuracy = val_categorical_accuracy + history.history['val_categorical_accuracy']
    # The history of our accuracy during training.
    plt.plot(categorical_accuracy)
    plt.plot(val_categorical_accuracy)
    plt.title('Model Accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Number of epochs')
    plt.legend(['train', 'validation'], loc='upper left')
    return plt
  def train(self):
    history = self.model.fit(self.x_data, self.y_data, epochs=num_epochs)
    # print(history.history.keys())
    # self.accPlot = self.plotAccuracy([history])

# NN model generator function
def createNN():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
  ])
  loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
  return model
# Weight averaging
def averageWeights(weightsList, weighting=None):
  denominator = len(weightsList)
  new_weights = []
  if weighting is None:
    # Handle IID data (balanced)
    for part in range(len(weightsList[0])):
      part_stack = [weights[part] for weights in weightsList]
      new_stack = sum(part_stack) / denominator
      new_stack = np.array(new_stack)
      new_weights.append(new_stack)
    return new_weights
  else:
    for part in range(len(weightsList[0])):
      part_stack = [weights[part] for weights in weightsList]
      # part_stack = np.array(part_stack) * weighting
      for i in range(len(weighting)):
        part_stack[i] = part_stack[i] * weighting[i]
      new_stack = sum(part_stack)
      new_stack = np.array(new_stack)
      new_weights.append(new_stack)
    return new_weights


# Create the network
print ("\nCreating a network...")
server = Server(createNN)
for i in range(num_clients):
  server.clients.append(Client(createNN))

# Import MNIST data
print ("\nDownloading MNIST data...")
mnist = tf.keras.datasets.mnist
# Load data into trains
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Splitting the dataset for different clients
print ("\nSplitting data into different clients...")
if nonIID:
  print ("\tRandomly assigning ranges of data...")
  percentageMarkers = []
  for i in range(num_clients-1):
    percentageMarkers.append(random.random())
  percentageMarkers.append(1.0)
  percentageMarkers = sorted(percentageMarkers)
else:
  print ("\tUniformly assigning ranges of data")
  percentageMarkers = [1/num_clients * (n+1) for n in range(num_clients)]
# Storing each subset of data in a client
print ("\tStoring subsets of data into each client...")
xMarkers = [int(marker * len(x_train)) for marker in percentageMarkers]
yMarkers = [int(marker * len(y_train)) for marker in percentageMarkers]
for j in range(len(percentageMarkers)):
  server.clients[j].x_data = x_train[(xMarkers[j-1] if j > 0 else 0):xMarkers[j]]
  server.clients[j].y_data = y_train[(yMarkers[j-1] if j > 0 else 0):yMarkers[j]]
  server.clients[j].data_size = len(server.clients[j].x_data)

# Client data diagnostic
print ("\nFinished setting up client data!")
for client in server.clients:
  print ("\tClient %d:\tX: %d\tY: %d" % (server.clients.index(client), len(client.x_data), len(client.y_data)))

# Server action
server_accuracies = []
server_losses = []
for server_round in range(num_server_rounds):
  print("\nSERVER ROUND ", server_round, ":\n")
  # Save server model weights
  global_weights = server.model.get_weights()
  # Clients' actions
  client_weight_list = []
  for client in server.clients:
    print("\nCLIENT ", server.clients.index(client), ":\n")
    # Initialize recorded weights
    round_weight_list = []
    for client_round in range(num_client_rounds):
      # Accept global weights
      client.model.set_weights(global_weights)
      # Train
      client.train()
      # Record weights
      round_weight_list.append(client.model.get_weights())
    client_weight_list.append(averageWeights(round_weight_list))
  client_data_sizes = [client.data_size for client in server.clients]
  client_weighting = np.array(client_data_sizes) / sum(client_data_sizes)
  server.model.set_weights(averageWeights(client_weight_list, weighting=client_weighting))
  loss, acc = server.model.evaluate(x_test, y_test)
  print("\nSERVER ROUND ", server_round, " ACCURACY: ", acc, "\n")
  server_accuracies.append(acc)
  server_losses.append(loss)
  print("FINAL RESULTS:\nAccuracies: ", server_accuracies, "\nLoss: ", server_losses)

## Decentralized Learning

In [13]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import random
import pdb

# Used to start execution ASAP
tf.enable_eager_execution()

# Configuration
num_clients = 4
num_epochs = 2
num_learning_rounds = 4
num_client_rounds = 1
nonIID = True
print ("Configuration:" + \
       "\n\t%d clients." % (num_clients) + \
       "\n\t%d training epochs." % (num_epochs)  + \
       "\n\tUsing %sIID data." % ("non-" if nonIID else ""))

# Client class
class Client:
  def __init__(self, modelGenerator):
    self.model = modelGenerator()
    self.neighbors = []
    self.x_data = None
    self.y_data = None
    self.data_size = None
    self.accuracy_history = []
    self.loss_history = []
  def plotAccuracy(self, histories):
    # Compile histories
    categorical_accuracy = []
    val_categorical_accuracy = []
    for history in histories:
      categorical_accuracy = categorical_accuracy + history.history['acc']
      # val_categorical_accuracy = val_categorical_accuracy + history.history['val_categorical_accuracy']
    # The history of our accuracy during training.
    plt.plot(categorical_accuracy)
    plt.plot(val_categorical_accuracy)
    plt.title('Model Accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Number of epochs')
    plt.legend(['train', 'validation'], loc='upper left')
    return plt
  def train(self):
    history = self.model.fit(self.x_data, self.y_data, epochs=num_epochs)
    # print(history.history.keys())
    # self.accPlot = self.plotAccuracy([history])
  def test(self, x, y):
    loss, acc = self.model.evaluate(x, y)
    self.accuracy_history.append(acc)
    self.loss_history.append(loss)
    # return loss, acc
  def setOutgoingMessage(self):
    self.outgoing_message = (self.data_size, self.model.get_weights())
  def communityLearn(self):
    # incoming_messages = [neighbor.outgoing_message for neighbor in self.neighbors]
    # incoming_messages.append((self.data_size, self.outgoing_message))
    ordered_sizes = [neighbor.data_size for neighbor in self.neighbors]
    ordered_sizes = np.array(ordered_sizes) / sum(ordered_sizes)
    ordered_weights = [neighbor.model.get_weights() for neighbor in self.neighbors]
    new_weights = averageWeights(ordered_weights, ordered_sizes)
    # print(self.model.get_weights() == new_weights)
    self.model = createNN()
    self.model.set_weights(new_weights)

# NN model generator function
def createNN():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
  ])
  loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
  return model
# Weight averaging
def averageWeights(weightsList, weighting=None):
  denominator = len(weightsList)
  new_weights = []
  if weighting is None:
    # Handle IID data (balanced)
    for part in range(len(weightsList[0])):
      part_stack = [weights[part] for weights in weightsList]
      new_stack = sum(part_stack) / denominator
      new_stack = np.array(new_stack)
      new_weights.append(new_stack)
    return new_weights
  else:
    for part in range(len(weightsList[0])):
      part_stack = [weights[part] for weights in weightsList]
      # part_stack = np.array(part_stack) * weighting
      for i in range(len(weighting)):
        part_stack[i] = part_stack[i] * weighting[i]
      new_stack = sum(part_stack)
      new_stack = np.array(new_stack)
      new_weights.append(new_stack)
    return new_weights


# Create the network
print ("\nCreating a network...")
clientList = []
for i in range(num_clients):
  clientList.append(Client(createNN))
# Add neighbors
for client in clientList:
  for neighbor in clientList:
    client.neighbors.append(neighbor)

# Import MNIST data
print ("\nDownloading MNIST data...")
mnist = tf.keras.datasets.mnist
# Load data into trains
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Splitting the dataset for different clients
print ("\nSplitting data into different clients...")
if nonIID:
  print ("\tRandomly assigning ranges of data...")
  percentageMarkers = []
  for i in range(num_clients-1):
    percentageMarkers.append(random.random())
  percentageMarkers.append(1.0)
  percentageMarkers = sorted(percentageMarkers)
else:
  print ("\tUniformly assigning ranges of data")
  percentageMarkers = [1/num_clients * (n+1) for n in range(num_clients)]
# Storing each subset of data in a client
print ("\tStoring subsets of data into each client...")
xMarkers = [int(marker * len(x_train)) for marker in percentageMarkers]
yMarkers = [int(marker * len(y_train)) for marker in percentageMarkers]
for j in range(len(percentageMarkers)):
  clientList[j].x_data = x_train[(xMarkers[j-1] if j > 0 else 0):xMarkers[j]]
  clientList[j].y_data = y_train[(yMarkers[j-1] if j > 0 else 0):yMarkers[j]]
  clientList[j].data_size = len(clientList[j].x_data)

# Client data diagnostic
print ("\nFinished setting up client data!")
for client in clientList:
  print ("\tClient %d:\tX: %d\tY: %d" % (clientList.index(client), len(client.x_data), len(client.y_data)))

for learning_round in range(num_learning_rounds):
  print("\nLEARNING ROUND ", learning_round, ":\n")
  # Have each client learn on its data
  for client in clientList:
    print ("\nROUND", learning_round, ", CLIENT", clientList.index(client), "TRAINING\n")
    client.train()
  # Communicate and learn
  for client in clientList:
    print ("\nROUND", learning_round, ", CLIENT", clientList.index(client), "LEARNING\n")
    client.communityLearn()
  # Test at the end of this round
  for client in clientList:
    print ("\nROUND", learning_round, ", CLIENT", clientList.index(client), "TESTING\n")
    client.test(x_test, y_test)
# Print out results
for client in clientList:
  print("Client ", clientList.index(client), ":\n")
  print("\tAccuracy History: ", client.accuracy_history, "\n")
  print("\tLoss History: ", client.loss_history, "\n")

Configuration:
	4 clients.
	2 training epochs.
	Using non-IID data.

Creating a network...

Downloading MNIST data...

Splitting data into different clients...
	Randomly assigning ranges of data...
	Storing subsets of data into each client...

Finished setting up client data!
	Client 0:	X: 24679	Y: 24679
	Client 1:	X: 356	Y: 356
	Client 2:	X: 8755	Y: 8755
	Client 3:	X: 26210	Y: 26210

LEARNING ROUND  0 :


ROUND 0 , CLIENT 0 TRAINING

Train on 24679 samples
Epoch 1/2
Epoch 2/2

ROUND 0 , CLIENT 1 TRAINING

Train on 356 samples
Epoch 1/2
Epoch 2/2

ROUND 0 , CLIENT 2 TRAINING

Train on 8755 samples
Epoch 1/2
Epoch 2/2

ROUND 0 , CLIENT 3 TRAINING

Train on 26210 samples
Epoch 1/2
Epoch 2/2

ROUND 0 , CLIENT 0 LEARNING


ROUND 0 , CLIENT 1 LEARNING


ROUND 0 , CLIENT 2 LEARNING


ROUND 0 , CLIENT 3 LEARNING


ROUND 0 , CLIENT 0 TESTING


ROUND 0 , CLIENT 1 TESTING


ROUND 0 , CLIENT 2 TESTING


ROUND 0 , CLIENT 3 TESTING


LEARNING ROUND  1 :


ROUND 1 , CLIENT 0 TRAINING

Train on 24679