<a href="https://colab.research.google.com/github/harikuts/federated-learning-trials/blob/master/FederatedLearningRepro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview

This notebook contains the reproduction of results of the original paper on federated learning.

## Plan

The roadmap for development is as follows:
*   Construct standard MNIST example.
*   To be continued.




# Standard MNIST

There are baseline implementations of a standard example of MNIST. A Keras implementation staands as the first example, but we will port this over to Tensorflow as it provides more low-level functionality.

## Example 1: Keras Implementation

A standard MNIST example from Keras (https://keras.io/examples/mnist_cnn/) is used as a basis to compare our fedeerated model to.

In [0]:
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

import pdb

# Configuration
batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# pdb.set_trace()

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

## Example 2.0: Tensorflow Implementation (Easy)

We begin by using a tutorial provided by Tensorflow (https://www.tensorflow.org/tutorials/quickstart/beginner). It already happens to use the MNIST example.

In [0]:
from __future__ import absolute_import, division, print_function, unicode_literals


import tensorflow as tf

tf.enable_eager_execution()

# Import MNIST data
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Create model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

# Predictions
predictions = model(x_train[:1]).numpy()
# Softmax
tf.nn.softmax(predictions).numpy()

# Defining the loss function
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()

# Compile model
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])

# Fit model
model.fit(x_train, y_train, epochs=5)

Train on 60000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


AttributeError: ignored

In [0]:
print(model.get_weights()[0].shape)
print(model.get_weights()[1].shape)
print(model.get_weights()[2].shape)
print(model.get_weights()[3].shape)

(784, 128)
(128,)
(128, 10)
(10,)


# Federated Mode Simulations

## Learning Instance Approach (defunct)

In this section, a beginning model is made that describes the higher level behavior of data interactions. I abandoned this model to pursue a simulation closer to the network level.

In [0]:
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D

import random
import pdb

# Configuration
batch_size = 128
num_classes = 10
epochs = 12

# Federated configuration
num_clients = 4
num_server_rounds = 8
num_client_rounds = 2

# mnist_train = tfds.load(name="mnist", split="train")
# mnist_train = mnist_train.repeat().shuffle(1024).batch(32)
# mnist_train = mnist_train.prefetch(tf.data.experimental.AUTOTUNE)
# mnist_test, info = tfds.load("mnist", split="test", with_info=True)

mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Splitting the dataset for different clients
nonIID = True
if nonIID:
  percentageMarkers = []
  for i in range(num_clients-1):
    percentageMarkers.append(random.random())
  percentageMarkers.append(1.0)
  percentageMarkers = sorted(percentageMarkers)
else:
  percentageMarkers = [1/num_clients * (n+1) for n in range(num_clients)]

# pdb.set_trace()

client_x_trains = []
client_y_trains = []
xMarkers = [int(marker * len(x_train)) for marker in percentageMarkers]
yMarkers = [int(marker * len(y_train)) for marker in percentageMarkers]
# pdb.set_trace()
for j in range(len(percentageMarkers)):
  client_x_trains.append(x_train[(xMarkers[j-1] if j > 0 else 0):xMarkers[j]])
  client_y_trains.append(x_train[(yMarkers[j-1] if j > 0 else 0):yMarkers[j]])

# pdb.set_trace()

# Model creation function
def createCNN():
  model = Sequential()
  model.add(Conv2D(32, kernel_size=(3, 3),
                  activation='relu',
                  input_shape=(1, 28, 28)))
  model.add(Conv2D(64, (3, 3), activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Flatten())
  model.add(Dense(128, activation='relu'))
  model.add(Dropout(0.5))
  model.add(Dense(10, activation='softmax'))
  model.compile(
            optimizer=keras.optimizers.Adadelta(),
            loss='sparse_categorical_crossentropy',
            metrics=['sparse_categorical_accuracy'])
  return model

def createNN():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
  ])
  return model

# Global model initialization
global_model = createCNN()
global_model.fit(x_train, y_train, epochs=epochs)
first_model_done = False


# Server action
for server_round in range(num_server_rounds):
  print("SERVER ROUND ", server_round, ":")
  # Clients' actions
  for client in range(num_clients):
    print("\tCLIENT ", client, "...")
    # Accept the global model
    if first_model_done:
      global_weights = global_model.get_weights()
    local_weights = []
    # Per each round
    for client_round in range(num_client_rounds):
      print("\t\tRound ", client_round)
      # Train on the local model
      round_model = createCNN()
      if first_model_done:
        round_model.set_weights(global_weights)
      # round_model.fit(client_x_trains[client], client_y_trains[client],
      #     batch_size=batch_size,
      #     epochs=epochs,
      #     verbose=1,
      #     validation_data=(x_test, y_test))
      # local_weights.append(round_model.get_weights())
      first_model_done = True
    pdb.set_trace()
      




ValueError: ignored

## Network Model Approach

Here we use nodes to carry models. The reason for doing this to prevent the instantiation of new models each time weights have to be transferred. Instead, the state of each model can be preserved in the node that it resides in.

In [19]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import random
import pdb

# Used to start execution ASAP
tf.enable_eager_execution()

# Configuration
num_clients = 4
num_epochs = 5
num_server_rounds = 2
num_client_rounds = 2
nonIID = False
print ("Configuration:" + \
       "\n\t%d clients." % (num_clients) + \
       "\n\t%d training epochs." % (num_epochs)  + \
       "\n\tUsing %sIID data." % ("non-" if nonIID else ""))

# Server class
class Server:
  def __init__(self, modelGenerator):
    self.model = modelGenerator()
    self.clients = []
    self.neighbors = []
# Client class
class Client:
  def __init__(self, modelGenerator):
    self.model = modelGenerator()
    self.neighbors = []
    self.x_data = None
    self.y_data = None
  def train(self):
    self.model.fit(self.x_data, self.y_data, epochs=num_epochs)
# NN model generator function
def createNN():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
  ])
  loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
  return model
# Weight averaging
def averageWeights(weightsList):
  pass

# Create the network
print ("\nCreating a network...")
server = Server(createNN)
for i in range(num_clients):
  server.clients.append(Client(createNN))

# Import MNIST data
print ("\nDownloading MNIST data...")
mnist = tf.keras.datasets.mnist
# Load data into trains
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Splitting the dataset for different clients
print ("\nSplitting data into different clients...")
if nonIID:
  print ("\tRandomly assigning ranges of data...")
  percentageMarkers = []
  for i in range(num_clients-1):
    percentageMarkers.append(random.random())
  percentageMarkers.append(1.0)
  percentageMarkers = sorted(percentageMarkers)
else:
  print ("\tUniformly assigning ranges of data")
  percentageMarkers = [1/num_clients * (n+1) for n in range(num_clients)]
# Storing each subset of data in a client
print ("\tStoring subsets of data into each client...")
xMarkers = [int(marker * len(x_train)) for marker in percentageMarkers]
yMarkers = [int(marker * len(y_train)) for marker in percentageMarkers]
for j in range(len(percentageMarkers)):
  server.clients[j].x_data = x_train[(xMarkers[j-1] if j > 0 else 0):xMarkers[j]]
  server.clients[j].y_data = y_train[(yMarkers[j-1] if j > 0 else 0):yMarkers[j]]

# Client data diagnostic
print ("\nFinished setting up client data!")
for client in server.clients:
  print ("\tClient %d:\tX: %d\tY: %d" % (server.clients.index(client), len(client.x_data), len(client.y_data)))

# Server action
for server_round in range(num_server_rounds):
  print("\nSERVER ROUND ", server_round, ":\n")
  # Save server model weights
  global_weights = server.model.get_weights()
  # Clients' actions
  client_weight_list = []
  for client in server.clients:
    # Initialize recorded weights
    round_weight_list = []
    for client_round in range(num_client_rounds):
      # Accept global weights
      client.model.set_weights(global_weights)
      # Train
      client.train()
      # Record weights
      round_weight_list.append(client.model.get_weights())
    client_weight_list.append(averageWeights(round_weight_list))
  server.model.set_weights(averageWeights(client_weight_list))


Configuration:
	4 clients.
	5 training epochs.
	Using IID data.

Creating a network...

Downloading MNIST data...

Splitting data into different clients...
	Uniformly assigning ranges of data
	Storing subsets of data into each client...

Finished setting up client data!
	Client 0:	X: 15000	Y: 15000
	Client 1:	X: 15000	Y: 15000
	Client 2:	X: 15000	Y: 15000
	Client 3:	X: 15000	Y: 15000

SERVER ROUND  0 :

Train on 15000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Train on 15000 samples
Epoch 1/5
Epoch 2/5

KeyboardInterrupt: ignored