## Personalized Learning (Localized Learning?)

#### This notebook includes the following online models;
1. A single global model with all data
2. Multiple local models (starting from a single global model)
   1. that are updated with new data
   2. that exchanges data in clusters
   3. that exchanges parameters in clusters

  
#### The dataset that is used for this project is [CIFAR-100 dataset][1]
* Has 100 classes containing 600 images each

#### New data are fed by the following rules;
1. Distributed, according to superclasses
  * Clusters will only be updated with data that belongs to a specific superclass
  * We update the NN by
    1. Changing all parameters of the NN
    2. Only changing the last few layers, as in many MTL models
2. Randomly (why?)

#### We expect to find an answer to the following research questions with this project;
1. If models are updated with data (or parameters) that are shared within a cluster, can the model perform good enough with the labels that count?
  * For example, the performance of the cluster that are updated with "Vehicles" superclass is only assessed with the labels that corresponds to the superclass.
  
[1]: https://www.cs.toronto.edu/~kriz/cifar.html

#### Questions

Retraining: how does it work <br>
How do we compare these models?


### Implementation with Custom Neural Network and EMNIST dataset

In [1]:
%load_ext tensorboard

In [2]:
from __future__ import print_function
import tensorflow.keras as keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import backend as K

In [3]:
import matplotlib

In [4]:
import datetime
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.lines as mlines

In [5]:
import tensorflow as tf
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

In [6]:
tf.__version__

'1.15.2'

In [7]:
# Hyperparameters
batch_size = 50
epochs = 20

# input image dimensions
img_rows, img_cols = 28, 28

#### Load MNIST dataset

In [8]:
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [9]:
if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

In [10]:
x_train.shape

(60000, 28, 28, 1)

In [11]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

In [12]:
global_dataset_size = 6000
local_dataset_size = 40000

In [13]:
import utils

In [14]:
import importlib
importlib.reload(utils)

<module 'utils' from '/home/seth/projects/fed-learn-experiment/utils.py'>

In [16]:
x_train_5_to_9, y_train_5_to_9 = utils.filter_data_by_labels(x_train, y_train, np.arange(5)+5)

In [17]:
x_test_0_to_4, y_test_0_to_4 = utils.filter_data_by_labels(x_test, y_test, np.arange(5))

In [18]:
x_test_5_to_9, y_test_5_to_9 = utils.filter_data_by_labels(x_test, y_test, np.arange(5)+5)

In [19]:
# convert class vectors to binary class matrices
num_classes = 10
y_train_0_to_4 = keras.utils.to_categorical(y_train_0_to_4, num_classes)
y_train_5_to_9 = keras.utils.to_categorical(y_train_5_to_9, num_classes)
y_test_0_to_4 = keras.utils.to_categorical(y_test_0_to_4, num_classes)
y_test_5_to_9 = keras.utils.to_categorical(y_test_5_to_9, num_classes)

In [20]:
y_test = keras.utils.to_categorical(y_test, num_classes)

### Define models and compile & fit function

In [21]:
def custom_model():
    model = Sequential()
    model.add(Flatten(input_shape=input_shape))
    model.add(Dense(200, activation='relu'))
    model.add(Dense(200, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    return model

In [22]:
def compile_model(model):  
    # initiate SGD optimizer
    opt = keras.optimizers.SGD(lr=0.1)
    model.compile(loss='mean_squared_error', optimizer=opt, metrics=['accuracy'])

In [23]:
def compile_model_lr(model):  
    # initiate SGD optimizer
    opt = keras.optimizers.SGD(lr=lr, decay=1e-6, momentum=0.9, nesterov=True)
    model.compile(loss='mean_squared_error', optimizer=opt, metrics=['accuracy'])

In [24]:
def fit_model_global(model, epochs):
    now = datetime.datetime.now()
    print ("Training date and time : ")
    print (now.strftime("%Y-%m-%d %H:%M:%S"))
    return model.fit(X_global, Y_global,
                      batch_size=100,
                      epochs=40,
                      shuffle=True, callbacks=[tensorboard_callback])

In [25]:
def fit_model_with_datasets(model, epochs, x_train, y_train):
    now = datetime.datetime.now()
    print ("Training date and time : ")
    print (now.strftime("%Y-%m-%d %H:%M:%S"))
    return model.fit(x_train, y_train,
                      batch_size=batch_size,
                      epochs=epochs,
                      shuffle=True, validation_split=0.1, verbose=1)

In [26]:
init_model = custom_model()

Instructions for updating:
If using Keras pass *_constraint arguments to layers.


In [27]:
model1 = custom_model()
model2 = custom_model()
model1.set_weights(init_model.get_weights())
model2.set_weights(init_model.get_weights())
compile_model(model1)
compile_model(model2)

In [57]:
his = fit_model_with_datasets(model1, 0, x_train_0_to_4, y_train_0_to_4)

Training date and time : 
2020-05-20 14:54:02
Train on 27536 samples, validate on 3060 samples


In [59]:
his.history

{}

In [28]:
fit_model_with_datasets(model1, 30, x_train_0_to_4, y_train_0_to_4)
fit_model_with_datasets(model2, 30, x_train_5_to_9, y_train_5_to_9)

Training date and time : 
2020-05-18 16:37:27
Train on 27536 samples, validate on 3060 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
Training date and time : 
2020-05-18 16:37:55
Train on 26463 samples, validate on 2941 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x7efb799e6080>

### Aggregate models

In [29]:
agg_model = custom_model()
weights = [model1.get_weights(), model2.get_weights()]
agg_weights = list()
for weights_list_tuple in zip(*weights):
    agg_weights.append(np.array([np.average(np.array(w), axis=0) for w in zip(*weights_list_tuple)]))
agg_model.set_weights(agg_weights)

In [30]:
model1.evaluate(x=x_test_0_to_4, y=y_test_0_to_4, verbose=1)



[0.0024359100811186126, 0.9857949]

In [31]:
model1.evaluate(x=x_test_5_to_9, y=y_test_5_to_9, verbose=1)



[0.1668007507586131, 0.0]

In [32]:
model1.evaluate(x=x_test, y=y_test, verbose=1)



[0.08233365869522094, 0.5066]

In [33]:
model2.evaluate(x=x_test_0_to_4, y=y_test_0_to_4, verbose=1)



[0.17027101150037344, 0.0]

In [34]:
model2.evaluate(x=x_test_5_to_9, y=y_test_5_to_9, verbose=1)



[0.004858046754005197, 0.96811354]

In [35]:
model2.evaluate(x=x_test, y=y_test, verbose=1)



[0.08986376942396164, 0.4706]

In [36]:
compile_model(agg_model)
agg_model.evaluate(x=x_test, y=y_test, verbose=1)



[0.038877179938554766, 0.8343]

In [37]:
agg_model.evaluate(x=x_test_0_to_4, y=y_test_0_to_4, verbose=1)



[0.04094871223091755, 0.82681453]

In [38]:
agg_model.evaluate(x=x_test_5_to_9, y=y_test_5_to_9, verbose=1)



[0.03668717683937245, 0.8422135]

In [39]:
import semantic_drift

In [40]:
semantic_drift.l2_distance(model1, model2)

5.7130294

In [41]:
semantic_drift.l2_distance(init_model, model1)

3.7985344

In [42]:
semantic_drift.l2_distance(init_model, model2)

4.434219