# Final Prototype for TFF Skin Lesion Type Classification
## Simulating Federated Learning

---

This notebook works through the required ETL process, to package the data into a federated tensor format and proposes a solution to simulate federated model training with two clients.

---

**Original data source:**
The HAM10000 dataset served as the training set for the [ISIC 2018 challenge (Task 3)](https://arxiv.org/abs/1902.03368) back in 2018. The official validation- and test-sets of this challenge are available, without ground-truth labels, through the challenge website https://challenge2018.isic-archive.com/. 

**Data import source for this notebook:**
The HAM10000 data used in this notebook is loaded via the Kaggle API from this [Kaggle data source](https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000).

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T

---

The author of this notebook used code from: [Federated Learning for Image Classification](https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification) and [Elena Gavagnin](https://github.com/gavagnin)

## 1. Installations
Google Colab is a Jupyter notebook environment that runs entirely in the cloud. Usually, the environment is already setup with TensorFlow 2, but TensorFlow Federated needs to be installed manually.


In [0]:
pip install folium==0.2.1

Collecting folium==0.2.1
[?25l  Downloading https://files.pythonhosted.org/packages/72/dd/75ced7437bfa7cb9a88b96ee0177953062803c3b4cde411a97d98c35adaf/folium-0.2.1.tar.gz (69kB)
[K     |████▊                           | 10kB 18.0MB/s eta 0:00:01[K     |█████████▍                      | 20kB 1.8MB/s eta 0:00:01[K     |██████████████                  | 30kB 2.1MB/s eta 0:00:01[K     |██████████████████▊             | 40kB 1.7MB/s eta 0:00:01[K     |███████████████████████▍        | 51kB 1.9MB/s eta 0:00:01[K     |████████████████████████████    | 61kB 2.2MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 2.1MB/s 
Building wheels for collected packages: folium
  Building wheel for folium (setup.py) ... [?25l[?25hdone
  Created wheel for folium: filename=folium-0.2.1-cp36-none-any.whl size=79979 sha256=daf03c59734b5285a85717e4f095095414bc8bf3740871df78f2d898759e691b
  Stored in directory: /root/.cache/pip/wheels/b8/09/f0/52d2ef419c2aaf4fb149f92a33e0008bdce7ae81

In [0]:
!pip install imgaug==0.2.5

Collecting imgaug==0.2.5
[?25l  Downloading https://files.pythonhosted.org/packages/d2/60/a06a48d85a7e9062f5870347a3e3e953da30b37928d43b380c949bca458a/imgaug-0.2.5.tar.gz (562kB)
[K     |████████████████████████████████| 563kB 2.7MB/s 
Building wheels for collected packages: imgaug
  Building wheel for imgaug (setup.py) ... [?25l[?25hdone
  Created wheel for imgaug: filename=imgaug-0.2.5-cp36-none-any.whl size=561439 sha256=20627f02b5f08e63aa8f265f5240b81fb804d90a3bfd137c636e26702dec8cd3
  Stored in directory: /root/.cache/pip/wheels/31/48/c8/ca3345e8582a078de94243996e148377ef66fdb845557bae0b
Successfully built imgaug
Installing collected packages: imgaug
  Found existing installation: imgaug 0.2.9
    Uninstalling imgaug-0.2.9:
      Successfully uninstalled imgaug-0.2.9
Successfully installed imgaug-0.2.5


In [0]:
!pip install --upgrade tensorflow-probability

Collecting tensorflow-probability
[?25l  Downloading https://files.pythonhosted.org/packages/61/c5/783644c55074f42070acfa1662145f4a0c59ff425495194aa2dc4052f22a/tensorflow_probability-0.10.0-py2.py3-none-any.whl (3.5MB)
[K     |████████████████████████████████| 3.5MB 2.7MB/s 
Installing collected packages: tensorflow-probability
  Found existing installation: tensorflow-probability 0.10.0rc0
    Uninstalling tensorflow-probability-0.10.0rc0:
      Successfully uninstalled tensorflow-probability-0.10.0rc0
Successfully installed tensorflow-probability-0.10.0


In [0]:
#@test {"skip": true}
!pip install --quiet --upgrade tensorflow_federated

[K     |████████████████████████████████| 460kB 2.7MB/s 
[K     |████████████████████████████████| 3.0MB 12.3MB/s 
[K     |████████████████████████████████| 296kB 28.9MB/s 
[K     |████████████████████████████████| 174kB 30.6MB/s 
[K     |████████████████████████████████| 92kB 9.6MB/s 
[K     |████████████████████████████████| 1.0MB 3.8MB/s 
[?25h

In [0]:
import collections

import numpy as np
import tensorflow as tf
import tensorflow_federated as tff

tf.compat.v1.enable_v2_behavior()

In [0]:
@tff.federated_computation
def hello_world():
  return 'Hello, World!'

hello_world()

b'Hello, World!'

In [0]:
!pip install -q kaggle

In [0]:
!pip install imread

Collecting imread
[?25l  Downloading https://files.pythonhosted.org/packages/91/48/6725bcdf0d8c7ad1204579d882ae3b74052d444e926eb804c61a665e148a/imread-0.7.4-cp36-cp36m-manylinux2010_x86_64.whl (1.6MB)
[K     |████████████████████████████████| 1.6MB 2.5MB/s 
Installing collected packages: imread
Successfully installed imread-0.7.4


### 1.1 Check the installed version

In [0]:
import tensorflow as tf
print(tf.__version__)

2.2.0


In [0]:
import tensorflow_federated as tff
print(tff.__version__)

0.14.0


In [0]:
import numpy as np
print(np.__version__)

1.18.4


### 1.2 Setting up the Kaggle API
Using the Kaggle API on Google Colab allows you to directly work with the dataset without downloading and uploading it through your local machine. BUT the disadvantage of this approach is that every session you use Colab, the downloaded data sets and the kaggle json file will be gone and will have to be manually downloaded again.

This article [Setting Up Kaggle in Google Colab](https://towardsdatascience.com/setting-up-kaggle-in-google-colab-ebb281b61463) by Anne Bonner (2018) is helpful to set up your own Kaggle API Token in Google Colab. Once your token is set up, uncomment the code in the section below and run the notebook as usual.

In [0]:
# mount your google drive so you can save to it. You'll need to put in a token.
#from google.colab import drive
#drive.mount('/content/gdrive')

In [0]:
#from google.colab import files
#files.upload()

In [0]:
# create environment variables for kaggle to authenticate with
#import os

#os.environ['KAGGLE_USERNAME'] = "kaggle-username"
#os.environ['KAGGLE_KEY'] = "kaggle_key"

In [0]:
# let's list what's in the directory
#os.listdir()

In [0]:
# make directory named kaggle and copy kaggle.json file there
#!mkdir -p ~/.kaggle
#!cp kaggle.json ~/.kaggle/

In [0]:
# let's make a new directory for c_skin
#os.mkdir('c_skin')
#os.listdir()

## 2. Extract Data

This section reuses code components from the notbook [Initial Data Exploration and ETL Process](https://github.com/ChristinaSalker/federated-learning/blob/master/tff_skin_lesion_classification_etl_v6.ipynb). 

### 2.1 Download HAM10000 data

The raw data is downloaded as a zip file directly from Kaggle and extracted to the `c_skin` directory.

In [0]:
# get the dataset from kaggle and load it into c_skin
!kaggle datasets download -d kmader/skin-cancer-mnist-ham10000 -p 'c_skin'

Downloading skin-cancer-mnist-ham10000.zip to c_skin
100% 5.20G/5.20G [01:14<00:00, 11.1MB/s]
100% 5.20G/5.20G [01:14<00:00, 75.2MB/s]


In [0]:
# unzip the file into /c_skin
!unzip -o c_skin/skin-cancer-mnist-ham10000.zip -d c_skin

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: c_skin/ham10000_images_part_2/ISIC_0029326.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029327.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029328.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029329.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029330.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029331.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029332.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029333.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029334.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029335.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029336.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029337.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029338.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029339.jpg  
  inflating: c_skin/ham10000_images_part_2/ISIC_0029340.jpg  
  inf

In [0]:
# run this command to see all the files unzipped into the c_skin directory
!ls c_skin

ham10000_images_part_1	HAM10000_metadata.csv  hmnist_8_8_RGB.csv
HAM10000_images_part_1	hmnist_28_28_L.csv     skin-cancer-mnist-ham10000.zip
ham10000_images_part_2	hmnist_28_28_RGB.csv
HAM10000_images_part_2	hmnist_8_8_L.csv


### 2.2 Create df with image path

In [0]:
# first import the usual frameworks
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import collections
import warnings
import json
import os
from glob import glob

from IPython.core.display import display, HTML

# for processing images
from skimage import data, io, filters

# configure things
warnings.filterwarnings('ignore')

pd.options.display.float_format = '{:,.2f}'.format  
pd.options.display.max_columns = 999

%load_ext autoreload
%autoreload 2
%matplotlib inline

  import pandas.util.testing as tm


In [0]:
# building the path column for the individual images is a bit tricky 
# first the base_skin_dir which will be used for the imageid_path_dic needs to be defined
base_skin_dir = os.path.join('c_skin')

In [0]:
# this os.path method is used to create the image path
imageid_path_dict = {os.path.splitext(os.path.basename(x))[0]: x
                     for x in glob(os.path.join(base_skin_dir, '*', '*.jpg'))}

In [0]:
# let's read the HAM10000_metadata.csv into the tile_df
tile_df = pd.read_csv('c_skin/HAM10000_metadata.csv')
tile_df.head()

Unnamed: 0,lesion_id,image_id,dx,dx_type,age,sex,localization
0,HAM_0000118,ISIC_0027419,bkl,histo,80.0,male,scalp
1,HAM_0000118,ISIC_0025030,bkl,histo,80.0,male,scalp
2,HAM_0002730,ISIC_0026769,bkl,histo,80.0,male,scalp
3,HAM_0002730,ISIC_0025661,bkl,histo,80.0,male,scalp
4,HAM_0001466,ISIC_0031633,bkl,histo,75.0,male,ear


In [0]:
# let's create a new column for the image path
tile_df['path'] = tile_df['image_id'].map(imageid_path_dict.get)
tile_df.sample(5)

Unnamed: 0,lesion_id,image_id,dx,dx_type,age,sex,localization,path
1979,HAM_0007586,ISIC_0029819,mel,histo,70.0,female,upper extremity,c_skin/HAM10000_images_part_2/ISIC_0029819.jpg
2852,HAM_0000872,ISIC_0029323,bcc,histo,75.0,male,chest,c_skin/HAM10000_images_part_2/ISIC_0029323.jpg
8339,HAM_0002216,ISIC_0026305,nv,histo,60.0,male,back,c_skin/ham10000_images_part_1/ISIC_0026305.jpg
4188,HAM_0000240,ISIC_0028552,nv,follow_up,60.0,male,trunk,c_skin/ham10000_images_part_1/ISIC_0028552.jpg
3498,HAM_0006859,ISIC_0032321,nv,follow_up,35.0,male,lower extremity,c_skin/HAM10000_images_part_2/ISIC_0032321.jpg


In [0]:
# create dictionary of the different lesion types - this will be needed to index the skin lesin types numerically
lesion_type_dict = {
    'nv': 'Melanocytic nevi',
    'mel': 'Melanoma',
    'bkl': 'Benign keratosis-like lesions ',
    'bcc': 'Basal cell carcinoma',
    'akiec': 'Actinic keratoses',
    'vasc': 'Vascular lesions',
    'df': 'Dermatofibroma'
}

In [0]:
# create the cell_type_idx column, which represents the lesion type in numeric value
# later in the notebook the celltype_idx and and image features will be needed for the tf data pipeline
tile_df['cell_type'] = tile_df['dx'].map(lesion_type_dict.get) 
tile_df['cell_type_idx'] = pd.Categorical(tile_df['cell_type']).codes
tile_df.sample(5)

Unnamed: 0,lesion_id,image_id,dx,dx_type,age,sex,localization,path,cell_type,cell_type_idx
1710,HAM_0005256,ISIC_0033728,mel,histo,80.0,male,back,c_skin/HAM10000_images_part_2/ISIC_0033728.jpg,Melanoma,5
878,HAM_0002757,ISIC_0024909,bkl,consensus,85.0,female,face,c_skin/ham10000_images_part_1/ISIC_0024909.jpg,Benign keratosis-like lesions,2
3590,HAM_0001123,ISIC_0027554,nv,follow_up,50.0,female,lower extremity,c_skin/ham10000_images_part_1/ISIC_0027554.jpg,Melanocytic nevi,4
5102,HAM_0005029,ISIC_0031514,nv,follow_up,50.0,male,trunk,c_skin/HAM10000_images_part_2/ISIC_0031514.jpg,Melanocytic nevi,4
6923,HAM_0001088,ISIC_0032078,nv,histo,45.0,female,upper extremity,c_skin/HAM10000_images_part_2/ISIC_0032078.jpg,Melanocytic nevi,4


### 2.3 Loading and resizing images

In [0]:
# resize the images scikit-images to a smaller scale of 28x28x3
from PIL import Image, ImageFont

tile_df['image'] = tile_df['path'].map(lambda x: np.asarray(Image.open(x).resize((28,28))))

In [0]:
# let's check the image size distribution again and see if the resizing worked 
tile_df['image'].map(lambda x: x.shape).value_counts()

(28, 28, 3)    10015
Name: image, dtype: int64

## 3. Data Preprocessing

Instead of preprocessing the data for centralized model training, the data needs to be preprocessed for federated ML. One of the key challlenges that needs to be solved in this section is partitioning the data to different clients.

### 3.1 Select relevant data from df

In [0]:
# select only the relevant columns and rename the columns titles
tile_df = tile_df[['image','cell_type_idx']]

ham_dataset = tile_df.rename(columns={"image": "pixels", "cell_type_idx": "label"})
ham_dataset.sample(6)

Unnamed: 0,pixels,label
6944,"[[[114, 80, 86], [125, 91, 97], [135, 103, 111...",4
2011,"[[[241, 178, 176], [243, 180, 178], [244, 182,...",5
6404,"[[[236, 150, 156], [238, 150, 160], [238, 151,...",4
245,"[[[163, 150, 158], [163, 149, 156], [165, 149,...",2
4373,"[[[223, 135, 139], [223, 134, 142], [223, 135,...",4
7775,"[[[207, 190, 223], [210, 194, 226], [212, 194,...",4


### 3.2 Select features and target


In [0]:
import tensorflow as tf
from tensorflow import keras

In [0]:
# extract the pixels 
features = ham_dataset.drop(columns=['label'])
features.head()

Unnamed: 0,pixels
0,"[[[192, 153, 193], [195, 155, 192], [197, 154,..."
1,"[[[27, 16, 32], [69, 49, 76], [122, 93, 126], ..."
2,"[[[192, 138, 153], [200, 144, 162], [202, 142,..."
3,"[[[40, 21, 31], [95, 61, 73], [143, 102, 118],..."
4,"[[[159, 114, 140], [194, 144, 173], [215, 162,..."


In [0]:
# extract the label for each image
target = ham_dataset['label']
target.head()

0    2
1    2
2    2
3    2
4    2
Name: label, dtype: int8

### 3.3 Train test split

Instead of splitting the data into a train and test data split, this method is used to split data to two clients.

In [0]:
from sklearn.model_selection import train_test_split

x_client_1, x_client_2, y_client_1, y_client_2 = train_test_split(features, target, test_size=0.40, train_size=0.40, random_state=10015)

In [0]:
# normalize x_train and x_test by subtracting from theor mean values and dividing by thier standard deviation
Xclient_1 = np.asarray(x_client_1['pixels'].tolist())
Xclient_2 = np.asarray(x_client_2['pixels'].tolist())

Xclient_1_mean = np.mean(Xclient_1)
Xclient_1_std = np.std(Xclient_1)

Xclient_2_mean = np.mean(Xclient_2)
Xclient_2_std = np.std(Xclient_2)

Xclient_1 = (Xclient_1 - Xclient_1_mean)/Xclient_1_std
Xclient_2 = (Xclient_2 - Xclient_2_mean)/Xclient_2_std

In [0]:
# let's have a look at the normalized image data
Xclient_1[0]

array([[[ 1.84246645e+00, -2.01279202e-02,  4.48463022e-02],
        [ 1.84246645e+00,  4.48463022e-02,  1.09820524e-01],
        [ 1.86412453e+00,  2.31882281e-02,  1.53136673e-01],
        ...,
        [ 1.86412453e+00,  2.31882281e-02,  1.31478599e-01],
        [ 1.84246645e+00,  2.31882281e-02,  8.81624504e-02],
        [ 1.79915031e+00, -6.34440684e-02,  2.31882281e-02]],

       [[ 1.79915031e+00, -8.51021425e-02, -1.06760217e-01],
        [ 1.82080838e+00, -4.17859943e-02, -6.34440684e-02],
        [ 1.84246645e+00, -2.01279202e-02, -2.01279202e-02],
        ...,
        [ 1.86412453e+00,  1.53136673e-01,  2.18110895e-01],
        [ 1.82080838e+00,  1.53015394e-03,  6.65043763e-02],
        [ 1.73417608e+00, -1.28418291e-01, -1.06760217e-01]],

       [[ 1.79915031e+00, -6.34440684e-02, -1.28418291e-01],
        [ 1.84246645e+00, -4.17859943e-02, -1.06760217e-01],
        [ 1.86412453e+00,  2.31882281e-02, -2.01279202e-02],
        ...,
        [ 1.84246645e+00,  8.81624504e-02,

In [0]:
# after performing train_test_split each client has 1006 images 
print(Xclient_1.shape)

(4006, 28, 28, 3)


In [0]:
# perform lable encoding by converting a class vector (integers) to binary class matrix
# labels are 7 different classes of skin lesion types from 0 to 6

from tensorflow.keras.utils import to_categorical

Yclient_1 = to_categorical(y_client_1, num_classes = 7)
Yclient_2 = to_categorical(y_client_2, num_classes = 7)

In [0]:
Yclient_1[0]

array([0., 0., 0., 0., 1., 0., 0.], dtype=float32)

In [0]:
print(Yclient_1.shape)

(4006, 7)


### 3.4 tf.data.Dataset.from_tensor_slices()

In [0]:
client_1 = tf.data.Dataset.from_tensor_slices((Xclient_1, Yclient_1))
client_1

<TensorSliceDataset shapes: ((28, 28, 3), (7,)), types: (tf.float64, tf.float32)>

In [0]:
client_2 = tf.data.Dataset.from_tensor_slices((Xclient_2, Yclient_2))
client_2

<TensorSliceDataset shapes: ((28, 28, 3), (7,)), types: (tf.float64, tf.float32)>

In [0]:
client_1.element_spec

(TensorSpec(shape=(28, 28, 3), dtype=tf.float64, name=None),
 TensorSpec(shape=(7,), dtype=tf.float32, name=None))

In [0]:
client_2.element_spec

(TensorSpec(shape=(28, 28, 3), dtype=tf.float64, name=None),
 TensorSpec(shape=(7,), dtype=tf.float32, name=None))

### 3.5 tff.simulation.ClientData.from_clients_and_fn()

In [0]:
dataset_paths = {
  "client_1" : client_1,
  "client_2" : client_2
}

def create_tf_dataset_for_client_fn(id):
   path = dataset_paths.get(id)
   if path is None:
     raise ValueError(f'No dataset for client {id}')
   return path

source = tff.simulation.ClientData.from_clients_and_fn(
  list(dataset_paths.keys()), create_tf_dataset_for_client_fn)

In [0]:
source

<tensorflow_federated.python.simulation.client_data.ConcreteClientData at 0x7f47edd05550>

In [0]:
source.element_type_structure

(TensorSpec(shape=(28, 28, 3), dtype=tf.float64, name=None),
 TensorSpec(shape=(7,), dtype=tf.float32, name=None))

In [0]:
source.client_ids

['client_1', 'client_2']

In [0]:
source.datasets

<bound method ClientData.datasets of <tensorflow_federated.python.simulation.client_data.ConcreteClientData object at 0x7f47edd05550>>

### 3.6 tff.simulation.ClientData.create_tf_dataset_for_client()

In [0]:
tff_dataset = source.create_tf_dataset_for_client(
        source.client_ids[0]
    )
print(type(tff_dataset))
example_element = iter(tff_dataset).next()
print(example_element)

<class 'tensorflow.python.data.ops.dataset_ops.TensorSliceDataset'>
(<tf.Tensor: shape=(28, 28, 3), dtype=float64, numpy=
array([[[ 1.84246645e+00, -2.01279202e-02,  4.48463022e-02],
        [ 1.84246645e+00,  4.48463022e-02,  1.09820524e-01],
        [ 1.86412453e+00,  2.31882281e-02,  1.53136673e-01],
        ...,
        [ 1.86412453e+00,  2.31882281e-02,  1.31478599e-01],
        [ 1.84246645e+00,  2.31882281e-02,  8.81624504e-02],
        [ 1.79915031e+00, -6.34440684e-02,  2.31882281e-02]],

       [[ 1.79915031e+00, -8.51021425e-02, -1.06760217e-01],
        [ 1.82080838e+00, -4.17859943e-02, -6.34440684e-02],
        [ 1.84246645e+00, -2.01279202e-02, -2.01279202e-02],
        ...,
        [ 1.86412453e+00,  1.53136673e-01,  2.18110895e-01],
        [ 1.82080838e+00,  1.53015394e-03,  6.65043763e-02],
        [ 1.73417608e+00, -1.28418291e-01, -1.06760217e-01]],

       [[ 1.79915031e+00, -6.34440684e-02, -1.28418291e-01],
        [ 1.84246645e+00, -4.17859943e-02, -1.06760217e

In [0]:
NUM_CLIENTS = 2
NUM_EPOCHS = 5
BATCH_SIZE = 20
SHUFFLE_BUFFER = 100
PREFETCH_BUFFER=10

def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    print('Running batch format ')
    print(element)
    return collections.OrderedDict(
        x=tf.reshape(element['pixels'], [-1, 784]),
        y=tf.reshape(element['label'], [-1, 1]))

  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)

In [0]:
preprocessed_example_dataset = preprocess(tff_dataset)

sample_batch = tf.nest.map_structure(lambda x: x.numpy(),
                                     next(iter(preprocessed_example_dataset)))

sample_batch

TypeError: ignored

In [0]:
tff_dataset.element_spec

(TensorSpec(shape=(28, 28, 3), dtype=tf.float64, name=None),
 TensorSpec(shape=(7,), dtype=tf.float32, name=None))

## 4. Create the Keras Model

### 4.1 tff.learning.from_keras_model()

Keras input explanation: input_shape, units, batch_size, dim, etc

https://stackoverflow.com/questions/44747343/keras-input-explanation-input-shape-units-batch-size-dim-etc

In [0]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

In [0]:
def create_keras_model():
  return tf.keras.models.Sequential([
  tf.keras.Input(shape=(28, 28, 3)),
  tf.keras.layers.Conv2D(32, kernel_size=(3, 3),activation='relu',padding='Same', input_shape=(3,)),
  tf.keras.layers.Conv2D(32, kernel_size=(3, 3),activation='relu',padding='Same'),
  tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
  tf.keras.layers.Dropout(0.25),

  tf.keras.layers.Conv2D(64, (3, 3),activation='relu',padding='Same'),
  tf.keras.layers.Conv2D(64, (3, 3),activation='relu',padding='Same'),
  tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
  tf.keras.layers.Dropout(0.40),

  tf.keras.layers.Flatten(),
  tf.keras.layers.Dropout(0.5),
  tf.keras.layers.Dense(7,activation='softmax')])

In [0]:
print(create_keras_model().summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 28, 28, 32)        896       
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 28, 28, 32)        9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 14, 14, 32)        0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 14, 14, 64)        18496     
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 14, 14, 64)        36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 64)         

In [0]:
def model_fn():
  # We _must_ create a new model here, and _not_ capture it from an external
  # scope. TFF will call this within different graph contexts.
  keras_model = create_keras_model()
  return tff.learning.from_keras_model(
      keras_model,
      input_spec=tff_dataset.element_spec,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

## 5. Training the Model

### 5.1 tff.learning.build_federated_averaging_process()

Note: the default server optimizer function is tf.keras.optimizers.SGD with a learning rate of 1.0, which corresponds to adding the model delta to the current server model. This recovers the original FedAvg algorithm in McMahan et al., 2017. More sophisticated federated averaging procedures may use different learning rates or server optimizers.

In [0]:
iterative_process = tff.learning.build_federated_averaging_process(
    model_fn,
    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),
    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0),
    client_weight_fn=None,
    stateful_delta_aggregate_fn=None,
    stateful_model_broadcast_fn=None)

ValueError: ignored

In [0]:
iterative_process = tff.learning.build_federated_sgd_process()

TypeError: ignored

In [0]:
str(iterative_process.initialize.type_signature)

NameError: ignored

### 5.2 iterative_process.initialize()

In [0]:
state = iterative_process.initialize()

In [0]:
state, metrics = iterative_process.next(state, source)
print('round  1, metrics={}'.format(metrics))

In [0]:
NUM_ROUNDS = 11
for round_num in range(2, NUM_ROUNDS):
  state, metrics = iterative_process.next(state, source)
  print('round {:2d}, metrics={}'.format(round_num, metrics))