This is a read-only notebook. To run this, you can first make a copy of it, and then run the new notebook.

# Federated Learning for Text Generation

This python notebook includes the code for the Network Softwarization final project.

In this project, we use Federated Learning to build a model for text generation. First, we load a pre-trained Keras model, and then, refine it using federated training. The reasons for taking this approach are thoroughly explained in the report.

For the pre-trained model, we use the text from two of the Charles Dickens' books, and for the federated learning part, we use a federated version of works of Shakespeare provided by TFF.

This project is based on TensorFlow's federated learning tutorial for text generation.

## Install and test `tensorflow_federated`

Install `tensorflow_federated` package.

In [None]:
#@test {"skip": true}
!pip install --quiet --upgrade tensorflow_federated

[K     |████████████████████████████████| 430kB 2.8MB/s 
[K     |████████████████████████████████| 2.8MB 13.6MB/s 
[K     |████████████████████████████████| 20.0MB 1.5MB/s 
[K     |████████████████████████████████| 102kB 5.6MB/s 
[K     |████████████████████████████████| 2.2MB 37.2MB/s 
[K     |████████████████████████████████| 296kB 40.6MB/s 
[K     |████████████████████████████████| 421.8MB 40kB/s 
[K     |████████████████████████████████| 3.9MB 38.6MB/s 
[K     |████████████████████████████████| 450kB 42.1MB/s 
[?25h  Building wheel for gast (setup.py) ... [?25l[?25hdone
[31mERROR: tensorflow-probability 0.10.0rc0 has requirement gast>=0.3.2, but you'll have gast 0.2.2 which is incompatible.[0m
[31mERROR: datascience 0.10.6 has requirement folium==0.2.1, but you'll have folium 0.8.3 which is incompatible.[0m
[31mERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.[0m


Import necessary packages and test to see if `tff` is working.

In [None]:
import collections
import functools
import os
import time
import numpy as np
import tensorflow as tf
import tensorflow_federated as tff

tf.compat.v1.enable_v2_behavior()
np.random.seed(0)

# Test the TFF is working:
tff.federated_computation(lambda: 'Hello, World!')()

b'Hello, World!'

## Load the pre-trained model

First, we start with an RNN model that generates ASCII characters, and then we will refine it via federated learning. This model is previously trained in one of TensorFlow tutorials ([Text generation with an RNN](https://www.tensorflow.org/tutorials/sequences/text_generation)).

In order to use the works of Shakespeare for federated learning step, the model is pre-trained on the text from the Charles Dickens'
    [A Tale of Two Cities](http://www.ibiblio.org/pub/docs/books/gutenberg/9/98/98.txt)
    and
    [A Christmas Carol](http://www.ibiblio.org/pub/docs/books/gutenberg/4/46/46.txt),
 and the final model was saved with `tf.keras.models.save_model(include_optimizer=False)`.
   
After this step, we will use federated learning to fine-tune this model for Shakespeare, using a federated version of the data provided by TFF in `tff.simulation.datasets.shakespeare.load_data()`.

Generate the vocab lookup tables

In [None]:
# A fixed vocabularly of ASCII chars that occur in the works of Shakespeare and Dickens:
vocab = list('dhlptx@DHLPTX $(,048cgkoswCGKOSW[_#\'/37;?bfjnrvzBFJNRVZ"&*.26:\naeimquyAEIMQUY]!%)-159\r')

# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

Load the pre-trained model


In [None]:
def load_model(batch_size):
  urls = {
      1: 'https://storage.googleapis.com/tff-models-public/dickens_rnn.batch1.kerasmodel',
      8: 'https://storage.googleapis.com/tff-models-public/dickens_rnn.batch8.kerasmodel'}
  assert batch_size in urls, 'batch_size must be in ' + str(urls.keys())
  url = urls[batch_size]
  local_file = tf.keras.utils.get_file(os.path.basename(url), origin=url)  
  return tf.keras.models.load_model(local_file, compile=False)

Here, to test the pre-trained model, we feed a start string to the model, and get a string as response.

In [None]:
def generate_text(model, start_string):
  # From https://www.tensorflow.org/tutorials/sequences/text_generation
  num_generate = 200
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)
  text_generated = []
  temperature = 1.0

  model.reset_states()
  for i in range(num_generate):
    predictions = model(input_eval)
    predictions = tf.squeeze(predictions, 0)
    predictions = predictions / temperature
    predicted_id = tf.random.categorical(
        predictions, num_samples=1)[-1, 0].numpy()
    input_eval = tf.expand_dims([predicted_id], 0)
    text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

In [None]:
# Text generation requires a batch_size=1 model.
keras_model_batch1 = load_model(batch_size=1)
print(generate_text(keras_model_batch1, 'What of TensorFlow Federated, you ask? '))

Downloading data from https://storage.googleapis.com/tff-models-public/dickens_rnn.batch1.kerasmodel
What of TensorFlow Federated, you ask? Shall I
do well now, I heard her staircases much behind the counter.

"In your arms towards the Notion


The hard was demained until at the topic, from turbidly until Younce
gentleman, accordin


## Load and Preprocess the Federated Shakespeare Data



To provide a realistic non-IID data distribution, TFF provides the `tff.simulation.datasets` package. In this package, datasets are split into "clients", where each client corresponds to a dataset on a particular device to participate in federated learning.

`tff.simulation.datasets.shakespeare.load_data()` returns the train and test Shakespeare federated datasets.

The structure of datasets is as follow: The client keys consist of the name of the play joined with the name of the character. For example: `MUCH_ADO_ABOUT_NOTHING_OTHELLO` corresponds to the lines for the character Othello in the play *Much Ado About Nothing*.

Note that in a real federated learning scenario
clients are never identified or tracked by ids, but for simulation it is useful to work with keyed datasets.

In [None]:
train_data, test_data = tff.simulation.datasets.shakespeare.load_data()

Downloading data from https://storage.googleapis.com/tff-datasets-public/shakespeare.tar.bz2


Here, for example, we can look at some data from King Lear:

In [None]:
# Here the play is "The Tragedy of King Lear" and the character is "King".
raw_example_dataset = train_data.create_tf_dataset_for_client('THE_TRAGEDY_OF_KING_LEAR_KING')

To prepare the data for training, we use `tf.data.dataset`.

In [None]:
# Input pre-processing parameters
SEQ_LENGTH = 100
BATCH_SIZE = 8
NUM_EPOCHS = 5
BUFFER_SIZE = 10000  # For dataset shuffling

# Construct a lookup table to map string chars to indexes,
# using the vocab loaded above:
table = tf.lookup.StaticHashTable(
  tf.lookup.KeyValueTensorInitializer(
    keys=vocab,
    values=tf.constant(list(range(len(vocab))), dtype=tf.int64)),
  default_value=0)


def to_ids(x):
  s = tf.reshape(x['snippets'], shape=[1])
  chars = tf.strings.bytes_split(s).values
  ids = table.lookup(chars)
  return ids


def split_input_target(chunk):
  input_text = tf.map_fn(lambda x: x[:-1], chunk)
  target_text = tf.map_fn(lambda x: x[1:], chunk)
  return (input_text, target_text)


def preprocess(dataset):
  return (
      # Try multiple epochs of local training
      dataset.repeat(NUM_EPOCHS)
      # Map ASCII chars to int64 indexes using the vocab
      .map(to_ids)
      # Split into individual chars
      .unbatch()
      # Form example sequences of SEQ_LENGTH +1
      .batch(SEQ_LENGTH + 1, drop_remainder=True)
      # Shuffle and form minibatches
      .shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
      # And finally split into (input, target) tuples, each of length SEQ_LENGTH.
      .map(split_input_target))

Note that in the formation of the original sequences and in the formation of batches above, we use `drop_remainder=True` for simplicity. This means that characters (clients) that don't have at least `(SEQ_LENGTH + 1) * BATCH_SIZE` chars of text will have empty datasets.

Now we can preprocess our `raw_example_dataset`:

In [None]:
example_dataset = preprocess(raw_example_dataset)

## Compile the model and test on the preprocessed data

In order to evaluate our model, we need to compile it with a loss function and metrics.

Furthermore, to have a char-level accuracy, we need to define a new metric class. Char-level accuracy is for the fraction of predictions where the highest probability was put on the correct next char.

In [None]:
class FlattenedCategoricalAccuracy(tf.keras.metrics.SparseCategoricalAccuracy):

  def __init__(self, name='accuracy', dtype=tf.float32):
    super().__init__(name, dtype=dtype)

  def update_state(self, y_true, y_pred, sample_weight=None):
    y_true = tf.reshape(y_true, [-1, 1])
    y_pred = tf.reshape(y_pred, [-1, len(vocab), 1])
    return super().update_state(y_true, y_pred, sample_weight)

Now we compile a model, and evaluate it on our `example_dataset`. After that, we compare our accuracy to a completely random data.

In [None]:
BATCH_SIZE = 8

# Load the model into keras_model
keras_model = load_model(batch_size=BATCH_SIZE)

# compile our keras_model
keras_model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[FlattenedCategoricalAccuracy()])

# Compute loss and accuracy on an example Shakespeare character
loss, accuracy = keras_model.evaluate(example_dataset.take(5), verbose=0)
print('Evaluating on an example Shakespeare character:')
print('Loss:', loss)
print('Accuracy:', accuracy)

# Compare our accuracy to a completely random data
random_indexes = np.random.randint(
    low=0, high=len(vocab), size=1 * BATCH_SIZE * (SEQ_LENGTH + 1))
data = collections.OrderedDict(snippets=tf.constant(
    ''.join(np.array(vocab)[random_indexes]), shape=[1, 1]))
random_dataset = preprocess(tf.data.Dataset.from_tensor_slices(data))

random_guessed_accuracy = 1.0 / len(vocab)
print('Expected accuracy for random guessing:', random_guessed_accuracy)

loss, accuracy = keras_model.evaluate(random_dataset, steps=10, verbose=0)
print('Evaluating on completely random data:', accuracy)

Evaluating on an example Shakespeare character:
Loss: 3.253553628921509
Accuracy: 0.41275
Expected accuracy for random guessing: 0.011627906976744186




Evaluating on completely random data: 0.012


## Fine-tune the model with Federated Learning

In order to connect to TFF Core layer, it is necessary to provide a function that TFF can use, so that it can inroduce our model to a graph that the TFF Core controls.

To do so, we need to clone our `keras_model` inside a function called `create_tff_model()`, which TFF will call to produce a new copy of the model inside the graph that it will serialize. It is important to construct all the necessary objects we will need inside this function.

In [None]:
def create_tff_model():
  # TFF uses a `dummy_batch` so it knows the types and shapes that your model expects.
  x = np.random.randint(1, len(vocab), size=[BATCH_SIZE, SEQ_LENGTH])
  dummy_batch = collections.OrderedDict(x=x, y=x)
  keras_model_clone = tf.keras.models.clone_model(keras_model)
  return tff.learning.from_keras_model(
      keras_model_clone,
      dummy_batch=dummy_batch,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
      metrics=[FlattenedCategoricalAccuracy()])

Now, everything is ready to construct a `Federated Averaging` iterative process, and fine tune our pre-trained model.

Note that, for the reasons that are mentioned and thoroughly explained in the report, we will feed back the final weights to the original Keras model. By doing so, after each round of federated training, we use a compiled Keras model to perform standard evaluation.

In [None]:
# This command builds all the TensorFlow graphs and serializes them: 
fed_avg = tff.learning.build_federated_averaging_process(
    model_fn=create_tff_model,
    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(lr=0.7),
    client_weight_fn=lambda _: tf.constant(1.0),
    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(lr=0.9))









Now that we have built all the TensorFlow graphs and serialized them, we write our training and evaluation loop. Note that, in order to accelerate our training step, we select three clients, and train our model only on them. In other words, we will overfit our model on these three clients.

In [None]:
def data(client, source=train_data):
  return preprocess(
      source.create_tf_dataset_for_client(client)).take(5)

# our three selected clients
clients = ['ALL_S_WELL_THAT_ENDS_WELL_CELIA',
           'MUCH_ADO_ABOUT_NOTHING_OTHELLO',
           'THE_TRAGEDY_OF_KING_LEAR_KING']

train_datasets = [data(client) for client in clients]

# We concatenate the test datasets for evaluation with Keras.
test_dataset = functools.reduce(
    lambda d1, d2: d1.concatenate(d2),
    [data(client, test_data) for client in clients])

For the reason that `clone_model()` does not clone the weights, and since we want to use the weights from the pre-trained model for the initial state of the model which is produced by `fed_avg.initialize()`, we set the model weights in the server state directly from the loaded model.

In [None]:
NUM_ROUNDS = 30

# The state of the FL server, containing the model and optimization state.
state = fed_avg.initialize()

state = tff.learning.state_with_new_model_weights(
    state,
    trainable_weights=[v.numpy() for v in keras_model.trainable_weights],
    non_trainable_weights=[v.numpy() for v in keras_model.non_trainable_weights])


def keras_evaluate(state, round_num):
  # Take our global model weights and push them back into a Keras model to
  # use its standard `.evaluate()` method.
  keras_model = load_model(batch_size=BATCH_SIZE)
  keras_model.compile(
      loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[FlattenedCategoricalAccuracy()])
  tff.learning.assign_weights_to_keras_model(keras_model, state.model)
  loss, accuracy = keras_model.evaluate(example_dataset, steps=2, verbose=0)
  print('\tEval: loss={l:.3f}, accuracy={a:.3f}'.format(l=loss, a=accuracy))

for round_num in range(NUM_ROUNDS):
  print('Round {r}'.format(r=round_num + 1))
  keras_evaluate(state, round_num)
  state, metrics = fed_avg.next(state, train_datasets)
  print('\tTrain: loss={l:.3f}, accuracy={a:.3f}'.format(
      l=metrics.loss, a=metrics.accuracy))

keras_evaluate(state, NUM_ROUNDS + 1)

Round 1
	Eval: loss=3.085, accuracy=0.417
	Train: loss=3.138, accuracy=0.420
Round 2
	Eval: loss=2.825, accuracy=0.453
	Train: loss=2.700, accuracy=0.460
Round 3
	Eval: loss=2.394, accuracy=0.460
	Train: loss=2.430, accuracy=0.477
Round 4
	Eval: loss=2.046, accuracy=0.525
	Train: loss=2.219, accuracy=0.512
Round 5
	Eval: loss=2.036, accuracy=0.520
	Train: loss=2.148, accuracy=0.511
Round 6
	Eval: loss=1.783, accuracy=0.576
	Train: loss=2.045, accuracy=0.528
Round 7
	Eval: loss=1.714, accuracy=0.575
	Train: loss=1.950, accuracy=0.537
Round 8
	Eval: loss=1.686, accuracy=0.570
	Train: loss=1.858, accuracy=0.562
Round 9
	Eval: loss=1.571, accuracy=0.588
	Train: loss=1.726, accuracy=0.576
Round 10
	Eval: loss=1.635, accuracy=0.596
	Train: loss=1.701, accuracy=0.577
Round 11
	Eval: loss=1.546, accuracy=0.611
	Train: loss=1.619, accuracy=0.592
Round 12
	Eval: loss=1.350, accuracy=0.640
	Train: loss=1.506, accuracy=0.619
Round 13
	Eval: loss=1.346, accuracy=0.676
	Train: loss=1.461, accuracy=0

We can test our model by calling `generate_text()`, and giving the `keras_model` and a string as inputs. Note that text generation requires `batch_size=1`.

In [None]:
# Set our newly trained weights back in the originally created model.
keras_model_batch1.set_weights([v.numpy() for v in keras_model.weights])
# Text generation requires batch_size=1
print(generate_text(keras_model_batch1, 'What of TensorFlow Federated, you ask? '))

What of TensorFlow Federated, you ask? Shall I

"Tell me what is it."

"She had an impact feelly in a traband, and came running at emerge carlied with
visible besides anything good remembran every head. My mother in the old law, seeme


Based on what we have done, we should expect that the generated text should be similar to the data from our three chosen clients. By selecting more clients, training more, and changing the clients between each `process`, we can get better results.