<a href="https://colab.research.google.com/github/ElnathanTiokou/federated-Amine/blob/master/Copy_of_Tensorflow_federated_framewrok_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2020 The TensorFlow Authors.

In [1]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/federated/tutorials/building_your_own_federated_learning_algorithm"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/federated/blob/v0.21.0/docs/tutorials/building_your_own_federated_learning_algorithm.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/federated/blob/v0.21.0/docs/tutorials/building_your_own_federated_learning_algorithm.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/federated/docs/tutorials/building_your_own_federated_learning_algorithm.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

## Environment intallation

Before we start,let's setup the environment.

In [2]:

!pip install --quiet  tensorflow-federated==0.20.0   # The latest version of tensorflow-federated is not working with the colab python version
!pip install --quiet --upgrade nest-asyncio

import nest_asyncio
nest_asyncio.apply()

[K     |████████████████████████████████| 819 kB 865 kB/s 
[K     |████████████████████████████████| 4.0 MB 34.2 MB/s 
[K     |████████████████████████████████| 251 kB 46.0 MB/s 
[K     |████████████████████████████████| 45 kB 1.7 MB/s 
[K     |████████████████████████████████| 121 kB 41.9 MB/s 
[K     |████████████████████████████████| 53 kB 1.0 MB/s 
[K     |████████████████████████████████| 65.1 MB 79 kB/s 
[K     |████████████████████████████████| 237 kB 65.8 MB/s 
[K     |████████████████████████████████| 887 kB 50.0 MB/s 
[K     |████████████████████████████████| 462 kB 59.3 MB/s 
[K     |████████████████████████████████| 4.2 MB 39.9 MB/s 
[?25h  Building wheel for jax (setup.py) ... [?25l[?25hdone
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy 2.2.4 requires tqdm<5.0.0,>=4.38.0, but you have tqdm 4.28.1 which is incompatible.
py

In [3]:
import tensorflow as tf
import tensorflow_federated as tff

# Federated Learning Algorithm


In this section, We aim to accomplish the following:

Use the Federated Core to implement Federated Averaging directly.


## Preparing the input data
Inasmuch as we do not have Federated data easily accessible in the real world, we are going to load and preprocess the EMNIST dataset included in TFF. And split them into 2 parts: Training and testing data.

In [4]:
emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data()

Downloading emnist_all.sqlite.lzma: 100%|██████████| 170507172/170507172 [00:55<00:00, 3040350.48it/s]


In order to feed the dataset into our model, we flatten the data, and convert each example into a tuple of the form: flatten image and label.

In [5]:
NUM_CLIENTS = 100    # Selected number of clients: simulation of a cross-device
BATCH_SIZE = 20

def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch of EMNIST data and return a (features, label) tuple."""
    return (tf.reshape(element['pixels'], [-1, 784]), 
            tf.reshape(element['label'], [-1, 1]))

  return dataset.batch(BATCH_SIZE).map(batch_format_fn)

We now select a small number of clients, and apply the preprocessing above to their datasets.

In [6]:
client_ids = sorted(emnist_train.client_ids)[:NUM_CLIENTS]
federated_train_data = [preprocess(emnist_train.create_tf_dataset_for_client(x))
  for x in client_ids
]

## Preparing the model

To easily move on in analysis, we use a built in model from Keras. This model (implemented via `tf.keras`) has a single hidden layer, followed by a softmax layer.

In [7]:
def create_keras_model():
  initializer = tf.keras.initializers.GlorotNormal(seed=0)
  return tf.keras.models.Sequential([
      tf.keras.layers.Input(shape=(784,)),
      tf.keras.layers.Dense(10, kernel_initializer=initializer),
      tf.keras.layers.Softmax(),
  ])

In order to use this model in TFF, we wrap the Keras model as a [`tff.learning.Model`](https://www.tensorflow.org/federated/api_docs/python/tff/learning/Model). This allows us to perform the model's [forward pass](https://www.tensorflow.org/federated/api_docs/python/tff/learning/Model#forward_pass) within TFF, and [extract model outputs](https://www.tensorflow.org/federated/api_docs/python/tff/learning/Model#report_local_unfinalized_metrics). 

In [8]:
def model_fn():
  keras_model = create_keras_model()
  return tff.learning.from_keras_model(
      keras_model,
      input_spec=federated_train_data[0].element_spec,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

In [9]:
model_fn()

<tensorflow_federated.python.learning.keras_utils._KerasModel at 0x7f8d121b3c90>

# Implementation of the Federated Learning pipeline

We are going to use `tff.learning` API that allows one to create many variants of Federated Averaging.

In many cases, federated algorithms have 4 main components:

1. A server-to-client broadcast step.
2. A local client update step.
3. A client-to-server upload step.
4. A server update step.

In TFF, we generally represent federated algorithms as a [`tff.templates.IterativeProcess`](https://www.tensorflow.org/federated/api_docs/python/tff/templates/IterativeProcess) (which we refer to as just an `IterativeProcess` throughout). This is a class that contains `initialize` and `next` functions. Here, `initialize` is used to initialize the server, and `next` will perform one communication round of the federated algorithm. Let's write a skeleton of what our iterative process for FedAvg should look like.

First, we have an initialize function that simply creates a `tff.learning.Model`, and returns its trainable weights.

In [10]:
def initialize_fn():
  model = model_fn()
  return model.trainable_variables

This function looks good, but as we will see later, we will need to make a small modification to make it a "TFF computation".

We also want to sketch the `next_fn`.

In [11]:
def next_fn(server_weights, federated_dataset):
  # Broadcast the server weights to the clients.
  server_weights_at_client = broadcast(server_weights)

  # Each client computes their updated weights.
  client_weights = client_update(federated_dataset, server_weights_at_client)

  # The server averages these updates.
  mean_client_weights = mean(client_weights)

  # The server updates its model.
  server_weights = server_update(mean_client_weights)

  return server_weights

We'll focus on implementing these four components separately. We first focus on the parts that can be implemented in pure TensorFlow, namely the client and server update steps.


## TensorFlow Blocks 

### Client update

We will use our `tff.learning.Model` to do client training in essentially the same way you would train a TensorFlow model. In particular, we will use `tf.GradientTape` to compute the gradient on batches of data, then apply these gradient using a `client_optimizer`. We focus only on the trainable weights.


In [12]:
@tf.function
def client_update(model, dataset, server_weights, client_optimizer):
  """Performs training (using the server model weights) on the client's dataset."""
  # Initialize the client model with the current server weights.
  client_weights = model.trainable_variables
  # Assign the server weights to the client model.
  tf.nest.map_structure(lambda x, y: x.assign(y),
                        client_weights, server_weights)

  # Use the client_optimizer to update the local model.
  for batch in dataset:
    with tf.GradientTape() as tape:
      # Compute a forward pass on the batch of data
      outputs = model.forward_pass(batch)

    # Compute the corresponding gradient
    grads = tape.gradient(outputs.loss, client_weights)
    grads_and_vars = zip(grads, client_weights)

    # Apply the gradient using a client optimizer.
    client_optimizer.apply_gradients(grads_and_vars)

  return client_weights

### Server Update

The server update for FedAvg is simpler than the client update. We will implement "vanilla" federated averaging, in which we simply replace the server model weights by the average of the client model weights. Again, we only focus on the trainable weights.

In [13]:
@tf.function
def server_update(model, mean_client_weights):
  """Updates the server model weights as the average of the client model weights."""
  model_weights = model.trainable_variables
  # Assign the mean client weights to the server model.
  tf.nest.map_structure(lambda x, y: x.assign(y),
                        model_weights, mean_client_weights)
  return model_weights

The snippet could be simplified by simply returning the `mean_client_weights`. However, more advanced implementations of Federated Averaging use `mean_client_weights` with more sophisticated techniques, such as momentum or adaptivity.

**Challenge**: Implement a version of `server_update` that updates the server weights to be the midpoint of model_weights and mean_client_weights. (Note: This kind of "midpoint" approach is analogous to recent work on the [Lookahead optimizer](https://arxiv.org/abs/1907.08610)!).

So far, we've only written pure TensorFlow code. This is by design, as TFF allows you to use much of the TensorFlow code you're already familiar with. However, now we have to specify the **orchestration logic**, that is, the logic that dictates what the server broadcasts to the client, and what the client uploads to the server.

This will require the *Federated Core* of TFF.

# Building your own Federated Learning algorithm, revisited

Now that we've gotten a glimpse of the Federated Core, we can build our own federated learning algorithm. Remember that above, we defined an `initialize_fn` and `next_fn` for our algorithm. The `next_fn` will make use of the `client_update` and `server_update` we defined using pure TensorFlow code.

However, in order to make our algorithm a federated computation, we will need both the `next_fn` and `initialize_fn` to each be a `tff.federated_computation`.

## TensorFlow Federated blocks 

### Creating the initialization computation

The initialize function will be quite simple: We will create a model using `model_fn`. However, remember that we must separate out our TensorFlow code using `tff.tf_computation`.

In [14]:
@tff.tf_computation
def server_init():
  model = model_fn()
  return model.trainable_variables

We can then pass this directly into a federated computation using `tff.federated_value`.

In [15]:
@tff.federated_computation
def initialize_fn():
  return tff.federated_value(server_init(), tff.SERVER)

### Creating the `next_fn`

We now use our client and server update code to write the actual algorithm. We will first turn our `client_update` into a `tff.tf_computation` that accepts a client datasets and server weights, and outputs an updated client weights tensor.

We will need the corresponding types to properly decorate our function. Luckily, the type of the server weights can be extracted directly from our model.

Let's look at the dataset type signature. Remember that we took 28 by 28 images (with integer labels) and flattened them.

We can also extract the model weights type by using our `server_init` function above.

In [16]:

model_weights_type = server_init.type_signature.result
tf_dataset_type = tff.SequenceType(model_fn().input_spec)

Type signature, we'll be able to see the architecture of our model!

In [17]:
str(model_weights_type)

'<float32[784,10],float32[10]>'

We can now create our `tff.tf_computation` for the client update.

In [18]:
@tff.tf_computation(tf_dataset_type, model_weights_type)
def client_update_fn(tf_dataset, server_weights):
  model = model_fn()
  client_optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
  return client_update(model, tf_dataset, server_weights, client_optimizer)

The `tff.tf_computation` version of the server update can be defined in a similar way, using types we've already extracted.

In [19]:
@tff.tf_computation(model_weights_type)
def server_update_fn(mean_client_weights):
  model = model_fn()
  return server_update(model, mean_client_weights)

Last, but not least, we need to create the `tff.federated_computation` that brings this all together. This function will accept two *federated values*, one corresponding to the server weights (with placement `tff.SERVER`), and the other corresponding to the client datasets (with placement `tff.CLIENTS`).

Note that both these types were defined above! We simply need to give them the proper placement using `tff.FederatedType`.

In [20]:
federated_server_type = tff.FederatedType(model_weights_type, tff.SERVER)
federated_dataset_type = tff.FederatedType(tf_dataset_type, tff.CLIENTS)

Remember the 4 elements of an FL algorithm?

1. A server-to-client broadcast step.
2. A local client update step.
3. A client-to-server upload step.
4. A server update step.

Now that we've built up the above, each part can be compactly represented as a single line of TFF code. This simplicity is why we had to take extra care to specify things such as federated types!

In [21]:
@tff.federated_computation(federated_server_type, federated_dataset_type)
def next_fn(server_weights, federated_dataset):
  # Broadcast the server weights to the clients.
  server_weights_at_client = tff.federated_broadcast(server_weights)

  # Each client computes their updated weights.
  client_weights = tff.federated_map(
      client_update_fn, (federated_dataset, server_weights_at_client))
  
  # The server averages these updates.
  mean_client_weights = tff.federated_mean(client_weights)

  # The server updates its model.
  server_weights = tff.federated_map(server_update_fn, mean_client_weights)

  return server_weights

We now have a `tff.federated_computation` for both the algorithm initialization, and for running one step of the algorithm. To finish our algorithm, we pass these into `tff.templates.IterativeProcess`.

In [22]:
federated_algorithm = tff.templates.IterativeProcess(
    initialize_fn=initialize_fn,
    next_fn=next_fn
)

This reflects the fact that `federated_algorithm.initialize` is a no-arg function that returns a single-layer model (with a 784-by-10 weight matrix, and 10 bias units).

In [23]:
str(federated_algorithm.next.type_signature)

'(<server_weights=<float32[784,10],float32[10]>@SERVER,federated_dataset={<float32[?,784],int32[?,1]>*}@CLIENTS> -> <float32[784,10],float32[10]>@SERVER)'

Here, we see that `federated_algorithm.next` accepts a server model and client data, and returns an updated server model.

## Evaluating the algorithm

Let's run a few rounds, and see how the loss changes. First, we will define an evaluation function using the *centralized* approach 

We first create a centralized evaluation dataset, and then apply the same preprocessing we used for the training data.

In [24]:
central_emnist_test = emnist_test.create_tf_dataset_from_all_clients()
central_emnist_test = preprocess(central_emnist_test)

Next, we write a function that accepts a server state, and uses Keras to evaluate on the test dataset. If you're familiar with `tf.Keras`, this will all look familiar, though note the use of `set_weights`!

In [25]:
def evaluate(server_state):
  keras_model = create_keras_model()
  keras_model.compile(
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()]  
  )
  keras_model.set_weights(server_state)
  keras_model.evaluate(central_emnist_test)

Now, let's initialize our algorithm and evaluate on the test set.

In [26]:
server_state = federated_algorithm.initialize()
evaluate(server_state)



Let's train for a few rounds, 

The more we increase the number of rounds, the more the loss function decrease. 


We've tried with 100 rounds which took more that 1 hour to compile, and gave us an accuracy of 0.3.

15 rounds, gave us an accuracy of 0.09.
30 rounds, accuracy : 0.104

In [27]:
for round in range(30):
  server_state = federated_algorithm.next(server_state, federated_train_data)

In [33]:
result = evaluate(server_state)



At this point, let's stop and think about what we've accomplished. We've implemented Federated Averaging directly by combining pure TensorFlow code (for the client and server updates) with federated computations from the Federated Core of TFF.



#Implementation or Simulation of Attacks

---


In [None]:
from attacks.data_attacker import LabelAttacker, NoiseMutator, OverlapMutator, DeleteMutator, UnbalanceMutator
from fed.aggregators import MedianAggregator, FedAvgAggregator, KrumAggregator, TrimmedMeanAggregator
from config import get_model, load_data, get_num_round, get_param, throw_conf_error, get_result_dir
from fed.federated import FedTester
from attacks.model_attacker import SignFlipModelAttacker, BackdoorAttack, RandomModelAttacker
from utils.util import ADNIDataset, Dataset
import numpy as np

In [None]:
fed_tester = FedTester(
    get_model,
    dataset,
    get_aggregator(),
    **get_attacks()
)