# Aplicação de Federated Learning (FL) utilizando Google Cloud, Flower e MEDMNIST

Nesse tutorial rodaremos um treinamento simples de FL utilizando o framework Flower, a base de dados MEDMNIST e o Google Cloud para criação de três virtual machines, duas que serão utilizadas como clientes e uma como servidor.
Para maior conhecimento é indicado a seguinte leitura: https://flower.dev/docs/tutorial/Flower-0-What-is-FL.html.

## Base de dados
PneumoniaMNIST: https://medmnist.com/ <br>
Como obter os dados: https://github.com/MedMNIST/MedMNIST/blob/main/examples/getting_started_without_PyTorch.ipynb

## Configurações utilizadas
120GB Armazenamento <br>
8GB Ram <br>
SO: debian-11-bullseye-v20230206 <br>
Regras de firewall (entrada): 
- Http - Https
- tcp 5000 - Conectar jupyter notebook
- tcp 8080 - Acesso ao servidor

## Tutorial

É necessário executar os seguintes passos em todas as máquinas. Para começar instale o git e clone o repositório desse projeto:

- <code> sudo apt-get install git <\code>

- <code> git clone [https://github.com/thborba/federated-learning.git](https://github.com/thborba/federated-learning.git) <\code>

Faça a instalação do jupyter notebook:

- <code> sudo apt-get install python3-pip <\code>

- <code> python3 -m pip install jupyter <\code>

- <code> export PATH=$PATH:~/.local/bin <\code>

A configuração a seguir é necessária para acessar o notebook de outro computador:

- <code> jupyter notebook --generate-config <\code>

- <code> nano ~/.jupyter/jupyter_notebook_config.py <\code>
    
    Copiar e colar seguinte texto:

        c = get_config() 
        c.NotebookApp.ip = '*' 
        c.NotebookApp.open_browser = False 
    
    CTRL + X -> Y para salvar

Execute o jupyter:

- <code> jupyter notebook --port=5000 <\code>

## Instalação de dependências

In [None]:
!python3 -m pip install -r requirements.txt

## Executar Servidor

In [None]:
from typing import Dict, Optional, Tuple
import flwr as fl
import utils

SERVER_ADDRESS = "[::]:8080" # dessa forma é possível acessar o servidor pelo Ip externo e interno

def main() -> None:
    model = utils.get_model()
    model.compile("adam", "binary_crossentropy", metrics=["accuracy"])

    strategy = fl.server.strategy.FedAvg(
        fraction_fit=0.2,
        fraction_evaluate=0.2,
        min_fit_clients=2,
        min_evaluate_clients=2,
        min_available_clients=2,
        evaluate_fn=get_evaluate_fn(model),
        on_fit_config_fn=fit_config,
        on_evaluate_config_fn=evaluate_config,
        initial_parameters=fl.common.ndarrays_to_parameters(model.get_weights()),
    )

    fl.server.start_server(
        server_address="[::]:8080",
        config=fl.server.ServerConfig(num_rounds=4),
        strategy=strategy,
    )


def get_evaluate_fn(model):
    """Return an evaluation function for server-side evaluation."""
    _, (x_test, y_test) = utils.load_data()
    # The `evaluate` function will be called after every round
    def evaluate(
        server_round: int,
        parameters: fl.common.NDArrays,
        config: Dict[str, fl.common.Scalar],
    ) -> Optional[Tuple[float, Dict[str, fl.common.Scalar]]]:
        model.set_weights(parameters)  # Update model with the latest parameters
        loss, accuracy = model.evaluate(x_test, y_test)
        return loss, {"test_accuracy": accuracy}


    return evaluate


def fit_config(rnd: int):
    """Return training configuration dict for each round.

    Keep batch size fixed at 32, perform two rounds of training with one
    local epoch, increase to two local epochs afterwards.
    """
    config = {
        "batch_size": 32,
        "local_epochs": 1 if rnd < 2 else 4,
    }
    return config


def evaluate_config(rnd: int):
    """Return evaluation configuration dict for each round.

    Perform five local evaluation steps on each client (i.e., use five
    batches) during rounds one to three, then increase to ten local
    evaluation steps.
    """
    val_steps = 5 if rnd < 4 else 10
    return {"val_steps": val_steps}

main()

## Executar Cliente

In [2]:
import argparse
import os
import utils
import flwr as fl

SERVER_ADDRESS = "35.234.149.156:8080"

# Make TensorFlow logs less verbose
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

# Define Flower client
class Client(fl.client.NumPyClient):
    def __init__(self, model, x_train, y_train, x_test, y_test):
        self.model = model
        self.x_train, self.y_train = x_train, y_train
        self.x_test, self.y_test = x_test, y_test

    def fit(self, parameters, config):
        """Train parameters on the locally held training set."""

        # Update local model parameters
        self.model.set_weights(parameters)

        # Get hyperparameters for this round
        batch_size: int = config["batch_size"]
        epochs: int = config["local_epochs"]

        # Train the model using hyperparameters from config
        history = self.model.fit(
            self.x_train,
            self.y_train,
            batch_size,
            epochs,
            validation_split=0.1,
        )

        # Return updated model parameters and results
        parameters_prime = self.model.get_weights()
        num_examples_train = len(self.x_train)
        results = {
            "loss": history.history["loss"][0],
            "accuracy": history.history["accuracy"][0],
            "val_loss": history.history["val_loss"][0],
            "val_accuracy": history.history["val_accuracy"][0],
        }
        return parameters_prime, num_examples_train, results

    def evaluate(self, parameters, config):
        """Evaluate parameters on the locally held test set."""

        # Update local model with global parameters
        self.model.set_weights(parameters)

        # Get config values
        steps: int = config["val_steps"]

        # Evaluate global model parameters on the local test data and return results
        loss, accuracy = self.model.evaluate(self.x_test, self.y_test, 32, steps=steps)
        num_examples_test = len(self.x_test)
        return loss, num_examples_test, {"test_accuracy": accuracy}


def main() -> None:
    model = utils.get_model()
    model.compile("adam", "binary_crossentropy", metrics=["accuracy"])

    (x_train, y_train), (x_test, y_test) = utils.load_data(1)
    client = Client(model, x_train, y_train, x_test, y_test)
    
    fl.client.start_numpy_client(
        server_address=SERVER_ADDRESS,
        client=client,
    )
    
main()

INFO flower 2023-02-28 01:03:30,772 | connection.py:102 | Opened insecure gRPC connection (no certificates were passed)
DEBUG flower 2023-02-28 01:03:30,774 | connection.py:39 | ChannelConnectivity.IDLE
DEBUG flower 2023-02-28 01:03:30,775 | connection.py:39 | ChannelConnectivity.CONNECTING


Using downloaded and verified file: /home/marcela_andrade/.medmnist/pneumoniamnist.npz
Using downloaded and verified file: /home/marcela_andrade/.medmnist/pneumoniamnist.npz


DEBUG flower 2023-02-28 01:03:31,281 | connection.py:39 | ChannelConnectivity.READY


Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4


DEBUG flower 2023-02-28 01:05:34,320 | connection.py:121 | gRPC channel closed
INFO flower 2023-02-28 01:05:34,321 | app.py:149 | Disconnect and shut down
