**Table of contents**<a id='toc0_'></a>    
- [Importation des bibliothèques](#toc1_)    
  - [Importation des paquets ou modules de la bibliothèque OpenFL](#toc1_1_)    
  - [Importation des paquets ou modules de la bibliothèque PyTorch](#toc1_2_)    
  - [Importation d’autres paquets ou modules requis](#toc1_3_)    
- [Définition du modèle d‘entraînement](#toc2_)    
  - [Définition des chargeurs de données](#toc2_1_)    
  - [Définition du modèle de réseau CNN](#toc2_2_)    
  - [Définition de la fonction d'inférence utilisée dans le test](#toc2_3_)    
- [Définition des règles de l'apprentissage fédéré](#toc3_)    
  - [Méthode de calcul de la moyenne des poids d'apprentissage fédéré](#toc3_1_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# <a id='toc1_'></a>[Importation des bibliothèques](#toc0_)

## <a id='toc1_1_'></a>[Importation des paquets ou modules de la bibliothèque OpenFL](#toc0_)


In [1]:
from openfl.experimental.workflow.interface import Aggregator, Collaborator, FLSpec
from openfl.experimental.workflow.placement import aggregator, collaborator
from openfl.experimental.workflow.runtime import LocalRuntime

## <a id='toc1_2_'></a>[Importation des paquets ou modules de la bibliothèque PyTorch](#toc0_)


In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader

from torchsummary import summary

from torchvision import datasets, transforms

## <a id='toc1_3_'></a>[Importation d’autres paquets ou modules requis](#toc0_)


In [3]:
from copy import deepcopy

import numpy as np

from termcolor import cprint

# <a id='toc2_'></a>[Définition du modèle d‘entraînement](#toc0_)

## <a id='toc2_1_'></a>[Définition des chargeurs de données](#toc0_)


In [4]:
data_path = "/tmp/files/"

tensor_cifar10 = datasets.CIFAR10(
    data_path, train=True, download=True, transform=transforms.ToTensor()
)

tensor_images = torch.stack([tensor_image for tensor_image, _ in tensor_cifar10], dim=3)

tensor_images.shape

torch.Size([3, 32, 32, 50000])

In [5]:
tensor_mean = tensor_images.view(3, -1).mean(dim=1)
tensor_mean

tensor([0.4914, 0.4822, 0.4465])

In [6]:
tensor_std = tensor_images.view(3, -1).std(dim=1)
tensor_std

tensor([0.2470, 0.2435, 0.2616])

In [7]:
transform_train = transforms.Compose(
    [
        transforms.Resize((32, 32)),
        transforms.RandomHorizontalFlip(),
        transforms.RandomRotation(10),
        transforms.RandomAffine(0, shear=10, scale=(0.8, 1.2)),
        transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
        transforms.ToTensor(),
        transforms.Normalize(tensor_mean, tensor_std),
    ]
)


transform_test = transforms.Compose(
    [
        transforms.Resize((32, 32)),
        transforms.ToTensor(),
        transforms.Normalize(tensor_mean, tensor_std),
    ]
)

cifar10_train = datasets.CIFAR10(
    "/tmp/files/",
    train=True,
    download=True,
    transform=transform_train,
)

cifar10_test = datasets.CIFAR10(
    "/tmp/files/",
    train=False,
    download=True,
    transform=transform_test,
)

## <a id='toc2_2_'></a>[Définition du modèle de réseau CNN](#toc0_)


In [8]:
if torch.backends.mps.is_available():
    cprint("MPS is available", "green")
    device = torch.device("mps:0")
elif torch.backends.cuda.is_available():
    cprint("CUDA is available", "green")
    device = torch.device("cuda:0")
elif torch.backends.cudnn.is_built():
    cprint("CUDNN is available", "green")
    device = torch.device("cuda:0")
else:
    cprint("CUDA and MPS are not available", "red")
    cprint("Using CPU", "red")
    device = torch.device("cpu")

[32mMPS is available[0m


In [9]:
class LeNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 16, 3, 1, padding=1)
        self.conv2 = nn.Conv2d(16, 32, 3, 1, padding=1)
        self.conv3 = nn.Conv2d(32, 64, 3, 1, padding=1)
        self.fc1 = nn.Linear(4 * 4 * 64, 500)
        self.dropout1 = nn.Dropout(0.5)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv3(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 64)
        x = F.relu(self.fc1(x))
        x = self.dropout1(x)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

In [10]:
model = LeNet()
summary(model, next(iter(cifar10_test))[0].shape)

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 16, 32, 32]             448
            Conv2d-2           [-1, 32, 16, 16]           4,640
            Conv2d-3             [-1, 64, 8, 8]          18,496
            Linear-4                  [-1, 500]         512,500
           Dropout-5                  [-1, 500]               0
            Linear-6                   [-1, 10]           5,010
Total params: 541,094
Trainable params: 541,094
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.23
Params size (MB): 2.06
Estimated Total Size (MB): 2.30
----------------------------------------------------------------


In [11]:
def count_parameters(model):
    length = 67
    names = [n for (n, p) in model.named_parameters() if p.requires_grad]
    name = "total parameters"
    names.append(name)
    max_length = max(map(len, names))
    formatted_names = [f"{f'  {n} ':.<{max_length + 3}}" for n in names]
    params = [p.numel() for p in model.parameters() if p.requires_grad]
    params.append(sum(params))
    formatted_params = [f"{f' {p}  ':.>{length - max_length - 3}}" for p in params]

    for n, p in zip(formatted_names[:-1], formatted_params[:-1]):
        cprint((n + p), "magenta")
    cprint(" " + "_" * (length - 2) + " ", "magenta")
    cprint(
        (formatted_names[-1] + formatted_params[-1]),
        "magenta",
        end="\n\n",
    )

    return names, params


names, params = count_parameters(model)

[35m  conv1.weight .............................................. 432  [0m
[35m  conv1.bias ................................................. 16  [0m
[35m  conv2.weight ............................................. 4608  [0m
[35m  conv2.bias ................................................. 32  [0m
[35m  conv3.weight ............................................ 18432  [0m
[35m  conv3.bias ................................................. 64  [0m
[35m  fc1.weight ............................................. 512000  [0m
[35m  fc1.bias .................................................. 500  [0m
[35m  fc2.weight ............................................... 5000  [0m
[35m  fc2.bias ................................................... 10  [0m
[35m _________________________________________________________________ [0m
[35m  total parameters ....................................... 541094  [0m



## <a id='toc2_3_'></a>[Définition de la fonction d'inférence utilisée dans le test](#toc0_)

In [12]:
def inference(network, test_loader):
    # Mettre le module en mode évaluation.
    network.eval()
    network.to(device)
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = network(data)
            test_loss += F.cross_entropy(output, target, reduction="sum").item()
            pred = output.data.max(dim=1, keepdim=True)[1]
            correct += pred.eq(target.data.view_as(pred)).sum()
    test_loss /= len(test_loader.dataset)
    cprint(
        "Test set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)".format(
            test_loss,
            correct,
            len(test_loader.dataset),
            100.0 * correct / len(test_loader.dataset),
        ),
        "magenta",
        attrs=["underline"],
        end="\n\n",
    )
    return float(correct / len(test_loader.dataset))

In [13]:
test_loader = DataLoader(cifar10_test, batch_size=500, shuffle=False)

inference(model, test_loader)

[4m[35mTest set: Avg. loss: 2.3027, Accuracy: 968/10000 (10%)[0m



0.09679999947547913

# <a id='toc3_'></a>[Définition des règles de l'apprentissage fédéré](#toc0_)

## <a id='toc3_1_'></a>[Méthode de calcul de la moyenne des poids d'apprentissage fédéré](#toc0_)

In [14]:
def FedAvg(models, weights=None):
    new_model = models[0]
    state_dicts = [model.state_dict() for model in models]
    state_dict = new_model.state_dict()
    for key in models[1].state_dict():
        if state_dict[key].dim() != 0:
            state_dict[key] = torch.from_numpy(
                np.average(
                    [state[key].cpu().numpy() for state in state_dicts],
                    axis=0,
                    weights=weights,
                )
            )
        else:
            state_dict[key] = torch.from_numpy(
                np.average(
                    [state[key].reshape(1).cpu().numpy() for state in state_dicts],
                    axis=0,
                    weights=weights,
                )
            )
    new_model.load_state_dict(state_dict)
    return new_model

In [15]:
learning_rate = 0.01
log_interval = 10
momentum = 0.5


class FederatedFlow(FLSpec):

    def __init__(self, model=None, optimizer=None, rounds=3, **kwargs):
        super().__init__(**kwargs)
        # Importe un modèle personnalisé et ajoute le bon algorithme d’optimisation pour ce dernier.
        if model is not None:
            self.model = model
            self.optimizer = optimizer
        # Chargez le modèle `Net()` et configurez l'optimiseur pour qu'il s'applique uniquement à ce
        # modèle.
        else:
            self.model = Net()
            self.optimizer = optim.SGD(
                self.model.parameters(), lr=learning_rate, momentum=momentum
            )
        self.rounds = rounds

    # Un agrégateur est le nœud central de l'apprentissage fédéré.

    # L'agrégateur commence par un modèle et un optimiseur transmis de manière facultative.

    # L'agrégateur commence le flux avec la tâche de `start`, où la liste des collaborateurs est
    # extraite de l'exécution (`self.collaborators = self.runtime.collaborators`) et est ensuite
    # utilisée comme liste de participants pour exécuterla tâche énumérée dans `self.next`,
    # `aggregated_model_validation`.
    @aggregator
    def start(self):
        cprint("Performing initialization for model", "black", attrs=["bold"])
        self.collaborators = self.runtime.collaborators
        self.private = 10
        self.current_round = 0
        self.next(
            self.aggregated_model_validation,
            foreach="collaborators",
            exclude=["private"],
        )

    # Le modèle, l'optimiseur et tout ce qui n'est pas explicitement exclu de la fonction suivante
    # seront transmis de la fonction de `start` de l'agrégateur à la tâche
    # `aggregated_model_validation` du collaborateur.

    # L’endroit où les tâches sont exécutées est déterminé par le décorateur de placement qui
    # précède chaque définition de tâche (`@aggregator` ou `@collaborator`).

    # Une fois que chaque collaborateur (défini dans l’exécution) a terminé la tâche
    # `aggregated_model_validation`, il transmet son état actuel à la tâche `train`, de `train` à
    # `local_model_validation`, et enfin à `join` à l'agrégateur.

    # C'est au niveau de `join` qu'une moyenne des poids des modèles est calculée et que le tour
    # suivant peut commencer.
    @collaborator
    def aggregated_model_validation(self):
        cprint(
            f"Performing aggregated model validation for collaborator {self.input}",
            "red",
            attrs=["bold"],
        )
        self.agg_validation_score = inference(self.model, self.test_loader)
        cprint(
            f"{self.input} value of {self.agg_validation_score}",
            "red",
            attrs=["underline"],
        )
        self.next(self.train)

    @collaborator
    def train(self):
        if model is not None:
            self.model = model
            self.optimizer = optimizer
        else:
            self.model = Net()
            self.optimizer = optim.SGD(
                self.model.parameters(), lr=learning_rate, momentum=momentum
            )
        self.model.train()
        for batch_idx, (data, target) in enumerate(self.train_loader):
            data, target = data.to(device), target.to(device)
            self.optimizer.zero_grad()
            output = self.model(data)
            loss = F.cross_entropy(output, target)
            loss.backward()
            self.optimizer.step()
            if batch_idx % log_interval == 0:
                cprint(
                    "Train Epoch: 1 [{}/{} ({:.0f}%)]\tLoss: {:.6f}".format(
                        batch_idx * len(data),
                        len(self.train_loader.dataset),
                        100.0 * batch_idx / len(self.train_loader),
                        loss.item(),
                    ),
                    "yellow",
                )
                self.loss = loss.item()
                torch.save(self.model.state_dict(), "model.pth")
                torch.save(self.optimizer.state_dict(), "optimizer.pth")
        self.training_completed = True
        self.next(self.local_model_validation)

    @collaborator
    def local_model_validation(self):
        self.local_validation_score = inference(self.model, self.test_loader)
        cprint(
            f"Doing local model validation for collaborator {self.input}: \
                {self.local_validation_score}",
            "white",
        )
        self.next(self.join, exclude=["training_completed"])

    @aggregator
    def join(self, inputs):
        self.average_loss = sum(input.loss for input in inputs) / len(inputs)
        self.aggregated_model_accuracy = sum(
            input.agg_validation_score for input in inputs
        ) / len(inputs)
        self.local_model_accuracy = sum(
            input.local_validation_score for input in inputs
        ) / len(inputs)
        cprint(
            f"Average aggregated model validation values = \
                {self.aggregated_model_accuracy}",
            "green",
        )
        cprint(f"Average training loss = {self.average_loss}", "green")
        cprint(
            f"Average local model validation values = \
            {self.local_model_accuracy}",
            "green",
        )
        self.model = FedAvg([input.model for input in inputs])
        self.optimizer = [input.optimizer for input in inputs][0]
        self.current_round += 1
        if self.current_round < self.rounds:
            self.next(
                self.aggregated_model_validation,
                foreach="collaborators",
                exclude=["private"],
            )
        else:
            self.next(self.end)

    @aggregator
    def end(self):
        cprint("This is the end of the flow", "black")

Aggregator step "start" registered
Collaborator step "aggregated_model_validation" registered
Collaborator step "train" registered
Collaborator step "local_model_validation" registered
Aggregator step "join" registered
Aggregator step "end" registered


In [16]:
random_seed = 1
torch.manual_seed(random_seed)

<torch._C.Generator at 0x1205f0670>

In [17]:
batch_size_train = 64

# Setup participants
aggregator = Aggregator()
aggregator.private_attributes = {}

# Setup collaborators with private attributes
collaborator_names = ["Portland", "Seattle", "Chandler", "Bangalore"]
collaborators = [Collaborator(name=name) for name in collaborator_names]
for idx, collaborator in enumerate(collaborators):
    local_train = deepcopy(cifar10_train)
    local_test = deepcopy(cifar10_test)
    local_train.data = cifar10_train.data[idx :: len(collaborators)]
    local_train.targets = cifar10_train.targets[idx :: len(collaborators)]
    local_test.data = cifar10_test.data[idx :: len(collaborators)]
    local_test.targets = cifar10_test.targets[idx :: len(collaborators)]
    collaborator.private_attributes = {
        "train_loader": DataLoader(
            local_train, batch_size=batch_size_train, shuffle=True
        ),
        "test_loader": DataLoader(
            local_test, batch_size=batch_size_train, shuffle=True
        ),
    }

local_runtime = LocalRuntime(
    aggregator=aggregator, collaborators=collaborators, backend="single_process"
)
print(f"Local runtime collaborators = {local_runtime.collaborators}")

Local runtime collaborators = ['Portland', 'Seattle', 'Chandler', 'Bangalore']


In [18]:
import os
if os.environ.get("USERNAME") is None:
    os.environ["USERNAME"] = "Hao"

In [19]:
import getpass
print(getpass.getuser())

haozhang


In [20]:
learning_rate = 0.001
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
flflow = FederatedFlow(model, optimizer, rounds=10, checkpoint=True)
flflow.runtime = local_runtime
flflow.run()

Created flow FederatedFlow

Calling start
[94m[1m[30mPerforming initialization for model[0m[0m[94m
[0m[94mSaving data artifacts for start[0m[94m
[0m[94mSaved data artifacts for start[0m[94m
[0m
Calling aggregated_model_validation
[94m[1m[31mPerforming aggregated model validation for collaborator Portland[0m[0m[94m
[0m[94m[4m[35mTest set: Avg. loss: 2.3023, Accuracy: 230/2500 (9%)[0m[0m[94m

[0m[94m[4m[31mPortland value of 0.09200000017881393[0m[0m[94m
[0m[94mSaving data artifacts for aggregated_model_validation[0m[94m
[0m[94mSaved data artifacts for aggregated_model_validation[0m[94m
[0m
Calling train
[0m[94mSaving data artifacts for train[0m[94m
[0m[94mSaved data artifacts for train[0m[94m
[0m
Calling local_model_validation
[94m[4m[35mTest set: Avg. loss: 1.5842, Accuracy: 1063/2500 (43%)[0m[0m[94m

[0m[94m[97mDoing local model validation for collaborator Portland:                 0.4251999855041504[0m[0m[94m
[0m[94mSa

In [21]:
print(
    f'Sample of the final model weights: {flflow.model.state_dict()["conv1.weight"][0]}'
)

print(
    f"\nFinal aggregated model accuracy for {flflow.rounds} rounds of training: \
        {flflow.aggregated_model_accuracy}"
)

Sample of the final model weights: tensor([[[-0.0145,  0.0039,  0.0102],
         [-0.0523,  0.0303,  0.0802],
         [ 0.0495, -0.0267,  0.1297]],

        [[-0.0949,  0.0476,  0.0694],
         [-0.1488,  0.0469,  0.1080],
         [-0.1946, -0.0422,  0.0222]],

        [[-0.0294,  0.0298,  0.1073],
         [-0.0938, -0.0925,  0.1693],
         [-0.0752, -0.0937,  0.2290]]], device='mps:0')

Final aggregated model accuracy for 10 rounds of training:         0.7294000089168549
