# Flower tutorial
## Strategies in federated learning

Let’s move beyond FedAvg with Flower Strategies!

In this notebook, we see how we can gradually enhance our system by customizing the strategy, initializing parameters on the server side, choosing a different strategy, and evaluating models on the server-side. That’s quite a bit of flexibility with so little code, right?

In the later sections, we see how we can communicate arbitrary values between server and clients to fully customize client-side execution. With that capability, we built a large-scale Federated Learning simulation using the Flower Virtual Client Engine and ran an experiment involving 1000 clients in the same workload - all in a Jupyter Notebook!

[tutorial link](https://flower.dev/docs/tutorial/Flower-2-Strategies-in-FL-PyTorch.html)

In [1]:
from collections import OrderedDict
from typing import Dict, List, Optional, Tuple

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, random_split
from torchvision.datasets import CIFAR10

import flwr as fl

DEVICE = torch.device("cpu")  # Try "cuda" to train on GPU
print(
    f"Training on {DEVICE} using PyTorch {torch.__version__} and Flower {fl.__version__}"
)

Training on cpu using PyTorch 2.0.0 and Flower 1.5.0.dev20230427


# Data loading
Let’s now load the CIFAR-10 training and test set, partition them into ten smaller datasets (each split into training and validation set), and wrap everything in their own DataLoader. We introduce a new parameter num_clients which allows us to call load_datasets with different numbers of clients.

In [2]:
NUM_CLIENTS = 10


def load_datasets(num_clients: int):
    # Download and transform CIFAR-10 (train and test)
    transform = transforms.Compose(
        [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
    )
    trainset = CIFAR10("./dataset", train=True, download=True, transform=transform)
    testset = CIFAR10("./dataset", train=False, download=True, transform=transform)

    # Split training set into `num_clients` partitions to simulate different local datasets
    partition_size = len(trainset) // num_clients
    lengths = [partition_size] * num_clients
    datasets = random_split(trainset, lengths, torch.Generator().manual_seed(42))

    # Split each partition into train/val and create DataLoader
    trainloaders = []
    valloaders = []
    for ds in datasets:
        len_val = len(ds) // 10  # 10 % validation set
        len_train = len(ds) - len_val
        lengths = [len_train, len_val]
        ds_train, ds_val = random_split(ds, lengths, torch.Generator().manual_seed(42))
        trainloaders.append(DataLoader(ds_train, batch_size=32, shuffle=True))
        valloaders.append(DataLoader(ds_val, batch_size=32))
    testloader = DataLoader(testset, batch_size=32)
    return trainloaders, valloaders, testloader


trainloaders, valloaders, testloader = load_datasets(NUM_CLIENTS)

Files already downloaded and verified
Files already downloaded and verified


# Model training/evaluation
Let’s continue with the usual model definition (including set_parameters and get_parameters), training and test functions:

In [3]:
class Net(nn.Module):
    def __init__(self) -> None:
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


def get_parameters(net) -> List[np.ndarray]:
    return [val.cpu().numpy() for _, val in net.state_dict().items()]


def set_parameters(net, parameters: List[np.ndarray]):
    params_dict = zip(net.state_dict().keys(), parameters)
    state_dict = OrderedDict({k: torch.Tensor(v) for k, v in params_dict})
    net.load_state_dict(state_dict, strict=True)


def train(net, trainloader, epochs: int):
    """Train the network on the training set."""
    criterion = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters())
    net.train()
    for epoch in range(epochs):
        correct, total, epoch_loss = 0, 0, 0.0
        for images, labels in trainloader:
            images, labels = images.to(DEVICE), labels.to(DEVICE)
            optimizer.zero_grad()
            outputs = net(images)
            loss = criterion(net(images), labels)
            loss.backward()
            optimizer.step()
            # Metrics
            epoch_loss += loss
            total += labels.size(0)
            correct += (torch.max(outputs.data, 1)[1] == labels).sum().item()
        epoch_loss /= len(trainloader.dataset)
        epoch_acc = correct / total
        print(f"Epoch {epoch+1}: train loss {epoch_loss}, accuracy {epoch_acc}")


def test(net, testloader):
    """Evaluate the network on the entire test set."""
    criterion = torch.nn.CrossEntropyLoss()
    correct, total, loss = 0, 0, 0.0
    net.eval()
    with torch.no_grad():
        for images, labels in testloader:
            images, labels = images.to(DEVICE), labels.to(DEVICE)
            outputs = net(images)
            loss += criterion(outputs, labels).item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    loss /= len(testloader.dataset)
    accuracy = correct / total
    return loss, accuracy

# Flower client
To implement the Flower client, we (again) create a subclass of flwr.client.NumPyClient and implement the three methods get_parameters, fit, and evaluate. Here, we also pass the cid to the client and use it log additional details:

In [4]:
class FlowerClient(fl.client.NumPyClient):
    def __init__(self, cid, net, trainloader, valloader):
        self.cid = cid
        self.net = net
        self.trainloader = trainloader
        self.valloader = valloader

    def get_parameters(self, config):
        print(f"[Client {self.cid}] get_parameters")
        return get_parameters(self.net)

    def fit(self, parameters, config):
        print(f"[Client {self.cid}] fit, config: {config}")
        set_parameters(self.net, parameters)
        train(self.net, self.trainloader, epochs=1)
        return get_parameters(self.net), len(self.trainloader), {}

    def evaluate(self, parameters, config):
        print(f"[Client {self.cid}] evaluate, config: {config}")
        set_parameters(self.net, parameters)
        loss, accuracy = test(self.net, self.valloader)
        return float(loss), len(self.valloader), {"accuracy": float(accuracy)}


def client_fn(cid) -> FlowerClient:
    net = Net().to(DEVICE)
    trainloader = trainloaders[int(cid)]
    valloader = valloaders[int(cid)]
    return FlowerClient(cid, net, trainloader, valloader)

# Strategy customization
So far, everything should look familiar if you’ve worked through the introductory notebook. With that, we’re ready to introduce a number of new features.

# Server-side parameter initialization
Flower, by default, initializes the global model by asking one random client for the initial parameters. In many cases, we want more control over parameter initialization though. Flower therefore allows you to directly pass the initial parameters to the Strategy:

In [5]:
# Create an instance of the model and get the parameters
params = get_parameters(Net())

# Pass parameters to the Strategy for server-side parameter initialization
strategy = fl.server.strategy.FedAvg(
    fraction_fit=0.3,
    fraction_evaluate=0.3,
    min_fit_clients=3,
    min_evaluate_clients=3,
    min_available_clients=NUM_CLIENTS,
    initial_parameters=fl.common.ndarrays_to_parameters(params),
)

# Specify client resources if you need GPU (defaults to 1 CPU and 0 GPU)
client_resources = None
if DEVICE.type == "cuda":
    client_resources = {"num_gpus": 1}

# Start simulation
fl.simulation.start_simulation(
    client_fn=client_fn,
    num_clients=NUM_CLIENTS,
    config=fl.server.ServerConfig(num_rounds=3),  # Just three rounds
    strategy=strategy,
    client_resources=client_resources,
)

INFO flwr 2023-05-23 13:06:03,350 | app.py:146 | Starting Flower simulation, config: ServerConfig(num_rounds=3, round_timeout=None)
2023-05-23 13:06:05,029	INFO worker.py:1625 -- Started a local Ray instance.
INFO flwr 2023-05-23 13:06:05,667 | app.py:180 | Flower VCE: Ray initialized with resources: {'node:127.0.0.1': 1.0, 'CPU': 8.0, 'memory': 8031669453.0, 'object_store_memory': 2147483648.0}
INFO flwr 2023-05-23 13:06:05,668 | server.py:86 | Initializing global parameters
INFO flwr 2023-05-23 13:06:05,668 | server.py:269 | Using initial parameters provided by strategy
INFO flwr 2023-05-23 13:06:05,668 | server.py:88 | Evaluating initial parameters
INFO flwr 2023-05-23 13:06:05,668 | server.py:101 | FL starting
DEBUG flwr 2023-05-23 13:06:05,669 | server.py:218 | fit_round 1: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7660)[0m [Client 0] fit, config: {}


DEBUG flwr 2023-05-23 13:06:09,295 | server.py:232 | fit_round 1 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:09,299 | server.py:168 | evaluate_round 1: strategy sampled 3 clients (out of 10)
DEBUG flwr 2023-05-23 13:06:10,946 | server.py:182 | evaluate_round 1 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:10,947 | server.py:218 | fit_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7660)[0m Epoch 1: train loss 0.06482589989900589, accuracy 0.22333333333333333
[2m[36m(launch_and_evaluate pid=7660)[0m [Client 3] evaluate, config: {}


DEBUG flwr 2023-05-23 13:06:13,822 | server.py:232 | fit_round 2 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:13,825 | server.py:168 | evaluate_round 2: strategy sampled 3 clients (out of 10)
DEBUG flwr 2023-05-23 13:06:15,466 | server.py:182 | evaluate_round 2 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:15,466 | server.py:218 | fit_round 3: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7661)[0m [Client 4] fit, config: {}[32m [repeated 5x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/ray-logging.html#log-deduplication for more options.)[0m
[2m[36m(launch_and_fit pid=7660)[0m Epoch 1: train loss 0.05658993124961853, accuracy 0.326[32m [repeated 3x across cluster][0m
[2m[36m(launch_and_evaluate pid=7661)[0m [Client 0] evaluate, config: {}[32m [repeated 5x across cluster][0m


DEBUG flwr 2023-05-23 13:06:18,430 | server.py:232 | fit_round 3 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:18,433 | server.py:168 | evaluate_round 3: strategy sampled 3 clients (out of 10)
DEBUG flwr 2023-05-23 13:06:20,112 | server.py:182 | evaluate_round 3 received 3 results and 0 failures
INFO flwr 2023-05-23 13:06:20,112 | server.py:147 | FL finished in 14.443642458005343
INFO flwr 2023-05-23 13:06:20,113 | app.py:218 | app_fit: losses_distributed [(1, 0.0629815646012624), (2, 0.056064457734425865), (3, 0.05286578257878621)]
INFO flwr 2023-05-23 13:06:20,113 | app.py:219 | app_fit: metrics_distributed_fit {}
INFO flwr 2023-05-23 13:06:20,113 | app.py:220 | app_fit: metrics_distributed {}
INFO flwr 2023-05-23 13:06:20,113 | app.py:221 | app_fit: losses_centralized []
INFO flwr 2023-05-23 13:06:20,113 | app.py:222 | app_fit: metrics_centralized {}


History (loss, distributed):
	round 1: 0.0629815646012624
	round 2: 0.056064457734425865
	round 3: 0.05286578257878621

Passing initial_parameters to the FedAvg strategy prevents Flower from asking one of the clients for the initial parameters. If we look closely, we can see that the logs do not show any calls to the FlowerClient.get_parameters method.

# Starting with a customized strategy
We’ve seen the function start_simulation before. It accepts a number of arguments, amongst them the client_fn used to create FlowerClient instances, the number of clients to simulate num_clients, the number of rounds num_rounds, and the strategy.

The strategy encapsulates the federated learning approach/algorithm, for example, FedAvg or FedAdagrad. Let’s try to use a different strategy this time:

In [6]:
# Create FedAdam strategy
strategy = fl.server.strategy.FedAdagrad(
    fraction_fit=0.3,
    fraction_evaluate=0.3,
    min_fit_clients=3,
    min_evaluate_clients=3,
    min_available_clients=NUM_CLIENTS,
    initial_parameters=fl.common.ndarrays_to_parameters(get_parameters(Net())),
)

# Start simulation
fl.simulation.start_simulation(
    client_fn=client_fn,
    num_clients=NUM_CLIENTS,
    config=fl.server.ServerConfig(num_rounds=3),  # Just three rounds
    strategy=strategy,
    client_resources=client_resources,
)

INFO flwr 2023-05-23 13:06:20,130 | app.py:146 | Starting Flower simulation, config: ServerConfig(num_rounds=3, round_timeout=None)


[2m[36m(launch_and_fit pid=7661)[0m [Client 8] fit, config: {}[32m [repeated 3x across cluster][0m
[2m[36m(launch_and_fit pid=7661)[0m Epoch 1: train loss 0.05372537672519684, accuracy 0.37555555555555553[32m [repeated 5x across cluster][0m
[2m[36m(launch_and_evaluate pid=7660)[0m [Client 6] evaluate, config: {}


2023-05-23 13:06:24,278	INFO worker.py:1625 -- Started a local Ray instance.
INFO flwr 2023-05-23 13:06:24,875 | app.py:180 | Flower VCE: Ray initialized with resources: {'memory': 7960094311.0, 'node:127.0.0.1': 1.0, 'object_store_memory': 2147483648.0, 'CPU': 8.0}
INFO flwr 2023-05-23 13:06:24,875 | server.py:86 | Initializing global parameters
INFO flwr 2023-05-23 13:06:24,876 | server.py:269 | Using initial parameters provided by strategy
INFO flwr 2023-05-23 13:06:24,876 | server.py:88 | Evaluating initial parameters
INFO flwr 2023-05-23 13:06:24,876 | server.py:101 | FL starting
DEBUG flwr 2023-05-23 13:06:24,876 | server.py:218 | fit_round 1: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7689)[0m [Client 9] fit, config: {}


DEBUG flwr 2023-05-23 13:06:28,476 | server.py:232 | fit_round 1 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:28,481 | server.py:168 | evaluate_round 1: strategy sampled 3 clients (out of 10)
DEBUG flwr 2023-05-23 13:06:30,111 | server.py:182 | evaluate_round 1 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:30,112 | server.py:218 | fit_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7689)[0m Epoch 1: train loss 0.06498049199581146, accuracy 0.2311111111111111
[2m[36m(launch_and_evaluate pid=7689)[0m [Client 2] evaluate, config: {}


DEBUG flwr 2023-05-23 13:06:33,031 | server.py:232 | fit_round 2 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:33,036 | server.py:168 | evaluate_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7689)[0m 
[2m[36m(launch_and_fit pid=7683)[0m [Client 3] fit, config: {}[32m [repeated 5x across cluster][0m


DEBUG flwr 2023-05-23 13:06:34,682 | server.py:182 | evaluate_round 2 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:34,682 | server.py:218 | fit_round 3: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7683)[0m Epoch 1: train loss 0.9056510925292969, accuracy 0.272[32m [repeated 5x across cluster][0m
[2m[36m(launch_and_evaluate pid=7683)[0m [Client 5] evaluate, config: {}[32m [repeated 5x across cluster][0m


DEBUG flwr 2023-05-23 13:06:37,609 | server.py:232 | fit_round 3 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:37,613 | server.py:168 | evaluate_round 3: strategy sampled 3 clients (out of 10)
DEBUG flwr 2023-05-23 13:06:39,258 | server.py:182 | evaluate_round 3 received 3 results and 0 failures
INFO flwr 2023-05-23 13:06:39,258 | server.py:147 | FL finished in 14.381816749984864
INFO flwr 2023-05-23 13:06:39,258 | app.py:218 | app_fit: losses_distributed [(1, 8.853475362141927), (2, 0.6242502880096436), (3, 0.07416640790303548)]
INFO flwr 2023-05-23 13:06:39,259 | app.py:219 | app_fit: metrics_distributed_fit {}
INFO flwr 2023-05-23 13:06:39,259 | app.py:220 | app_fit: metrics_distributed {}
INFO flwr 2023-05-23 13:06:39,259 | app.py:221 | app_fit: losses_centralized []
INFO flwr 2023-05-23 13:06:39,259 | app.py:222 | app_fit: metrics_centralized {}


[2m[36m(launch_and_fit pid=7683)[0m [Client 2] fit, config: {}[32m [repeated 3x across cluster][0m


History (loss, distributed):
	round 1: 8.853475362141927
	round 2: 0.6242502880096436
	round 3: 0.07416640790303548

# Server-side parameter evaluation
Flower can evaluate the aggregated model on the server-side or on the client-side. Client-side and server-side evaluation are similar in some ways, but different in others.

__Centralized Evaluation__ (or server-side evaluation) is conceptually simple: it works the same way that evaluation in centralized machine learning does. If there is a server-side dataset that can be used for evaluation purposes, then that’s great. We can evaluate the newly aggregated model after each round of training without having to send the model to clients. We’re also fortunate in the sense that our entire evaluation dataset is available at all times.

__Federated Evaluation__ (or client-side evaluation) is more complex, but also more powerful: it doesn’t require a centralized dataset and allows us to evaluate models over a larger set of data, which often yields more realistic evaluation results. In fact, many scenarios require us to use __Federated Evaluation__ if we want to get representative evaluation results at all. But this power comes at a cost: once we start to evaluate on the client side, we should be aware that our evaluation dataset can change over consecutive rounds of learning if those clients are not always available. Moreover, the dataset held by each client can also change over consecutive rounds. This can lead to evaluation results that are not stable, so even if we would not change the model, we’d see our evaluation results fluctuate over consecutive rounds.

We’ve seen how federated evaluation works on the client side (i.e., by implementing the evaluate method in FlowerClient). Now let’s see how we can evaluate aggregated model parameters on the server-side:

In [7]:
# The `evaluate` function will be by Flower called after every round
def evaluate(
    server_round: int,
    parameters: fl.common.NDArrays,
    config: Dict[str, fl.common.Scalar],
) -> Optional[Tuple[float, Dict[str, fl.common.Scalar]]]:
    net = Net().to(DEVICE)
    valloader = valloaders[0]
    set_parameters(net, parameters)  # Update model with the latest parameters
    loss, accuracy = test(net, valloader)
    print(f"Server-side evaluation loss {loss} / accuracy {accuracy}")
    return loss, {"accuracy": accuracy}

In [8]:
strategy = fl.server.strategy.FedAvg(
    fraction_fit=0.3,
    fraction_evaluate=0.3,
    min_fit_clients=3,
    min_evaluate_clients=3,
    min_available_clients=NUM_CLIENTS,
    initial_parameters=fl.common.ndarrays_to_parameters(get_parameters(Net())),
    evaluate_fn=evaluate,  # Pass the evaluation function
)

fl.simulation.start_simulation(
    client_fn=client_fn,
    num_clients=NUM_CLIENTS,
    config=fl.server.ServerConfig(num_rounds=3),  # Just three rounds
    strategy=strategy,
    client_resources=client_resources,
)

INFO flwr 2023-05-23 13:06:39,275 | app.py:146 | Starting Flower simulation, config: ServerConfig(num_rounds=3, round_timeout=None)


[2m[36m(launch_and_fit pid=7683)[0m Epoch 1: train loss 0.10485149919986725, accuracy 0.11622222222222223[32m [repeated 3x across cluster][0m
[2m[36m(launch_and_evaluate pid=7683)[0m [Client 1] evaluate, config: {}[32m [repeated 3x across cluster][0m


2023-05-23 13:06:43,522	INFO worker.py:1625 -- Started a local Ray instance.
INFO flwr 2023-05-23 13:06:44,143 | app.py:180 | Flower VCE: Ray initialized with resources: {'memory': 7939730637.0, 'CPU': 8.0, 'node:127.0.0.1': 1.0, 'object_store_memory': 2147483648.0}
INFO flwr 2023-05-23 13:06:44,144 | server.py:86 | Initializing global parameters
INFO flwr 2023-05-23 13:06:44,144 | server.py:269 | Using initial parameters provided by strategy
INFO flwr 2023-05-23 13:06:44,144 | server.py:88 | Evaluating initial parameters
INFO flwr 2023-05-23 13:06:44,199 | server.py:91 | initial parameters (loss, other metrics): 0.07388043880462647, {'accuracy': 0.076}
INFO flwr 2023-05-23 13:06:44,200 | server.py:101 | FL starting
DEBUG flwr 2023-05-23 13:06:44,200 | server.py:218 | fit_round 1: strategy sampled 3 clients (out of 10)


Server-side evaluation loss 0.07388043880462647 / accuracy 0.076
[2m[36m(launch_and_fit pid=7707)[0m [Client 0] fit, config: {}


DEBUG flwr 2023-05-23 13:06:47,809 | server.py:232 | fit_round 1 received 3 results and 0 failures
INFO flwr 2023-05-23 13:06:47,866 | server.py:119 | fit progress: (1, 0.06223023772239685, {'accuracy': 0.27}, 3.6656037499778904)
DEBUG flwr 2023-05-23 13:06:47,866 | server.py:168 | evaluate_round 1: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7707)[0m Epoch 1: train loss 0.06600439548492432, accuracy 0.22511111111111112
Server-side evaluation loss 0.06223023772239685 / accuracy 0.27


DEBUG flwr 2023-05-23 13:06:49,515 | server.py:182 | evaluate_round 1 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:49,516 | server.py:218 | fit_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_evaluate pid=7707)[0m [Client 9] evaluate, config: {}


DEBUG flwr 2023-05-23 13:06:52,447 | server.py:232 | fit_round 2 received 3 results and 0 failures
INFO flwr 2023-05-23 13:06:52,500 | server.py:119 | fit progress: (2, 0.05776379370689392, {'accuracy': 0.32}, 8.300287666963413)
DEBUG flwr 2023-05-23 13:06:52,501 | server.py:168 | evaluate_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7709)[0m [Client 8] fit, config: {}[32m [repeated 5x across cluster][0m
Server-side evaluation loss 0.05776379370689392 / accuracy 0.32
[2m[36m(launch_and_fit pid=7709)[0m Epoch 1: train loss 0.05801357701420784, accuracy 0.31355555555555553[32m [repeated 5x across cluster][0m


DEBUG flwr 2023-05-23 13:06:54,164 | server.py:182 | evaluate_round 2 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:06:54,165 | server.py:218 | fit_round 3: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_evaluate pid=7708)[0m [Client 2] evaluate, config: {}[32m [repeated 5x across cluster][0m


DEBUG flwr 2023-05-23 13:06:57,061 | server.py:232 | fit_round 3 received 3 results and 0 failures
INFO flwr 2023-05-23 13:06:57,114 | server.py:119 | fit progress: (3, 0.054261915922164915, {'accuracy': 0.386}, 12.914160416985396)
DEBUG flwr 2023-05-23 13:06:57,114 | server.py:168 | evaluate_round 3: strategy sampled 3 clients (out of 10)


Server-side evaluation loss 0.054261915922164915 / accuracy 0.386
[2m[36m(launch_and_fit pid=7709)[0m [Client 6] fit, config: {}[32m [repeated 3x across cluster][0m
[2m[36m(launch_and_fit pid=7707)[0m Epoch 1: train loss 0.0543978214263916, accuracy 0.3611111111111111


DEBUG flwr 2023-05-23 13:06:58,768 | server.py:182 | evaluate_round 3 received 3 results and 0 failures
INFO flwr 2023-05-23 13:06:58,769 | server.py:147 | FL finished in 14.568740582966711
INFO flwr 2023-05-23 13:06:58,769 | app.py:218 | app_fit: losses_distributed [(1, 0.0626326318581899), (2, 0.05687365523974101), (3, 0.05304962309201558)]
INFO flwr 2023-05-23 13:06:58,769 | app.py:219 | app_fit: metrics_distributed_fit {}
INFO flwr 2023-05-23 13:06:58,769 | app.py:220 | app_fit: metrics_distributed {}
INFO flwr 2023-05-23 13:06:58,770 | app.py:221 | app_fit: losses_centralized [(0, 0.07388043880462647), (1, 0.06223023772239685), (2, 0.05776379370689392), (3, 0.054261915922164915)]
INFO flwr 2023-05-23 13:06:58,770 | app.py:222 | app_fit: metrics_centralized {'accuracy': [(0, 0.076), (1, 0.27), (2, 0.32), (3, 0.386)]}


[2m[36m(launch_and_fit pid=7708)[0m Epoch 1: train loss 0.05410698428750038, accuracy 0.358


History (loss, distributed):
	round 1: 0.0626326318581899
	round 2: 0.05687365523974101
	round 3: 0.05304962309201558
History (loss, centralized):
	round 0: 0.07388043880462647
	round 1: 0.06223023772239685
	round 2: 0.05776379370689392
	round 3: 0.054261915922164915
History (metrics, centralized):
{'accuracy': [(0, 0.076), (1, 0.27), (2, 0.32), (3, 0.386)]}

# Sending/receiving arbitrary values to/from clients
In some situations, we want to configure client-side execution (trainig, evaluation) from the server-side. One example for that is the server asking the clients to train for a certain number of local epochs. Flower provides a way to send configuration values from the server to the clients using a dictionary. Let’s look at an example where the clients receive values from the server through the `config` parameter in `fit` (`config` is also available in `evaluate`). The `fit` method receives the configuration dictionary through the `config` parameter and can then read values from this dictionary. In this example, it reads `server_round` and `local_epochs` and uses those values to improve the logging and configure the number of local training epochs:

In [9]:
class FlowerClient(fl.client.NumPyClient):
    def __init__(self, cid, net, trainloader, valloader):
        self.cid = cid
        self.net = net
        self.trainloader = trainloader
        self.valloader = valloader

    def get_parameters(self, config):
        print(f"[Client {self.cid}] get_parameters")
        return get_parameters(self.net)

    def fit(self, parameters, config):
        # Read values from config
        server_round = config["server_round"]
        local_epochs = config["local_epochs"]

        # Use values provided by the config
        print(f"[Client {self.cid}, round {server_round}] fit, config: {config}")
        set_parameters(self.net, parameters)
        train(self.net, self.trainloader, epochs=local_epochs)
        return get_parameters(self.net), len(self.trainloader), {}

    def evaluate(self, parameters, config):
        print(f"[Client {self.cid}] evaluate, config: {config}")
        set_parameters(self.net, parameters)
        loss, accuracy = test(self.net, self.valloader)
        return float(loss), len(self.valloader), {"accuracy": float(accuracy)}


def client_fn(cid) -> FlowerClient:
    net = Net().to(DEVICE)
    trainloader = trainloaders[int(cid)]
    valloader = valloaders[int(cid)]
    return FlowerClient(cid, net, trainloader, valloader)

So how can we send this config dictionary from server to clients? The built-in Flower Strategies provide way to do this, and it works similarly to the way server-side evaluation works. We provide a function to the strategy, and the strategy calls this function for every round of federated learning:

In [10]:
def fit_config(server_round: int):
    """Return training configuration dict for each round.

    Perform two rounds of training with one local epoch, increase to two local
    epochs afterwards.
    """
    config = {
        "server_round": server_round,  # The current round of federated learning
        "local_epochs": 1 if server_round < 2 else 2,  #
    }
    return config

Next, we’ll just pass this function to the FedAvg strategy before starting the simulation:

In [11]:
strategy = fl.server.strategy.FedAvg(
    fraction_fit=0.3,
    fraction_evaluate=0.3,
    min_fit_clients=3,
    min_evaluate_clients=3,
    min_available_clients=NUM_CLIENTS,
    initial_parameters=fl.common.ndarrays_to_parameters(get_parameters(Net())),
    evaluate_fn=evaluate,
    on_fit_config_fn=fit_config,  # Pass the fit_config function
)

fl.simulation.start_simulation(
    client_fn=client_fn,
    num_clients=NUM_CLIENTS,
    config=fl.server.ServerConfig(num_rounds=3),  # Just three rounds
    strategy=strategy,
    client_resources=client_resources,
)

INFO flwr 2023-05-23 13:06:58,789 | app.py:146 | Starting Flower simulation, config: ServerConfig(num_rounds=3, round_timeout=None)


[2m[36m(launch_and_evaluate pid=7707)[0m [Client 6] evaluate, config: {}
[2m[36m(launch_and_fit pid=7709)[0m Epoch 1: train loss 0.0548231303691864, accuracy 0.3526666666666667


2023-05-23 13:07:03,061	INFO worker.py:1625 -- Started a local Ray instance.
INFO flwr 2023-05-23 13:07:03,646 | app.py:180 | Flower VCE: Ray initialized with resources: {'object_store_memory': 2147483648.0, 'memory': 8005319066.0, 'node:127.0.0.1': 1.0, 'CPU': 8.0}
INFO flwr 2023-05-23 13:07:03,647 | server.py:86 | Initializing global parameters
INFO flwr 2023-05-23 13:07:03,647 | server.py:269 | Using initial parameters provided by strategy
INFO flwr 2023-05-23 13:07:03,647 | server.py:88 | Evaluating initial parameters
INFO flwr 2023-05-23 13:07:03,703 | server.py:91 | initial parameters (loss, other metrics): 0.0739201283454895, {'accuracy': 0.092}
INFO flwr 2023-05-23 13:07:03,703 | server.py:101 | FL starting
DEBUG flwr 2023-05-23 13:07:03,704 | server.py:218 | fit_round 1: strategy sampled 3 clients (out of 10)


Server-side evaluation loss 0.0739201283454895 / accuracy 0.092
[2m[36m(launch_and_fit pid=7731)[0m [Client 8, round 1] fit, config: {'server_round': 1, 'local_epochs': 1}


DEBUG flwr 2023-05-23 13:07:07,308 | server.py:232 | fit_round 1 received 3 results and 0 failures
INFO flwr 2023-05-23 13:07:07,363 | server.py:119 | fit progress: (1, 0.06284520053863525, {'accuracy': 0.276}, 3.658823541016318)
DEBUG flwr 2023-05-23 13:07:07,363 | server.py:168 | evaluate_round 1: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7732)[0m Epoch 1: train loss 0.06438609957695007, accuracy 0.23355555555555554
Server-side evaluation loss 0.06284520053863525 / accuracy 0.276


DEBUG flwr 2023-05-23 13:07:09,050 | server.py:182 | evaluate_round 1 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:07:09,051 | server.py:218 | fit_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_evaluate pid=7730)[0m [Client 1] evaluate, config: {}
[2m[36m(launch_and_fit pid=7730)[0m [Client 1, round 2] fit, config: {'server_round': 2, 'local_epochs': 2}[32m [repeated 5x across cluster][0m


DEBUG flwr 2023-05-23 13:07:13,274 | server.py:232 | fit_round 2 received 3 results and 0 failures
INFO flwr 2023-05-23 13:07:13,327 | server.py:119 | fit progress: (2, 0.05364082360267639, {'accuracy': 0.402}, 9.622870624996722)
DEBUG flwr 2023-05-23 13:07:13,327 | server.py:168 | evaluate_round 2: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7732)[0m Epoch 2: train loss 0.05363083258271217, accuracy 0.36466666666666664[32m [repeated 6x across cluster][0m
Server-side evaluation loss 0.05364082360267639 / accuracy 0.402


DEBUG flwr 2023-05-23 13:07:15,020 | server.py:182 | evaluate_round 2 received 3 results and 0 failures
DEBUG flwr 2023-05-23 13:07:15,020 | server.py:218 | fit_round 3: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_evaluate pid=7732)[0m [Client 6] evaluate, config: {}[32m [repeated 2x across cluster][0m
[2m[36m(launch_and_fit pid=7730)[0m [Client 0, round 3] fit, config: {'server_round': 3, 'local_epochs': 2}[32m [repeated 3x across cluster][0m


DEBUG flwr 2023-05-23 13:07:19,376 | server.py:232 | fit_round 3 received 3 results and 0 failures
INFO flwr 2023-05-23 13:07:19,431 | server.py:119 | fit progress: (3, 0.05107601547241211, {'accuracy': 0.428}, 15.727594875032082)
DEBUG flwr 2023-05-23 13:07:19,432 | server.py:168 | evaluate_round 3: strategy sampled 3 clients (out of 10)


[2m[36m(launch_and_fit pid=7731)[0m Epoch 2: train loss 0.04899758845567703, accuracy 0.42933333333333334[32m [repeated 6x across cluster][0m
Server-side evaluation loss 0.05107601547241211 / accuracy 0.428


DEBUG flwr 2023-05-23 13:07:21,094 | server.py:182 | evaluate_round 3 received 3 results and 0 failures
INFO flwr 2023-05-23 13:07:21,094 | server.py:147 | FL finished in 17.39067466603592
INFO flwr 2023-05-23 13:07:21,095 | app.py:218 | app_fit: losses_distributed [(1, 0.0625706917444865), (2, 0.05246995917956034), (3, 0.050402129093805946)]
INFO flwr 2023-05-23 13:07:21,095 | app.py:219 | app_fit: metrics_distributed_fit {}
INFO flwr 2023-05-23 13:07:21,095 | app.py:220 | app_fit: metrics_distributed {}
INFO flwr 2023-05-23 13:07:21,096 | app.py:221 | app_fit: losses_centralized [(0, 0.0739201283454895), (1, 0.06284520053863525), (2, 0.05364082360267639), (3, 0.05107601547241211)]
INFO flwr 2023-05-23 13:07:21,096 | app.py:222 | app_fit: metrics_centralized {'accuracy': [(0, 0.092), (1, 0.276), (2, 0.402), (3, 0.428)]}


[2m[36m(launch_and_evaluate pid=7731)[0m [Client 0] evaluate, config: {}[32m [repeated 4x across cluster][0m


History (loss, distributed):
	round 1: 0.0625706917444865
	round 2: 0.05246995917956034
	round 3: 0.050402129093805946
History (loss, centralized):
	round 0: 0.0739201283454895
	round 1: 0.06284520053863525
	round 2: 0.05364082360267639
	round 3: 0.05107601547241211
History (metrics, centralized):
{'accuracy': [(0, 0.092), (1, 0.276), (2, 0.402), (3, 0.428)]}

As we can see, the client logs now include the current round of federated learning (which they read from the config dictionary). We can also configure local training to run for one epoch during the first and second round of federated learning, and then for two epochs during the third round.

Clients can also return arbitrary values to the server. To do so, they return a dictionary from fit and/or evaluate. We have seen and used this concept throughout this notebook without mentioning it explicitly: our FlowerClient returns a dictionary containing a custom key/value pair as the third return value in evaluate.

# Scaling federated learning
As a last step in this notebook, let’s see how we can use Flower to experiment with a large number of clients.

In [12]:
NUM_CLIENTS = 1000

trainloaders, valloaders, testloader = load_datasets(NUM_CLIENTS)

Files already downloaded and verified
Files already downloaded and verified


We now have 1000 partitions, each holding 45 training and 5 validation examples. Given that the number of training examples on each client is quite small, we should probably train the model a bit longer, so we configure the clients to perform 3 local training epochs. We should also adjust the fraction of clients selected for training during each round (we don’t want all 1000 clients participating in every round), so we adjust fraction_fit to 0.05, which means that only 5% of available clients (so 50 clients) will be selected for training each round:

In [13]:
def fit_config(server_round: int):
    config = {
        "server_round": server_round,
        "local_epochs": 3,
    }
    return config


strategy = fl.server.strategy.FedAvg(
    fraction_fit=0.025,  # Train on 25 clients (each round)
    fraction_evaluate=0.05,  # Evaluate on 50 clients (each round)
    min_fit_clients=20,
    min_evaluate_clients=40,
    min_available_clients=NUM_CLIENTS,
    initial_parameters=fl.common.ndarrays_to_parameters(get_parameters(Net())),
    on_fit_config_fn=fit_config,
)

fl.simulation.start_simulation(
    client_fn=client_fn,
    num_clients=NUM_CLIENTS,
    config=fl.server.ServerConfig(num_rounds=3),  # Just three rounds
    strategy=strategy,
    client_resources=client_resources,
)

INFO flwr 2023-05-23 13:07:22,332 | app.py:146 | Starting Flower simulation, config: ServerConfig(num_rounds=3, round_timeout=None)


[2m[36m(launch_and_fit pid=7730)[0m Epoch 2: train loss 0.047881197184324265, accuracy 0.44533333333333336[32m [repeated 2x across cluster][0m
[2m[36m(launch_and_evaluate pid=7730)[0m [Client 2] evaluate, config: {}[32m [repeated 2x across cluster][0m


2023-05-23 13:07:26,174	INFO worker.py:1625 -- Started a local Ray instance.
INFO flwr 2023-05-23 13:07:26,788 | app.py:180 | Flower VCE: Ray initialized with resources: {'node:127.0.0.1': 1.0, 'CPU': 8.0, 'memory': 7935764071.0, 'object_store_memory': 2147483648.0}
INFO flwr 2023-05-23 13:07:26,790 | server.py:86 | Initializing global parameters
INFO flwr 2023-05-23 13:07:26,790 | server.py:269 | Using initial parameters provided by strategy
INFO flwr 2023-05-23 13:07:26,790 | server.py:88 | Evaluating initial parameters
INFO flwr 2023-05-23 13:07:26,791 | server.py:101 | FL starting
DEBUG flwr 2023-05-23 13:07:26,791 | server.py:218 | fit_round 1: strategy sampled 25 clients (out of 1000)


[2m[36m(launch_and_fit pid=7750)[0m [Client 275, round 1] fit, config: {'server_round': 1, 'local_epochs': 3}
[2m[36m(launch_and_fit pid=7753)[0m Epoch 1: train loss 0.10210651904344559, accuracy 0.1111111111111111
[2m[36m(launch_and_fit pid=7753)[0m [Client 208, round 1] fit, config: {'server_round': 1, 'local_epochs': 3}[32m [repeated 5x across cluster][0m


[2m[36m(raylet)[0m Spilled 2801 MiB, 32 objects, write throughput 1269 MiB/s. Set RAY_verbose_spill_logs=0 to disable this message.
DEBUG flwr 2023-05-23 13:07:43,845 | server.py:232 | fit_round 1 received 25 results and 0 failures
DEBUG flwr 2023-05-23 13:07:43,868 | server.py:168 | evaluate_round 1: strategy sampled 50 clients (out of 1000)


[2m[36m(launch_and_evaluate pid=7747)[0m [Client 717] evaluate, config: {}
[2m[36m(launch_and_fit pid=7752)[0m Epoch 3: train loss 0.10181637108325958, accuracy 0.15555555555555556[32m [repeated 74x across cluster][0m
[2m[36m(launch_and_fit pid=7747)[0m [Client 273, round 1] fit, config: {'server_round': 1, 'local_epochs': 3}[32m [repeated 19x across cluster][0m
[2m[36m(launch_and_evaluate pid=7747)[0m [Client 144] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7749)[0m [Client 116] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7752)[0m [Client 157] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7752)[0m [Client 492] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7752)[0m [Client 616] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7747)[0m [Client 633] evaluate, config: {}[32m [repeated 14x across cluster][0m
[2m[36m(launch_and_evaluate pid=7754)[0m [Client 722] evaluate, config: {}[32m [repeated 7x across cluster][0m

DEBUG flwr 2023-05-23 13:08:11,519 | server.py:182 | evaluate_round 1 received 50 results and 0 failures
DEBUG flwr 2023-05-23 13:08:11,520 | server.py:218 | fit_round 2: strategy sampled 25 clients (out of 1000)


[2m[36m(launch_and_evaluate pid=7750)[0m [Client 9] evaluate, config: {}[32m [repeated 17x across cluster][0m
[2m[36m(launch_and_fit pid=7747)[0m [Client 412, round 2] fit, config: {'server_round': 2, 'local_epochs': 3}
[2m[36m(launch_and_fit pid=7747)[0m Epoch 1: train loss 0.10265219211578369, accuracy 0.15555555555555556
[2m[36m(launch_and_fit pid=7747)[0m Epoch 2: train loss 0.10128385573625565, accuracy 0.24444444444444444
[2m[36m(launch_and_fit pid=7747)[0m Epoch 3: train loss 0.10030465573072433, accuracy 0.26666666666666666
[2m[36m(launch_and_fit pid=7747)[0m [Client 958, round 2] fit, config: {'server_round': 2, 'local_epochs': 3}
[2m[36m(launch_and_fit pid=7747)[0m Epoch 1: train loss 0.10269943624734879, accuracy 0.08888888888888889
[2m[36m(launch_and_fit pid=7747)[0m Epoch 2: train loss 0.10125084221363068, accuracy 0.17777777777777778
[2m[36m(launch_and_fit pid=7747)[0m Epoch 3: train loss 0.10079773515462875, accuracy 0.15555555555555556
[2m

[2m[36m(raylet)[0m Spilled 4718 MiB, 53 objects, write throughput 1412 MiB/s.
DEBUG flwr 2023-05-23 13:08:25,605 | server.py:232 | fit_round 2 received 25 results and 0 failures
DEBUG flwr 2023-05-23 13:08:25,624 | server.py:168 | evaluate_round 2: strategy sampled 50 clients (out of 1000)


[2m[36m(launch_and_evaluate pid=7748)[0m [Client 499] evaluate, config: {}
[2m[36m(launch_and_fit pid=7749)[0m [Client 637, round 2] fit, config: {'server_round': 2, 'local_epochs': 3}[32m [repeated 21x across cluster][0m
[2m[36m(launch_and_fit pid=7747)[0m Epoch 3: train loss 0.099659264087677, accuracy 0.17777777777777778[32m [repeated 66x across cluster][0m
[2m[36m(launch_and_evaluate pid=7748)[0m [Client 261] evaluate, config: {}[32m [repeated 2x across cluster][0m
[2m[36m(launch_and_evaluate pid=7751)[0m [Client 127] evaluate, config: {}[32m [repeated 14x across cluster][0m
[2m[36m(launch_and_evaluate pid=7747)[0m [Client 336] evaluate, config: {}[32m [repeated 11x across cluster][0m


DEBUG flwr 2023-05-23 13:08:54,585 | server.py:182 | evaluate_round 2 received 50 results and 0 failures
DEBUG flwr 2023-05-23 13:08:54,586 | server.py:218 | fit_round 3: strategy sampled 25 clients (out of 1000)


[2m[36m(launch_and_fit pid=7752)[0m [Client 914, round 3] fit, config: {'server_round': 3, 'local_epochs': 3}
[2m[36m(launch_and_fit pid=7752)[0m Epoch 1: train loss 0.1014876440167427, accuracy 0.24444444444444444
[2m[36m(launch_and_fit pid=7752)[0m Epoch 2: train loss 0.09928058832883835, accuracy 0.24444444444444444
[2m[36m(launch_and_evaluate pid=7752)[0m [Client 236] evaluate, config: {}[32m [repeated 22x across cluster][0m
[2m[36m(launch_and_fit pid=7752)[0m [Client 954, round 3] fit, config: {'server_round': 3, 'local_epochs': 3}[32m [repeated 5x across cluster][0m
[2m[36m(launch_and_fit pid=7750)[0m Epoch 3: train loss 0.0975508987903595, accuracy 0.28888888888888886[32m [repeated 13x across cluster][0m


DEBUG flwr 2023-05-23 13:09:08,655 | server.py:232 | fit_round 3 received 25 results and 0 failures
DEBUG flwr 2023-05-23 13:09:08,675 | server.py:168 | evaluate_round 3: strategy sampled 50 clients (out of 1000)


[2m[36m(launch_and_evaluate pid=7751)[0m [Client 251] evaluate, config: {}
[2m[36m(launch_and_fit pid=7751)[0m [Client 303, round 3] fit, config: {'server_round': 3, 'local_epochs': 3}[32m [repeated 19x across cluster][0m
[2m[36m(launch_and_fit pid=7751)[0m Epoch 3: train loss 0.09924256801605225, accuracy 0.26666666666666666[32m [repeated 60x across cluster][0m
[2m[36m(launch_and_evaluate pid=7751)[0m [Client 328] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7751)[0m [Client 560] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7751)[0m [Client 507] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7751)[0m [Client 122] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7751)[0m [Client 753] evaluate, config: {}
[2m[36m(launch_and_evaluate pid=7753)[0m [Client 803] evaluate, config: {}[32m [repeated 7x across cluster][0m
[2m[36m(launch_and_evaluate pid=7752)[0m [Client 420] evaluate, config: {}[32m [repeated 13x across cluster][0m

[2m[36m(raylet)[0m Spilled 8254 MiB, 83 objects, write throughput 1720 MiB/s.
DEBUG flwr 2023-05-23 13:09:36,307 | server.py:182 | evaluate_round 3 received 50 results and 0 failures
INFO flwr 2023-05-23 13:09:36,307 | server.py:147 | FL finished in 129.51526524999645
INFO flwr 2023-05-23 13:09:36,307 | app.py:218 | app_fit: losses_distributed [(1, 0.45949037551879895), (2, 0.45577912521362296), (3, 0.4533768520355224)]
INFO flwr 2023-05-23 13:09:36,308 | app.py:219 | app_fit: metrics_distributed_fit {}
INFO flwr 2023-05-23 13:09:36,308 | app.py:220 | app_fit: metrics_distributed {}
INFO flwr 2023-05-23 13:09:36,308 | app.py:221 | app_fit: losses_centralized []
INFO flwr 2023-05-23 13:09:36,308 | app.py:222 | app_fit: metrics_centralized {}


[2m[36m(launch_and_evaluate pid=7753)[0m [Client 528] evaluate, config: {}[32m [repeated 19x across cluster][0m


History (loss, distributed):
	round 1: 0.45949037551879895
	round 2: 0.45577912521362296
	round 3: 0.4533768520355224