FederatedScope  is a flexible FL framework, which enables users to implement complex FL algorithms simply and intuitively. In this tutorial, we will show how to implement diverse personalized FL algorithms.

## Background

In an FL course, multiple clients aim to cooperatively learn models without directly sharing their private data. As a result, these clients can be arbitrarily different in terms of their underlying **data distribution** and **system resources** such as computational power and communication width.

- On one hand, the data quantity skew, feature distribution skew,  label distribution skew, and temporal skew are pervasive in real-world applications as different users generate the data with different usage manners.
  Simply applying the shared global model for all participants might lead to sub-optimal performance.
- On the other hand, the participation degrees of different FL participants can be diverse due to their different hardware capabilities and network conditions.

It is challenging to make full use of local data considering such systematical heterogeneity. As a natural and effective approach to address these challenges, personalization gains increasing attention in recent years. Personalized FL (pFL) raises strong demand for various customized FL implementation, e.g., the personalization may exist in

- Model objects, optimizers and hyper-parameters
- Model sub-modules
- Client-end behaviors such as regularization and multi-model interaction
- Server-end behaviors such as model interpolation

We have implemented several implementations for state-of-the-art (SOTA) pFL methods to meet the above requirements. We will demonstrate the FedBN in this playground. More examples can be found in our [tutorial](https://federatedscope.io/docs/pfl/), in which we show how powerful and flexible the FederatedScope framework to implement pFL extensions.


In [None]:
import os
import sys

file_dir = os.path.join(os.getcwd(), '..')
sys.path.append(file_dir)

## The necessity of personalization: A multi-task example

### Setting
Federated learning can be well modeled by multi-task learning, that is, viewing each client's learning as a subtask. A common multi-task learning scenario is that individual subtasks use different and non-overlapping features and labels. 
Although the features and labels are superficially different, the ability of sub-task models can be migrated and enhanced by the federated process.

Note that when the features and labels have different shape, it is necessary to personalize the input and output layers for the aggregation compatibility.
Here we use a multi-task graph dataset as an example since the dataset has a relatively small data size and we can check the performance quickly. 
For more larger datasets, please refer examples in `scripts/personalization_exp_scripts`.

### Configuration
**FederatedScope** organizes the configuration through an extension of `yacs.config.cfgNode`. Please refer to our [official documentation](https://federatedscope.io/docs/own-case/) for specific instructions on how to configure the initial `global_cfg` and customize your own configuration. By default, we provide built-in personalization-related configurations in `federatedscope/core/configs/`.

In [1]:
from federatedscope.core.configs.config import global_cfg
import torch

pfl_cfg = global_cfg.clone()
pfl_cfg.merge_from_file("FederatedScope/federatedscope/gfl/baseline/fedavg_gnn_minibatch_on_multi_task.yaml")
pfl_cfg.use_gpu = torch.cuda.is_available()

print(pfl_cfg.personalization)

  from .autonotebook import tqdm as notebook_tqdm


K: 5
beta: 1.0
cfg_check_funcs: []
local_param: ['encoder_atom', 'encoder', 'clf']
local_update_steps: 1
lr: 0.5
regular_weight: 0.1
share_non_trainable_para: False


As you can see, we make the input and output layer as local personalized parameters by simply configuration.

### 1. Prepare dataset and model
Now, let's load the graph multi-task dataset and check the GIN model.

In [2]:
from federatedscope.core.auxiliaries.data_builder import get_data

data, modified_cfg = get_data(pfl_cfg.clone())
pfl_cfg.merge_from_other_cfg(modified_cfg)
print(f"=====The data config is =====\n{pfl_cfg.data}\n")
print(f"=====The data meta-info is =====\n{data}\n")
print(f"=====The train data at client 1 is =====\n{data[1]['train'].dataset}\n")

=====The data config is =====
batch_size: 64
cSBM_phi: [0.5, 0.5, 0.5]
cfg_check_funcs: []
drop_last: False
graphsaint:
  cfg_check_funcs: []
  num_steps: 30
  walk_length: 2
loader: 
num_workers: 0
pre_transforms: constant_feat
root: data/
shuffle: True
sizes: [10, 5]
splits: [0.8, 0.1, 0.1]
splitter: louvain
subsample: 1.0
transforms: 
type: graph_multi_domain_mol

=====The data meta-info is =====
{1: {'num_label': 2, 'train': <torch_geometric.loader.dataloader.DataLoader object at 0x7f1f033a2250>, 'val': <torch_geometric.loader.dataloader.DataLoader object at 0x7f1ed40fa460>, 'test': <torch_geometric.loader.dataloader.DataLoader object at 0x7f1ed40fe430>}, 2: {'num_label': 2, 'train': <torch_geometric.loader.dataloader.DataLoader object at 0x7f1ed4103400>, 'val': <torch_geometric.loader.dataloader.DataLoader object at 0x7f1ed4027c10>, 'test': <torch_geometric.loader.dataloader.DataLoader object at 0x7f1ed402cbe0>}, 3: {'num_label': 2, 'train': <torch_geometric.loader.dataloader.Data

In [3]:
print(f"=====The model config is =====\n{pfl_cfg.model}\n")

=====The model config is =====
cfg_check_funcs: []
dropout: 0.5
embed_size: 8
gnn_layer: 2
graph_pooling: mean
hidden: 64
in_channels: 0
model_num_per_trainer: 1
num_item: 0
num_user: 0
out_channels: 0
task: graph
type: gin
use_bias: True



##  2. Check the performance w/o FedBN
Now, let's check the FL performance from FedAvg. We first check several task-specific configuration speficied by our yaml:

In [4]:
print(f"=====Evaluation related configs=====\n{pfl_cfg.eval}\n")
print(f"=====Federated setting related configs=====\n{pfl_cfg.federate}\n")

=====Evaluation related configs=====
best_res_update_round_wise_key: val_loss
cfg_check_funcs: []
freq: 5
metrics: ['acc', 'correct']
monitoring: []
report: ['weighted_avg', 'avg', 'fairness', 'raw']
save_data: False
split: ['test', 'val']

=====Federated setting related configs=====
batch_or_epoch: batch
cfg_check_funcs: []
client_num: 7
data_weighted_aggr: False
ignore_weight: False
join_in_info: []
local_update_steps: 1
make_global_eval: False
method: FedAvg
mode: standalone
online_aggr: False
restore_from: 
sample_client_num: 7
sample_client_rate: -1.0
save_to: 
share_local_model: False
total_round_num: 400
use_ss: False



Then we can see that we only maintain local paramters of pre-layer and last-layer to handle multi-task case:

In [5]:
print(pfl_cfg.personalization) 

K: 5
beta: 1.0
cfg_check_funcs: []
local_param: ['encoder_atom', 'encoder', 'clf']
local_update_steps: 1
lr: 0.5
regular_weight: 0.1
share_non_trainable_para: False


Now let's run our API to see the performance of FedAvg! This will take a few minutes.

In [6]:
from federatedscope.core.fed_runner import FedRunner
from federatedscope.core.auxiliaries.worker_builder import get_server_cls, get_client_cls
from federatedscope.core.auxiliaries.utils import setup_seed, update_logger

update_logger(pfl_cfg)
setup_seed(pfl_cfg.seed)

Fed_runner = FedRunner(data=data,
                       server_class=get_server_cls(pfl_cfg),
                       client_class=get_client_cls(pfl_cfg),
                       config=pfl_cfg.clone())
Fed_runner.run()

2022-04-24 06:17:42,822 (utils:89) INFO: the output dir is exp/
2022-04-24 06:17:43,102 (fed_runner:181) INFO: Server #0 has been set up ... 
2022-04-24 06:17:43,115 (fed_runner:220) INFO: Client 1 has been set up ... 
2022-04-24 06:17:43,137 (fed_runner:220) INFO: Client 2 has been set up ... 
2022-04-24 06:17:43,153 (fed_runner:220) INFO: Client 3 has been set up ... 
2022-04-24 06:17:43,175 (fed_runner:220) INFO: Client 4 has been set up ... 
2022-04-24 06:17:43,186 (fed_runner:220) INFO: Client 5 has been set up ... 
2022-04-24 06:17:43,203 (fed_runner:220) INFO: Client 6 has been set up ... 
2022-04-24 06:17:43,220 (fed_runner:220) INFO: Client 7 has been set up ... 
2022-04-24 06:17:43,221 (trainer:260) INFO: Model meta-info: <class 'federatedscope.gfl.model.graph_level.GNN_Net_Graph'>.
2022-04-24 06:17:43,222 (trainer:268) INFO: Num of original para names: 43.
2022-04-24 06:17:43,222 (trainer:269) INFO: Num of original trainable para names: 29.
2022-04-24 06:17:43,223 (trainer:2

{'client_individual': {'val_loss': 10.580272614955902,
  'test_correct': 12.0,
  'test_acc': 0.46472019464720193,
  'test_avg_loss': 0.4127966678142548,
  'test_total': 19.0,
  'test_loss': 25.78902691602707,
  'val_correct': 14.0,
  'val_acc': 0.5158150851581509,
  'val_avg_loss': 0.5246114015579224,
  'val_total': 19.0},
 'client_summarized_weighted_avg': {'val_loss': 176.37432350558842,
  'test_correct': 72.85714285714286,
  'test_acc': 0.6151990349819059,
  'test_avg_loss': 0.6274365048644338,
  'test_total': 118.42857142857143,
  'test_loss': 174.82009630626015,
  'val_correct': 74.0,
  'val_acc': 0.626360338573156,
  'val_avg_loss': 0.6361808064770959,
  'val_total': 118.14285714285714},
 'client_summarized_avg': {'val_loss': 75.16021813665118,
  'test_correct': 72.85714285714286,
  'test_acc': 0.6680161353452165,
  'test_avg_loss': 0.6111628547531363,
  'test_total': 118.42857142857143,
  'test_loss': 74.30640893323081,
  'val_correct': 74.0,
  'val_acc': 0.6889739882351612,
  '

We finally log the best results based on the validation datasets. 
You can see that the FedAvg gains 0.615 acc when using weighted summarizaion manner. Besides, we find that the std of test accucary is 0.140, and the top and bottom acc deciles are 0.86 and 0.465 respectively.

##  2. Check the performance w/ FedBN
Now, lets use the FedBN and see the performance change!

In [7]:
print(f"=====[Before] Federated personalization related configs=====\n{pfl_cfg.personalization}\n")

pfl_cfg.merge_from_file("FederatedScope/scripts/personalization_exp_scripts/fedbn/fedbn_gnn_minibatch_on_multi_task.yaml")
pfl_cfg.use_gpu = torch.cuda.is_available()

print(f"=====[After] Federated personalization related configs=====\n{pfl_cfg.personalization}\n")

=====[Before] Federated personalization related configs=====
K: 5
beta: 1.0
cfg_check_funcs: []
local_param: ['encoder_atom', 'encoder', 'clf']
local_update_steps: 1
lr: 0.5
regular_weight: 0.1
share_non_trainable_para: False

=====[After] Federated personalization related configs=====
K: 5
beta: 1.0
cfg_check_funcs: []
local_param: ['encoder_atom', 'encoder', 'clf', 'norms']
local_update_steps: 1
lr: 0.5
regular_weight: 0.1
share_non_trainable_para: False



The modification here is to filter the `norms` parameters. Let's run the API to check the performance of FedBN!

In [8]:
from federatedscope.core.fed_runner import FedRunner
from federatedscope.core.auxiliaries.worker_builder import get_server_cls, get_client_cls


Fed_runner = FedRunner(data=data,
                       server_class=get_server_cls(pfl_cfg),
                       client_class=get_client_cls(pfl_cfg),
                       config=pfl_cfg.clone())
Fed_runner.run()

2022-04-24 06:23:27,658 (fed_runner:181) INFO: Server #0 has been set up ... 
2022-04-24 06:23:27,669 (fed_runner:220) INFO: Client 1 has been set up ... 
2022-04-24 06:23:27,691 (fed_runner:220) INFO: Client 2 has been set up ... 
2022-04-24 06:23:27,708 (fed_runner:220) INFO: Client 3 has been set up ... 
2022-04-24 06:23:27,730 (fed_runner:220) INFO: Client 4 has been set up ... 
2022-04-24 06:23:27,742 (fed_runner:220) INFO: Client 5 has been set up ... 
2022-04-24 06:23:27,759 (fed_runner:220) INFO: Client 6 has been set up ... 
2022-04-24 06:23:27,778 (fed_runner:220) INFO: Client 7 has been set up ... 
2022-04-24 06:23:27,779 (trainer:260) INFO: Model meta-info: <class 'federatedscope.gfl.model.graph_level.GNN_Net_Graph'>.
2022-04-24 06:23:27,780 (trainer:268) INFO: Num of original para names: 43.
2022-04-24 06:23:27,781 (trainer:269) INFO: Num of original trainable para names: 29.
2022-04-24 06:23:27,781 (trainer:272) INFO: Num of preserved para names in local update: 12. 
Pres

{'client_individual': {'val_loss': 5.833064764738083,
  'test_correct': 16.0,
  'test_acc': 0.4605263157894737,
  'test_avg_loss': 0.179232417345047,
  'test_total': 19.0,
  'test_loss': 7.801190614700317,
  'val_correct': 16.0,
  'val_acc': 0.5588235294117647,
  'val_avg_loss': 0.2110978102684021,
  'val_total': 19.0},
 'client_summarized_weighted_avg': {'val_loss': 146.69982919835698,
  'test_correct': 87.71428571428571,
  'test_acc': 0.7406513872135102,
  'test_avg_loss': 0.4958598550826994,
  'test_total': 118.42857142857143,
  'test_loss': 147.5788726323836,
  'val_correct': 84.85714285714286,
  'val_acc': 0.7182587666263603,
  'val_avg_loss': 0.5176812625493508,
  'val_total': 118.14285714285714},
 'client_summarized_avg': {'val_loss': 61.16034344690187,
  'test_correct': 89.0,
  'test_acc': 0.7788663172997756,
  'test_avg_loss': 0.46336370370533403,
  'test_total': 118.42857142857143,
  'test_loss': 58.72397426622255,
  'val_correct': 84.85714285714286,
  'val_acc': 0.7392645588

We can find that simple personalized method achieves 0.741 acc when using weighted summarizaion manner. Besides, the std of test accucary is 0.138, and the top and bottom acc deciles are 0.927 and 0.514 respectively. 

Compared with FedAvg, we gain significant accuracy improvements (12.6%)! The fairness-related metrics are also improved: the standard deviation, the top and bottom deciles are improved with a ratio 0.2%, 6.7%, and 4.9% respectively.

Try other pFL algorithms from our [tutorial](https://federatedscope.io/docs/pfl/), and welcome to [contribute](https://federatedscope.io/docs/contributor/) more pFL algorithms!