Skip to content

[ICML 2023] Optimizing the Collaboration Structure in Cross-Silo Federated Learning. Wenxuan Bao, Haohan Wang, Jun Wu, Jingrui He.

Notifications You must be signed in to change notification settings

baowenxuan/FedCollab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FedCollab

This is the official implementation of the following paper:

Wenxuan Bao, Haohan Wang, Jun Wu, Jingrui He. Optimizing the Collaboration Structure in Cross-Silo Federated Learning. ICML 2023.

[Website] [Arxiv] [Poster] [Slides]

Introduction

FedCollab is an algorithm which alleviates the negative transfer problem in federated learning by clustering clients into non-overlapping coalitions based on (1) clients' data quantities and (2) their pairwise distribution distances.

Requirements

  • python 3.8.5
  • cudatoolkit 10.2.89
  • cudnn 7.6.5
  • pytorch 1.11.0
  • torchvision 0.12.0
  • numpy 1.18.5
  • tqdm 4.65.0
  • matplotlib 3.7.1

Run

Here we provide an example script for experiments with FedAvg.

  1. Generate data partition
cd ./example/${setting}
./data_prepare.sh

${setting} should be filled with label_shift, feature_shift or concept_shift. The dataset and its partition will be saved to ~/data.

  1. Estimate pairwise divergence between each pair of clients
./distance_estimate.sh

The estimated pairwise divergence will be saved to ./divergence.

  1. Solve for the optimal collaboration structure
./collaboration_solve.sh

The solved collaboration structure will be saved to ./collab.

  1. Run clustered federated learning
./clustered_fl.sh

The training history will be saved to ./history.

  1. Calculate metrics (Acc, IPR, RSD)
./stats.sh

The statistics will be printed.

Expected outputs

The expected extimated pairwise divergence, collaboration structure, and training history are given in this repository. We also list the expected accuracy (Acc), incentivized participation rate (IPR), and reward standard deviation (RSD) below. Notice that this is the result with one seed, while we showed the results from five difference random seeds in our paper.

Label shift (FashionMNIST)

Method Acc IPR RSD
Local Train 0.8580 - -
FedAvg (Global Train) 0.4674 0.4500 0.4076
FedAvg + FedCollab 0.9247 1.0000 0.0626

Feature shift (Rotated CIFAR-10)

Method Acc IPR RSD
Local Train 0.3829 - -
FedAvg (Global Train) 0.4447 0.9000 0.0408
FedAvg + FedCollab 0.5286 1.0000 0.0413

Label shift (Coarse CIFAR-100)

Method Acc IPR RSD
Local Train 0.3032 - -
FedAvg (Global Train) 0.2649 0.5000 0.1111
FedAvg + FedCollab 0.4041 1.0000 0.0256

Citation

If you find this paper or repository helpful in your research, please consider giving a star ⭐️ and citing our paper:

@inproceedings{bao2023fedcollab,
  author  = {Wenxuan Bao, Haohan Wang, Jun Wu, Jingrui He},
  title   = {Optimizing the Collaboration Structure in Cross-Silo Federated Learning},
  booktitle = {Proceedings of the 40th International Conference on Machine Learning (ICML'23)},
  year    = {2023},
}

About

[ICML 2023] Optimizing the Collaboration Structure in Cross-Silo Federated Learning. Wenxuan Bao, Haohan Wang, Jun Wu, Jingrui He.

Topics

Resources

Stars

Watchers

Forks