FedER: Federated Learning through Experience Replay and Privacy-Preserving Data Synthesis

Matteo Pennisi, Federica Proietto Salanitri, Giovanni Bellitto, Bruno Casella, Marco Aldinucci, Simone Palazzo, Concetto Spampinato

Overview

Official PyTorch implementation of paper: "FedER: Federated Learning through Experience Replay and Privacy-Preserving Data Synthesis"

Abstract

In the medical field, multi-center collaborations are often sought to yield more generalizable findings by leveraging the heterogeneity of patient and clinical data. However, recent privacy regulations hinder the possibility to share data, and consequently, to come up with machine learning-based solutions that support diagnosis and prognosis. Federated learning (FL) aims at sidestepping this limitation by bringing AI-based solutions to data owners and only sharing local AI models, or parts thereof, that need then to be aggregated. However, most of the existing federated learning solutions are still at their infancy and show several shortcomings, from the lack of a reliable and effective aggregation scheme able to retain the knowledge learned locally to weak privacy preservation as real data may be reconstructed from model updates. Furthermore, the majority of these approaches, especially those dealing with medical data, relies on a centralized distributed learning strategy that poses robustness, scalability and trust issues. In this paper we present a federated and decentralized learning strategy, FedER, that, exploiting experience replay and generative adversarial concepts, effectively integrates features from local nodes, providing models able to generalize across multiple datasets while maintaining privacy. FedER is tested on two tasks — tuberculosis and melanoma classification — using multiple datasets in order to simulate realistic non-i.i.d. medical data scenarios. Results show that our approach achieves performance comparable to both standard (non-federated) learning and significantly outperforms state-of-the-art federated methods in their centralized (thus, more favourable) formulation.

Method

How to run

The code expects a json file containing the image paths and their respective labels, formatted as follow:

{
"train": { "pos": [ {"image": #path,  
                     "label": #class}, 
                      ...
                      {"image": #path,  
                     "label": #class},
                     ],
           "neg": [ {"image": #path,  
                     "label": #class},
                     ...],
"test": [ {"image": #path,  
           "label": #class}, 
            ...
            {"image": #path,  
           "label": #class},
           ],
}

Pre-requisites:

NVIDIA GPU (Tested on Nvidia GeForce RTX 3090)
Requirements

Train Example

python federated_simulation.py --n_nodes 2 --dataset Tuberculosis --n_rounds 100 --num_epochs 100 --buffer_size 512 --learning_rate 1e-4 --setting non-IID

Notes

The GAN weights can be downloaded from here

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
buffer		buffer
config_file		config_file
dataset		dataset
dnnlib		dnnlib
gan		gan
img		img
models		models
torch_utils		torch_utils
utils		utils
LICENSE		LICENSE
README.md		README.md
env.yml		env.yml
federated.py		federated.py
federated_simulation.py		federated_simulation.py
node.py		node.py
train.py		train.py
validation.py		validation.py

License

perceivelab/FedER

Folders and files

Latest commit

History

Repository files navigation

FedER: Federated Learning through Experience Replay and Privacy-Preserving Data Synthesis

Overview

Abstract

Method

How to run

Pre-requisites:

Train Example

Notes

About

Resources

License

Stars

Watchers

Forks

Languages