Skip to content

nliulab/FedScore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FedScore: A privacy-preserving framework for federated scoring system development

FedScore is a framework for developing scoring systems across multiple sites in a privacy-preserving way. The R and Python code provided in this repository implements the proposed FedScore algorithm.

Discover the full story in the FedScore paper for binary outcomes, and explore our latest preprint where we extend FedScore to include survival outcomes. The implementation of FedScore on two real-world EHR datasets is available in this tiny paper.

Introduction

Cross-institutional collaboration has gained popularity in recent years as a way to accelerate medical research and facilitate quality improvement. Federated learning (FL) can avoid data sharing by collectively training algorithms without exchanging patient-level data. However, most FL applications in medical image data use black box models from computer vision. Interpretable models, on the contrary, have fewer instances of FL applications despite their popularity in clinical research.

As a type of interpretable risk scoring model, scoring systems have been employed in practically every diagnostic area of medicine. However, scoring systems have usually been created using single-source data, limiting application at other sites if the development data has insufficient sample size or is not representative. Although it is possible to develop scoring systems on pooled data, the process of doing such pooling is time-consuming and difficult to achieve due to privacy restrictions.

To fill this gap, we propose FedScore, a first-of-its-kind framework for building federated scoring systems across multiple sites.

The figure below provides a high-level overview of the FedScore algorithm (example of survival outcomes):

Figure 1: Overview of the FedScore algorithm (survival outcomes)

Versions

FedScore is available in two programming languages, each catering to different FL frameworks (engineering-based & statistics-based). For a comprehensive overview, refer to the review paper and benchmarking study.

R Version (This Repository)

  • Programming Language: R
  • FL Algorithms (Model-Specific): ODAL2, dCLR, ODAC & ODACH
  • FL Framework Type: Statistics-Based
  • Communication Efficiency: One-shot
  • Types of outcomes supported: Binary and survival outcomes
  • Programming Language: Python
  • FL Algorithms (Model-Agnostic): FedAvg etc. (Availability consistent with the Flower framework)
  • FL Framework Type: Engineering-Based
  • Communication Efficiency: Requires multiple rounds of communications
  • Types of outcomes supported: Binary outcomes only

In summary, choose the version based on your preferences and refer to the respective repositories for detailed documentation and implementation details.

Usage

System requirements

To run the R and Python code, you will need:

  • R packages: AutoScore, tidyverse, ggplot2, mle.tools, rjson, doParallel, foreach, dplyr, survival, data.table, pda, survAUC, rstudioapi
  • Python packages: sys

Running the demo

To run the demo scripts, follow the step-by-step instructions provided in examples. We have provided a demo for homogeneous data with binary outcomes and two demos for homogeneous and heterogeneous data with survival outcomes.

Citation

Li, S., Ning, Y., Ong, M.E., Chakraborty, B., Hong, C., Xie, F., ... & Liu, N. (2023). FedScore: A privacy-preserving framework for federated scoring system development. Journal of Biomedical Informatics, 2023,104485, ISSN 1532-0464 https://doi.org/10.1016/j.jbi.2023.104485

Li, S., Shang, Y., Wang, Z., Wu, Q., Hong, C., Ning, Y., ... & Liu, N. (2024). Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data. arXiv preprint arXiv:2403.05229.

Contact

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published