FedScore: A privacy-preserving framework for federated scoring system development

FedScore is a framework for developing scoring systems across multiple sites in a privacy-preserving way. The R and Python code provided in this repository implements the proposed FedScore algorithm.

Discover the full story in the FedScore paper for binary outcomes, and explore our latest preprint where we extend FedScore to include survival outcomes. The implementation of FedScore on two real-world EHR datasets is available in this tiny paper.

Introduction

Cross-institutional collaboration has gained popularity in recent years as a way to accelerate medical research and facilitate quality improvement. Federated learning (FL) can avoid data sharing by collectively training algorithms without exchanging patient-level data. However, most FL applications in medical image data use black box models from computer vision. Interpretable models, on the contrary, have fewer instances of FL applications despite their popularity in clinical research.

As a type of interpretable risk scoring model, scoring systems have been employed in practically every diagnostic area of medicine. However, scoring systems have usually been created using single-source data, limiting application at other sites if the development data has insufficient sample size or is not representative. Although it is possible to develop scoring systems on pooled data, the process of doing such pooling is time-consuming and difficult to achieve due to privacy restrictions.

To fill this gap, we propose FedScore, a first-of-its-kind framework for building federated scoring systems across multiple sites.

The figure below provides a high-level overview of the FedScore algorithm (example of survival outcomes):

Versions

FedScore is available in two programming languages, each catering to different FL frameworks (engineering-based & statistics-based). For a comprehensive overview, refer to the review paper and benchmarking study.

R Version (This Repository)

Programming Language: R
FL Algorithms (Model-Specific): ODAL2, dCLR, ODAC & ODACH
FL Framework Type: Statistics-Based
Communication Efficiency: One-shot
Types of outcomes supported: Binary and survival outcomes

Python Version (FedScore-Python Repository)

Programming Language: Python
FL Algorithms (Model-Agnostic): FedAvg etc. (Availability consistent with the Flower framework)
FL Framework Type: Engineering-Based
Communication Efficiency: Requires multiple rounds of communications
Types of outcomes supported: Binary outcomes only

In summary, choose the version based on your preferences and refer to the respective repositories for detailed documentation and implementation details.

Usage

System requirements

To run the R and Python code, you will need:

R packages: AutoScore, tidyverse, ggplot2, mle.tools, rjson, doParallel, foreach, dplyr, survival, data.table, pda, survAUC, rstudioapi
Python packages: sys

Running the demo

To run the demo scripts, follow the step-by-step instructions provided in examples. We have provided a demo for homogeneous data with binary outcomes and two demos for homogeneous and heterogeneous data with survival outcomes.

Citation

Li, S., Ning, Y., Ong, M.E., Chakraborty, B., Hong, C., Xie, F., ... & Liu, N. (2023). FedScore: A privacy-preserving framework for federated scoring system development. Journal of Biomedical Informatics, 2023,104485, ISSN 1532-0464 https://doi.org/10.1016/j.jbi.2023.104485

Li, S., Shang, Y., Wang, Z., Wu, Q., Hong, C., Ning, Y., ... & Liu, N. (2024). Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data. arXiv preprint arXiv:2403.05229.

Contact

Siqi Li (Email: siqili@u.duke.nus.edu)
Qiming Wu (Email: wuqiming@duke-nus.edu.sg)
Nan Liu (Email: liu.nan@duke-nus.edu.sg)

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
code		code
data		data
examples		examples
output		output
.gitignore		.gitignore
README.md		README.md
workflow.jpg		workflow.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

data

data

examples

examples

output

output

.gitignore

.gitignore

README.md

README.md

workflow.jpg

workflow.jpg

Repository files navigation

FedScore: A privacy-preserving framework for federated scoring system development

Introduction

Versions

R Version (This Repository)

Python Version (FedScore-Python Repository)

Usage

System requirements

Running the demo

Citation

Contact

About

Releases

Packages

Contributors 3

Languages

nliulab/FedScore

Folders and files

Latest commit

History

Repository files navigation

FedScore: A privacy-preserving framework for federated scoring system development

Introduction

Versions

R Version (This Repository)

Python Version (FedScore-Python Repository)

Usage

System requirements

Running the demo

Citation

Contact

About

Resources

Stars

Watchers

Forks

Languages