Knowledge-Aware Federated Active Learning with Non-IID Data

This is the official implementation of the paper Knowledge-Aware Federated Active Learning with Non-IID Data (ICCV23).

Problem

In this paper, we propose a federated active learning paradigm to efficiently learn a global model with a limited annotation budget while protecting data privacy in a decentralized learning manner. The main challenge faced by federated active learning is the mismatch between the active sampling goal of the global model on the server and that of the asynchronous local clients. This becomes even more significant when data is distributed non-IID across local clients.

Method

We propose a federated active learning scheme, namely Knowledge- Aware Federated Active Learning (KAFAL)， that comprises two key components, Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory Federated Update (KCFU).

KSAS

Given the mismatch problem in federated active learning, informative data on each client may not be that informative to the global model due to the non-IID data distributions, meaning that using only one of them for active sampling is therefore not reliable. We propose to enable each client to intensify its specialized knowledge (common class knowledge) in the computation of discrepancy to sample more informative data containing specialized knowledge. We introduce the knowledge-specialized KL-Divergence that focuses on each client’s specialized knowledge and selects more informative data points from its specialized classes for labelling. It amplifies the KL-Divergence on classes that are considered to contain the client’s specialized knowledge (frequent classes in training data).

KCFU

The local data on each client follows its own realistic data distributions [18], thus leaving non-uniform class distribution on each client. Besides, our KSAS which tends to annotate data with specialized-knowledge further introduces imbalance in labelled data. Therefore, we introduce KCFU that includes a balanced classifier and a knowledge-ompensatory strategy. The former prevents the model from becoming biased towards common classes during training and the latter compensates for the clients’ knowledge on the weak classes to reduce statistical heterogeneity of clients.

The whole learning process:

Training

run git clone https://github.com/anonydoe/Knowledge-Aware-Federated-Active-Learning-with-Non-IID-Data.git to download the project
run pip install -r requirements.txt to download the required packages
run the code with python3 main.py or alternatively python3 -u main.py > log.txt to save the log file

Citing KAFAL

@misc{cao2023knowledgeaware,
      title={Knowledge-Aware Federated Active Learning with Non-IID Data}, 
      author={Yu-Tong Cao and Ye Shi and Baosheng Yu and Jingya Wang and Dacheng Tao},
      year={2023},
      eprint={2211.13579},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
models		models
.gitattributes		.gitattributes
README.md		README.md
alg1.png		alg1.png
alpha0-1_cifar10_10clients.json		alpha0-1_cifar10_10clients.json
config.py		config.py
main.py		main.py
main_figure.jpg		main_figure.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge-Aware Federated Active Learning with Non-IID Data

Table of contents

Problem

Method

KSAS

KCFU

Training

Citing KAFAL

About

Releases

Packages

Languages

ycao5602/KAFAL

Folders and files

Latest commit

History

Repository files navigation

Knowledge-Aware Federated Active Learning with Non-IID Data

Table of contents

Problem

Method

KSAS

KCFU

Training

Citing KAFAL

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages