FedCSGA: Evolutionary Client Selection with Joint Statistical and System Heterogeneity in Federated Learning
This repository implements a Federated Learning (FL) framework that enables adaptive client selection using Genetic Algorithms (GA). The framework provides comprehensive tools for comparing and evaluating different client selection strategies across various data distributions and heterogeneity levels in federated learning simulations.
Note: This code has been written and tested on Ubuntu and should work seamlessly on any Linux-based distribution. Windows users may need to adjust some steps.
- Git
- Python 3.x
- pip (Python package installer)
- Miniconda (recommended for environment management)
Follow these steps to set up and use this repository:
git clone https://github.com/nclabteam/FedCSGA.git-
After cloning, navigate to the cloned directory and open the terminal.
-
Ensure
pipis installed. If not, install it using:
sudo apt install python3-pip- We will use Miniconda to create a virtual environment. Download Miniconda with: (If you already have Conda installed, skip steps 4 and 5.)
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3-latest-Linux-x86_64.sh- Then install Miniconda using the command below:
bash Miniconda3-latest-Linux-x86_64.sh- Create a new virtual environment using Conda:
conda env create -f environment.yamlThis will create a virtual environment named fedcsga based on the environment.yaml file.
- Activate the virtual environment:
conda deactivate
conda activate fedcsga-
You can change the configuration as per your needs in the
config.yamlfile. -
To scale clients to a few hundred, we can run Flower in simulation mode on a single machine like this:
python main.py
or
python main.py /path/to/config.yamlThis script reads the configuration from the config.yaml file and starts the simulation.
The outputs will be saved in the out directory.
🔧 Description about config.yaml file
The config.yaml file is the configuration file used to train a Federated Learning model with this framework.
It is divided into three sections: common, server, and client.
The common section contains the common configurations used in this framework.
data_type: This field specifies the data distribution type used in the training process. Currently supported data distributions are [iid,dirichlet_niid]. Detailed explination can be found here.dataset: This field specifies the dataset used in the training process.dirichlet_alpha: This field is used whendata_typeis set todirichlet_niid. It specifies the Dirichlet concentration parameter.target_acc: This field specifies the target accuracy the model should achieve (must be> 0).model: This field specifies the model architecture used in the training process.optimizer: This field specifies the optimizer used in the training process. It could be eithersgdoradam.seed: This field fixes the seed for reproducibility.
The server section contains the configurations for the server that coordinates the Federated Learning process.
num_rounds: This field specifies the maximum number of rounds for the training process.num_clients: This field specifies the total number of clients participating in training.fraction_fit: This field specifies the fraction of participating clients used for training in each round.fraction_evaluate: This field specifies the fraction of participating clients used for evaluation in each round.min_fit_clients: This field specifies the minimum number of participating clients required for training in each round.address: This field specifies the IP address of the server.strategy: This field specifies the strategy used for Federated Learning. Currently supported strategies are [fedcsga,fedavg,fedprox,powd,greedyfed,afl,greedyfl].
The client section contains the configurations for the clients participating in the Federated Learning process.
-
epochs: This field specifies the number of epochs for each client's training process. -
batch_size: This field specifies the batch size for each client's training process. -
lr: This field specifies the learning rate for each client's training process. -
save_train_res: This field specifies whether to save the training results. Iftrue, saves training results (accuracy, loss, time, etc.) to theoutdirectory. -
total_cpus: Number of CPU cores assigned for the whole simulation. -
total_gpus: Number of GPUs assigned for the whole simulation. -
gpu:trueorfalse, use GPU for training or not. Default tofalse. -
num_cpus: Number of CPU cores assigned for each client. The default is1. -
num_gpus: Fraction of GPU assigned to each client. (num_cpus and num_gpus can only be used in simulation mode ifsimulationis set totrue).For more details on simulation mode, please refer to Flower's simulation guide and Ray's scheduling documentation.
pip install isort black
isort mak --profile=black
black mak
Or run:
bash code_formatting.sh