Skip to content

Experimental sandbox serving as a part of IDA 2024 submission

License

Notifications You must be signed in to change notification settings

MKZuziak/IDA_2024_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IDA_2024_demo

1. Introduction

This repository allows to re-create series of experiments presented in the paper 'Amplified Contribution Analysis for Federated Learning', as well as run custom experiments using attached libraries for generating federated data splits and performing federated training.

2. How to Install

a) Clone the repo using:

git clone https://github.com/MKZuziak/IDA_2024_demo.git

b) Navigate to the clone folder and install the virtual environment with all the dependencies

poetry install

c) Enter the shell inside the poetry virtual environment

poetry shell

or

poetry [cmd]

2. How to Use

The Demo Library supports two types of operations to configure your simulation. Firstly, you can run the simulation using a pre-configured JSON format. Secondly, you can manually insert all the parameters. Navigate inside the ida_2024_demo/ida_2024_demo folder and run:

python cli.py -c -p configuration.json

or

python cli.py -m

for manually inserting all the parameters. To change the directory in which the results of the simulation will be stored, you can use '-s' option with desired global path.

python cli.py -c configuration.json -s global/path/to/desired/directory || python cli.py -m -s global/path/to/desired/directory

3. Using JSON format

All the required parameters are placed inside the configuration.json file. The template is as follows:

{
    "dataset": "mnist",      ### You can select: mnist, fmnist and cifar10 [str]
    "split": "homogeneous",      ### homogeneous, heterogeneous_size or dirchlet [str]
    "clients": 15,      ### Number of clients [int]
    "transformations": [0, 1, 2, 3, 4, 5],      ### Select clients (by ID) which will be noised out [list[int]]
    "iterations": 40,      ### Number of iterations [int]
    "sample_size": 3,      ### Size of the sample  [int]
    "global_learning_rate": 1.0,      ### Global Learning Rate [float]
    "local_epochs": 3,      ### Number of local epochs [int]
    "optimizer": "Adam",    ### Optimizer to choose from [str]
    "batch_size": 32,      ### Batch size [int]
    "learning_rate": 0.001,      ### Local learning rate [float]
    "alpha-amplification": 2      ### Alpha amplification [int]
}

Please note that in this case, alpha-amplification must be an int, as this is interpreted as the number of copies that will be added to the set (Remark no. 1 in the original paper).

4. Output format

The script will produce two different results: dataset generation output and simulation output. Dataset generation output will be saved in the current working directory at the time of invoking the python cli.py script. This category contains the dataset in a universal HuggingFace format, HuggingFace arrows that allow quick load of the cached dataset, distribution blueprint and transformations visualizations. This can be discarded or preserved, depending on the preferences. Simulation output will be saved in a 'demo_results' folder, and it will contain an 'archiver' subfolder with all the results. This category contains metrics from all training rounds, contribution evaluation metrics, preserved global and local models and more.

About

Experimental sandbox serving as a part of IDA 2024 submission

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages