Dynamic Averaging techniques for Federated Machine Learning

Dynamic averaging framework for federated machine learning algorithms

Free software: MIT license

Prerequisites

python >= 3.6
protocol buffers

Datasets

Occupancy data (https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+)
MNIST (http://yann.lecun.com/exdb/mnist/)
Fashion MNIST (https://github.com/zalandoresearch/fashion-mnist)

Features

Parameter quantization
Fault tolerant training - server keeps track of clients most representative data samples. If the worker goes down, the server redistributes those points to the clients that continue to do work
Periodic communication - Workers only communicate parameters based on a selected interval

Development

git clone https://github.com/sashlinreddy/dyn-fed.git

First you need to compile the protocol buffer file. The definitions are in the dfl.proto file.

Compilation is executed with the following command:

protoc -I=protos/ --python_out=dyn_fed/proto/ protos/dfl.proto

Local development (With tmux)

tmux
export LOGDIR=${PWD}/logs
./scripts/client_local.sh $N_WORKERS -v $VERBOSE -m $MODEL_TYPE # Run in separate window
./scripts/server_local.sh $N_WORKERS -v $VERBOSE -m $MODEL_TYPE # Set "setw synchronize-panes on" as a tmux setting. Use Ctrl+B,: for insert mode

To view the results on tensorboard assuming you are in the parent directory:

tensorboard --logdir=logs

Go to http://localhost:6006.

Running on SLURM cluster

The slurm launch generates a multi-prog on the fly with desired arguments. The above command will launch a job with the default arguments specified in server execution script. However, arguments can be passed to the job submission as below:

sbatch -n $ntasks dyn-fed/scripts/slurm_launch.sh -m $MODEL_TYPE -v $VERBOSE

Generate multiple experiments

(ftml) $ python dyn-fed/examples/train_experiments.py

Cancel multiple jobs

squeue -u $USER | grep $JOBIDSTART |awk '{print $1}' | xargs -n 1 scancel

Setup config

The config of the model can be set in the config file. The dataset can be configured in this file as well as the following parameters:

model
- type: Type of model
- n_iterations: No. of iterations
- shuffle: Whether or not to shuffle the data in each iteration
data
- name: Dataset name
- shuffle: Whether or not to shuffle dataset # Not used
- batch_size: Data batch size
- shuffle_buffer_size: Shuffle buffer size
- noniid: Whether or not to make dataset noniid
- unbalanced: Whether or not to make dataset unbalanced
optimizer
- learning_rate: Rate at which model learns
  - Mnist: SGD: 0.01, Adam: 0.001
  - Fashion Mnist: SGD: 0.01, Adam: 0.001
- name: Name of optimizer (Currently supports sgd and adam)
distribute
- strategy: Name of distribution strategy
- remap: Redistribution strategy
- quantize: Whether or not to use quantization when communicating parameters
- comm_period: How often to communicate parameters
- delta_switch: When to switch to every iteration communication
- delta_threshold: For dynamic averaging paper
- timeout: Time given for any clients to join
- send_gradients: Whether or not to send gradients back to server
- shared_folder: Dataset to be used
executor
- scenario: Scenario type, see code for more details

View Results

To view the results on tensorboard:

sbatch scripts/tensorboard_slurm.sh

Check the output log located in $HOME/logs.

We need to create an ssh tunnel

ssh username@$clusterIP -L $localPort:$clusterNodeip:$clusterNodePort

View tensorboard on http://localhost:6006 :)

To pull results for log files run the following:

python dyn-fed/dyn_fed/utils/logger_parser.py logs/slurm/[fashion-mnist|mnist]/ fault-tolerant-ml/data/[fashion_mnist|mnist]_results.csv

Run tests

nosetests -vv

Future

Build and test on mobile app

Links

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Name		Name	Last commit message	Last commit date
Latest commit History 573 Commits
.dvc		.dvc
.github		.github
config		config
data		data
docs		docs
dyn_fed		dyn_fed
examples		examples
notebooks		notebooks
protos		protos
reports		reports
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pylintrc		.pylintrc
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
Dockerfile		Dockerfile
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dynamic Averaging techniques for Federated Machine Learning

Prerequisites

Datasets

Features

Development

Local development (With tmux)

Running on SLURM cluster

Generate multiple experiments

Cancel multiple jobs

Setup config

View Results

Run tests

Future

Links

Credits

About

Releases

Packages

Languages

License

sashlinreddy/dyn-fed

Folders and files

Latest commit

History

Repository files navigation

Dynamic Averaging techniques for Federated Machine Learning

Prerequisites

Datasets

Features

Development

Local development (With tmux)

Running on SLURM cluster

Generate multiple experiments

Cancel multiple jobs

Setup config

View Results

Run tests

Future

Links

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages