Skip to content

Dynamic averaging framework for federated machine learning algorithms

License

Notifications You must be signed in to change notification settings

sashlinreddy/dyn-fed

Repository files navigation

Dynamic Averaging techniques for Federated Machine Learning

Dynamic averaging framework for federated machine learning algorithms


  • Free software: MIT license

Prerequisites

Datasets

Features

  • Parameter quantization
  • Fault tolerant training - server keeps track of clients most representative data samples. If the worker goes down, the server redistributes those points to the clients that continue to do work
  • Periodic communication - Workers only communicate parameters based on a selected interval

Development

git clone https://github.com/sashlinreddy/dyn-fed.git

First you need to compile the protocol buffer file. The definitions are in the dfl.proto file.

Compilation is executed with the following command:

protoc -I=protos/ --python_out=dyn_fed/proto/ protos/dfl.proto

Local development (With tmux)

tmux
export LOGDIR=${PWD}/logs
./scripts/client_local.sh $N_WORKERS -v $VERBOSE -m $MODEL_TYPE # Run in separate window
./scripts/server_local.sh $N_WORKERS -v $VERBOSE -m $MODEL_TYPE # Set "setw synchronize-panes on" as a tmux setting. Use Ctrl+B,: for insert mode

To view the results on tensorboard assuming you are in the parent directory:

tensorboard --logdir=logs

Go to http://localhost:6006.

Running on SLURM cluster

The slurm launch generates a multi-prog on the fly with desired arguments. The above command will launch a job with the default arguments specified in server execution script. However, arguments can be passed to the job submission as below:

sbatch -n $ntasks dyn-fed/scripts/slurm_launch.sh -m $MODEL_TYPE -v $VERBOSE

Generate multiple experiments

(ftml) $ python dyn-fed/examples/train_experiments.py

Cancel multiple jobs

squeue -u $USER | grep $JOBIDSTART |awk '{print $1}' | xargs -n 1 scancel

Setup config

The config of the model can be set in the config file. The dataset can be configured in this file as well as the following parameters:

  • model

    • type: Type of model
    • n_iterations: No. of iterations
    • shuffle: Whether or not to shuffle the data in each iteration
  • data

    • name: Dataset name
    • shuffle: Whether or not to shuffle dataset # Not used
    • batch_size: Data batch size
    • shuffle_buffer_size: Shuffle buffer size
    • noniid: Whether or not to make dataset noniid
    • unbalanced: Whether or not to make dataset unbalanced
  • optimizer

    • learning_rate: Rate at which model learns
      • Mnist: SGD: 0.01, Adam: 0.001
      • Fashion Mnist: SGD: 0.01, Adam: 0.001
    • name: Name of optimizer (Currently supports sgd and adam)
  • distribute

    • strategy: Name of distribution strategy
    • remap: Redistribution strategy
    • quantize: Whether or not to use quantization when communicating parameters
    • comm_period: How often to communicate parameters
    • delta_switch: When to switch to every iteration communication
    • delta_threshold: For dynamic averaging paper
    • timeout: Time given for any clients to join
    • send_gradients: Whether or not to send gradients back to server
    • shared_folder: Dataset to be used
  • executor

    • scenario: Scenario type, see code for more details

View Results

To view the results on tensorboard:

sbatch scripts/tensorboard_slurm.sh

Check the output log located in $HOME/logs.

We need to create an ssh tunnel

ssh username@$clusterIP -L $localPort:$clusterNodeip:$clusterNodePort

View tensorboard on http://localhost:6006 :)

To pull results for log files run the following:

python dyn-fed/dyn_fed/utils/logger_parser.py logs/slurm/[fashion-mnist|mnist]/ fault-tolerant-ml/data/[fashion_mnist|mnist]_results.csv

Run tests

nosetests -vv

Future

  • Build and test on mobile app

Links

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

Dynamic averaging framework for federated machine learning algorithms

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published