<a href="https://colab.research.google.com/github/NMashalov/FederationLearning/blob/master/Fed.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Reserach on zero-gradient in federation learning

Code was highly inspired by implementation in paper:

https://arxiv.org/pdf/2304.07861.pdf


First we'll work without samples
$$
    f(x,\xi), = b_\xi + |x|_\infty
$$

$\xi$ are samples from $\mathbf{N}(0,I)$.

$x$ is restricted to simplex:

$$
    D = \{|x|=1, x > 0\}
$$


Recall, infinity norm is max among all coordinates

$$
    \|x\|_\infty = \max\limits_{i=1,n} x_i
$$

## Algorithm implmentation

In [13]:
### HYPERPARAMS

# vector dim
D = 20
# number of machines
M = 32
# smoothing gamma
G = 0.001

In [17]:
import numpy as np
# fix random vector b

b = np.random.randn(D)

b = b / np.linalg.norm(b)
b.shape

(20,)

In [19]:
def loss(x, noise = 0):
    """
    Compute optimization function

    Argument:
        x: tensor [Machines number, Vector Dim
        noise:
    """
    return x @ b + np.max(x,dim=1) + noise * np.random.randn(D)

Gradient approximation

Two points l1:
$$
\nabla f_\gamma (x, ξ, e) = \frac{d}{2γ}
\left(f_{δ1}
(x + γe, ξ) − f_{δ2}
(x − γe, ξ)\right) sign(e)
$$


Two points l2:
$$
\nabla f_\gamma (x, ξ, e) = \frac{d}{2γ}
\left(f_{δ1}
(x + γe, ξ) − f_{δ2}
(x − γe, ξ)\right) e
$$

In [20]:
def sample_spherical(npoints, ndim=D):
    vec = np.random.randn(ndim, npoints)
    vec /= np.linalg.norm(vec, axis=0)
    return vec

In [24]:
import typing as tp

method_type = tp.Literal['l1','l2']

def calc_grad(x: np.array, method_type: method_type = 'l2'):
    if method_type == 'l2':
        e = sample_spherical(M)
        grad = D / (2* G) * (loss(x + G * e ) - loss(x - G * e )) * e
    elif method_type == 'l1':
        raise NotImplementedError('l1')
        grad = D / (2* G) * (loss(x + G * e ) - loss(x - G * e )) * e
    else:
        raise NotImplementedError('No methods')

# FED_AVG

In [25]:
def broadcast_avg(pool):
    """
    Helper functions for FedAc and FedAvg, average and broadcast the weights.
    """
    avg = pool.mean(axis=0)
    pool = np.repeat(avg[np.newaxis, :], pool.shape[0], axis=0)
    return pool

In [27]:
import pandas as pd
def fedavg(eta, M, K, T,  record_intvl=512, print_intvl=8192, SEED=0):
        """
        Simulate Federated Averaging (FedAvg, a.k.a. Local-SGD, or Parallel SGD, etc.)

        Arguments:
            eta:    learning rate
            M:      number of workers
            K:      synchronization interval, (i.e., local steps)
            T:      total parallel runtime
            record_intvl:   compute the population loss every record_intvl steps.

        Return:
            A pandas.Series object of population loss evaluated.
        """
        # set of
        np.random.seed(SEED)
        # weights on nodes
        common_init_w = np.random.randn(D)
        #
        w_pool = np.repeat(common_init_w[np.newaxis, :], M, axis=0)

        seq = pd.Series(name='loss')
        for iter_cnt in range(T+1):
            if iter_cnt % K == 0:
                w_pool = broadcast_avg(w_pool)

                if iter_cnt % record_intvl == 0:
                    seq.at[iter_cnt] = loss(w_pool[0, :])

            w_pool -= eta * calc_grad
        return seq

In [28]:
calc_grad()

TypeError: ignored

## Smooth function