## 4.1 Introduction

The logistic problem discussed in Section 4 is how to tackle two types of bias for a finite sum problem in a decentralized environment using stochastic gradient descent. The finite sum problem is the minimization of an average of functions.
\begin{equation*}
    f(x) = \min_{x} \frac{1}{M} \sum_{i=1}^{M} f_i(x)
\end{equation*}
In a machine learning setting where there is a set of $M$ data points, we can interpret this problem as minimizing the average of $M$ error functions $f_i$ evaluated at the corresponding $i$th data point. Here we assume each $f_i$ is $L_i$-smooth and $f(x)$ is $\mu$-strongly convex. The first bias comes from the inaccurate gradient estimation due to the stochatic nature of SGD, which can be resolved with a stochastic variance reduced gradient descent (SVRG). The second type of bias is introduced due to the concensus bias of gradient estimation in a decentralized environment, which can be addressed using exact diffusion.

### 4.1.1 Organization

The organization of Section 4 is as follows: \[Will add link when these sections are ready\]

- Sec. 4.2 Stochastic Variance Reduced Gradient Descent

- Sec. 4.3 Decentralized Exact Diffusion

- Sec. 4.4 Combine SVRG and Exact Diffusion together

### 4.1.2 Initialize BlueFog and test it

All contents in this section are displayed in Jupyter notebook, and all experimental examples are written with BlueFog and iParallel. Readers not familiar with how to run BlueFog in ipython notebook environment is encouraged to read Sec. [HelloWorld section] first. In the following codes, we will initialize BlueFog and test whether it works normally.

In the following code, you should be able to see the id of your CPUs. We use 4 CPUs to conduct the following experiment.

In [1]:
import ipyparallel as ipp

rc = ipp.Client(profile='bluefog')
rc.ids

[0, 1, 2, 3]

Let each agent import necessary modules and then initialize BlueFog. You should be able to see the printed information like:  

> \[stdout:0\] Hello, I am 1 among 4 processes
> 
> ...

In [2]:
%%px
import numpy as np
import bluefog.torch as bf
import torch
from bluefog.common import topology_util
import networkx as nx

bf.init()
print(f"Hello, I am {bf.rank()} among {bf.size()} processes")

[stdout:0] Hello, I am 2 among 4 processes
[stdout:1] Hello, I am 1 among 4 processes
[stdout:2] Hello, I am 0 among 4 processes
[stdout:3] Hello, I am 3 among 4 processes


Push seed to each agent so that the simulation can be reproduced.

In [3]:
dview = rc[:] # A DirectView of all engines
dview.block=True

# Push the data into all workers
dview.push({'seed': 2021}, block=True)

[None, None, None, None]

After running the following code, you should be able to see the printed information like 

> \[stdout:0\] I received seed as value:  2021
> 
> ...

In [5]:
%%px
print("I received seed as value: ", seed)

[stdout:0] I received seed as value:  2021
[stdout:1] I received seed as value:  2021
[stdout:2] I received seed as value:  2021
[stdout:3] I received seed as value:  2021


Congratulations! Your BlueFog is initialized and tested successfully.