# Distributed computing using `dask`
----

- localcluster (i.e., laptop, desktop)

### **Import modules**

In [None]:
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import dask
from dask.distributed import Client, LocalCluster
import dask.array as da

### **Setup local cluster**

In [None]:
#cluster = LocalCluster(n_workers=1,threads_per_worker=1) # serial
cluster = LocalCluster()
client = Client(cluster)
client

## **Visualize the tasks:**

- Open the `dashboard` using the above http link
- `http://<localhost or server ip>:<port>/status`

### **Comparison of `numpy` vs. `dask` performance**

**Test computation:**

*Define:*

\begin{equation}
\mathbf{X} \in \mathcal{R}^{n_i \times n_j}
\end{equation}

where:
- $n_i = n_j = 40000$

Let's compute $y$ using the following expression:

\begin{equation}
    y = \sum_i (\langle \mathbf{X} \rangle_j)_i
\end{equation}

#### **Define problem size:**

In [None]:
size = (10000,10000)

#### **Using numpy (single threaded)**

In [None]:
%%time
x = np.random.uniform(low=0., high=1.0, size=size)
y = x.mean(axis=0).sum()
print(y)

#### **Using dask (distributed)**

In [None]:
%%time 
x = da.random.uniform(low=0.,high=1.0,size=size)
y = x.mean(axis=0).sum()
y_val = y.compute()
print(y_val)

**Closing the cluster:**

In [None]:
client.close()
cluster.close()