# Process an array with multiple threads

Multiple threads to process simultaneously different parts of the same array. `dask` automatically provides this feature by replacing the `numpy` function with `dask` functions. The key concept is a chunk, each chunk of data is executed separately by different threads. For example for a matrix we define a 2D block size and each of those blocks can be executed independently and then the results accumulated to get to the final answer.

### Library Dependancies

Need mkl, numpy. Install mkl with pip: ```pip install mkl```. Install numpy with pip: ```pip install numpy```.

In [None]:
# Currently numpy on some platforms is already multithreaded thanks to Intel MKL,
# for this example we disable multithreading
import mkl
mkl.set_num_threads(1)

In [None]:
import numpy as np
import dask.array as da

In [None]:
A = np.random.rand(20000,4000)

`%whos` is a magic function provided by `IPython` that gives memory consumption of defined variables

In [None]:
%whos

In [None]:
A

First let's perform some operations on the matrix in pure `numpy`, using a single thread

In [None]:
%time B = A**2 + np.sin(A) * A * np.log(A)

## Processing with dask

First create a chunked `dask` array from the `numpy` array

In [None]:
A_dask = da.from_array(A, chunks=(2000, 1000))

In [None]:
A_dask.numblocks

Then replace each function with the equivalent provided by `dask`, it implements most of the `numpy` functions and operations.

In [None]:
compute_B = (A_dask**2 + da.sin(A_dask) * A_dask * da.log(A_dask))

In [None]:
%time B_dask = compute_B.compute(num_workers=1)

In [None]:
%time B_dask = compute_B.compute(num_workers=4)

In [None]:
%time B_dask = compute_B.compute(num_workers=12)

In [None]:
%time B_dask = compute_B.compute(num_workers=24)

In [None]:
assert np.allclose(B, B_dask)