In [1]:
import numpy as np
from dask import array as da

**Dask Arrays** interface is similar to **Numpy API**

We just need to add an additional attribute `chunks`:

In [2]:
np_arr = np.random.randint(20, size=20)
np_arr

array([ 4,  6,  6,  5,  6,  7,  2,  3, 15, 17, 13, 11, 10, 18,  1,  6, 14,
       10, 10,  1])

In [3]:
dask_arr = da.random.randint(20, size=20, chunks=3)
dask_arr

Unnamed: 0,Array,Chunk
Bytes,160 B,24 B
Shape,"(20,)","(3,)"
Count,7 Tasks,7 Chunks
Type,int64,numpy.ndarray
"Array Chunk Bytes 160 B 24 B Shape (20,) (3,) Count 7 Tasks 7 Chunks Type int64 numpy.ndarray",20  1,

Unnamed: 0,Array,Chunk
Bytes,160 B,24 B
Shape,"(20,)","(3,)"
Count,7 Tasks,7 Chunks
Type,int64,numpy.ndarray


Since dask has lazy evaluation, you can see that the array has not been computed yet.

In order to force the computation, you must call the `.compute()` method:

In [4]:
dask_arr.compute()

array([13, 10, 11, 16,  2, 16, 19,  6, 16,  5, 16, 11, 14, 14,  6,  5, 17,
        7, 17, 18])

You can also see chunks sizes:

In [5]:
dask_arr.chunks

((3, 3, 3, 3, 3, 3, 2),)

One can also create a **dask array** from existing data in a **numpy array** (or a **list**):

In [6]:
dask_arr_from_np = da.from_array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], chunks=5)

dask_arr_from_np

Unnamed: 0,Array,Chunk
Bytes,80 B,40 B
Shape,"(10,)","(5,)"
Count,2 Tasks,2 Chunks
Type,int64,numpy.ndarray
"Array Chunk Bytes 80 B 40 B Shape (10,) (5,) Count 2 Tasks 2 Chunks Type int64 numpy.ndarray",10  1,

Unnamed: 0,Array,Chunk
Bytes,80 B,40 B
Shape,"(10,)","(5,)"
Count,2 Tasks,2 Chunks
Type,int64,numpy.ndarray
