# 2.6. Dask Array

![](dask-array-black-text.svg)

*From the Dask documentation:*
> Dask Array implements a subset of the NumPy ndarray interface using 
> blocked algorithms, cutting up the large array into many small arrays. 
> This lets us compute on arrays larger than memory using all of our cores. 
> We coordinate these blocked algorithms using dask graphs.

Dask Arrays provide "a parallel, larger-than-memory, n-dimensional array using blocked algorithms. Simply put: distributed Numpy.

- **Parallel:** Uses all of the cores on your computer
- **Larger-than-memory:** Lets you work on datasets that are larger than your available memory by breaking up your array into many small pieces, operating on those pieces in an order that minimizes the memory footprint of your computation, and effectively streaming data from disk.
- **Blocked Algorithms:** Perform large computations by performing many smaller computations"

*Taken from the Dask tutorial: https://github.com/dask/dask-tutorial*

In [None]:
import dask.array as da
import netCDF4 as nc4

## Creating Dask Arrays from Numpy Arrays

One of the easiest ways of creating Dask Arrays is directly from Numpy arrays using the `from_array` function of `dask.array`.  This function accepts anything that is "array-like," so it can accept a netCDF4 variable from netCDF4-python.

In [None]:
exfile = '../data/e5.oper.an.sfc.128_034_sstk.regn320sc.2010010100_2010013123.nc'

In [None]:
ncds = nc4.Dataset(exfile)
ncds