# Daskified unyt Arrays  

notes on how yt uses unyt arrays. 

Part of the Daskening of yt relies on adding Dask support to unyt arrays ([PR 185](https://github.com/yt-project/unyt/pull/185)). As this has potential users beyond yt users, it is worth walking through its usage.  


## example usage 

In [1]:
from unyt import dask_array as unyt_dask_array, unyt_quantity, unyt_array
from dask import array as da
import numpy as np

In [2]:
x1 = unyt_dask_array.unyt_from_dask(da.random.random((1e6,), chunks=(1e5)), 'm')

In [3]:
x1

Unnamed: 0,Array,Chunk
Bytes,8.00 MB,800.00 kB
Shape,"(1000000,)","(100000,)"
Count,10 Tasks,10 Chunks
Type,float64,numpy.ndarray
Units,m,m
"Array Chunk Bytes 8.00 MB 800.00 kB Shape (1000000,) (100000,) Count 10 Tasks 10 Chunks Type float64 numpy.ndarray Units m m",1000000  1,

Unnamed: 0,Array,Chunk
Bytes,8.00 MB,800.00 kB
Shape,"(1000000,)","(100000,)"
Count,10 Tasks,10 Chunks
Type,float64,numpy.ndarray
Units,m,m


In [4]:
x1.to('cm')

Unnamed: 0,Array,Chunk
Bytes,8.00 MB,800.00 kB
Shape,"(1000000,)","(100000,)"
Count,20 Tasks,10 Chunks
Type,float64,numpy.ndarray
Units,cm,cm
"Array Chunk Bytes 8.00 MB 800.00 kB Shape (1000000,) (100000,) Count 20 Tasks 10 Chunks Type float64 numpy.ndarray Units cm cm",1000000  1,

Unnamed: 0,Array,Chunk
Bytes,8.00 MB,800.00 kB
Shape,"(1000000,)","(100000,)"
Count,20 Tasks,10 Chunks
Type,float64,numpy.ndarray
Units,cm,cm


In [5]:
x2 = unyt_dask_array.unyt_from_dask(0.001 * da.random.random((1e6,), chunks=(1e5)), 'km')

In [6]:
x = (x1 + x2).to('m')
x

Unnamed: 0,Array,Chunk
Bytes,8.00 MB,800.00 kB
Shape,"(1000000,)","(100000,)"
Count,60 Tasks,10 Chunks
Type,float64,numpy.ndarray
Units,m,m
"Array Chunk Bytes 8.00 MB 800.00 kB Shape (1000000,) (100000,) Count 60 Tasks 10 Chunks Type float64 numpy.ndarray Units m m",1000000  1,

Unnamed: 0,Array,Chunk
Bytes,8.00 MB,800.00 kB
Shape,"(1000000,)","(100000,)"
Count,60 Tasks,10 Chunks
Type,float64,numpy.ndarray
Units,m,m


In [7]:
x.mean().compute()

unyt_quantity(1.00010563, 'm')

In [8]:
mask  = np.greater(x1, unyt_quantity(50, 'cm').to(x1.units)) ## > operator is broken?
mask

Unnamed: 0,Array,Chunk
Bytes,1000.00 kB,100.00 kB
Shape,"(1000000,)","(100000,)"
Count,20 Tasks,10 Chunks
Type,bool,numpy.ndarray
"Array Chunk Bytes 1000.00 kB 100.00 kB Shape (1000000,) (100000,) Count 20 Tasks 10 Chunks Type bool numpy.ndarray",1000000  1,

Unnamed: 0,Array,Chunk
Bytes,1000.00 kB,100.00 kB
Shape,"(1000000,)","(100000,)"
Count,20 Tasks,10 Chunks
Type,bool,numpy.ndarray


In [9]:
x[mask].mean().compute()

unyt_quantity(1.24989267, 'm')

## Design

The approach to a daskifed unyt array is to create a new Dask Collection that has the behavior of both a dask array and a unyt array. 

Insert some code snippets...

## a comparison

Using unyt dask arrays comes with the enhanced performance expected from using dask arrays.


In [10]:
array_shape = (int(1e8), )
chunk_size = 1e6

In [11]:
plain_numpy = np.ones(array_shape[0])
plain_unyt = unyt_array(plain_numpy,'m')
plain_dask = da.ones(array_shape[0], chunks = (chunk_size,))
unyt_dask = unyt_dask_array.unyt_from_dask(plain_dask,'m')

Operations for all four arrays:

In [12]:
%%timeit 
(plain_numpy ** 2).mean()

313 ms ± 39.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [13]:
%%timeit 
(plain_unyt ** 2).mean()

302 ms ± 24.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [14]:
%%timeit 
(plain_dask ** 2).mean().compute()

113 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [15]:
%%timeit 
(unyt_dask ** 2).mean().compute()

95.5 ms ± 1.28 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


Operations with unit conversions

In [16]:
%%timeit 
(plain_unyt.to('cm') ** 2).mean()

447 ms ± 8.18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [17]:
%%timeit 
(unyt_dask.to('cm') ** 2).mean().compute()

142 ms ± 3.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


### an aside on when to convert units

note, it's worth a reminder that when stringing together operations you can sometimes save on computation by delaying the scalar operation until after any reductions. 

In [18]:
result = ( ( 100 * plain_numpy )** 2).mean()
result_convert_after = (plain_numpy** 2).mean() * (100 **2)

print([result, result_convert_after, result == result_convert_after])

[10000.0, 10000.0, True]


since our unit conversions are simply scalar multiplications, the unit equivalent woutl be:

In [19]:
result = (plain_unyt.to('cm')** 2).mean()
result_convert_after = (plain_unyt** 2).mean().to('cm * cm') 

print([result, result_convert_after, result == result_convert_after])

[unyt_quantity(10000., 'cm**2'), unyt_quantity(10000., 'cm**2'), array(True)]


In [20]:
%%timeit 
(plain_unyt ** 2).mean().to('cm*cm')

280 ms ± 14.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [21]:
%%timeit 
(unyt_dask ** 2).mean().to('cm*cm').compute()

105 ms ± 4.81 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


### final performance comparison

SHOW plot: time vs array size for each, for different number of processors 