# Parallelization Testing

In this notebook, I will learn how to use dask within xarray to parallelize running code and speed up parts of the Argo analysis. I'll start by running a simple test case (I hope to find) in xarray's documentation. If this work successfully, I will then move on to running the depth-->density interpolation function to see if that comes with speed improvements too.

In [79]:
import dask.array as da
import xarray as xr

In [81]:
factor = 10
lat, lon, time = 256, 512, 52596*factor
data = da.random.random((time,lat,lon),chunks=(100,256,512))
ds = xr.Dataset(
    {
        "data": (["time", "latitude", "longitude"], data)
    },
    coords={
        "time": np.arange(time),
        "latitude": np.linspace(-90, 90, lat),
        "longitude": np.linspace(-180, 180, lon)
    }
)

In [82]:
%time result = ds.mean('time').compute()

CPU times: user 54min 17s, sys: 13min 43s, total: 1h 8min 1s
Wall time: 1min 51s


In [2]:
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib.path import Path
import seaborn as sns
import seaborn
import pandas as pd
import numpy as np
from importlib import reload
import cartopy.crs as ccrs
import cmocean.cm as cmo
import gsw

In [3]:
import os
os.chdir('/home.ufs/amf2288/argo-intern/funcs')
import density_funcs as df
import EV_funcs as ef
import filt_funcs as ff
import plot_funcs as pf
import processing_funcs as prf

In [4]:
reload(df)
reload(ef)
reload(ff)
reload(prf)

<module 'processing_funcs' from '/home/amf2288/argo-intern/funcs/processing_funcs.py'>

In [68]:
natl = xr.open_dataset('/swot/SUM05/amf2288/sync-boxes/lon:(-25,-20)_lat:(-70,70)_ds_z.nc')
datl = xr.open_dataset('/swot/SUM05/amf2288/sync-boxes/lon:(-25,-20)_lat:(-70,70)_ds_z.nc').chunk({'N_PROF':1000})

Chunk and save as a zarr

In [83]:
datl

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type datetime64[ns] numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type object numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type object numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In [73]:
%time float(natl.CT.mean())

CPU times: user 43.3 ms, sys: 18.1 ms, total: 61.4 ms
Wall time: 57.9 ms


6.800860854562184

In [74]:
%time float(datl.CT.mean())

CPU times: user 85.5 ms, sys: 66.1 ms, total: 152 ms
Wall time: 63.5 ms


6.800860854562181

In [69]:
%time natl.CT.groupby('LATITUDE').mean();

CPU times: user 6.36 s, sys: 228 ms, total: 6.59 s
Wall time: 6.59 s


In [76]:
%time datl.CT.groupby('LATITUDE').mean();

CPU times: user 17.1 s, sys: 96.9 ms, total: 17.2 s
Wall time: 17.2 s


Okay something is not working as expected because the xr ds loaded with dask takes longer than the one loaded without. A few thoughts:
- It's possible the chunks are too small, so the overhead added for each calculation overwhelmes any advantage of running in parallel.
- Maybe it's not using multiple cores at all: the CPU time is about the same as wall time, which isn't a good sign.
- Maybe this isn't a time consuming enough calculation for using dask to make a difference at all?

The first thing to look into is definitely the second bullet point. If the processes aren't running oon multiple cores, then nothing else is going to work either.

Okay, I went to http://gyre.ldeo.columbia.edu:19999/#menu_users_submenu_cpu;theme=slate;help=true and the natl and datl runs both took right at (or slightly over ) 100%. So I don't think anything is being parallelized. What to try next??

In [78]:
datl

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type datetime64[ns] numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type object numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type object numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,object numpy.ndarray,object numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 85.70 kiB 7.81 kiB Shape (10970,) (1000,) Dask graph 11 chunks in 2 graph layers Data type int64 numpy.ndarray",10970  1,

Unnamed: 0,Array,Chunk
Bytes,85.70 kiB,7.81 kiB
Shape,"(10970,)","(1000,)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 20.92 MiB 1.91 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float32 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,20.92 MiB,1.91 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 41.85 MiB 3.81 MiB Shape (10970, 500) (1000, 500) Dask graph 11 chunks in 2 graph layers Data type float64 numpy.ndarray",500  10970,

Unnamed: 0,Array,Chunk
Bytes,41.85 MiB,3.81 MiB
Shape,"(10970, 500)","(1000, 500)"
Dask graph,11 chunks in 2 graph layers,11 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


- Don't use argo data to start.
- Start with a self contained example (creating arrays of arbitrary values using numpy). Test on LEAP hub first because we know that parallelization works on that hub.
- May need to work at the task graph to see how it's structuring.
- Individual chunks should be about 100mb in size, total dataset should be about 2gb in size.

- Try few versions: 1) create an array in memory (just calling np in the notebook), 2) saving that array, then loading it will lazy load