# Individual Dataset preprocessing

🚨 Run this using the project image: `quay.io/jbusecke/scale-aware-air-sea:68b654d76dce` (I think `7a675e9538c5` also works). 🚨

The newer tags (I tried `3dd162bc47c3` and all hell broke loose) cause issues with xarray writing to zarr (see notes below). 

- [ ] TODO: Check out the actual version of the pangeo-notebook that introduced this regression!
    - Is this related to gcm-filters at all?

I have produced this data with a pretty wild mix of dask clusters here (gateway, coiled, local cluster?)

- TODO: Reproduce this with a single type of cluster? Coiled is much cheaper due to the spot instance stuff, but limited in free core-hours...something to discuss later. Also super wasteful to do rn. Should be targeted for reviews or other revision.


## Notes
- In the past I had issues to write stores properly. But these seem to be transient!
Problems include:
- all nan slices chunks/slices in the final array
- not being able to write a store again after deleting it with fsspec/gcsfs (this seems to be fixed by running `fs.invalidate_cache()`, but still annoying
- Some algorithms just silently fail (see comments above and [here](https://github.com/ocean-transport/scale-aware-air-sea/issues/68))
- Simply selecting a few time slices for the `all_terms` output causes issues. 
- coiled fails with the regridding?

In [1]:
# this is kinda dumb, and should be included in the docker image, but for now I keep making so many changes.

In [2]:
# !pip install -e /home/jovyan/PROJECTS/scale-aware-air-sea --no-deps

In [3]:
from scale_aware_air_sea.parameters import get_params
from scale_aware_air_sea.stages import preprocess
from scale_aware_air_sea.stages_tests import test_data_preprocessing, test_smoothed_data, test_data_flux
from scale_aware_air_sea.utils import weighted_coarsen, filter_inputs_dataset, to_zarr_split

import numpy as np
import xarray as xr
import gcsfs

In [4]:
# Reproducibility info
import os
os.environ['JUPYTER_IMAGE']

'quay.io/jbusecke/scale-aware-air-sea:68b654d76dce'

In [5]:
xr.__version__

'2023.2.0'

In [6]:
fs = gcsfs.GCSFileSystem(requester_pays=True)

In [7]:
# write the datasets out to scratch (they are massively large!)
# TODO: I could probably achieve this with some sort of fancy decorator (wrapping some function with the path as input). 
# for now lets just write these bad boys to disk
# load global parameters
params = get_params('v1.0.1', test=False)
models = ['CM26', 'CESM']
full_check=False # run computationally expensive tests
plot = False
xr.set_options(keep_attrs=True)

<xarray.core.options.set_options at 0x7aa4ba545a80>

In [8]:
import dask
from dask_gateway import Gateway

gateway = Gateway()

# close existing clusters
open_clusters = gateway.list_clusters()
print(list(open_clusters))
if len(open_clusters)>0:
    for c in open_clusters:
        cluster = gateway.connect(c.name)
        cluster.shutdown()  

# options = gateway.cluster_options()
# options.worker_memory = 110 #in anticipation of the new workers.
# options.worker_cores = 14
# display(options)

# # Create a cluster with those options
# cluster = gateway.new_cluster(options)
# display(cluster)
# client = cluster.get_client()
# # cluster.adapt(1, 100)
# cluster.scale(2)
# client

options = gateway.cluster_options()
# options.worker_resource_allocation = '4CPU, 28.9Gi'
options.worker_resource_allocation = '8CPU, 57.9Gi'
options.idle_timeout_minutes = 30
# # options.worker_memory = 28*3 #in anticipation of the new workers.
# # options.worker_cores = 12 
# # options.worker_memory = 110 #in anticipation of the new workers.
# # options.worker_cores = 15
# options.worker_memory = 7 #in anticipation of the new workers.
# options.worker_cores = 1

# get more mem per worker for the shitty scaling of the flux calc
# Did not get this to work, so back to full utilization and `to_zarr_split`.
display(options)

[]


VBox(children=(HTML(value='<h2>Cluster Options</h2>'), GridBox(children=(HTML(value="<p style='font-weight: bo…

Options<instance_type='n2-highmem-16',
        worker_resource_allocation='8CPU, 57.9Gi',
        image='quay.io/jbusecke/scale-aware-air-sea:68b654d76dce',
        environment={'SCRATCH_BUCKET': 'gs://leap-scratch/jbusecke',
         'PANGEO_SCRATCH': 'gs://leap-scratch/jbusecke'},
        idle_timeout_minutes=30>


In [9]:
# Create a cluster with those options
cluster = gateway.new_cluster(options)
display(cluster)
client = cluster.get_client()
# cluster.scale(2)
cluster.scale(50)
client

VBox(children=(HTML(value='<h2>GatewayCluster</h2>'), HBox(children=(HTML(value='\n<div>\n<style scoped>\n    …

0,1
Connection method: Cluster object,Cluster type: dask_gateway.GatewayCluster
Dashboard: /services/dask-gateway/clusters/prod.cc44595957e548c8bb4682b171108fd7/status,


In [10]:
# from distributed import Client
# client = Client()
# client

In [11]:
# FIXME: This does produce nan only slices for the preprocessed data (maybe a dependency issue with xesmf?)

# import coiled
# import dask
# cluster = coiled.Cluster(
#     n_workers=40,
#     worker_cpu=[2,4],
#     worker_memory=["8GiB", "16GiB"],
#     spot_policy='spot_with_fallback',
#     account='jbusecke',
#     wait_for_workers=False,
# )
# cluster.set_keepalive("30 minutes")
# client = cluster.get_client()
# display(cluster)
# client

In [22]:
fs.invalidate_cache()

In [23]:
fs.exists(path)

False

In [24]:
data_preprocessing = {}
for model in models:
    path = params['paths'][model]['preprocessing']['scratch']
    
    if not fs.exists(path):
        
        print(f'Did not find {path}. Recomputing output')
        ds = preprocess(fs, model)
        ds.attrs['model'] = model
        
        print(f"Start Writing to zarr {path}")
        ds.to_zarr(path)
    
    print(f"Reloading data from {path}")
    ds_reloaded = xr.open_dataset(path, engine='zarr', chunks={})
    display(ds_reloaded)
    
    print(f"Testing reloaded data")
    test_data_preprocessing(ds_reloaded, full_check=full_check)
    
    data_preprocessing[model] = ds_reloaded

Did not find gs://leap-scratch/jbusecke/scale-aware-air-sea/v1.0.1/temp/CM26.zarr. Recomputing output
CM26: Loading Data
Load Data
Interpolating ocean velocities
Modify units
CM26: Align in time
CM26: Regridding atmosphere (this takes a while, because we are computing the weights on the fly)
CM26: Merging on ocean tracer grid
CM26: Calculate relative wind
CM26: Drop extra coords
Start Writing to zarr gs://leap-scratch/jbusecke/scale-aware-air-sea/v1.0.1/temp/CM26.zarr
Reloading data from gs://leap-scratch/jbusecke/scale-aware-air-sea/v1.0.1/temp/CM26.zarr


Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,66.13 GiB,27.81 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,bool numpy.ndarray,bool numpy.ndarray
"Array Chunk Bytes 66.13 GiB 27.81 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type bool numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,66.13 GiB,27.81 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,bool numpy.ndarray,bool numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded data
Reloading data from gs://leap-scratch/jbusecke/scale-aware-air-sea/v1.0.1/temp/CESM.zarr


Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.87 GiB,8.24 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,bool numpy.ndarray,bool numpy.ndarray
"Array Chunk Bytes 5.87 GiB 8.24 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type bool numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,5.87 GiB,8.24 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,bool numpy.ndarray,bool numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded data


## Coarsen/Filter the inputs

In [27]:
smoothed_data_raw = {'filter':{}, 'coarse':{}}
for smoothing_method in ['coarse', 'filter']:
    for model in models:
        path = params['paths'][model]['smoothing'][smoothing_method]
        if not fs.exists(path):
            print(f'Did not find {path}. Recomputing output')
            ds_in = data_preprocessing[model]
            display(ds_in)

            
            if smoothing_method == 'coarse':
                print(f"Coarsening {model}")
                ds_out = weighted_coarsen(
                    ds_in, 
                    {'xt_ocean':params['n_coarsen'], 'yt_ocean':params['n_coarsen']}, 
                    'area_t'
                )
                ds_out.attrs['n_coarsen'] = params['n_coarsen']
                
            elif smoothing_method == 'filter':
                
                print(f"Filtering {model}")
                
                # the filtering does not like the ice_mask somehow?
                # TODO: debug what is going on here.
                ds_in = ds_in.drop(['ice_mask'])
                # we can always get that back from the original dataset?
                
                ds_out = filter_inputs_dataset(
                    ds_in,
                    ['yt_ocean', 'xt_ocean'], 
                    params['filter_scale'], 
                    filter_type=params['filter_type'],
                )
                ds_out.attrs['filter_type'] = params['filter_type']
                ds_out.attrs['filter_scale'] = params['filter_scale']

            ds_out.attrs['smoothing_method'] = smoothing_method
            display(ds_out)

            print(f"Start Writing to zarr {path} (Size in memory: {ds_out.nbytes/1e12}TB)")
            if smoothing_method=='coarse':
                cluster.scale(100)
            elif smoothing_method=='filter':
                cluster.scale(700)

            ds_out.to_zarr(path)
        print(f"Reloading data from {path}")
        ds_reloaded = xr.open_dataset(path, engine='zarr', chunks={})
        
        #TODO: remove this for final version
        ds_reloaded.attrs['model'] = model
        # TODO: end
        
        display(ds_reloaded)
        print(f"Testing reloaded output")
        test_smoothed_data(data_preprocessing[model], ds_reloaded, plot=plot, full_check=full_check)
        smoothed_data_raw[smoothing_method][model] = ds_reloaded

Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/smoothing/CM26_coarse_50.zarr


Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,91.12 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 216.69 MiB 91.12 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float64 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,91.12 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 108.34 MiB 45.56 kiB Shape (7305, 54, 72) (3, 54, 72) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54  7305,

Unnamed: 0,Array,Chunk
Bytes,108.34 MiB,45.56 kiB
Shape,"(7305, 54, 72)","(3, 54, 72)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/smoothing/CESM_coarse_50.zarr


Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 19.25 MiB 27.00 kiB Shape (730, 48, 72) (1, 48, 72) Dask graph 730 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48  730,

Unnamed: 0,Array,Chunk
Bytes,19.25 MiB,27.00 kiB
Shape,"(730, 48, 72)","(1, 48, 72)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


Testing reloaded output
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/smoothing/CM26_filter.zarr


Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 264.51 GiB 111.24 MiB Shape (7305, 2700, 3600) (3, 2700, 3600) Dask graph 2435 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700  7305,

Unnamed: 0,Array,Chunk
Bytes,264.51 GiB,111.24 MiB
Shape,"(7305, 2700, 3600)","(3, 2700, 3600)"
Dask graph,2435 chunks in 2 graph layers,2435 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/smoothing/CESM_filter.zarr


Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,1.03 MiB
Shape,"(2400, 3600)","(300, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 1.03 MiB Shape (2400, 3600) (300, 450) Dask graph 64 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,1.03 MiB
Shape,"(2400, 3600)","(300, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,1.03 MiB
Shape,"(2400, 3600)","(300, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 1.03 MiB Shape (2400, 3600) (300, 450) Dask graph 64 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,1.03 MiB
Shape,"(2400, 3600)","(300, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,1.03 MiB
Shape,"(2400, 3600)","(300, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 1.03 MiB Shape (2400, 3600) (300, 450) Dask graph 64 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,1.03 MiB
Shape,"(2400, 3600)","(300, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 23.50 GiB 32.96 MiB Shape (730, 2400, 3600) (1, 2400, 3600) Dask graph 730 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2400  730,

Unnamed: 0,Array,Chunk
Bytes,23.50 GiB,32.96 MiB
Shape,"(730, 2400, 3600)","(1, 2400, 3600)"
Dask graph,730 chunks in 2 graph layers,730 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output


## Recompute fluxes from 

In [28]:
from aerobulk import noskin
def compute_fluxes(
    ds,
    algo,
    method,
    sst_name = 'surface_temp',
    t_name = 't_ref',
    q_name = 'q_ref',
    u_name = 'u_relative',
    v_name = 'v_relative',
    slp_name = 'slp',
    skin_correction = False
):
    ds = ds.copy() # TODO: Does this help with inplace modification? If so, why?
    # input dependent on method
    # FIXME: Technically we should probably make the 'filtered' suffix optional, since this could also be coarseened?
    if method == 'smooth_tracer':
        sst = ds[sst_name+'_filtered']
        t = ds[t_name+'_filtered']
        q = ds[q_name+'_filtered']
        u = ds[u_name]
        v = ds[v_name]
        slp = ds[slp_name+'_filtered']
    elif method == 'smooth_vel':
        sst = ds[sst_name]
        t = ds[t_name]
        q = ds[q_name]
        u = ds[u_name+'_filtered']
        v = ds[v_name+'_filtered']
        slp = ds[slp_name]
    elif method == 'smooth_vel_tracer_atmos':
        sst = ds[sst_name]
        t = ds[t_name+'_filtered']
        q = ds[q_name+'_filtered']
        u = ds[u_name+'_filtered_atmos_only']
        v = ds[v_name+'_filtered_atmos_only']
        slp = ds[slp_name+'_filtered']
    elif method == 'smooth_vel_tracer_ocean':
        sst = ds[sst_name+'_filtered']
        t = ds[t_name]
        q = ds[q_name]
        u = ds[u_name+'_filtered_ocean_only']
        v = ds[v_name+'_filtered_ocean_only']
        slp = ds[slp_name]
    elif method == 'smooth_vel_ocean':
        sst = ds[sst_name]
        t = ds[t_name]
        q = ds[q_name]
        u = ds[u_name+'_filtered_ocean_only']
        v = ds[v_name+'_filtered_ocean_only']
        slp = ds[slp_name]
    elif method == 'smooth_all':
        sst = ds[sst_name+'_filtered']
        t = ds[t_name+'_filtered']
        q = ds[q_name+'_filtered']
        u = ds[u_name+'_filtered']
        v = ds[v_name+'_filtered']
        slp = ds[slp_name+'_filtered']
    elif method == 'smooth_none':
        sst = ds[sst_name]
        t = ds[t_name]
        q = ds[q_name]
        u = ds[u_name]
        v = ds[v_name]
        slp = ds[slp_name]
    else:
        raise ValueError(f'`method`{method} not recognized')
        
    # if skin_correction:
    #     func = noskin
    
    ## test ranges on first timestep
    noskin(
        sst.isel(time=0),
        t.isel(time=0),
        q.isel(time=0),
        u.isel(time=0),
        v.isel(time=0),
        slp=slp.isel(time=0),
        algo=algo,
        zt=2,
        zu=10,
        input_range_check=True
    )
    
    
    ds_out = xr.Dataset()
    (
        ds_out['ql'],
        ds_out['qh'],
        ds_out['taux'],
        ds_out['tauy'],
        ds_out['evap']
    ) =  noskin(
        sst,
        t,
        q,
        u,
        v,
        slp=slp,
        algo=algo,
        zt=2,
        zu=10,
        input_range_check=False
    )
    return ds_out


def _concat_flux_methods(ds, algo, smoothing_fields, skin_correction):
    algo_datasets = []
    for method in smoothing_fields:
        ds_method = compute_fluxes(ds, algo, method, skin_correction=skin_correction)
        ds_method = ds_method.assign_coords(smoothing=method)
        algo_datasets.append(ds_method)

    ds_algo = xr.concat(algo_datasets, dim='smoothing')
    ds_algo = ds_algo.assign_coords({'algo':f"{algo}{'_'+skin if skin_correction else ''}"})
    return ds_algo

def _combine_filtered_unfiltered(ds_unfiltered: xr.Dataset, ds_filtered: xr.Dataset) -> xr.Dataset:
    
    ds_out = ds_unfiltered
    
    filter_vars = ['u_relative', 'v_relative', 'surface_temp', 't_ref', 'q_ref', 'slp', 'u_ocean', 'v_ocean', 'u_ref', 'v_ref']
     # add the filtered variables to the original dataset
    for var in filter_vars:
        ds_out[var+'_filtered'] = ds_filtered[var]
    
    # Rebuild a second relative wind from only ocean filtered velocities 
    # I think this is easy enough to do on the fly and avoids storing 4 more full fields in the filtered output 
    ds_out['u_relative_filtered_ocean_only'] = ds_out['u_ref'] - ds_out['u_ocean_filtered']
    ds_out['v_relative_filtered_ocean_only'] = ds_out['v_ref'] - ds_out['v_ocean_filtered']

    ds_out['u_relative_filtered_atmos_only'] = ds_out['u_ref_filtered'] - ds_out['u_ocean']
    ds_out['v_relative_filtered_atmos_only'] = ds_out['v_ref_filtered'] - ds_out['v_ocean']
    return ds_out
    

def flux_compute_wrapper_filter(unfiltered:xr.Dataset, filtered:xr.Dataset, algo_options:list[str, bool], smoothing_fields: list[str], ice_mask:xr.DataArray) -> xr.Dataset:
    """Wrapper to apply the flux computation for different algorithms and smoothed fields
    
    """
    # merge filtered variables into unfiltered dataset
    ds = _combine_filtered_unfiltered(unfiltered, filtered)
    
    # calculate fluxes for each specified algo and concat
    datasets = []
    for algo, skin_correction in algo_options:
        ds_algo = _concat_flux_methods(ds, algo, smoothing_fields, skin_correction)
        datasets.append(ds_algo)
    ds_out = xr.concat(datasets, dim='algo')
    
    # mask with ice_mask
    # Ice mask needs to be applied after, because tempo-spatially variable nans lead to problems with the flux calculation
    ds_out = ds_out.where(ice_mask)
    ds_out.attrs = ds_out.attrs | ds_filter.attrs
    return ds_out

# coarsened data needs a totally different wrapper!!!
def flux_compute_wrapper_coarse(coarsened:xr.Dataset, filtered_fluxes:xr.Dataset, coarsen_dim_dict:dict[str:int]) -> xr.Dataset:
    
    datasets = []
    # align both datasets on inner for time
    exclude_dims = set(filtered_fluxes.dims)-set(['time'])
    filtered_fluxes, coarsened  = xr.align(filtered_fluxes, coarsened, join='inner', exclude=exclude_dims)
    
    # make sure the area of the filtered_fluxes array is masked consistently with the variables (needed for `weighted_coarsen`)
    nanmask = np.isnan(filtered_fluxes.qh.isel(time=0).drop('time'))
    area_masked = filtered_fluxes.area_t.where(~nanmask, 0.0)
    filtered_fluxes = filtered_fluxes.assign_coords(area_t=area_masked)
    
    # for this dataset we only have two methods
    # 1. The equivalent to 'smooth_all' where we compute fluxes on the coarsened output
    # This needs to be computed over every algo
    # to avoid misalignment, lets capture the algo from the filtered_fluxes dataset
    # TODO: Once we actually implement the skin_correction, that needs to be somehow extracted from the algo dimension
    # for now hardcode
    skin_correction = False
    iter_algos = filtered_fluxes.algo.data
    # in case there is only a single algo
    filtered_fluxes = filtered_fluxes.squeeze()
    for algo in iter_algos:
        # The equivalent to 'smooth_all' where we compute fluxes on the coarsened output
        ds_coarse_single_algo = compute_fluxes(coarsened, algo, 'smooth_none', skin_correction=skin_correction)
        # FIXME: this naming is not really easy to understand. Basically I am computing the fluxes on the coarsened dataset
        # which is equivalent to the 'smooth_all' in the filtered_fluxes. But for `compare_fluxes` we need to pretend like this is the full res dataset (input 'smooth_none'). 
        # This is obviously a shit design for this function and should be remedied.
        # I could rename the coarsened data before? 
        ds_coarse_single_algo = ds_coarse_single_algo.assign_coords(smoothing='smooth_all')
        datasets.append(ds_coarse_single_algo)
    ds_all_coarse = xr.concat(datasets, dim='algo')
    # and the equivalent to smooth none, where we coarsen the full res flux output
    # this is based on the other precomputed data, so all the algos are there already
    
    # keep only the variables we also have on the filtered_fluxes data
    ds_all_coarse = ds_all_coarse[list(filtered_fluxes.data_vars)]
    
    ds_none_coarse = weighted_coarsen(
        filtered_fluxes.sel(smoothing='smooth_none'), 
        coarsen_dim_dict,
        'area_t'
    )
    ds_none_coarse = ds_none_coarse.assign_coords(smoothing='smooth_none')
    
    # See note about the ice mask above: `ds_none_coarse` actually incorporates the ice mask (because it is applied as part of the flux calculation), and so this dataset has *a*
    # icemask. For now lets just apply this to the final output.
    # TODO: discuss if the coarsening is appropriate for treating the ice mask. currently AFAIK every larger box that has at least a single value in it will show up in the coarsened
    # output, thus e.g. central americas land barrier dissapears.
    ice_mask_coarse = ~np.isnan(ds_none_coarse.ql.reset_coords(drop=True))
    ds_all_coarse = ds_all_coarse.where(ice_mask_coarse)
    ds_all_coarse.coords['ice_mask'] = ice_mask_coarse

    # Finally concat along the 'smoothing' dimension (FIXME: better naming for this would be nice).
    ds_full_coarse = xr.concat([ds_none_coarse, ds_all_coarse], dim='smoothing', compat='override', coords='minimal')
    ds_full_coarse.attrs = ds_full_coarse.attrs | coarsened.attrs
    return ds_full_coarse

In [29]:
fs.invalidate_cache() # needed to sync gcsfs when deleting large stores...

In [30]:
flux_time_slice = {
    'prod': slice(0,None),
    'appendix': slice(0, 365)
}
smoothing_fields = {
    'prod':[
        'smooth_none',
        'smooth_vel_tracer_ocean',
        'smooth_vel_tracer_atmos',
        'smooth_all'
    ],
    'appendix':[
        'smooth_none',
        'smooth_tracer',
        'smooth_vel',
        'smooth_vel_tracer_ocean',
        'smooth_vel_tracer_atmos',
        'smooth_all'
    ],
}
algo_options = {
    'prod':[
        ('ecmwf', False), 
    ],
    'appendix':
    [
        ('ncar', False), 
        ('ecmwf', False), 
        # ('coare3p0', False), # I honestly think we do not need to include this (too similar to the newer coare)
        ('coare3p6', False), # This does produce errors for CESM (TODO raise issue). 
        ('andreas', False) # This does produce errors for CESM (TODO raise issue). 
        # weirdly though this did fail again afI suspect it is some issue with too high winds or some other out of range value...
    ]
}

flux_data = {'filter':{'prod':{}, 'appendix':{}}, 'coarse':{'prod':{}, 'appendix':{}}}

# CESM_fluxes_filter_appendix fails. I suspect due to some issue with large wind values or similar. 
# Needs debugging, but for now I will only run appendix for CM2.6?

# for smoothing_method in ['filter', 'coarse']:
#     for production_spec in ['prod', 'appendix']:
#         for model in models:
for smoothing_method, production_spec, model in [
    # ('filter', 'prod', 'CM26'),
    # ('filter', 'prod', 'CESM'),
    ('filter', 'appendix', 'CM26'),
    ('filter', 'appendix', 'CESM'), # this one fails
    ('coarse', 'prod', 'CM26'),
    ('coarse', 'prod', 'CESM'),
    ('coarse', 'appendix', 'CM26'),
    ('coarse', 'appendix', 'CESM'), # cant do this one without the filtered
]:
    path = params['paths'][model]['fluxes'][smoothing_method][production_spec]

    if not fs.exists(path):
        print(f'Did not find {path}. Recomputing output')
        
        ds_unfiltered = data_preprocessing[model].isel(time=flux_time_slice[production_spec])
        ds_filter = smoothed_data_raw['filter'][model].isel(time=flux_time_slice[production_spec])
        ds_coarse = smoothed_data_raw['coarse'][model].isel(time=flux_time_slice[production_spec])
        
        print(f"Computing Fluxes {model} {smoothing_method} {production_spec}")
        print(algo_options[production_spec])
        
        print('Ad-hoc fix for CESM. Only calculate emwf and ncar')
        a_options = algo_options[production_spec]
        if model == 'CESM':
            # The CESM data has issues with most of the algos. 
            # For now only doing ncar and ecmwf
            a_options = [(a, b) for a,b in a_options if a in ['ncar', 'ecmwf']]
        
        if smoothing_method == 'filter':
            ds_out = flux_compute_wrapper_filter(
                ds_unfiltered,
                ds_filter,
                a_options,
                smoothing_fields[production_spec],
                ds_unfiltered['ice_mask'],
            )
        elif smoothing_method == 'coarse':
            # this needs the filtered flux input
            # not a good desing here, since we need to keep the loops in this specific order
            ds_filter_fluxes = flux_data['filter'][production_spec][model]

            ds_out = flux_compute_wrapper_coarse(
                ds_coarse,
                ds_filter_fluxes,
                {'xt_ocean':params['n_coarsen'], 'yt_ocean':params['n_coarsen']},
            )
            
            # again, some issue with the ice mask
            ds_out = ds_out.drop(['ice_mask'])

        ds_out.attrs['production_spec'] = production_spec
        ds_out.attrs['model'] = model
        
        # retain only heat fluxes
        ds_out = ds_out[['qh', 'ql']]
        
        # testing: 
        ds_out = ds_out.drop([va for va in ['ice_mask'] if va in ds_out.variables])
        
        display(ds_out)
        
        print(f"Start Writing to zarr {path} (Size in memory: {ds_out.nbytes/1e12}TB)")
        # this somehow still does not work in one go for filtering
        if smoothing_method == 'coarse':
            ds_out.to_zarr(path)
        elif smoothing_method == 'filter':
            # split_interval = 30 if production_spec == 'appendix' else 600
            split_interval = 60 if production_spec == 'appendix' else 600
            to_zarr_split(ds_out, fs.get_mapper(path), split_interval=split_interval)
            # to_zarr_split(ds_out, fs.get_mapper(path), split_dim='algo', split_interval=1)

    print(f"Reloading data from {path}")
    ds_reloaded = xr.open_dataset(path, engine='zarr', chunks={})
    display(ds_reloaded)
    
    print(f"Testing reloaded output")
    test_data_flux(ds_reloaded, full_check=full_check, plot=plot)
    
    flux_data[smoothing_method][production_spec][model] = ds_reloaded

Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/fluxes/CM26_fluxes_filter_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(5, 6, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 396.50 GiB 111.24 MiB Shape (5, 6, 365, 2700, 3600) (1, 1, 3, 2700, 3600) Dask graph 3660 chunks in 2 graph layers Data type float32 numpy.ndarray",6  5  3600  2700  365,

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(5, 6, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(5, 6, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 396.50 GiB 111.24 MiB Shape (5, 6, 365, 2700, 3600) (1, 1, 3, 2700, 3600) Dask graph 3660 chunks in 2 graph layers Data type float32 numpy.ndarray",6  5  3600  2700  365,

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(5, 6, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/fluxes/CESM_fluxes_filter_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(2, 6, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 140.98 GiB 32.96 MiB Shape (2, 6, 365, 2400, 3600) (1, 1, 1, 2400, 3600) Dask graph 4380 chunks in 2 graph layers Data type float32 numpy.ndarray",6  2  3600  2400  365,

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(2, 6, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(2, 6, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 140.98 GiB 32.96 MiB Shape (2, 6, 365, 2400, 3600) (1, 1, 1, 2400, 3600) Dask graph 4380 chunks in 2 graph layers Data type float32 numpy.ndarray",6  2  3600  2400  365,

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(2, 6, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/fluxes/CM26_fluxes_coarse_50_prod.zarr


Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 216.69 MiB 45.56 kiB Shape (2, 7305, 54, 72, 1) (1, 3, 54, 72, 1) Dask graph 4870 chunks in 2 graph layers Data type float32 numpy.ndarray",7305  2  1  72  54,

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 216.69 MiB 45.56 kiB Shape (2, 7305, 54, 72, 1) (1, 3, 54, 72, 1) Dask graph 4870 chunks in 2 graph layers Data type float32 numpy.ndarray",7305  2  1  72  54,

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/fluxes/CESM_fluxes_coarse_50_prod.zarr


Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,27.00 kiB
Shape,"(2, 730, 48, 72, 1)","(1, 1, 48, 72, 1)"
Dask graph,1460 chunks in 2 graph layers,1460 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 38.50 MiB 27.00 kiB Shape (2, 730, 48, 72, 1) (1, 1, 48, 72, 1) Dask graph 1460 chunks in 2 graph layers Data type float64 numpy.ndarray",730  2  1  72  48,

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,27.00 kiB
Shape,"(2, 730, 48, 72, 1)","(1, 1, 48, 72, 1)"
Dask graph,1460 chunks in 2 graph layers,1460 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,27.00 kiB
Shape,"(2, 730, 48, 72, 1)","(1, 1, 48, 72, 1)"
Dask graph,1460 chunks in 2 graph layers,1460 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 38.50 MiB 27.00 kiB Shape (2, 730, 48, 72, 1) (1, 1, 48, 72, 1) Dask graph 1460 chunks in 2 graph layers Data type float64 numpy.ndarray",730  2  1  72  48,

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,27.00 kiB
Shape,"(2, 730, 48, 72, 1)","(1, 1, 48, 72, 1)"
Dask graph,1460 chunks in 2 graph layers,1460 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/fluxes/CM26_fluxes_coarse_50_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,75.94 kiB,75.94 kiB
Shape,"(54, 72, 5)","(54, 72, 5)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 75.94 kiB 75.94 kiB Shape (54, 72, 5) (54, 72, 5) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",5  72  54,

Unnamed: 0,Array,Chunk
Bytes,75.94 kiB,75.94 kiB
Shape,"(54, 72, 5)","(54, 72, 5)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,54.14 MiB,30.48 kiB
Shape,"(2, 5, 365, 54, 72)","(1, 1, 3, 51, 51)"
Dask graph,4880 chunks in 2 graph layers,4880 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 54.14 MiB 30.48 kiB Shape (2, 5, 365, 54, 72) (1, 1, 3, 51, 51) Dask graph 4880 chunks in 2 graph layers Data type float32 numpy.ndarray",5  2  72  54  365,

Unnamed: 0,Array,Chunk
Bytes,54.14 MiB,30.48 kiB
Shape,"(2, 5, 365, 54, 72)","(1, 1, 3, 51, 51)"
Dask graph,4880 chunks in 2 graph layers,4880 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,54.14 MiB,30.48 kiB
Shape,"(2, 5, 365, 54, 72)","(1, 1, 3, 51, 51)"
Dask graph,4880 chunks in 2 graph layers,4880 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 54.14 MiB 30.48 kiB Shape (2, 5, 365, 54, 72) (1, 1, 3, 51, 51) Dask graph 4880 chunks in 2 graph layers Data type float32 numpy.ndarray",5  2  72  54  365,

Unnamed: 0,Array,Chunk
Bytes,54.14 MiB,30.48 kiB
Shape,"(2, 5, 365, 54, 72)","(1, 1, 3, 51, 51)"
Dask graph,4880 chunks in 2 graph layers,4880 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/fluxes/CESM_fluxes_coarse_50_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,54.00 kiB,54.00 kiB
Shape,"(48, 72, 2)","(48, 72, 2)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 54.00 kiB 54.00 kiB Shape (48, 72, 2) (48, 72, 2) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",2  72  48,

Unnamed: 0,Array,Chunk
Bytes,54.00 kiB,54.00 kiB
Shape,"(48, 72, 2)","(48, 72, 2)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 27.00 kiB 27.00 kiB Shape (48, 72) (48, 72) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",72  48,

Unnamed: 0,Array,Chunk
Bytes,27.00 kiB,27.00 kiB
Shape,"(48, 72)","(48, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,25.88 kiB
Shape,"(2, 2, 365, 48, 72)","(1, 1, 1, 48, 69)"
Dask graph,2920 chunks in 2 graph layers,2920 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 38.50 MiB 25.88 kiB Shape (2, 2, 365, 48, 72) (1, 1, 1, 48, 69) Dask graph 2920 chunks in 2 graph layers Data type float64 numpy.ndarray",2  2  72  48  365,

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,25.88 kiB
Shape,"(2, 2, 365, 48, 72)","(1, 1, 1, 48, 69)"
Dask graph,2920 chunks in 2 graph layers,2920 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,25.88 kiB
Shape,"(2, 2, 365, 48, 72)","(1, 1, 1, 48, 69)"
Dask graph,2920 chunks in 2 graph layers,2920 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 38.50 MiB 25.88 kiB Shape (2, 2, 365, 48, 72) (1, 1, 1, 48, 69) Dask graph 2920 chunks in 2 graph layers Data type float64 numpy.ndarray",2  2  72  48  365,

Unnamed: 0,Array,Chunk
Bytes,38.50 MiB,25.88 kiB
Shape,"(2, 2, 365, 48, 72)","(1, 1, 1, 48, 69)"
Dask graph,2920 chunks in 2 graph layers,2920 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model


## Decompose small scale signals

In [31]:
def small_scale_decomposition(ds: xr.Dataset) -> xr.Dataset:
    """This encodes a lot of the theoretical discussions I had with Dhruv"""
    
    # FIXME: This attribute should be set in the smoothing/coarsening steps.
    method = ds.attrs.get('smoothing_method', None)
    if method is None or method not in ['coarse', 'filter']:
        raise ValueError(
            "Input dataset needs to contain a dataset attribute `smooth_method` indicating ",
            "if the dataset was filtered (value='filter') or coarsened (value='coarse')."
        )
    
    def filt(ds: xr.Dataset) -> xr.Dataset:
        """Utility function to reapply filtering as smoother"""
        return filter_inputs_dataset(ds, ['yt_ocean', 'xt_ocean'], params['filter_scale']) # TODO: should I pass the params in a better way? Maybe create a filtering class?
    
    # NOTE: We don't need this for coarsening, because that is a reynolds operator
    
    tdict = {}
    
    # Q_H (AB) - high resolution input
    if method == 'filter':
        tdict['Q_H'] = ds.sel(smoothing='smooth_none')
        tdict['Q_H_bar'] = filt(tdict['Q_H'])
    elif method == 'coarse':
        tdict['Q_H_bar'] = ds.sel(smoothing='smooth_none')
        
    # Q_L low resolution input
    if method == 'filter': 
        tdict['Q_L'] = ds.sel(smoothing='smooth_all')
        tdict['Q_L_bar'] = filt(tdict['Q_L'])
        tdict['Q_L_prime'] = tdict['Q_L'] - tdict['Q_L_bar'] # TODO: I could potentially compute this on the fly...
    elif method == 'coarse':
        tdict['Q_L_bar'] = ds.sel(smoothing='smooth_all')
        
    
    # Inferred Small scale
    if method == 'filter':
        tdict['Q_star'] = tdict['Q_H_bar'] - tdict['Q_L']
        tdict['Q_star_star'] = tdict['Q_H_bar'] - tdict['Q_L_bar']
    elif method == 'coarse':
        tdict['Q_star_star'] = tdict['Q_H_bar'] - tdict['Q_L_bar']
        
    # mixed low resolution input (filtered only)
    if method == 'filter':
        if 'smooth_vel_tracer_ocean' in ds.smoothing and 'smooth_vel_tracer_atmos' in ds.smoothing:
            tdict['Q_L_ocean'] = ds.sel(smoothing='smooth_vel_tracer_ocean')
            tdict['Q_L_ocean_bar'] = filt(tdict['Q_L_ocean'])

            tdict['Q_L_atmos'] = ds.sel(smoothing='smooth_vel_tracer_atmos')
            tdict['Q_L_atmos_bar'] = filt(tdict['Q_L_atmos'])

            tdict['Q_star_star_ocean'] = tdict['Q_H_bar'] - tdict['Q_L_ocean_bar']
            tdict['Q_star_star_atmos'] = tdict['Q_H_bar'] - tdict['Q_L_atmos_bar']

        if 'smooth_vel' in ds.smoothing and 'smooth_tracer' in ds.smoothing:
            tdict['Q_L_vel'] = ds.sel(smoothing='smooth_vel')
            tdict['Q_L_vel_bar'] = filt(tdict['Q_L_vel'])

            tdict['Q_L_tracer'] = ds.sel(smoothing='smooth_tracer')
            tdict['Q_L_tracer_bar'] = filt(tdict['Q_L_tracer'])

            tdict['Q_star_star_vel'] = tdict['Q_H_bar'] - tdict['Q_L_vel_bar']
            tdict['Q_star_star_tracer'] = tdict['Q_H_bar'] - tdict['Q_L_tracer_bar']
    
    # concat into a single dataset
    datasets = [tdict[t].assign_coords(term=t).drop([dvar for dvar in ['smoothing'] if dvar in tdict[t]]) for t in tdict.keys()]
    ds_out = xr.concat(datasets, dim='term', combine_attrs="override")
    ds_out.attrs = ds.attrs
    return ds_out

In [32]:
def _test_timesteps(ds:xr.Dataset):
    assert 'model' in ds.attrs
    prod_spec = ds.attrs.get('production_spec')
    if prod_spec == 'appendix':
        assert len(ds.time) == 365
    elif prod_spec == 'prod':
        if ds.attrs['model'] == 'CESM':
            assert len(ds.time) == 730
        elif ds.attrs['model'] == 'CM26':
            assert len(ds.time) == 7305

def test_data_results(ds:xr.Dataset, plot=False, full_check=False):
    for attr in ['smoothing_method', 'production_spec', 'model', 'time_spec']:
        print(attr)
        assert attr in ds.attrs.keys()
    if 'time' in ds.dims:
        _test_timesteps(ds)

    assert len(ds.term.data) == len(set(ds.term.data))
    
    assert set(['Q_H_bar', 'Q_star_star']).issubset(set(ds.term.data))

    # test that there are no all nan maps anywhere
    if full_check:
        nan_test = np.isnan(ds).all(['xt_ocean', 'yt_ocean']).to_array().sum()
        assert nan_test.data == 0

    if plot:
        for va in ds.data_vars:
            plt.figure()
            da_plot = ds[va].isel(algo=0).isel(algo=0, xt_ocean=slice(0,5,None), yt_ocean=slice(0,5,None))
            display(da_plot)
#             if 'time' in da_plot.dims:
#                 if len(da_plot.time) > 5:
#                     da_plot = da_plot.isel(time=[0,90,180]) # load here to separate loading from plotting issues
#                 else:
#                     da_plot = da_plot
#                 kwargs = dict(col='term', row='time', robust=True)
                
#             else:
                
#                 kwargs= dict(col='term', robust=True)
            
#             da_plot.plot(**kwargs)
#             plt.show()

In [33]:
# paths = [
#     # 'gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_filter_decomposed_mean_prod.zarr',
#     # 'gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CESM_fluxes_filter_decomposed_mean_prod.zarr',
#     # 'gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_coarse_decomposed_mean_prod.zarr',
#     # 'gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CESM_fluxes_coarse_decomposed_mean_prod.zarr',
#     # 'gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_filter_decomposed_native_all_terms.zarr',
#     # 'gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_coarse_decomposed_native_all_terms.zarr',
#     # 'gs://leap-scratch/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_filter_decomposed_native_all_terms.zarr',
#     # 'gs://leap-scratch/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_coarse_decomposed_native_all_terms.zarr',
# ]
# for path in paths:
#     if fs.exists(path):
#         fs.rm(path, recursive=True)

In [34]:
# define dict tree (where I can write into multiple levels
# https://stackoverflow.com/questions/5369723/multi-level-defaultdict-with-variable-depth/8702435#8702435
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)


In [35]:
fs.invalidate_cache() # needed to sync gcsfs when deleting large stores...

In [36]:
fix_persistent_bug = False
keep_terms = {
    # TODO: I think I should write one native time res datasets just with ['Q_H_bar', 'Q_star_star'] for the histgrams in fig 1 and 2 for the final pub.
    'prod': ['Q_H_bar', 'Q_star_star', 'Q_star_star_ocean', 'Q_star_star_atmos'],
    'appendix': ['Q_H_bar', 'Q_star_star', 'Q_star_star_ocean','Q_star_star_atmos','Q_star_star_vel', 'Q_star_star_tracer'],
    'all_terms': [
        'Q_H', 'Q_H_bar', 'Q_L', 'Q_L_bar', 'Q_L_prime', 'Q_star', 'Q_star_star', 'Q_L_ocean',
        'Q_L_ocean_bar', 'Q_L_atmos', 'Q_L_atmos_bar', 'Q_star_star_ocean', 'Q_star_star_atmos', 'Q_L_vel',
        'Q_L_vel_bar', 'Q_L_tracer', 'Q_L_tracer_bar', 'Q_star_star_vel', 'Q_star_star_tracer'
    ],
}

result_data = nested_dict()

for smoothing_method, production_spec, model, time_spec in [
    
    # ('filter', 'prod', 'CM26', 'mean'),
    # ('filter', 'prod', 'CM26', 'native'), # Ideally we want this for the histgram on fig 1 and the 2d histogram on  
    # ('filter', 'prod', 'CESM', 'mean'), 
    # # ('filter', 'prod', 'CESM', 'native'), # This is hella expensive. only do if totally necessary
    ('filter', 'appendix', 'CM26', 'mean'),
    ('filter', 'appendix', 'CM26', 'native'),
    ('filter', 'appendix', 'CESM', 'mean'), # this one fails
    ('filter', 'appendix', 'CESM', 'native'), # this one fails
    # ('filter', 'all_terms', 'CM26', 'mean'), #TODO; activate
    # ('filter', 'all_terms', 'CM26', 'native'),
    # ('filter', 'all_terms', 'CESM', 'mean'),
    # ('filter', 'all_terms', 'CESM', 'native'), # this would need a CESM appendix ouput. Prob not needed at this point.
    
    # ('coarse', 'prod', 'CM26', 'mean'),
    ('coarse', 'prod', 'CM26', 'native'), # I could consider writing these if we need them..
    # ('coarse', 'prod', 'CESM', 'mean'),
    # ('coarse', 'prod', 'CESM', 'native'), # I could consider writing these if we need them..
    # ('coarse', 'appendix', 'CM26', 'mean'),
    # ('coarse', 'appendix', 'CM26', 'native'),
    # ('coarse', 'appendix', 'CESM', 'mean'), # cant do this one without the filtered
    # ('coarse', 'appendix', 'CESM', 'native'), # cant do this one without the filtered
    # ('coarse', 'all_terms', 'CM26', 'mean'), # TODO: activate
    # ('coarse', 'all_terms', 'CM26', 'native'),
    # ('coarse', 'all_terms', 'CESM', 'mean'),
    # ('coarse', 'all_terms', 'CESM', 'native'), # Needs CESM appendix output (see above)
]:
    path = params['paths'][model]['results'][smoothing_method][time_spec][production_spec]

    if not fs.exists(path):
        print(f'Did not find {path}. Recomputing output')
        if production_spec == 'all_terms':
            # for the all_terms production_spec use the appendix flux data (more algos)
            # and then restrict it to a few days, otherwise this is getting VERY LARGE
            # FIXME: This is a big hacky for now but works fine. Lets see how this fits into the beam data model.
            ds_flux = flux_data[smoothing_method]['appendix'][model]
            
            # FIXME: I do not understand why the encoding chunks vs dask chunks gets messed up
            # But for now try to brute-force fix it with rechunking?
            ds_flux = ds_flux.isel(time=[0, 90, 180, 270])
        else:
            ds_flux = flux_data[smoothing_method][production_spec][model]
        
        print(f"Computing Small Scale Decomposition {model} {smoothing_method} {production_spec} {time_spec}")
    
        if time_spec == 'mean':
            ds_flux = ds_flux.mean('time')
    
        ds_out = small_scale_decomposition(ds_flux)
    
        # reduce the terms for the production output to save space
        ds_out = ds_out.sel(term=[t for t in keep_terms[production_spec] if t in ds_out.term])
    
        ds_out.attrs['time_spec'] = time_spec
        ds_out.attrs['production_spec'] = production_spec
        
        if time_spec == 'native':
            cluster.scale(200)
        else:
            cluster.scale(100)
        
        if fix_persistent_bug:
        
            # FUCK THIS ENCODING BULLSHIT@!!!
            ds_out = strip_chunk_encoding(ds_out).chunk({'time':3, 'xt_ocean':-1, 'yt_ocean':-1, 'algo':1, 'term': 1})



            # ds_out = ds_out.persist()
            # # this helps but leads to chunking issues again...I Fucking hate this rn!

            display(ds_out)
            # end debug
            # overwrite path for testing
            # path = 'gs://leap-scratch/jbusecke/test.zarr'
            # ok there is something wrong with the persistent bucket....need a reproducer

            # ds_out.to_zarr(path, mode='w', safe_chunks=False) #, mode='w'
            # ds_out.to_zarr('some_local_store.zarr', mode='w')

            # ok there is something wrong with the persistent bucket here?
            # Lets try this: Write to scratch path, and then move the store on the backend
            scratch_path = path.replace('persistent', 'scratch')
            print(f"DEBUG: Start Writing to zarr {scratch_path} (Size in memory: {ds_out.nbytes/1e12}TB)")
            ds_out.to_zarr(scratch_path)
            print(f"DEBUG: Moving {scratch_path} to {path}")
            fs.mv(scratch_path, path, recursive=True)
            # fs.rm(scratch_path, recursive=True)
        else:
            print(f"Start Writing to zarr {path} (Size in memory: {ds_out.nbytes/1e12}TB)") 
            display(ds_out)
            if 'time' in ds_out.dims:
                to_zarr_split(ds_out, fs.get_mapper(path), split_interval=100)
            else:
                ds_out.to_zarr(path)


    print(f"Reloading data from {path}")
    ds_reloaded = xr.open_dataset(path, engine='zarr', chunks={})
    display(ds_reloaded)
    
    print(f"Testing reloaded output")
    test_data_results(ds_reloaded, full_check=True, plot=False)

    result_data[smoothing_method][production_spec][time_spec][model] = ds_reloaded

Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_filter_decomposed_mean_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.09 GiB,37.08 MiB
Shape,"(6, 5, 2700, 3600)","(1, 1, 2700, 3600)"
Dask graph,30 chunks in 2 graph layers,30 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.09 GiB 37.08 MiB Shape (6, 5, 2700, 3600) (1, 1, 2700, 3600) Dask graph 30 chunks in 2 graph layers Data type float32 numpy.ndarray",6  1  3600  2700  5,

Unnamed: 0,Array,Chunk
Bytes,1.09 GiB,37.08 MiB
Shape,"(6, 5, 2700, 3600)","(1, 1, 2700, 3600)"
Dask graph,30 chunks in 2 graph layers,30 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.09 GiB,37.08 MiB
Shape,"(6, 5, 2700, 3600)","(1, 1, 2700, 3600)"
Dask graph,30 chunks in 2 graph layers,30 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.09 GiB 37.08 MiB Shape (6, 5, 2700, 3600) (1, 1, 2700, 3600) Dask graph 30 chunks in 2 graph layers Data type float32 numpy.ndarray",6  1  3600  2700  5,

Unnamed: 0,Array,Chunk
Bytes,1.09 GiB,37.08 MiB
Shape,"(6, 5, 2700, 3600)","(1, 1, 2700, 3600)"
Dask graph,30 chunks in 2 graph layers,30 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
time_spec
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_filter_decomposed_native_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 37.08 MiB Shape (2700, 3600) (2700, 3600) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,37.08 MiB
Shape,"(2700, 3600)","(2700, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 37.08 MiB 594.14 kiB Shape (2700, 3600) (338, 450) Dask graph 64 chunks in 2 graph layers Data type float32 numpy.ndarray",3600  2700,

Unnamed: 0,Array,Chunk
Bytes,37.08 MiB,594.14 kiB
Shape,"(2700, 3600)","(338, 450)"
Dask graph,64 chunks in 2 graph layers,64 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(6, 5, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 396.50 GiB 111.24 MiB Shape (6, 5, 365, 2700, 3600) (1, 1, 3, 2700, 3600) Dask graph 3660 chunks in 2 graph layers Data type float32 numpy.ndarray",5  6  3600  2700  365,

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(6, 5, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(6, 5, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 396.50 GiB 111.24 MiB Shape (6, 5, 365, 2700, 3600) (1, 1, 3, 2700, 3600) Dask graph 3660 chunks in 2 graph layers Data type float32 numpy.ndarray",5  6  3600  2700  365,

Unnamed: 0,Array,Chunk
Bytes,396.50 GiB,111.24 MiB
Shape,"(6, 5, 365, 2700, 3600)","(1, 1, 3, 2700, 3600)"
Dask graph,3660 chunks in 2 graph layers,3660 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
time_spec
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CESM_fluxes_filter_decomposed_mean_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,395.51 MiB,32.96 MiB
Shape,"(6, 2, 2400, 3600)","(1, 1, 2400, 3600)"
Dask graph,12 chunks in 2 graph layers,12 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 395.51 MiB 32.96 MiB Shape (6, 2, 2400, 3600) (1, 1, 2400, 3600) Dask graph 12 chunks in 2 graph layers Data type float32 numpy.ndarray",6  1  3600  2400  2,

Unnamed: 0,Array,Chunk
Bytes,395.51 MiB,32.96 MiB
Shape,"(6, 2, 2400, 3600)","(1, 1, 2400, 3600)"
Dask graph,12 chunks in 2 graph layers,12 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,395.51 MiB,32.96 MiB
Shape,"(6, 2, 2400, 3600)","(1, 1, 2400, 3600)"
Dask graph,12 chunks in 2 graph layers,12 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 395.51 MiB 32.96 MiB Shape (6, 2, 2400, 3600) (1, 1, 2400, 3600) Dask graph 12 chunks in 2 graph layers Data type float32 numpy.ndarray",6  1  3600  2400  2,

Unnamed: 0,Array,Chunk
Bytes,395.51 MiB,32.96 MiB
Shape,"(6, 2, 2400, 3600)","(1, 1, 2400, 3600)"
Dask graph,12 chunks in 2 graph layers,12 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
time_spec
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CESM_fluxes_filter_decomposed_native_appendix.zarr


Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 65.92 MiB 65.92 MiB Shape (2400, 3600) (2400, 3600) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3600  2400,

Unnamed: 0,Array,Chunk
Bytes,65.92 MiB,65.92 MiB
Shape,"(2400, 3600)","(2400, 3600)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(6, 2, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 140.98 GiB 32.96 MiB Shape (6, 2, 365, 2400, 3600) (1, 1, 1, 2400, 3600) Dask graph 4380 chunks in 2 graph layers Data type float32 numpy.ndarray",2  6  3600  2400  365,

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(6, 2, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(6, 2, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 140.98 GiB 32.96 MiB Shape (6, 2, 365, 2400, 3600) (1, 1, 1, 2400, 3600) Dask graph 4380 chunks in 2 graph layers Data type float32 numpy.ndarray",2  6  3600  2400  365,

Unnamed: 0,Array,Chunk
Bytes,140.98 GiB,32.96 MiB
Shape,"(6, 2, 365, 2400, 3600)","(1, 1, 1, 2400, 3600)"
Dask graph,4380 chunks in 2 graph layers,4380 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
time_spec
Reloading data from gs://leap-persistent/jbusecke/scale-aware-air-sea/v1.0.1/results/CM26_fluxes_coarse_decomposed_native_prod_50.zarr


Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 15.19 kiB 15.19 kiB Shape (54, 72) (54, 72) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",72  54,

Unnamed: 0,Array,Chunk
Bytes,15.19 kiB,15.19 kiB
Shape,"(54, 72)","(54, 72)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 216.69 MiB 45.56 kiB Shape (2, 7305, 54, 72, 1) (1, 3, 54, 72, 1) Dask graph 4870 chunks in 2 graph layers Data type float32 numpy.ndarray",7305  2  1  72  54,

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 216.69 MiB 45.56 kiB Shape (2, 7305, 54, 72, 1) (1, 3, 54, 72, 1) Dask graph 4870 chunks in 2 graph layers Data type float32 numpy.ndarray",7305  2  1  72  54,

Unnamed: 0,Array,Chunk
Bytes,216.69 MiB,45.56 kiB
Shape,"(2, 7305, 54, 72, 1)","(1, 3, 54, 72, 1)"
Dask graph,4870 chunks in 2 graph layers,4870 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Testing reloaded output
smoothing_method
production_spec
model
time_spec


In [37]:
cluster.shutdown()