# This notebook shows how to access to netcdf files on internet using xarray, and how to access multiple files as a dataset with chunk in a simple manner using [kerchunk](https://fsspec.github.io/kerchunk/)

In the tutorial of Dask, we will load data from catalogue accessible from cloud.  This notebook will show how this catalogue was created.  

## 1. How to open a NetCDF file (or other formats cf tiff? ..) on internet using xarray and fsspec 

We have placed multiple netcdf files in swist server at [CESNET](https://www.eosc.eu/members/cesnet)


In [52]:
filepath="https://object-store.cloud.muni.cz/swift/v1/foss4g-data/"

with fsspec.open(filepath, "r") as f:
    print(f.read())

CGLS_LTS_1999_2019/
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0101_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0111_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0121_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0201_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0211_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0221_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0301_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0311_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0321_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0401_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0411_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0421_GLOBE_VGT-PROBAV_V3.0.1.nc
CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-0501_GLOBE_VGT-PROBAV_V3

Using Xarray and fsspec, you can load these data sets to your notebook without downloading the files explicitly.  

In [21]:
filename='CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-1221_GLOBE_VGT-PROBAV_V3.0.1.nc'

In [23]:
%%time
f = fsspec.open(filepath+filename) 
ds = xr.open_dataset(f.open(), engine='h5netcdf')
ds

CPU times: user 136 ms, sys: 40 ms, total: 176 ms
Wall time: 253 ms


---- below explanation of zarr, but can omit it if done other place ----  

**  can  explain how to create zarr, in local file, and show the construction of zarr if not done before ** 


In [39]:
ds.sel(lat=slice(80.,70.),lon=slice(70.,90)).chunk({"lat":1000,"lon":1000})#.to_zarr

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 9.58 MiB 3.81 MiB Shape (1121, 2240) (1000, 1000) Count 7 Tasks 6 Chunks Type float32 numpy.ndarray",2240  1121,

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 9.58 MiB 3.81 MiB Shape (1121, 2240) (1000, 1000) Count 7 Tasks 6 Chunks Type float32 numpy.ndarray",2240  1121,

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 9.58 MiB 3.81 MiB Shape (1121, 2240) (1000, 1000) Count 7 Tasks 6 Chunks Type float32 numpy.ndarray",2240  1121,

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 9.58 MiB 3.81 MiB Shape (1121, 2240) (1000, 1000) Count 7 Tasks 6 Chunks Type float32 numpy.ndarray",2240  1121,

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 9.58 MiB 3.81 MiB Shape (1121, 2240) (1000, 1000) Count 7 Tasks 6 Chunks Type float32 numpy.ndarray",2240  1121,

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 9.58 MiB 3.81 MiB Shape (1121, 2240) (1000, 1000) Count 7 Tasks 6 Chunks Type float32 numpy.ndarray",2240  1121,

Unnamed: 0,Array,Chunk
Bytes,9.58 MiB,3.81 MiB
Shape,"(1121, 2240)","(1000, 1000)"
Count,7 Tasks,6 Chunks
Type,float32,numpy.ndarray


In [40]:
ds.sel(lat=slice(80.,70.),lon=slice(70.,90)).chunk({"lat":1000,"lon":1000}).to_zarr('test.zarr',mode='w')

<xarray.backends.zarr.ZarrStore at 0x7f98ab35cf90>

In [35]:
!ls -lart test.zarr

total 64
drwxr-xr-x  5 jovyan jovyan  4096 Aug  1 16:12 ..
-rw-r--r--  1 jovyan jovyan    24 Aug  1 16:12 .zgroup
-rw-r--r--  1 jovyan jovyan   956 Aug  1 16:12 .zattrs
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 lon
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 crs
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 lat
-rw-r--r--  1 jovyan jovyan 11311 Aug  1 16:12 .zmetadata
drwxr-xr-x 11 jovyan jovyan  4096 Aug  1 16:12 .
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 max
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 mean
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 median
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 min
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 nobs
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 stdev


In [36]:
!ls -lart test.zarr/max

total 68
-rw-r--r--  1 jovyan jovyan   345 Aug  1 16:12 .zarray
-rw-r--r--  1 jovyan jovyan   522 Aug  1 16:12 .zattrs
drwxr-xr-x 11 jovyan jovyan  4096 Aug  1 16:12 ..
-rw-r--r--  1 jovyan jovyan 11898 Aug  1 16:12 0.0
-rw-r--r--  1 jovyan jovyan 14182 Aug  1 16:12 0.1
-rw-r--r--  1 jovyan jovyan  4082 Aug  1 16:12 0.2
-rw-r--r--  1 jovyan jovyan  5869 Aug  1 16:12 1.0
-rw-r--r--  1 jovyan jovyan  7979 Aug  1 16:12 1.1
-rw-r--r--  1 jovyan jovyan  4082 Aug  1 16:12 1.2
drwxr-xr-x  2 jovyan jovyan  4096 Aug  1 16:12 .


---- zarr explanation finishes here ----

## 2. access multiple files as a dataset with chunk in a simple manner using [kerchunk](https://fsspec.github.io/kerchunk/)

Opening multiple file at once is very useful for optimize workflow. We can re-use the each file's size as 'chunk' later for Dask to apply paralliesation.  For example, 
we can use  [`xr.open_mfdataset `](https://docs.xarray.dev/en/stable/generated/xarray.open_mfdataset.html) to open multiple files.  

Some file format (HDF5, NetCDF4 with HDF5,   ..) are already 'chunked' using kerchunk, we can load multiple datasets using chunk of original file size, and also chunks inside the files.  

### 2.1 How do we do it?

First, update your kerchunk to be sure that you have verion 0.0.7

In [87]:
!pip install --upgrade kerchunk

Collecting kerchunk
  Downloading kerchunk-0.0.7-py3-none-any.whl (41 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.6/41.6 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: kerchunk
  Attempting uninstall: kerchunk
    Found existing installation: kerchunk 0.0.6
    Uninstalling kerchunk-0.0.6:
      Successfully uninstalled kerchunk-0.0.6
Successfully installed kerchunk-0.0.7


In [41]:
import fsspec



We create list of files:

In [43]:
fileb="https://object-store.cloud.muni.cz/swift/v1/foss4g-data/CGLS_LTS_1999_2019/c_gls_NDVI-LTS_1999-2019-"
filee="1_GLOBE_VGT-PROBAV_V3.0.1.nc"
urls= [ fileb + f'{m:02}' + f'{p:01}' +filee   for m in range(1,13) for p in range(0,3)]
#urls

We extract chunk information from each files using kerchunk.hdf

In [2]:
import kerchunk.hdf
singles = []
for u in urls:
    with fsspec.open(u) as inf:
        h5chunks = kerchunk.hdf.SingleHdf5ToZarr(inf, u, inline_threshold=100)
        singles.append(h5chunks.translate())

Lets see how each files been translated.  You will alraedy see 'chunks' from one file.

- [ ] Comopare the 'chunk' output from ds and ds_single  

In [5]:
import xarray as xr
ds_single = xr.open_mfdataset(
    "reference://", engine="zarr",
    backend_kwargs={
        "storage_options": {
            "fo": singles[3],
        },
        "consolidated": False
    }
)
ds


Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.36 GiB 14.28 MiB Shape (15680, 40320) (1207, 3102) Count 170 Tasks 169 Chunks Type float32 numpy.ndarray",40320  15680,

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.36 GiB 14.28 MiB Shape (15680, 40320) (1207, 3102) Count 170 Tasks 169 Chunks Type float32 numpy.ndarray",40320  15680,

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.36 GiB 14.28 MiB Shape (15680, 40320) (1207, 3102) Count 170 Tasks 169 Chunks Type float32 numpy.ndarray",40320  15680,

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.36 GiB 14.28 MiB Shape (15680, 40320) (1207, 3102) Count 170 Tasks 169 Chunks Type float32 numpy.ndarray",40320  15680,

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.36 GiB 14.28 MiB Shape (15680, 40320) (1207, 3102) Count 170 Tasks 169 Chunks Type float32 numpy.ndarray",40320  15680,

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.36 GiB 14.28 MiB Shape (15680, 40320) (1207, 3102) Count 170 Tasks 169 Chunks Type float32 numpy.ndarray",40320  15680,

Unnamed: 0,Array,Chunk
Bytes,2.36 GiB,14.28 MiB
Shape,"(15680, 40320)","(1207, 3102)"
Count,170 Tasks,169 Chunks
Type,float32,numpy.ndarray


Now we will combine all 36 files into one kerchunked catalogue, and try to open it as a xarray data set. 

In [44]:
import re
pattern=re.compile('.*c_gls_NDVI-LTS_1999-2019-([0-9]{4})_GLOBE_VGT-PROBAV_V3.0.1.nc')


from kerchunk.combine import MultiZarrToZarr
mzz = MultiZarrToZarr(
    singles,
    coo_map={'time': 'INDEX'},
    identical_dims=['crs'],
    #remote_protocol="https",
    #remote_options={'anon': True},
    concat_dims=["time"]
)

out = mzz.translate()

In [11]:
import xarray as xr
ds = xr.open_mfdataset(
    "reference://", engine="zarr",
    backend_kwargs={
        "storage_options": {
            "fo": out,
        },
        "consolidated": False
    }
)
ds


Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray


We can save the catalogue in a file, and load the data from the catalog in your local directory.

In [15]:
import json
jsonfile='c_gls_NDVI-LTS_1999-2019.json'
with open(jsonfile, mode='w') as f :
    json.dump(out, f)
    

In [14]:
!ls *json

c_gls_NDVI-LTS_1999-2019.json


In [16]:
import xarray as xr
ds = xr.open_mfdataset(
    "reference://", engine="zarr",
    backend_kwargs={
        "storage_options": {
            "fo":'./c_gls_NDVI-LTS_1999-2019.json',
        },
        "consolidated": False
    }
)
ds

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray


The catalog can be shared on cloud and load it from there too.  

In [47]:
filepath="https://object-store.cloud.muni.cz/swift/v1/foss4g-catalogue/"

with fsspec.open(filepath, "r") as f:
    print(f.read())

c_gls_NDVI-LTS_1999-2019.json


In [51]:
catalogue="https://object-store.cloud.muni.cz/swift/v1/foss4g-catalogue/c_gls_NDVI-LTS_1999-2019.json"
ds = xr.open_mfdataset(
    "reference://", engine="zarr",
    backend_kwargs={
        "storage_options": {
            "fo":catalogue
                    },
        "consolidated": False
    }
)
ds

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 84.79 GiB 14.28 MiB Shape (36, 15680, 40320) (1, 1207, 3102) Count 6085 Tasks 6084 Chunks Type float32 numpy.ndarray",40320  15680  36,

Unnamed: 0,Array,Chunk
Bytes,84.79 GiB,14.28 MiB
Shape,"(36, 15680, 40320)","(1, 1207, 3102)"
Count,6085 Tasks,6084 Chunks
Type,float32,numpy.ndarray


Now you have catalogue, original data source, both on cloud space, thus even from daks workers which does not have access to your NFS local disk space, datas are accessible.  
**Now you are ready to work your data from dask workers from dask gateway!**