# Open NWM 1km dataset as ReferenceFileSystem 
Create a `ReferenceFileSystem` object by reading references from a 9.8GB combined JSON file. 

Opening the dataset in Xarray takes more than 10 minutes, mostly due to decoding the giant JSON file.   It also requires more than 50GB of RAM to run, more than 8GB or 16GB typically available to users. 

In [None]:
import fsspec
import xarray as xr
from fsspec.implementations.reference import ReferenceFileSystem

In [None]:
fs = fsspec.filesystem('s3', anon=True, 
                        client_kwargs={'endpoint_url':'https://ncsa.osn.xsede.org'})

In [None]:
url = 's3://esip/noaa/nwm/grid1km/LDAS_combined.json'

In [None]:
fs.size(url)/1e9  # JSON size in GB

In [None]:
%%time
s_opts = {'anon':True, 'client_kwargs':{'endpoint_url':'https://ncsa.osn.xsede.org'}}
r_opts = {'anon':True}
fs = ReferenceFileSystem(url, ref_storage_args=s_opts,
                       remote_protocol='s3', remote_options=r_opts)
m = fs.get_mapper("")
ds = xr.open_dataset(m, engine="zarr", chunks={}, backend_kwargs=dict(consolidated=False))

In [None]:
ds

Examine a specific variable:

In [None]:
ds.TRAD

Compute the uncompressed size of this dataset in TB:

In [None]:
ds.nbytes/1e12   

Loading data for a particular time step is fast as the references are already loaded:

In [None]:
%%time
da = ds.TRAD.sel(time='1990-01-01 00:00').load()

Loading data for another time step takes about the same amount of time:

In [None]:
%%time
da = ds.TRAD.sel(time='2015-01-01 00:00').load()

Compute the mean over the domain:

In [None]:
da.mean().data