# Intake Catalog Demo

Here is a short notebook for accessing nested catalogs in an ```intake``` catalog.

In [1]:
import intake
import xarray
import pandas as pd

Open the HyTEST Intake Catalog and view its contents

In [2]:
cat = intake.open_catalog("https://raw.githubusercontent.com/hytest-org/hytest/main/dataset_catalog/hytest_intake_catalog.yml")
list(cat)

['conus404-drb-eval-tutorial-catalog',
 'conus404-hourly-onprem',
 'conus404-hourly-cloud',
 'conus404-daily-onprem',
 'conus404-daily-diagnostic-onprem',
 'conus404-daily-cloud',
 'conus404-daily-diagnostic-cloud',
 'conus404-monthly-onprem',
 'conus404-monthly-cloud',
 'nwis-streamflow-usgs-gages-onprem',
 'nwis-streamflow-usgs-gages-cloud',
 'nwm21-streamflow-usgs-gages-onprem',
 'nwm21-streamflow-usgs-gages-cloud',
 'nwm21-streamflow-cloud',
 'nwm21-scores',
 'lcmap-cloud',
 'conus404-hourly-cloud-dev',
 'nhm-v1.0-daymet-byHRU-onprem',
 'nhm-v1.0-daymet-byHW-musk-onprem',
 'nhm-v1.0-daymet-byHW-musk-obs-onprem',
 'nhm-v1.0-daymet-byHW-noroute-onprem',
 'nhm-v1.0-daymet-byHW-noroute_obs-onprem',
 'nhm-v1.1-gridmet-byHRU-onprem',
 'nhm-v1.1-gridmet-byHW-onprem',
 'nhm-v1.1-gridmet-byHWobs-onprem',
 'rechunking-tutorial-cloud']

This catalog has many datasets and a nested catalog, 'conus404-drb-cat'. Use the same method used to list the parent catalog but use the nested catalog as an index.

In [3]:
conus404_drb_cat = cat["conus404-drb-eval-tutorial-catalog"]
list(conus404_drb_cat)

['conus404-drb-OSN',
 'prism-drb-OSN',
 'ceres-drb-OSN',
 'crn-drb-OSN',
 'hcn-drb-OSN']

Examine one of the catalogs datasets and see that it's read parameters have already been set in the nested catalog.

In [4]:
conus404_drb_cat['conus404-drb-OSN']

conus404-drb-OSN:
  args:
    storage_options:
      anon: true
      client_kwargs:
        endpoint_url: https://renc.osn.xsede.org
      requester_pays: false
    urlpath: s3://rsignellbucket2/hytest/tutorials/conus404_model_evaluation/c404_drb.nc
    xarray_kwargs:
      decode_coords: all
  description: CONUS404 Delaware River Basin subset, 40 years of monthly data for
    CONUS404 model evaluation
  driver: intake_xarray.netcdf.NetCDFSource
  metadata:
    catalog_dir: https://raw.githubusercontent.com/hytest-org/hytest/main/dataset_catalog/subcatalogs


And these datasets can be called through the nested catalog.

First, a parquet read into a ```pandas``` DataFrame.

In [5]:
crn_drb = conus404_drb_cat['crn-drb-OSN'].read()
crn_drb.head()

ValueError: No plugins loaded for this entry: parquet
A listing of installable plugins can be found at https://intake.readthedocs.io/en/latest/plugin-directory.html .

Second, read a netCDF file into `dask`.

In [None]:
c404_drb = conus404_drb_cat['conus404-drb-OSN'].to_dask()
c404_drb