# Getting Started

This notebook demonstrate data catalog functionality provided by `intake-cesm`. Let's begin by importing intake:

In [1]:
import intake

## Open a Collection

So far, `intake-cesm` supports data sets from 3 CESM collections:

- `cesm_dple`: DPLE | Decadal Prediction Large Ensemble Project
- `cesm2_runs`: CESM runs
- `cesm1_le`: LE | Large Ensemble Community Project




To use `intake-cesm`, we instatiate a `cesm_cat` class with the name of the collection we want to use.

Since the class is in the top-level of the package i.e `__init__.py`, and the package name starts with `intake_`, it will be scanned when Intake is imported. Now the plugin automatically appears in the set of known plugins in the Intake registry, and an associated `intake.open_cesm_cat` function is created at import time.

In [2]:
cat = intake.open_cesm_cat('cesm_dple')

Active collection: cesm_dple


In [3]:
cat.df.head()

Unnamed: 0,case,component,date_range,ensemble,experiment,file_basename,files,grid,sequence_order,stream,variable,year_offset,ctrl_branch_year,has_ocean_bgc
0,g.e11_LENS.GECOIAF.T62_g16.009,ocn,"['024901', '031612']",0,hindcast_sigma_coord,g.e11_LENS.GECOIAF.T62_g16.009.pop.h.sigma.NO3...,/glade/p/cgd/oce/projects/DPLE_O2/sigma_coord/...,POP_gx1v6,0,pop.h.sigma,NO3,1699,,
1,g.e11_LENS.GECOIAF.T62_g16.009,ocn,"['024901', '031612']",0,hindcast_sigma_coord,g.e11_LENS.GECOIAF.T62_g16.009.pop.h.sigma.O2....,/glade/p/cgd/oce/projects/DPLE_O2/sigma_coord/...,POP_gx1v6,0,pop.h.sigma,O2,1699,,
2,g.e11_LENS.GECOIAF.T62_g16.009,ocn,"['024901', '031612']",0,hindcast_sigma_coord,g.e11_LENS.GECOIAF.T62_g16.009.pop.h.sigma.SAL...,/glade/p/cgd/oce/projects/DPLE_O2/sigma_coord/...,POP_gx1v6,0,pop.h.sigma,SALT,1699,,
3,g.e11_LENS.GECOIAF.T62_g16.009,ocn,"['024901', '031612']",0,hindcast_sigma_coord,g.e11_LENS.GECOIAF.T62_g16.009.pop.h.sigma.TEM...,/glade/p/cgd/oce/projects/DPLE_O2/sigma_coord/...,POP_gx1v6,0,pop.h.sigma,TEMP,1699,,
4,g.e11_LENS.GECOIAF.T62_g16.009,ice,"['024901', '031612']",0,hindcast,g.e11_LENS.GECOIAF.T62_g16.009.cice.h.FYarea_n...,/glade/p/cesm/community/CESM-DPLE/CESM-DPLE_PO...,,0,cice.h,FYarea_nh,1699,,


In [4]:
len(cat.df)

583

## Set active collection

`Intake-cesm` allows the user to switch active collections by calling `set_collection(collection_name)`

In [5]:
cat.set_collection('cesm1_le')

Active collection: cesm1_le


In [6]:
cat.df.head()

Unnamed: 0,case,component,date_range,ensemble,experiment,file_basename,files,freq,grid,has_ocean_bgc,sequence_order,variable,year_offset,ctrl_branch_year
0,b.e11.BRCP85C5CNBDRD.f09_g16.105,ice,"['200601', '210012']",105,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.105.cice.h.hisnap...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,hisnap_nh,,
1,b.e11.BRCP85C5CNBDRD.f09_g16.105,ice,"['200601', '210012']",105,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.105.cice.h.hisnap...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,hisnap_sh,,
2,b.e11.BRCP85C5CNBDRD.f09_g16.105,ice,"['200601', '210012']",105,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.105.cice.h.strair...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,strairy_nh,,
3,b.e11.BRCP85C5CNBDRD.f09_g16.105,ice,"['200601', '210012']",105,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.105.cice.h.strair...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,strairy_sh,,
4,b.e11.BRCP85C5CNBDRD.f09_g16.105,ice,"['200601', '210012']",105,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.105.cice.h.strcor...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,strcory_nh,,


In [7]:
len(cat.df)

116275

## Search entries matching query

One of the features supported in `intake-cesm` is querying the collection. This is achieved through the `search` method. The `search` method allows the user to specify a query by using keyword arguments. This method returns a subset of the collection with all the entries that match the query. 

In [8]:
results = cat.search(experiment=['20C', 'RCP85'], component='ocn', ensemble=1, variable='FG_CO2')
results

Unnamed: 0,case,component,date_range,ensemble,experiment,file_basename,files,freq,grid,has_ocean_bgc,sequence_order,variable,year_offset,ctrl_branch_year
64401,b.e11.BRCP85C5CNBDRD.f09_g16.001,ocn,"['200601', '208012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.pop.h.FG_CO2....,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,FG_CO2,,
64402,b.e11.BRCP85C5CNBDRD.f09_g16.001,ocn,"['208101', '210012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.pop.h.FG_CO2....,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,FG_CO2,,
100755,b.e11.B20TRC5CNBDRD.f09_g16.001,ocn,"['185001', '200512']",1,20C,b.e11.B20TRC5CNBDRD.f09_g16.001.pop.h.FG_CO2.1...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,0,FG_CO2,,


In [9]:
results = cat.search(experiment='RCP85', component='ice', ensemble=1)
results.head()

Unnamed: 0,case,component,date_range,ensemble,experiment,file_basename,files,freq,grid,has_ocean_bgc,sequence_order,variable,year_offset,ctrl_branch_year
63153,b.e11.BRCP85C5CNBDRD.f09_g16.001,ice,"['200601', '208012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.cice.h.hisnap...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,hisnap_nh,,
63154,b.e11.BRCP85C5CNBDRD.f09_g16.001,ice,"['208101', '210012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.cice.h.hisnap...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,hisnap_nh,,
63155,b.e11.BRCP85C5CNBDRD.f09_g16.001,ice,"['200601', '208012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.cice.h.hisnap...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,hisnap_sh,,
63156,b.e11.BRCP85C5CNBDRD.f09_g16.001,ice,"['208101', '210012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.cice.h.hisnap...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,hisnap_sh,,
63157,b.e11.BRCP85C5CNBDRD.f09_g16.001,ice,"['200601', '208012']",1,RCP85,b.e11.BRCP85C5CNBDRD.f09_g16.001.cice.h.strair...,/glade/collections/cdg/data/cesmLE/CESM-CAM5-B...,month_1,POP_gx1v6,True,1,strairy_nh,,


In [10]:
len(results)

592

In [11]:
%load_ext watermark

In [12]:
%watermark --iversion -g -h -m -v -u -d

intake    0.3.0
last updated: 2019-01-26 

CPython 3.6.7
IPython 7.2.0

compiler   : GCC 7.3.0
system     : Linux
release    : 3.12.62-60.64.8-default
machine    : x86_64
processor  : x86_64
CPU cores  : 72
interpreter: 64bit
host name  : r9i1n13
Git hash   : 829f413c7781ef29ee863a36c5961354cc9af4b7
