# ESGF Search demo

The esgf-search package provides a simple interface to the [ESGF search API](https://earthsystemcog.org/projects/cog/esgf_search_restful_api).

It also provides access to facet presets.

First let's create a generic instance to explore the facets.

In [1]:
import esgf_search

esgf = esgf_search.ESGF()

### Listing facets

We'll print just the first two in the example but feel free to remove `[:2]` to view the entire list.

In [2]:
esgf.facets[:2]

['project', 'product']

### Listing values for a specific facet

We'll again print the first two, remove `[:2]` to view the full list.

In [3]:
esgf.facet_values('project')[:2]

['ACME', 'BioClim']

### Create an instance to search the CMIP5 project

Here we'll create an instance that will search the CMIP5 project for datasets and limit the results to 10 items per page.

Alternatively this could be accomplished with `esgf = esgf_search.CMIP5(type='Dataset', limit=10)`. Presets for CMIP5 and CMIP6 exist.

In [4]:
esgf = esgf_search.ESGF(facet_query={'project': 'CMIP5'}, type='Dataset', limit=10)

Now that we've setup some default search criteria we can start more specific searches.

For this example we'll look for datasets containing the variables `pr` in the `amip` experiment.

The results will be scrubbed and returned in a [DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html) from `Pandas`. 

**Remove `head(2)` to view the entire page of results.**

In [5]:
result = esgf.search(variable='pr', experiment='amip')
result.head(2)

Unnamed: 0,id,version,access,cf_standard_name,cmor_table,data_node,dataset_id_template_,datetime_start,datetime_stop,description,...,variable_units,west_degrees,_version_,retracted,_timestamp,score,HTTPServer,GridFTP,Globus,OPENDAP
0,cmip5.output1.BCC.bcc-csm1-1.amip.mon.atmos.Am...,1,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[air_pressure_at_convective_cloud_base, air_pr...",Amon,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-16T12:00:00Z,2008-12-16T12:00:00Z,bcc-csm1-1 model output prepared for CMIP5 AMIP,...,"[Pa, Pa, 1e-9, 1, %, 1, kg m-2, %, 1, kg m-2, ...",0.0,1636912851766476800,False,2019-06-21T01:49:04.925Z,1.0,,,,
1,cmip5.output1.BCC.bcc-csm1-1.amip.mon.atmos.Am...,1,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[air_pressure_at_convective_cloud_base, air_pr...",Amon,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-16T12:00:00Z,2008-12-16T12:00:00Z,bcc-csm1-1 model output prepared for CMIP5 AMIP,...,"[Pa, Pa, 1e-9, 1, %, 1, kg m-2, %, 1, kg m-2, ...",0.0,1636912858316931072,False,2019-06-21T01:49:11.172Z,1.0,,,,


We can further investigate and filter the results using the [DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html).

Let's see the distribution for `time_frequency` and then filter by a specific value.

In [6]:
for x in result.groupby('time_frequency'):
    print(x[0], len(x))

3hr 2
day 2
mon 2


In [7]:
filter_time_frequency = result[result.time_frequency=='3hr']
filter_time_frequency

Unnamed: 0,id,version,access,cf_standard_name,cmor_table,data_node,dataset_id_template_,datetime_start,datetime_stop,description,...,variable_units,west_degrees,_version_,retracted,_timestamp,score,HTTPServer,GridFTP,Globus,OPENDAP
7,cmip5.output1.BCC.bcc-csm1-1.amip.3hr.atmos.3h...,1,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[cloud_area_fraction, surface_upward_latent_he...",3hr,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-01T00:00:00Z,2008-12-31T22:30:00Z,bcc-csm1-1 model output prepared for CMIP5 AMIP,...,"[%, W m-2, W m-2, kg m-2 s-1, kg m-2 s-1, kg m...",0.0,1636912758700113920,False,2019-06-21T01:47:36.170Z,1.0,,,,
8,cmip5.output1.BCC.bcc-csm1-1.amip.3hr.atmos.3h...,1,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[cloud_area_fraction, surface_upward_latent_he...",3hr,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-01T00:00:00Z,2008-12-31T22:30:00Z,bcc-csm1-1 model output prepared for CMIP5 AMIP,...,"[%, W m-2, W m-2, kg m-2 s-1, kg m-2 s-1, kg m...",0.0,1636912762329235456,False,2019-06-21T01:47:39.631Z,1.0,,,,
9,cmip5.output1.BCC.bcc-csm1-1.amip.3hr.atmos.3h...,1,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[cloud_area_fraction, surface_upward_latent_he...",3hr,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-01T00:00:00Z,2008-12-31T22:30:00Z,bcc-csm1-1 model output prepared for CMIP5 AMIP,...,"[%, W m-2, W m-2, kg m-2 s-1, kg m-2 s-1, kg m...",0.0,1636912765217013760,False,2019-06-21T01:47:42.384Z,1.0,,,,


Since we limited the results to 10 items per page, we can navigate between these pages.

Here we'll view the first two entries on the next page.

**Remove `head(2)` to view the entire page of results.**

In [8]:
esgf.next().head(2)

Unnamed: 0,id,version,access,cf_standard_name,cmor_table,data_node,dataset_id_template_,datetime_start,datetime_stop,description,...,variable_units,west_degrees,_version_,retracted,_timestamp,score,HTTPServer,GridFTP,Globus,OPENDAP
0,cmip5.output1.IPSL.IPSL-CM5A-LR.amip.mon.atmos...,20110427,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[air_pressure_at_convective_cloud_base, air_pr...",Amon,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-16T12:00:00Z,2009-12-16T12:00:00Z,IPSL-CM5A-LR model output prepared for CMIP5 AMIP,...,"[Pa, Pa, %, 1, kg m-2, %, 1, kg m-2, kg m-2 s-...",0.0,1636888982825467904,False,2019-06-20T19:29:41.728Z,1.0,,,,
1,cmip5.output1.IPSL.IPSL-CM5A-LR.amip.mon.atmos...,20110427,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[air_pressure_at_convective_cloud_base, air_pr...",Amon,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,,,IPSL-CM5A-LR model output prepared for CMIP5 AMIP,...,"[Pa, Pa, %, 1, kg m-2, %, 1, kg m-2, kg m-2 s-...",,1636888988517138432,False,2019-06-20T19:29:47.157Z,1.0,,,,


In [9]:
esgf.previous().head(2)

Unnamed: 0,id,version,access,cf_standard_name,cmor_table,data_node,dataset_id_template_,datetime_start,datetime_stop,description,...,variable_units,west_degrees,_version_,retracted,_timestamp,score,HTTPServer,GridFTP,Globus,OPENDAP
0,cmip5.output1.MIROC.MIROC5.amip.mon.atmos.Amon...,20120710,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[air_pressure_at_convective_cloud_base, air_pr...",Amon,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-16T12:00:00Z,2008-12-16T12:00:00Z,MIROC5 model output prepared for CMIP5 AMIP,...,"[Pa, Pa, 1e-12, 1e-12, 1e-12, 1e-9, %, 1, kg m...",0.0,1636878732152012800,False,2019-06-20T16:46:45.924Z,1.0,,,,
1,cmip5.output1.MIROC.MIROC5.amip.mon.atmos.Amon...,20120710,"[HTTPServer, GridFTP, OPENDAP, Globus, LAS]","[air_pressure_at_convective_cloud_base, air_pr...",Amon,aims3.llnl.gov,cmip5.output1.%(valid_institute)s.%(model)s.%(...,1979-01-16T12:00:00Z,2008-12-16T12:00:00Z,MIROC5 model output prepared for CMIP5 AMIP,...,"[Pa, Pa, 1e-12, 1e-12, 1e-12, 1e-9, %, 1, kg m...",0.0,1636878722531328000,False,2019-06-20T16:46:36.749Z,1.0,,,,
