# PAVICS catalog search

To find files that meet constraints, PAVICS offer a process called `pavicssearch` that searches through a catalog for files matching user-defined criteria. The information for each file is scraped from the attributes of each netCDF file. 

In [1]:
from birdy import WPSClient
url = "https://pavics.ouranos.ca/twitcher/ows/proxy/catalog/wps"
wps = WPSClient(url)
help(wps.pavicsearch)

Help on method pavicsearch in module birdy.client.base:

pavicsearch(facets='None', shards='*', offset=0, limit=0, fields='*', format='application/solr+json', query='*', distrib=False, type='Dataset', constraints='None', esgf=False, list_type='opendap_url') method of birdy.client.base.WPSClient instance
    Search the PAVICS database and return a catalogue of matches.
    
    Parameters
    ----------
    facets : string
        Comma separated list of facets; facets are searchable indexing terms in the database.
    shards : string
        Shards to be queried
    offset : integer
        Where to start in the document count of the database search.
    limit : integer
        Maximum number of documents to return.
    fields : string
        Comme separated list of fields to return.
    format : string
        Output format.
    query : string
        Direct query to the database.
    distrib : boolean
        Distributed query
    type : string
        One of Dataset, File, Aggregat

Potential search constraints are:
- project
- experiment
- model
- frequency
- variable
- variable_long_name
- units
- institute

Note that the *rip* label (realization, initialization, physics), e.g. r5i1p1, is missing from search facets.

The process returns an output dictionary storing the search facets of each file found, as well as a simple list of the links. 
Note that it is important to specify `type="File"`, otherwise the process will look for datasets, ie file aggregations. At the moment, very few aggregations are available on the PAVICS data server. 


In [2]:
resp = wps.pavicsearch(constraints="variable:tasmin,project:CMIP5,experiment:rcp85,frequency:day", limit=10, type="File")
[result, files] = resp.get(asobj=True)
files

['https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r5i1p1/tasmin/tasmin_day_CanESM2_rcp85_r5i1p1_20060101-21001231.nc',
 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r2i1p1/tasmin/tasmin_day_CanESM2_rcp85_r2i1p1_20060101-21001231.nc',
 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r3i1p1/tasmin/tasmin_day_CanESM2_rcp85_r3i1p1_20060101-21001231.nc',
 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r4i1p1/tasmin/tasmin_day_CanESM2_rcp85_r4i1p1_20060101-21001231.nc',
 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r1i1p1/tasmin/tasmin_day_CanESM2_rcp85_r1i1p1_20060101-21001231.nc']

In [5]:
result['response']['docs'][0]

{'cf_standard_name': ['air_temperature'],
 'abstract': 'birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r5i1p1/tasmin/tasmin_day_CanESM2_rcp85_r5i1p1_20060101-21001231.nc',
 'replica': False,
 'wms_url': 'https://pavics.ouranos.ca/twitcher/ows/proxy/ncWMS2/wms?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.3.0&DATASET=outputs/CCCMA/CanESM2/rcp85/day/atmos/r5i1p1/tasmin/tasmin_day_CanESM2_rcp85_r5i1p1_20060101-21001231.nc',
 'keywords': ['air_temperature',
  'day',
  'application/netcdf',
  'tasmin',
  'thredds',
  'CMIP5',
  'rcp85',
  'CanESM2',
  'CCCma'],
 'dataset_id': 'CCCMA.CanESM2.rcp85.day.atmos.r5i1p1.tasmin',
 'datetime_max': '2100-12-31T12:00:00Z',
 'id': '29186a2db2230376',
 'subject': 'Birdhouse Thredds Catalog',
 'category': 'thredds',
 'opendap_url': 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/CCCMA/CanESM2/rcp85/day/atmos/r5i1p1/tasmin/tasmin_day_CanESM2_rcp85_r5i1p1_20060101-21001231.nc',
 'title': 'tasmin_day_CanESM2_rcp85_r5i1p1_20060101-21001231.nc'