# Exploring the THREDDS catalog with Siphon


This notebook shows ways to list all the data collection, sub-collections, datasets and files from NCI's THREDDS service portal. 

* Siphon
* Extracts the catalog
* List data collections
* List datasets
* List files

---


- Authors: NCI Virtual Research Environment Team
- Keywords: THREDDS, Siphon, datasets, data collection
- Create Date: 2020-Jul

---


### Siphon

[Siphon](http://siphon.readthedocs.io/en/latest/) is a Python module for accessing data hosted on a THREDDS data server. Siphon works by parsing the catalog XML and exposing it with higher level functions.

In this notebook we will explore data available on the NCI data access portal THREDDS.

 The cell below extracts the catalog information.

In [1]:
from siphon.catalog import TDSCatalog

catalog = TDSCatalog("http://dapds00.nci.org.au/thredds/catalog.xml")


info = """
Catalog information
-------------------

Base THREDDS URL: {}
Catalog name: {}
Catalog URL: {}
Metadata: {}
""".format(
    catalog.base_tds_url, catalog.catalog_name, catalog.catalog_url, catalog.metadata
)

print(info)


Catalog information
-------------------

Base THREDDS URL: http://dapds00.nci.org.au
Catalog name: THREDDS Master Catalog
Catalog URL: http://dapds00.nci.org.au/thredds/catalog.xml
Metadata: {}



Get supported resultType’s:

In [2]:
for service in catalog.services:
    print(service.name)

all
licenses


What datasets are there?

In [3]:
print("\n".join(catalog.catalog_refs.keys()))

License and README files
3D Geological models of Australia
ANU Water and Landscape Dynamics
ARC Centre of Excellence - CLEX and ARCCSS Publication Data
ASTER maps of Australia
Aus400 Weather Simulations
Australian Bathymetry Reference Data
Australian Climate Observations Reference Network
Australian Marine Video and Imagery Data
Australian Natural Hazards Data archive
Australian Regional Copernicus Data Hub - Sentinel-1,2,3
Australian Research Radar Archive
Bureau of Meteorology - CAWCR - POAMA Data Catalog using a Legacy Structure
Bureau of Meteorology Observations Data
Bureau of Meteorology Ocean-Marine Reference Data
Bureau of Meteorology Seasonal Prediction Data
CMIP5/NRM
COSIMA Model Output
Decadal Forecast Project
eMAST TERN
eReefs GBR Model Data
ESGF Australian Data - CMIP5, GeoMIP, PMIP3, CORDEX
ESGF Australian Data - CMIP6
GA Earth Observations - Data
GA Earth Observations - Derived
Geoscience Australia Geophysics Reference Data Collection
Geoscience Australia Landsat Analysis

### Look into a detailed thredds catalogue of CMIP6 sub-collection ACCESS-ESM1.5

Extract datasets information: 

In [4]:
cat = TDSCatalog("http://dapds00.nci.org.au/thredds/catalog/fs38/publications/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/catalog.xml")
print("\n".join(cat.catalog_refs.keys()))

1pctCO2
abrupt-4xCO2
amip
esm-hist
esm-piControl
historical
piControl


### Look into a CMIP6 dataset 

Notes: when drilling down to the file level, `datasets` should be used instead of `catalog_refs`.

In [5]:
cat = TDSCatalog("http://dapds00.nci.org.au/thredds/catalog/fs38/publications/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r1i1p1f1/3hr/pr/gn/latest/catalog.xml")
print("\n".join(cat.datasets.keys()))

pr_3hr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_196001010130-196912312230.nc
pr_3hr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_197001010130-197912312230.nc
pr_3hr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_198001010130-198912312230.nc
pr_3hr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_199001010130-199912312230.nc
pr_3hr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_200001010130-200912312230.nc
pr_3hr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_201001010130-201412312230.nc


<div class="alert alert-info">
<b>Notes: </b> This function is useful especially when you want to programmatically access/download the data.
</div>

### Summery

We demonstrate how to use Siphon to get a list of datasets from NCI's THREDDS service portal.