<a name="top"></a>
<div style="width:1000 px">

<div style="float:right; width:98 px; height:98px;">
<img src="https://raw.githubusercontent.com/Unidata/MetPy/master/metpy/plots/_static/unidata_150x150.png" alt="Unidata Logo" style="height: 98px;">
</div>

<h1>Siphon Overview</h1>
<h3>Unidata Python Workshop</h3>

<div style="clear:both"></div>
</div>

<hr style="height:2px;">

<div style="float:right; width:250 px"><img src="https://unidata.github.io/siphon/latest/_static/siphon_150x150.png" alt="TDS" style="height: 200px;"></div>

## Overview:

* **Teaching:** 15 minutes
* **Exercises:** 15 minutes

### Questions
1. What is a THREDDS Data Server (TDS)?
1. How can I use Siphon to access a TDS?

### Objectives
1. <a href="#threddsintro">Use siphon to access a THREDDS catalog</a>
1. <a href="#filtering">Find data within the catalog that we wish to access</a>
1. <a href="#dataaccess">Use siphon to perform remote data access</a>

<a name="threddsintro"></a>
## 1. What is THREDDS?

 * Server for providing remote access to datasets
 * Variety of services for accesing data:
   - HTTP Download
   - Web Mapping/Coverage Service (WMS/WCS)
   - OPeNDAP
   - NetCDF Subset Service
   - CDMRemote
 * Provides a more uniform way to access different types/formats of data

## THREDDS Demo
http://thredds.ucar.edu

### THREDDS Catalogs
- XML descriptions of data and metadata
- Access methods
- Easily handled with `siphon.catalog.TDSCatalog`

In [None]:
from datetime import datetime, timedelta
from siphon.catalog import TDSCatalog
date = datetime.utcnow() - timedelta(days=1)
cat = TDSCatalog('http://thredds.ucar.edu/thredds/catalog/nexrad/level3/'
                 f'N0Q/LRX/{date:%Y%m%d}/catalog.xml')

<a href="#top">Top</a>
<hr style="height:2px;">

<a name="filtering"></a>
## 2. Filtering data

We *could* manually figure out what dataset we're looking for and generate that name (or index). Siphon provides some helpers to simplify this process, provided the names of the dataset follow a pattern with the timestamp in the name:

In [None]:
request_time = date.replace(hour=18, minute=30, second=0, microsecond=0)
ds = cat.datasets.filter_time_nearest(request_time)
ds

We can also find the list of datasets within a time range:

In [None]:
datasets = cat.datasets.filter_time_range(request_time, request_time + timedelta(hours=1))
print(datasets)

### Exercise
* Starting from http://thredds.ucar.edu/thredds/catalog/satellite/SFC-T/SUPER-NATIONAL_1km/catalog.html, find the composites for the previous day.
* Grab the URL and create a TDSCatalog instance.
* Using Siphon, find the data available in the catalog between 12Z and 18Z on the previous day.

In [None]:
# YOUR CODE GOES HERE


#### Solution

In [None]:
# %load solutions/datasets.py


<a href="#top">Top</a>
<hr style="height:2px;">

<a name="dataaccess"></a>
## 3. Accessing data

Accessing catalogs is only part of the story; Siphon is much more useful if you're trying to access/download datasets.

For instance, using our data that we just retrieved:

In [None]:
ds = datasets[0]

We can ask Siphon to download the file locally:

In [None]:
ds.download()

In [None]:
import os; os.listdir()

Or better yet, get a file-like object that lets us `read` from the file as if it were local:

In [None]:
fobj = ds.remote_open()
data = fobj.read()
print(len(data))

This is handy if you have Python code to read a particular format.

It's also possible to get access to the file through services that provide netCDF4-like access, but for the remote file. This access allows downloading information only for variables of interest, or for (index-based) subsets of that data:

In [None]:
nc = ds.remote_access()

By default this uses CDMRemote (if available), but it's also possible to ask for OPeNDAP (using netCDF4-python).

In [None]:
print(list(nc.variables))

<a href="#top">Top</a>
<hr style="height:2px;">