# Building ERDDAP Datasets

This notebook documents the process of creating XML fragments
for nowcast system run results files
for inclusion in `/opt/tomcat/content/erddap/datasets.xml`
on the `skookum` ERDDAP server instance.

The contents are a combination of:

* instructions for using the
`GenerateDatasetsXml.sh` and `DasDds.sh` tools found in the
`/opt/tomcat/webapps/erddap/WEB-INF/` directory
* instructions for forcing the server to update the datasets collection
via the `/results/erddap/flags/` directory
* code and metadata to transform the output of `GenerateDatasetsXml.sh`
into XML fragments that are ready for inclusion in `/opt/tomcat/content/erddap/datasets.xml`

In [3]:
from lxml import etree

**NOTE**

The next cell mounts the `/results` filesystem on `skookum` locally.
It is intended for use if when this notebook is run on a laptop 
or other non-Waterhole machine that has `sshfs` installed 
and a mount point for `/results` available in its root filesystem.

Don't execute the cell if that doesn't describe your situation.

In [4]:
!sshfs skookum:/results /results

In [2]:
metadata = {
    # For all datasets:
    'coverage_content_type': 'modelResult',
    'infoUrl': 'http://salishsea-meopar-tools.readthedocs.org/en/latest/results_server/index.html#salish-sea-model-results',
    'institution': 'UBC EOAS',
    'institution_fulname': 'Earth, Ocean &amp; Atmospheric Sciences, University of British Columbia',
    'license': '''The Salish Sea MEOPAR NEMO model results are copyright 2013-2016
by the Salish Sea MEOPAR Project Contributors and The University of British Columbia.

They are licensed under the Apache License, Version 2.0. http://www.apache.org/licenses/LICENSE-2.0''',
    'project': 'Salish Sea MEOPAR NEMO Model',
    'creator_name': 'Salish Sea MEOPAR Project Contributors',
    'creator_email': 'sallen@eos.ubc.ca',
    'creator_url': 'http://salishsea-meopar-docs.readthedocs.org/',
    'acknowledgement': 'MEOPAR, ONC, Compute Canada',
    'drawLandMask': 'over',
    # Details of individual datasets:
    # (keys are datasetIDs)
    'ubcSSnPointAtkinsonSSH15m': {
        
    },
    
    'ubcSSn3dvVelocity1h': {
        'title': 'Nowcast, Salish Sea, 3d u Velocity Field, Hourly',
        'summary': '''3d zonal (u) component velocity field values averaged over 1 hour intervals
from Salish Sea NEMO model nowcast runs. The values are calculated for the entire model grid
that includes the Strait of Juan de Fuca, the Strait of Georgia, Puget Sound,
and Johnstone Strait on the coasts of Washington State and British Columbia.''',
    },
    
    'ubcSSn3dvVelocity1h': {
        'title': 'Nowcast, Salish Sea, 3d v Velocity Field, Hourly',
        'summary': '''3d meridional (v) component velocity field values averaged over 1 hour intervals
from Salish Sea NEMO model nowcast runs. The values are calculated for the entire model grid
that includes the Strait of Juan de Fuca, the Strait of Georgia, Puget Sound,
and Johnstone Strait on the coasts of Washington State and British Columbia.''',
    },
    
    'ubcSSn3dwVelocity1h': {
        'title': 'Nowcast, Salish Sea, 3d w Velocity Field, Hourly',
        'summary': '''3d vertical (w) component velocity field values averaged over 1 hour intervals
from Salish Sea NEMO model nowcast runs. The values are calculated for the entire model grid
that includes the Strait of Juan de Fuca, the Strait of Georgia, Puget Sound,
and Johnstone Strait on the coasts of Washington State and British Columbia.''',
    },
}

In [24]:
tree = etree.parse('/results/erddap/logs/GenerateDatasetsXml.out')
root = tree.getroot()

In [25]:
def print_tree(root):
    print(etree.tostring(root, pretty_print=True).decode('ascii'))

In [26]:
print_tree(root)

<dataset type="EDDGridFromNcFiles" datasetID="nowcast_faa8_4044_5dbf" active="true">
    <reloadEveryNMinutes>60</reloadEveryNMinutes>
    <updateEveryNMillis>10000</updateEveryNMillis>
    <fileDir>/results/SalishSea/nowcast/</fileDir>
    <recursive>true</recursive>
    <fileNameRegex>.*SalishSea_1h_\d{8}_\d{8}_grid_W\.nc</fileNameRegex>
    <metadataFrom>last</metadataFrom>
    <matchAxisNDigits>20</matchAxisNDigits>
    <fileTableInMemory>false</fileTableInMemory>
    <accessibleViaFiles>false</accessibleViaFiles>
    <!-- sourceAttributes>
        <att name="Conventions">CF-1.1</att>
        <att name="file_name">SalishSea_1h_20160121_20160121_grid_W.nc</att>
        <att name="history">Thu Jan 21 14:33:12 2016: ncks -4 -L4 -O SalishSea_1h_20160121_20160121_grid_W.nc SalishSea_1h_20160121_20160121_grid_W.nc</att>
        <att name="NCO">4.4.2</att>
        <att name="production">An IPSL model</att>
        <att name="TimeStamp">21/01/2016 14:20:02 -0800</att>
    </sourceAttribute

In [27]:
print(root.tag)
print(root.get('datasetID'))
datasetID = 'ubcSSn3dwVelocity1h'
root.attrib['datasetID'] = datasetID
print(root.get('datasetID'))

dataset
nowcast_faa8_4044_5dbf
ubcSSn3dwVelocity1h
