# Generate Metadata

In [42]:
import os
import sys

sys.path.insert(0, os.path.abspath(os.path.join('..', '..', 'src')))

import config

sys.path.remove(os.path.abspath(os.path.join('..', '..', 'src')))

## Section Boilerplate

### Header

In [43]:
header = '''** Metadata for {time_series_fn}
** Contact: Jessica D. Lundquist (jdlund@uw.edu), University of Washington
** Updated: {update_date}
** Title: {verbose_name} stage, temperature, and discharge
** PI:  Jessica D. Lundquist (jdlund@uw.edu), University of Washington
** Funding:  National Science Foundation, CBET-0729830, and NASA Grant-NNX15AB29G'''

### Summary

In [44]:
summary = '''** Summary:
Half-hourly recordings of raw pressure (in cm of water), water pressure (in cm of water, with 
atmospheric pressure removed), estimated discharge (in cubic-feet-per-second), and water
temperature (in degrees Celcius), from instruments {site_location_description}, for water years
{start_year} to {end_year}. These measurements were taken by Solinst Leveloggers of various models.
{noteworthy}.'''

### Location

In [45]:
location = '''** Location:
Tuolumne River Watershed, Yosemite National Park
Approximate Coordinates: {approx_coords}
Near Lee Vining, California, USA (Tuolumne County)
Elevation: {site_elevation_m} m
See the Data Citation for more imformation about each site.'''

### Site History: Important Changes in Site Setup and Missing Data

In [46]:
site_history = '''** Site History: Important Changes in Site Setup 
{site_history}
'''

### Variables and Units

In [47]:
variables = '''** Variables and Units:
Note: the raw_pressure can either reflect vented or unvented levelogger data (depending on data 
availability and quality). When available, vented data is preferred to unvented data. To determine
which sensor was used at a particular time step, compare the 'raw_pressure(cm)' column to the
'barocorrected_pressure(cm)' column. If the two values are the same, the data is from a vented Levelogger.
If they differ, the data is from an unvented Levelogger.

date_time(UTC:PDT+7)       - time stamp in UTC, in format "yyyy/mm/dd HH:MM:SS", midnight = 00:00:00
raw_pressure(cm)           - raw data from the logger (either vented or unvented) (cm of water)
barocorrected_pressure(cm) - raw Levelogger data minus atmospheric pressure, same as raw when vented
                             logger is selected for period (cm of water)
adjusted_stage(cm)         - vented or baroccorrected pressure plus offset used to align with manual 
                             measurements/previous record (cm of water)
estimated_discharge(cms)   - calculated using rating curve defined below (cubic-meters-per-second, cms)
water_temperature(deg_C)   - raw stream temperature recorded by the Levelogger (degrees Celsius)
discharge_flag             - QC flag defined below
NaN = no data'''

### Discharge Flags

In [48]:
discharge_flags = '''** Discharge Flags:
0 = No detected anomoly, ice jam, or malfunction
1 = Anomalous single timestep readings ("blips"). Corrected to average of previous and subsequent recording
2 = High stage recordings created by ice blocking river flow.
3 = Sensor malfunction'''

### Instruments

In [49]:
instruments = '''** Instrument Models:
The following instruments are used for vented pressure collection:
{vented_instruments}

Unvented pressure data is collected using Solinst Leveloggers of several models: Levelogger Edge, Levelogger Gold
and Levelogger Junior. See the official documentation for more details:
https://www.solinst.com/products/dataloggers-and-telemetry/3001-levelogger-series/operating-instructions/previous-instruction-versions/3001-user-guide-v4-4-0.pdf'''

### Rating Curve

In [50]:
rating_curve = '''** Rating Curves:
The rating curve equation used to estimate discharge is
{rating_curve}

Note: The greatest uncertainties occur at very high flows (due to the small number of manual measurements in that
range) and at very low flows (due to uncertainty in the gauge datum and the exact water depth above the sensor
when discharge stops).'''

### Post Processing and Quality Control

In [51]:
postprocessing_qc = '''** Data Postprocessing and Quality Control Summary:
This file includes raw, corrected, and estimated data. The 'raw_pressure(cm)' data is a temporally-continuous 
record made up of a number of different loggers deployed through time (generally changed once per summer). 
The corrected data includes both the 'barocorrected_pressure(cm)' and 'adjusted_stage(cm)' time series, 
which apply an atmospheric pressure correction and datum correction, respectively. The 
'estimated_discharge(cms)' time series is an estimate of discharge computed using the 'adjusted_stage(cm)' time series
and site rating curve (which is built using manual measurements collected over several summers). 
The postprocessing includes all the of steps needed to translate the 'raw_pressure(cm)' time series into 
the 'estimated_discharge(cms)' time series. There is a summary of the steps below. For a comprehensive, 
thorough view of the post processing, visit the following url, where all steps are documented in Jupyter notebooks:
{postprocessing_url}

For each site, the following operations were completed:
1. Both vented and unvented Levelogger time series were inspected, trimmed (to the segment when the sensor
   was actively deployed), converted to appropriate units, and examined for sensor malfunctions, ice jams, 
   and sensor shifts.
2. A barometric pressure correction was applied to the unvented Levelogger time series using colocated Barologger 
   time series when available and NCAR Reanalysis surface pressure data otherwise. When NCAR Reanalysis data was 
   used, a hypsometric elevation correction was applied before performing the barometric correction. 
3. An offset was applied to both vented and barometrically corrected unvented Levelogger time series to convert 
   to the stage reference datum. Offsets are typically estimated by comparing the 'barocorrected_pressure(cm)' to 
   previous corrected time series and/or available manual stage measurements. 
4. For each water year, either the vented or corrected unvented Levelogger time series was selected 
   (based on data availability and quality) as the stage measurement ('adjusted_stage(cm)')
5. Stage time series from consecutive years were concatenated
6. Discharge was estimated using the stage time series and documented rating curve.'''

### Data Citation

In [52]:
data_citation = '''** Data Citation:
Lundquist, J. D., J. W. Roche, H. Forrester, C. Moore, E. Keenan, G. Perry, N. Cristea, B. Henn,
K. Lapo, B. McGurk, D. R. Cayan, and M. D. Dettinger (2016), Yosemite hydroclimate network: Distributed
stream and atmospheric data for the Tuolumne River watershed and surroundings. Water Resour. Res.,
doi:10.1002/2016WR019261'''

## Join Sections

In [53]:
sections = [header, 
            summary, 
            location, 
            site_history, 
            variables, 
            discharge_flags, 
            instruments, 
            rating_curve, 
            postprocessing_qc, data_citation]

meta_data = '\n\n\n'.join(sections)

## Generate Metadata for each output dataset

### Global Variables

In [54]:
POSTPROCESSING_URL = ''
UPDATE_DATE = 'August 24, 2022'

### Lyell Below Maclure

In [55]:
LyellBlwMaclure_specs = {'site_code' : 0,
                         'time_series_fn' : 'LyellBlwMaclure_timeseries_stage_Q_T_2004_2021.csv',
                         'verbose_name' : 'Lyell Fork of Tuolumne River below confluence with Maclure Creek',
                         'site_location_description' : 'located downstream of the footbridge crossing the\nLyell Fork of the Tuolumne River, downstream of the confluence with Maclure Creek,\nin Yosemite National Park',
                         'start_year' : '2004',
                         'end_year' : '2021',
                         'noteworthy' : 'Water year 2008 is missing due to a failed instrument.',
                         'approx_coords' : '37.77778 N, 119.26139 W (datum: NAD 83)',
                         'site_elevation_m' : '2947',
                         'site_history' : '''- Water year 2008 is missing due to a failed instrument.
- Starting late in water year 2012, a vented pressure transducer was also installed at the site.
- The channel cross-section is not believed to have changed during the period of record.
- Our corrected pressure, with associated offsets, is referenced to the 10 minus tape down from the 2015
  bolt (most recent operational practice), in units of ft. (long history before that)''',
                         'vented_instruments' : '''(as of 2015)
Campbell Scientific CR1000 Datalogger  SN:7519 HIF # 130889-Stock #5203013
Campbell Scientific CS450-L P.T. SN:70010535 25ft cable
Campbell Scientific CS547A-L Sp. Cond. SN:6181 30’ cable Kc:1.411''',
                         'rating_curve' : '''for H < 2.29893m, Q = 20.114*(H-2.1167)^1.70046
for H > 2.29893m, Q = 5.08516*(H-0.1.93825)^1.49088'''}

### Lyell Above Twin Bridges

In [56]:
LyellAbvTB_specs = {'site_code' : 1,
                    'time_series_fn' : 'LyellAbvTB_timeseries_stage_Q_T_2005_2021.csv',
                    'verbose_name' : 'Lyell Fork of the Tuolumne River near the twin bridges on the Pacific Crest Trail near Tuolumne Meadows',
                    'site_location_description' : 'located on the Lyell Fork of the Tuolumne River both upstream\nand downstream of the twin bridges on the Pacific Crest Trail near Tuolumne Meadows,\nYosemite National Park',
                    'start_year' : '2005',
                    'end_year' : '2021',
                    'noteworthy' : '',
                    'approx_coords' : '37.86948 N, 119.33110 W (datum: NAD 83)',
                    'site_elevation_m' : '',
                    'site_history' : '''- Previously, both a downstream and upstream site were maintained. This dataset only includes
  data from the upstream site; however, further data for the downstream site is available (if desired)
- Previously, no quality staff plate or datum existed for this location, so the barocorrected instrument-recorded
  stage was considered the stage of record and manual measurements of stage were not used as part of gaugings
  to determine rating curves. 
- During summer 2015, a benchmark bolt was installed. Following the installation of the new benchmark, the top
  of the stilling tube was used for measuring stage.
- Unvented data for 2019 and 2020 was barometrically corrected using NCAR Reanalysis surface pressure
  since no adequate Barologger record was present.
- The channel cross-section is not believed to have changed between the start of the record and 2018. 
  In the subsequent years channel-bank erosion has been observed.''',
                    'vented_instruments' : '',
                    'rating_curve' : '''for H < 0.545617m, Q = 13.0679*(H-0.189547)^1.53746
for 0.545617m < H < 0.786625m, Q = 32.3266*(H-0.3171)^1.68957
for 0.786625m < H, Q = 34.0175*(H-0.3297)^1.69602'''}

### Dana Fork at Bug Camp

In [57]:
DanaFkBug_specs = {'site_code' : 2,
                   'time_series_fn' : 'DanaFk@BugCamp_timeseries_stage_Q_T_2005_2021.csv',
                   'verbose_name' : 'Dana Fork of the Tuolumne River near Bug Camp and the Dog Lake Parking Lot near Tuolumne Meadows',
                   'site_location_description' : 'located on the Dana Fork of the Tuolumne River near the\nPacific Crest Trail near Tuolumne Meadows,\nYosemite National Park',
                   'start_year' : '2005',
                   'end_year' : '2021',
                   'noteworthy' : '',
                   'approx_coords' : '37.877 N, 119.338 W (datum: NAD 83)',
                   'site_elevation_m' : '',
                   'site_history' : '''- Staff plate gage installed 2012
- CR1000 & CS450 P.T install: 6/26/2014
- Bolt benchmark install: 7/16/2015
- In 2019 staff plate washed away; used tapedown bolt instead
- Staff plate replaced at end of 2019, used for all subsequent measurements
- Unvented data for 2019 was barometrically corrected using NCAR Reanalysis surface pressure
  since no adequate Barologger record was present
- Water year 2020 missing data (vented sensor malfunction, not able to retrieve unvented data)
- The channel cross-section is not believed to have changed during the period of record.
''',
                   'vented_instruments' : '''(starting in 2014)
Campbell Scientific CR1000 Datalogger
Campbell Scientific CS450-L P.T.
(starting in 2018)
Campbell Scientific CR800 Datalogger''',
                   'rating_curve' : '''for H < 0.354118m, Q = 9.81848*(H-0.218727)^1.48825
for 0.354118m < H < 0.904318m, Q = 34.8425*(H-0.2816)^1.61522
for 0.904318m < H, Q = 66.8344*(H-0.4724)^1.68704'''}

### Tuolumne River at 120

In [58]:
TuolumneAt120 = {'site_code' : 3,
                 'time_series_fn' : 'Tuolumne@120_timeseries_stage_Q_T_2001_2021.csv',
                 'verbose_name' : 'Tuolumne River where it passes under Highway 120',
                 'site_location_description' : 'located on Tuolumne River where it passes under\nHighway 120 in Tuolumne Meadows,\nYosemite National Park',
                 'start_year' : '2001',
                 'end_year' : '2021',
                 'noteworthy' : '',
                 'approx_coords' : '37.87629 N, 119.35475 W (datum: NAD 83)',
                 'site_elevation_m' : '',
                 'site_history' : '''- Staff plate installed in 2006
- CR1000 & CS450 P.T installed 10/10/2012
- 6/25/19 it was observed the bottom portion of staff plate ripped off; the staff plate was replaced and resurveyed
- The record is believed to be stable through the present. It is expected the site will be temporarily uninstalled (and reinstalled) during late summer 2022 for construction on the Tioga Bridge
''',
                 'vented_instruments' : '''(starting in 2012)
Campbell Scientific CR1000 Datalogger
Campbell Scientific CS450-L P.T.''',
                 'rating_curve' : '''for H < 0.8011m, Q = 34.3522*(H-0.7639)^1.465
for H > 0.8011m, Q = 17.0873(H-0.7076)^1.7401'''}

### Create metadata files

In [59]:
def write_metadata(site_spec):
    metadata_text = meta_data.format(time_series_fn=site_spec['time_series_fn'],
                                     update_date=UPDATE_DATE, 
                                     verbose_name=site_spec['verbose_name'],
                                     site_location_description=site_spec['site_location_description'], 
                                     start_year=site_spec['start_year'], 
                                     end_year=site_spec['end_year'],
                                     noteworthy=site_spec['noteworthy'], 
                                     approx_coords=site_spec['approx_coords'], 
                                     site_elevation_m=site_spec['site_elevation_m'],
                                     site_history=site_spec['site_history'], 
                                     postprocessing_url=POSTPROCESSING_URL,
                                     vented_instruments=site_spec['vented_instruments'],
                                     rating_curve=site_spec['rating_curve'])
    
    fn = config.FINAL_OUTPUT_METADATA_FN.format(site=config.SITE_SHORTNAME[site_spec['site_code']],
                                                start=site_spec['start_year'], 
                                                end=site_spec['end_year'])
    path = os.path.join('..', 'metadata', fn)
    
    f = open(path, 'w')
    f.write(metadata_text)
    f.close()

In [60]:
site_specs = [LyellBlwMaclure_specs, LyellAbvTB_specs, DanaFkBug_specs, TuolumneAt120]

for site_spec in site_specs:
    write_metadata(site_spec)