## Watchdog data prep and visualization
This batch of code is intended to manage the meterologic data collected by the network of Watchdog 2000 series loggers deployed across the JFSP 2015 experimental gradient. Generally, this notebook will build a programmatic way to read in overlapping or discontinuous met records from a single station, generate unique timestamp information for each record, associate each logger with metadata, perform cursory QA/QC steps, and concatenate the data into a single met record.

Ultimately, a portion of the steps developed here will be packaged into executables and ran each time the data are downloaded by a field technician, ideally aiding the technician in performing on-site QA/QC prior to leaving the field.

### Load notebook dependencies
and configure notebook aesthetic preferences

In [26]:
# ------- Notebook config
%matplotlib inline
import matplotlib.colors
import matplotlib.pyplot as plt

# ------- Load dependencies
import pandas as pd
import numpy as np
import seaborn as sns
import os

# ------- Plot environment aesthetics
sns.set_style('ticks')
sns.set_context('notebook', font_scale=2)

dataDir = 'Z:/JFSP_2015/Weather Stations/Data/Exports/'
outDir = 'Z:/JFSP_2015/Weather Stations/Data/Vis/Diagnostics/'


### Processing steps:
#### Generate a list of files in the 'Exports' directory
Then parse the strings of the exported .txt files to extract station ID, station locale, and if need be down the road, the download date.

#### Fix up the timestamps
This just relates to naming and additional column generation. Rename the initial timestamp column, and extract day of year, month, year, and hour for easy resampling and averaging later on. This will also make adjusting time stamps for incorrect clocks or offsets much easier as well.

#### Create a quick panel of the variables of interest
Generally step through each column that has data in the met record and plot it. This is a crude output, first pass sort of plotting.

#### Create a variable by variable QA/QC framework
There are two types of measurements made by the watchdogs: core and ancillary. The core measurements are the air temperature, relative humidity, anemometer measurements, rainfall, and some calculated variables derived from those core measurements. Ancillary measurements come from sensors plugged into the watchdog's logger. Currently, we record two soil temperature and two soil moisture measurements at each logger (a pair 5 cm under shrubs, and a pair 5cm in the open).

In [48]:
# Quickly list all the files in the data directory
fileList = next(os.walk(dataDir))[2]

# parseAndReadMetData:
# Summary: Parses file name to gather metadata, appends to read in pandas dataframe
#
# Inputs  - fname (full file path of met data file)
# Returns - df    (pandas dataframe with logger location and station ID)
def parseAndReadMetData(fname):
    location       = fname.split('_')[0]
    stationNum     = fname.split('_')[1]
    df             = pd.read_csv(dataDir + fname, sep = '\t', skiprows=3)
    df['Locale']   = location
    df['LoggerID'] = int(stationNum)
    return df

# prepareTimeStamps:
# Summary: generates time pandas date-time timestamps from time column, renames and adds
# time variables.
#
# Inputs  - df (pandas dataframe with logger location and station ID
# Returns - metdf (pandas dataframe with appended time stamp information)
def prepareTimeStamps(df):
    df.rename(columns = {'Date and Time   ':'Timestamp'}, inplace = True)
    df.index    = pd.to_datetime(df['Timestamp'])
    df['doy']   = df.index.dayofyear
    df['month'] = df.index.month
    df['year']  = df.index.year
    df['hour']  = df.index.hour
    return df

# rawSummaryPlots:
# Summary: Creates seven subplots for the main variables output by the Watchdog 2000
# series loggers. Saves the plot with a site and station ID specific file name.

# Inputs  - df (pandas dataframe with complete timestamps)
#         - outDir (/path/where/output/will/be/saved/)
# Returns - null
def rawSummaryPlots(df, outDir):
        
    # Setup plot axes
    f, ((ax1, ax2), (ax3, ax4), (ax5, ax6), (ax7, ax8)) = plt.subplots(4,2, figsize = (20,22))
    f.delaxes(ax8)

    # Soil temperature (sensor ports A and B)
    ax1.plot(df.index, df['TMPA'], lw = 3, color = 'gray', label = 'Temp A')
    ax1.plot(df.index, df['TMPB'], lw = 3, color = 'green', label = 'Temp B')
    ax1.set_title(df['Locale'][0] + ' ' + str(df['LoggerID'][0]))
    ax1.set_ylim([-5, 20])
    ax1.set_ylabel('Soil Temperature (deg C)')
    ax1.set_xticklabels([])
    ax1.legend()

    # Soil VWC (sensor ports C and D)
    ax2.plot(df.index, df['VWCC'], lw = 3, color = 'gray', 
             label = 'VWC C')
    ax2.plot(df.index, df['VWCD'], lw = 3, color = 'green', 
             label = 'VWC D')
    ax2.set_ylim([0, 40])
    ax2.set_ylabel('Volumetric Water Content (%)')
    ax2.set_xticklabels([])
    ax2.legend()

    # rH 
    ax3.plot(df.index, df['HMD'], lw = 2, color = 'black')
    ax3.set_ylabel('Relative Humidity (%)')
    ax3.set_xticklabels([])

    # TA 
    ax4.plot(df.index, df['TMP'], lw = 2, color = 'black')
    ax4.set_ylabel('Air Temperature (deg C)')
    ax4.set_xticklabels([])
    ax4.legend()

    # Wind velocity 
    ax5.plot(df.index, df['WNG'], lw = 1, color = 'gray', 
             label = 'Gusts', alpha = 0.5)
    ax5.plot(df.index, df['WNS'], lw = 1, color = 'black', 
             label = 'Wind Speed')
    ax5.set_ylabel('Wind Speed (km h$^{-1}$)')
    ax5.set_xticklabels([])

    # Wind direction
    ax6.plot(df.index, df['WND'], lw = 2, color = 'black')
    ax6.set_ylabel('Wind Direction (deg)')
    ax6.set_ylim([0,360])
    plt.setp(ax6.get_xticklabels(), rotation = 45)

    # Precip
    ax7.plot(df.index, df['RNF'], lw = 2, color = 'black')
    ax7.set_ylabel('Precipitation (mm)')
    plt.setp(ax7.get_xticklabels(), rotation = 45)

    sns.despine()

    # Create the file name and save the figure
    plotStationName = df['Locale'][0] + '_' + str(df['LoggerID'][0]) + '_'
    plt.savefig(outDir + plotStationName + 'RawSummary.tif')

In [None]:
# Diagnostic plot creation
# Usage: Step through the three functions defined above, in a loop where the 
#        loop iterator is the file name in the list of met station data files.
#        The result will be the production of a list of .tif files, one for each
#        met station.


# for metfile in fileList:
#    metdf = parseAndReadMetData(metfile)
#    metdf_a = prepareTimeStamps(metdf)
#    rawSummaryPlots(metdf_a, outDir)