# Project Progress Report

**Name**: Nathan Hein<br/>
**Semester**: Spring 2019 <br/>
**Project area**: Agronomy

## Objective:

Organize temperature and relative humidity data and calculate hourly temperature average, difference, min, and max and use relative humidity to calculate hourly average and vapor pressure deficit.

## Rationale:

During our lab's experiments we use HOBO data loggers to monitor the temperature and humidity.  Out stress periods can run over a month which results in over ten thousand of data points.  Typically we would then go over the data manually in excel to organize and extract different components and this would take hours.

## Project Diagram

<img src="ProjectFigure.jpg" width="1000"/>

## Current Progress:

In [None]:
# Import Modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from functools import reduce
%matplotlib inline

# Set variables for data directory, results directory, and file name.  Import CSV file, trim off top row, drop unneccessary columns, rename columns.
datadirname = '/Users/nhein/Desktop/Coding/Project/Data/'
resultsdirname = '/Users/nhein/Desktop/Coding/Project/Results/'
datafile = 'Ch5-36C.csv'

df = pd.read_csv(datadirname + datafile, skiprows=1)
df.drop(df.columns[[0, 4, 5, 6]], axis=1, inplace= True)
df.columns = ['Full_Date', 'Temp_C', 'RH']

# Change dates to datetime and create new columns with date data needed for calculations
df.Full_Date = pd.to_datetime(df.Full_Date)
df.insert(1,'Date', df.Full_Date.dt.date)
df.insert(2,'Time', df.Full_Date.dt.time)
df.insert(3, 'Hour', df.Full_Date.dt.hour)

# drop rows witn NaNs
df = df.dropna()

<img src="originalcsvimport.png" width="800"/>
<img src="dfnonan.png" width="400"/>

## Calculate VPD for each entry, creating new column, and entering the calculations into dataframe:

In [None]:
# Defining VPD Function
def vpdfun (T_Grid, RH_Grid): 
    '''
    Uses the inputs of Temperature (Celsius) and relative humidity (%RH) and outputs the VPD (kPa).
    Inputs must be in numpy 2D array T in Celsius and RH in %RH with same dimensions.
    Author: Nathan Hein
    Date: 03/03/2019    
    '''
    esat = .611 * (np.exp((T_Grid * 17.5) / (T_Grid + 241))) # Sat. Vapor Pressure
    eact = esat * (RH_Grid / 100)                            # Actual Vapor Pressure
    vpd = esat - eact                                        # Vapr Pressure Deficit
    vpd = np.round(vpd, 2) 
    return vpd

# Defining Temp and RH variables
T_Grid = df.Temp_C
RH_Grid = df.RH

# Calling function and creating new column in dataframe with VPD.
VPD = vpdfun(T_Grid, RH_Grid)
df.insert(6,'VPD', VPD)

<img src="dfwithvpd.png" width="400"/>

## Calculate the average Temperature, VPD, and RH for each hour of the day over the stress period:

In [None]:
# Defining function that averages the VPD values over a given hour of the day for the entire data set.
def monthlyvpd ():
    '''
    Takes the calculated VPD and averages all values that were measured during each hour of the day throughout the data period.
    Records the results in a new dataframe.
    Inputs: VPD as an integer from a dataframe created from a csv file and results are the same units as inputs (normally kPA).  
    Author: Nathan Hein
    Date: 03/11/19
    '''
    cols = ['Hour', 'VPD']
    rows = []
    xdf = pd.DataFrame(rows, columns=cols)
    vpdlist = []
    for x in range(0,24):
        vpd = df[df['Hour']==x]['VPD']
        vpdlist.append(vpd)
        avg = np.mean(vpdlist)
        xdf = xdf.append({'Hour': x, 'VPD': avg}, ignore_index=True)
        vpdlist = []
    return(xdf)

MonthlyHourlyVPD = monthlyvpd()

# Defining Function that average the temp values over a given hour of the day for the entire data set.
def monthlytemp ():
    '''
    Calculates the average temperature for each hour of the day over the entire data set.
    Inputs: Temperature as an integer in a dataframe created from a csv file.
    Outputs: Temperature as an integer and a new dataframe with average temeperature values and the hour.
    Author: Nathan Hein
    Date: 03/11/19
    '''
    cols = ['Hour', 'Temp']
    rows = []
    ydf = pd.DataFrame(rows, columns=cols)
    templist = []
    for x in range(0,24):
        temp = df[df['Hour']==x]['Temp_C']
        templist.append(temp)
        avg = np.mean(templist)
        ydf = ydf.append({'Hour': x, 'Temp': avg}, ignore_index=True)
        templist = []
    return(ydf)

MonthlyHourlyTemp = monthlytemp()

# Defining function that averages the RH values over a given hour of the day for the entire data set.
def monthlyrh ():
    '''
    Calculates the average RH value for each hour of the day over the entire data set.
    Input: RH from a dataframe created from the imported CSV file
    Output: Hourly average RH in a new dataframe and the hour.
    Author: Nathan Hein
    Date: 03/11/19
    '''
    
    cols = ['Hour', 'RH']
    rows = []
    zdf = pd.DataFrame(rows, columns=cols)
    rhlist = []
    for x in range(0,24):
        rh = df[df['Hour']==x]['RH']
        rhlist.append(rh)
        avg = np.mean(rhlist)
        zdf = zdf.append({'Hour': x, 'RH': avg}, ignore_index=True)
        rhlist = []
    return(zdf)

MonthlyHourlyRH = monthlyrh()

<img src="HourlyVPD.png" width="100"/>

## Merge the Data Frames:

In [None]:
# Creates a list of the different Data types data frames and merges all the resuls data frames into one.
data_frames = [MonthlyHourlyTemp, MonthlyHourlyRH, MonthlyHourlyVPD]
MHA_Merged = reduce(lambda left,right: pd.merge(left,right,on=['Hour'],how='outer'), data_frames)

<img src="HourlyMerged.png" width="250"/>

## Roadblocks

* How to trim original dataframe to only include the data from the period between the inputted start and end date.

## Future Work

* Adjust original dataframe to start and end date of experiment
* Ability to add second CSV file to make comparisons between control and stress
* Calculate the difference between control and stress temperature, RH, and VPD
* Create graphs for hourly average Temp, RH, and VPD and comparison graphs between control and stress for both stress and non-stress periods of the day.
* Save all files so can be used by other members of lab who do not code.

<img src="HourlyDiffexample.png" width="1000"/>