# Eddy Covariance Method for calculating fluxes

In lecture 3, you learnt about how vertical fluxes can be expressed as the sum of all the transport by individual eddies. 

For example, the sensible heat flux can be written as:

$Q_H = \rho c_p \overline{w' \theta'}$

The latent heat flux can be written as:

$Q_E = \rho L_v \overline{w' q'}$

This assignment has two parts:
1. You're going to calculate $Q_H$ and $Q_E$ for a 24 hour period, using turbulent flux measurements. The measurements are taken from a measurement tower that was formerly installed in the Wombat State Forest (Northwest of Melbourne). 

2. You will calculate the surface energy balance terms for a year of measurements of $Q_H$ and $Q_E$, $Q_G$ $LW \downarrow$, $SW \downarrow$, $LW \uparrow$ and $SW \uparrow$. 

# Part 1: Eddy Covariance Calculation

Question 1: Follow the steps in the Jupyter notebook to calculate $Q_H$ and $Q_E$. Include all the requested plots in your report. Present your results as a graph with $Q_H$ and $Q_E$ on the same axes. [4 marks]

Question 2: Describe what you think the clouds were like on this day and sketch what you think the incoming short-wave radiation might have looked like. [2 marks]

# Part 2: Surface Energy Balance

Question 1: Write down the equation for the surface energy balance [1 mark]

Question 2: Follow the steps in the Jupyter notebook to calculate the seven terms of the average surface energy balance. Find a meaningful way to present your results and test the hypothesis that the sum of the radiative fluxes are balanced by the sum of the sensible, latent and ground heat fluxes. [3 marks]

Question 3: Do the terms balance? Why / Why not? [1 mark]

Question 4: These fluxes were measured at a height of 30 m above the ground, so that the measurement mast extends out of the forest canopy. How might the results have changed if they had been measured at the forest floor?  [1 mark]

Question 5: Choose one day from the year, and plot the surface energy balance for that specific day. Does it balance when you look at a single day? What might cause errors in the balance? [2 marks]

In [1]:
# Just run this! This loads some packages that we're going to use
import pandas
import numpy as np
import datetime as datetime
from matplotlib import pyplot as plt

In [2]:
# Here we load the data
df = pandas.read_csv('/home/workbooks/fluxes_week8-9/Fluxes2024.csv', parse_dates=['TIMESTAMP'])

# Print the dataframe to have a look at which variables it contains:
df

Unnamed: 0.1,Unnamed: 0,TIMESTAMP,Vertical_velocity,Virtual_Temp,Pressure,Vapour_Pressure
0,0,2019-03-08 00:00:00.000,-0.23175,15.40259,93.37927,0.442335
1,1,2019-03-08 00:00:00.200,-0.63725,15.52631,93.43839,0.445633
2,2,2019-03-08 00:00:00.300,-0.53350,15.58398,93.40527,0.445690
3,3,2019-03-08 00:00:00.400,-0.58875,15.41278,93.41237,0.443708
4,4,2019-03-08 00:00:00.500,-0.07800,15.25351,93.41237,0.447943
...,...,...,...,...,...,...
820270,820270,2019-03-08 23:59:59.500,0.12575,14.98248,93.43839,0.757567
820271,820271,2019-03-08 23:59:59.600,0.00425,14.99432,93.43839,0.754179
820272,820272,2019-03-08 23:59:59.700,-0.07825,14.99600,93.43839,0.755957
820273,820273,2019-03-08 23:59:59.800,-0.09575,15.06714,93.46440,0.755958


In [3]:
# First, you need to calculate the mixing ratio and the virtual potential temperature. 
# Note that Pressure and Vapour pressure are both in kPa, so you have to multiply by 1000 to get Pa. 
# For the mixing ratio, see lecture 2, slide 36
mixing_ratio = 
theta = 

# Add them to the dataframe:
df['Mixing_ratio'] = mixing_ratio
df['Theta'] = theta

SyntaxError: invalid syntax (<ipython-input-3-445ffea344f3>, line 4)

## Data length and missing data
You'll notice that your data is slightly shorter than the expected length of 864000. That's because of missing datapoints, which is an annoying thing that almost always happens with 'real world' data. We're going to deal with that in the analysis.


## Half hour averaging function ##

We're going to have to do this a few times in this analysis, so let's make a function. Study this function carefully, because the structure of this will help you in the other steps of the analysis. I've done this for you, to get you started. 

Running this function just builds your halfhour averaging 'machine', but it won't actually do anything until you put some data into it.

In [4]:
def do_half_hour_avs(data, var): # <- don't forget the : after your function definition
    '''
    Half hour averaging function: 
    Inputs: The data frame, and the variable that we would like to return the half-hourly values of
    Outputs: The half-hour averages.
    '''
    
    # Define an empty array to store hour half-hour averages in

    halfhourvar = []
    halfhourts = []

    # Let's start at the start of the dataset
    tstart = data.TIMESTAMP[0]
    
    # Get the 'TIMESTAMP' and var columns from the dataframe.
    data = data[['TIMESTAMP', var]]

    # Now, we're going to loop through each half hour of the 24-hour period. 
    for nseg in np.arange(0,48):
        
        # The end of the half-hour period is 30 minutes after the start of the half-hour period
        tend = tstart + pandas.Timedelta(minutes=30)
        #print('Calculating averages for half hour between', np.str(tstart), ' and ', np.str(tend))

        # Make a mask for the 30 minute period between dstart and dend
        mask = (data['TIMESTAMP'] >= tstart) & (data['TIMESTAMP'] < tend)
        
        # Get the data over the masked period
        tinds = data.loc[mask].index
    
        # Now we're going to average all the values of var that match with the timestep indices that we've just found
        # The nanmean function means that any missing data ('NaNs, or Not-a-Numbers') are ignored in the average.
        halfhourvar.append(np.nanmean(data[var][np.squeeze(tinds)]))
        halfhourts.append(tstart + pandas.Timedelta(minutes=15))
        #print('half-hour average: ', np.nanmean(data[var][np.squeeze(tinds)]))
        tstart = tend
    
    # Change the name so instead of _CSAT the name has 'hh' (E.g. 'Ux_CSAT' -> 'Uxhh')
    new_name = var.replace('_CSAT', 'hh')
    # Return our half-hour averages in a pandas dataframe
    return pandas.DataFrame({'TIMESTAMP':halfhourts, new_name:halfhourvar})

Now it's time to try your half-hour averaging machine. We're going to give it the dataframe, and the variable that we want to create half-hour averages of.

In [17]:
Vertical_velocity_hh = do_half_hour_avs(df, var = 'Vertical_velocity')
Theta_hh = do_half_hour_avs(df, var = 'Theta')
Mixing_ratio_hh = do_half_hour_avs(df, var = 'Mixing_ratio')

# Make a plot of the original data and the half-hour averages, on the same axes, for ONE of the variables (your choice which one)



In [107]:
def calculate_anomalies(data, var, varhh): # <- don't forget the : after your function definition
    '''
    Anomaly calculation function: 
    Inputs: The data frame, and the variable that we would like to return the anomalies values of
    Outputs: The anomalies.
    '''
    
    # Define an empty array to store hour half-hour averages in

    anom = np.empty(len(data[var]))

    # Let's start at the start of the dataset
    tstart = data.TIMESTAMP[0]
    
    # Get the 'TIMESTAMP' and var columns from the dataframe.
    data = data[['TIMESTAMP', var]]

    # Now, we're going to loop through each half hour of the 24-hour period. 
    for nseg in np.arange(0,48):
        
        # The end of the half-hour period is 30 minutes after the start of the half-hour period
        tend = tstart + pandas.Timedelta(minutes=30)
        #print('Calculating averages for half hour between', np.str(tstart), ' and ', np.str(tend))

        # Make a mask for the 30 minute period between dstart and dend
        mask = (data['TIMESTAMP'] >= tstart) & (data['TIMESTAMP'] < tend)
        
        # Get the data over the masked period
        tinds = data.loc[mask].index
    
        # Now we're going to calculate the anomalies
        anom[tinds] = data[var][tinds] - varhh[nseg]
        tstart = tend
    
    
    # Return othe anomalies
    return anom

In [None]:
Theta_anom = calculate_anomalies(df, 'Theta', Theta_hh.Theta)
Mixing_ratio_anom = calculate_anomalies(df, 'Mixing_ratio', Mixing_ratio_hh.Mixing_ratio)
Vertical_velocity_anom = calculate_anomalies(df, 'Vertical_velocity', Vertical_velocity_hh.Vertical_velocity)

# Make a plot of the anomalies for one of the variables (your choice which one)

# Now it's time to calculate some fluxes! 

First, you have to multiple two of your anomaly terms together:

$\overline{w' \theta'}$ and $\overline{w' q'}$

Next, you have to calculate the half-hour average of this. Add it to the original pandas data frame, and then use the half-hour averaging function. 

Finally, you have to apply a couple of constants to find the sensible and latent heat fluxes. It's OK to assume that density is a constant = 1.25 kg/m^3. 

$Q_H = \rho c_p \overline{w' \theta'}$
$Q_E = \rho L_v \overline{w' q'}$


## Don't over-think this step. It's just a few of lines of code. If you find yourself trying to do anything too complicated, you've probably missed the point. Ask for help! 

In [134]:
# Now plot your sensible and latent heat flux, and answer the questions. 

# Part 2: Surface Energy Balance

In [21]:
# Load the fluxes. They will load in the form of an array of size 365 X 48. That is one row for every day of the year, and one 
# column for every half-hour period the day. 

SWD = pandas.read_csv('/home/workbooks/fluxes_week8-9/SWD.csv', header=None).values
LWD = pandas.read_csv('/home/workbooks/fluxes_week8-9/LWD.csv', header=None).values
SWU = pandas.read_csv('/home/workbooks/fluxes_week8-9/SWU.csv', header=None).values
LWU = pandas.read_csv('/home/workbooks/fluxes_week8-9/LWU.csv', header=None).values
QH = pandas.read_csv('/home/workbooks/fluxes_week8-9/QH.csv', header=None).values
QE = pandas.read_csv('/home/workbooks/fluxes_week8-9/QE.csv', header=None).values
QG = pandas.read_csv('/home//workbooks/fluxes_week8-9/QG.csv', header=None).values

In [None]:
# Plot all seven terms of the surface energy balance on one plot. 

In [None]:
# Now calculate Q* from (a) the sum of the radiative fluxes and (b) the sum of the sensible, latent and ground heat fluxes. 
# Plot the two calculations of Q* on the same axes. Careful with signs! 