# Analysis of observed evapotranspiration Hupsel 2011 (step 1)

## Intro
This exercise is a first step in a series of four. The final objective is to determine the actual evapotranspiration of the Hupsel catchment in the first weeks of May 2022.

Since we do not have direct observations of the current surface fluxes (either eddy-covariance, lysimeters or through scintillometry) we split the process in two major steps (left and right in the figure below):
a. understand how the ET in Hupsel catchment responds to external forcings, based on historical data from Hupsel and elswhere;
b. use that process understanding to make the best possible estimate of the ET of the Hupsel catchment in the past weeks.

The current land-use in the Hupsel catchment can be simplified as a mixture of grass and bare soil (the maize is just emerging). Today we focus on understanding the response of grass and bare soil to external forcings (step 1 and 2). In the second practical we will finish step 3 and 4.

The concept of reference evapotranspiration is dealt with extensively in the [book used for Atmosphere Vegetation Soil Interactions](https://www-cambridge-org.ezproxy.library.wur.nl/core/books/transport-in-the-atmospherevegetationsoil-continuum/5944F8B7ADAC6409AD4575642431B2DC) (chapter 7 and 8). A summary of the [most essential concepts](reference_ET_concept.pdf) is also available.

Collect your answers in the [answer sheet](Actual_ET_1-answer-sheet.docx).

<img src="analysis-overview.png" width="80%">

## The method
The logic of this practical is that reference ET is supposed to contain all main meteorological drivers of evapotranspiration. By comparing the reference ET with the observed actual ET (based on eddy-covariance measurements) we can find out if there are additional external factors that need to be taken into account. 

If all relevant information would be contained in reference ET, the crop factor (ET<sub>act</sub> / ET<sub>ref</sub>) would be constant. If it varies in time, that is an indication that additional factors play a role.

In the four steps sketched above we will focus on daily mean data (i.e. data that have been averaged over 24 hours).

## The data
To understand the response of grass ET to external forcings we will make use of flux observations obtained in April and May 2011. We choose this year since it had a similarly dry spring as the current year as you can see in the graphs below (the red dot indicates where we are now).
We will make use of two datasets:
* the standard data obtained routinely by KNMI 
* the additional flux observations made by the WUR chairgroup MAQ in the context of the Hupsel practical

<img src="precip_deficit_2011_2022.png" width="80%">

## Initialize Python stuff and read the data
Please run the cell below by selecting it and pressing Shift+Enter. Or Press the Run button in the toolbar at the top of the screen (with the right pointing triangle).

In [None]:
# Load some necessary Python modules
import pandas as pd # Pandas is a library for data analysis
import numpy as np # Numpy is a library for processing multi-dimensional datasets
from hupsel_helper import myplot, myreadfile
from hupsel_helper import check_Lv, check_esat, check_s, check_gamma, check_makkink

Now read the data from the Excel file.

In [None]:
# File name
fname='Hupsel2011_MeteoData.xlsx'

# Get the data
df = myreadfile(fname)

## Explore the data
### Information available in the dataframe
Before you start making computations with the data it is wise to first explore the data. 

To show the names of the available variables, type `df.keys()` in the cell below (and run, or press Shift+Enter). Based on the names you can also make the distinction between the two parts of the dataset:
* KNMI data: the first range of variables (without a `_m` in the name) 
* MAQ data: variables of which the name ends with `_m`. For the current exercise we will only use `LvE_m`.

The dataframe also contains information about the units of the variables: type `df.attrs['units']` in the cell below. You can also access the units of an individual variable as follows: `df.attrs['units']['u_10']` should give `[m/s]`. Finally, the dataframe also contains a more complete description of the variables: `df.attrs['description']`.

There are a number of ways to inspect the data:
* print the full dataframe in a cell (simply type `df` and run the cell)
* print a single variable from the dataframe (type for instance `df['K_in']` to show the values of global radiation)
* plot the data with the plot command `myplot`

More details about the plotting command can be found in the *Step-0* Python notebook.

### <span style='background:lightblue'>Question 1</span>
Characterize the weather conditions during the period in which the data were gathered. Do this in ver broad terms (do not study individual days, but rather in terms of e.g. 'in the first 5 days the weather was sunny and dry'.
Use some of the commands in the cell below. Use this to get an idea of the conditions during the 3 weeks of observations.

## Determine the reference evapotranspiration
For reasons of simplicuty and robstness, we will use the Makkink equation to determine the reference ET. The essential equations can be found in the [Formularium of Atmosphere-Vegetation-Soil Interactions](Forumularium_AVSI_2021.pdf). 

###  Define some functions
In ordder to determine the reference ET in mm/day a number of ingredients are needed:
* L<sub>v</sub> is needed to convert the latent heat flux (in energy flux units into flux in terms of mm/day).
* s is the slope of the saturated vapour pressure as a function of temperature
* gamma: the psychrometer constant

For each of these ingredients you need to construct a function. Below we provide the skeleton for these functions. The name of the function always starts with `f_` just to indicate that it is a function, not a variable (this is an arbitrary choice that I made, not a necessity).

Some basics about how functions work is given in the *Step-0* notebook.

### <span style='background:lightblue'>Question 2</span>
Complete the function skeletons below and check whether they are correct:
* edit the function such that the values of the constants and the structure of the equation (following 'result') is correct
* check your function with the appropriate check function. For the function `f_Lv` that would be `check_Lv`. As an argument you pass your own function. So in the cell below the function definition you type `check_Lv(Lv)` and press Shift+Enter. Note that this check only checks for programming errors -> it **does not check for errors in units**.
* check that the function produces reasonable results if you feed it with your (reasonable) observations (e.g. by making a plot or printing the number). In particular: **check that the units of the data** that you supply to the function are correct!

In [None]:
# Function to compute latent heat of vaporization
# Input
#    T     : temperature (Kelvin)
# Output
#    Lv    : latent heat of vaporization (J/kg)
#
# This function is complete and functioning as an example
# See secton 7.1 of the AVSI formularium or table B.3 in Moene & van Dam (2014)
def f_Lv(T):
    # Define constants
    c1 = 2501000
    c2 = 0.00095
    c3 = 273.15
    
    # Compute the result
    result =  c1*(1 - c2*(T - c3))
    
    return result    

In [None]:
# Check the formulation of your function is correct
check_Lv(f_Lv)  
# Check that the function produces sensible numbers when you feed it with your observations
# In this case need to make sure that the temperature that we supply is in Kelvin (the data have temperature in Celcius)
print('Units of temperature are ', df.attrs['units']['T_1_5'])
f_Lv(df['T_1_5']+273) 

In [None]:
# Function to compute saturated vapour pressure (over water)
# Input
#    T     : temperature (Kelvin)
# Output
#    esat  : saturated vapour pressure (Pa)
#
# See secton 7.1 of the AVSI formularium or table B.3 in Moene & van Dam (2014)
def f_esat(T):
    # Define constants (chaeck the values, the zeros are certainly wrong)
    c1 = 611.2
    c2 = 0
    c3 = 0
    c4 = 0
      
    # Compute the result (the structure of the equation is correct)
    result = c1*np.exp((c2*(T-c3))/(-c4+T))
    
    return result

In [None]:
# Do not forget to check the formulation of your function is correct
check_esat(f_esat)
# Do not forget to check the resulting values (mind the units)!


In [None]:
# Function to compute slope of the saturated vapour pressure in Pa/K
# Input
#    T     : temperature (Kelvin)
# Output
#    s     : slope of saturated vapour pressure versus temperature (d esat / dT)(Pa/K)
#
# See secton 7.1 of the AVSI formularium or table B.3 in Moene & van Dam (2014)
def f_s(T):
    # Define constants (check the values, the zeros are certainly wrong)
    c1 = 0
    c2 = 0

    # Compute the result (complete the formula)
    # Note that taking the exponent in Python is done with ** (not ^, as in Excel)
    # (so, e.g. x squared is computed as x**2)
    result = f_esat(T)*0
    
    return result

In [None]:
# Do not forget to check the formulation of your function is correct

# Do not forget to check the resulting values (mind the units)!


In [None]:
# Function to compute the psychrometer constant
# Input
#    T     : temperature (Kelvin)
#    p     : pressure (Pa)
#    q     : specific humidity (kg/kg)
# Output
#    gamma : psychrometer constant (Pa/K)
#
# See secton 7.1 of the AVSI formularium or table B.3 in Moene & van Dam (2014)
def f_gamma(T, p, q):
    # Define constants (chaeck the values, the zeros are certainly wrong)
    c1 = 65.5
    c2 = 0
    c3 = 0
    c4 = 0
    c5 = 0

    # Compute the result (complete the formula)
    # An alternative to implementing the equation given in the formularium would be to implement the 
    # definition of gamma as given in equation B.23 in Moene & van Dam (2014)
    result = c1*0

    return result   

In [None]:
# Do not forget to check the formulation of your function is correct

# Do not forget to check the resulting values (mind the units, in particular pressure)!


In [None]:
# Function to compute reference evapotranspiration according to Makkink
# Input
#    K_in  : global radiation (W/m2)
#    T     : temperature (Kelvin)
#    p     : pressure (Pa)
#    q     : specific humidity (kg/kg)
# Output
#    LvEref: reference evapotranspiration according to Makkink (W/m2)
#
# See secton 7.7 of the AVSI formularium or chapter 7 in Moene & van Dam (2014)
# Please note what is the unit of the resulting number !
def f_makkink(K_in, T, p, q):
    # First supply the commands that compute s and gamma from the data
    gamma = f_gamma(T, p, q)
    s = f_s(T)
    
    # Now construct the Makkink equation (i.e. replace the '0' by the correct equation)
    # What is the unit?
    result  = 0
    
    return result

In [None]:
# Do not forget to check the formulation of your function is correct
check_makkink(f_makkink)
# Do not forget to check the resulting values (mind the units, in particular pressure)!


## Determine the reference evapotranspiration
Now that you have the basic equations ready, it is time to compute reference evaporanspiration for this data set. For this you can use the `f_makkink` function that you just constructed above. The data that you need to feed that function are contained in the dataframe `df` that we read from the data file. You can obtain each variable from the dataframe by its name, e.g. `df['K_in']`.
You could can programe the computation of the reference ET in two ways (where it is up to you to fill the dots): 

* Supply the input to the function directly from the data frame:

    `LvEref = f_makkink(df['K_in'], ....)` 


* Alternatively, you could first make separate variables for K_in, T etc. and use those in the function:

    `K_in = df['K_in']`

    `T = df['T_1_5']`

    ....

    `LvEref = f_makkink(K_in, T, ....)`

Once you made this computation, you have a variable `LvEref` that contains the outcomes of the computation made by the function `f_makkink`. If you like, you can add this data to the dataframe `df` to have everything in one place: `df['LvEref'] = LvEref` (note that you could also have called this variable `LvEmakkink`: `df['LvEmakkink'] = LvEref`)

### <span style='background:lightblue'>Question 3</span>
Compute the reference evapotranspiration in mm/day based on the current data set (check what is the unit of the flux you computed with your `f_makkink` function.

## Determine the actual evapotranspiratoin
The actual evapotranspiration has been measured using the eddy-covariance technique. It is available in the data set in the variable named `LvE_m`. First check what the units of this quantity are (use the units attribute for this). 

### <span style='background:lightblue'>Question 4</span>
Compute the actual evapotranspiration in mm/day based on the eddy-covariance fluxes. Make sure that you really understand the unit conversion that is needed for this (check the [book used for Atmosphere Vegetation Soil Interactions](https://www-cambridge-org.ezproxy.library.wur.nl/core/books/transport-in-the-atmospherevegetationsoil-continuum/5944F8B7ADAC6409AD4575642431B2DC), section 8.1.1). Please do not use any short-cuts here: the density of liquid water is *not* 1000 kg m<sup>-3</sup> (check for instance this [data page on Wikipedia](https://en.wikipedia.org/wiki/Water_(data_page))).

## Compare actual and reference evapotranspiration
Now that you have both actual and reference evapotranspiration availble, it is time to compare them. The first step would be to plot both in one graph

### <span style='background:lightblue'>Question 5</span>
How do actual and reference evapotranspiration compare? Are they identical, is there a fixed offset, or is the difference variable over time. If so, can you related those differences to specific conditions?

Part of the variability in the actual evapotranspiration is related to variations in meteorological conditions. Those variations are supposed be captured by the reference evapotranspiration. A straightforward way to see to what extent indeed the reference evapotranspiration captures the variability of the actual evapotranspiration is to compute the crop factor (ET<sub>act</sub> / ET<sub>ref</sub>). If the resulting crop factor is *not* constant in time, apparently other things are happening as well:
* perhaps not *all* relevant meteorological variation is captured in the reference ET method you used
* perhaps not only the energy supply is limiting ET, but other things as well (most notably the water in the soil, and the availability of route to get water from the soil into the atmosphere (= plants))

### <span style='background:lightblue'>Question 6</span>
Compute the crop factor for the current data. What is the overall magnitude the crop factor? Is the crop factor constant over time, and if not, can you explain the variations (or at least bring forward a hypothesis)? What we need, in the end, is some sort of look-up table that provides you with a value for the crop factor, given certain conditions.

## Conclusion
You have made your first step to come to an estimation of the actual evapotranspiration of the Hupsel catchment in May 2022: you now know how -for a year like this- the grass in the catchment reacts to the external meteorological forcings (expressed in the reference evapotranspiration).

## Up to the next exercise
Apart from the new insights you obtained, you also developed a number of functions that you need to compute the reference evapotranspiration. For the next exercise we will make those functions available for you.