# Determination of actual evapotranspiration Hupsel 2022 (step 3+4)

## Intro
The first exercise of today complets the steps you made before: here you are going to  determine the actual evapotranspiration of the Hupsel catchment in the first weeks of May 2022. A subset of this result will be used in the determination of the Hupsel water balance (for May 1-19).  

Today you will use the  process understanding that you obtained in step 1 and 2 to make the best possible estimate of the ET of the Hupsel catchment.

Collect your answers in the <a href="Actual_ET_3-answer-sheet.docx" download>answer sheet</a>.


<img src="analysis-overview.png" width="80%">

## The method
In the first two steps you have investigated how the actual evapo(transpi)ration of grass and bare soil reacts to the weather. In this we condensed 'the weather' into a single variable: the reference evapotranspiration according to the Makkink method.

Your conclusion was that for some weather conditions and surface types ET<sub>act</sub> was larger than ET<sub>ref</sub>, for other moments it was lower. So a fixed 'crop factor' for a given surface is not enough to come to a good estimate of ET<sub>act</sub>. But based on the work you did before, you should be able to translate your knowledge about the variation of the crop factor into a new dataset: the data of May 2022.

The steps to take are:
* Determine ET<sub>ref</sub> for May 2022
* Determine appropriate day-to-day 'crop factors' (based on your insights from step 1 and 2) for
  * grass
  * bare soil
* Apply those crop factors to determine a time series of ET<sub>act</sub> for the dominant land use in the Hupsel catchment for May 2022
  * grass
  * bare soil
* Determine the day-to-day values for the catchment-mean ET<sub>act</sub>

As before, in these final steps we will focus on daily mean data (i.e. data that have been averaged over 24 hours).
As a final reminder, [supporting documentation on reference ET methods](reference_ET_concept.pdf) is available.

## The data
The 2022 contains similar data as the 2011 data from Hupsel that you analyzed in step 1. However, the MAQ observations (radiation and turbulent fluxes) are missing.

## Initialize Python stuff and read the data
Please run the cell below by selecting it and pressing Shift+Enter. Or Press the Run button in the toolbar at the top of the screen (with the right pointing triangle).

In [None]:
# Load some necessary Python modules
import pandas as pd # Pandas is a library for data analysis
pd.set_option("mode.chained_assignment", None)
import numpy as np # Numpy is a library for processing multi-dimensional datasets
from hupsel_helper import myplot, myreadfile
from hupsel_helper import f_Lv, f_esat, f_s, f_gamma, f_makkink, check_crop_factor, check_ET, f_days_since_rain

Now read the data for the current year from the Excel file.

In [None]:
# File name: this is a different file that you worked on before
fname='Hupsel2022_MeteoData.xlsx'

# Get the data
df = myreadfile(fname)

## Explore the data
### Information available in the dataframe
Before you start making computations with the data it is wise to first explore the data.

Just as in the previous notebooks you can obtain additiona information
* `df.keys()` gives the available variables
* `df.attrs['units']` gives information about the units
* `df.attrs['description']` gives more a complete description of the variables: .

### Inspect the data
Just as in the previous notebooks there are various ways to explore the data: print the data frame, print a single variable, plot a combination of variables with `myplot` (see previous notebook for documentation).
There are a number of ways to inspect the data:
* print the full dataframe in a cell (simply type `df` and run the cell)
* print a single variable from the dataframe (type for instance `df['K_in']` to show the values of global radiation)
* plot the data with the plot command `myplot` (for documentation of the function: type `help(myplot)`    

### <span style='background:lightblue'>Question 1</span>
Characterize the weather conditions during the period in which the data were gathered. Now do this on a day-to-day based. Your analysis should be sufficiently detailed so that you will be able to assign an appropriate 'crop factor' for grass and bare soil for each day.

## Determine the reference evapotranspiration
By now, this step should be relatively straightforward. Just as in the previous steps -for consistency- we use  the Makkink equation to determine the reference ET. The essential equations can be found in the [Formularium of Atmosphere-Vegetation-Soil Interactions](Forumularium_AVSI_2021.pdf). 

In the practical for step 1 you developed a number of functions. Those are now directly available to you:
* `f_Lv(T)`: compte latent heat of vapourization from temperature (in K)
* `f_esat(T)`: compute saturated vapour pressure from temperature (in K)
* `f_s(T)`: compute the slope of the saturated vapour pressure as a function of temperature (in K)
* `f_gamma(T, p, q)`: compute the psychrometer constant from temperature (K), pressure (Pa) and specific humidity (kg/kg)
* `f_makkink(Kin, T, p, q)`: compute reference evapotranspiration according to the Makkink equation.

### <span style='background:lightblue'>Question 2</span>
Determine the reference evapotranspiration for the **2022** dataset in mm/day (as always: check units of your data). Explore its variation with time, and link that to the variations you observe in the meteorological conditions (discussed in question 1). But start with checking if your values are reasonable using the `check_ET` function (give it the time series of your ETref as an argument, e.g. `check_ET(ETref)`).

In [None]:
# Check your ET values
# check_ET( )

## Determine appropriate day-to-day 'crop factors'  for grass and bare soil

The next step is to define time series for the 'crop factors' for grass and bare soil. Depending on the way you are going to make that time series, it can be handy to have an empty array to start with. To ensure that it has the same length the data set that we have we can use:
```
k_soil = np.empty(len(df), dtype=float)
k_soil[:] = np.nan
```
which makes an empty array that is as long as our data frame `df` and can store floats (real values, as opposed to integer values). Subsequently we store not-a-number values in all elemnts of the array (indicated with `[:]`). In the cell below, create the crop factor arrays for grass and bare soil.

In [None]:
k_soil = np.empty(len(df), dtype=float)
k_soil[:] = np.nan
k_grass = np.empty(len(df), dtype=float)
k_grass[:] = np.nan

### Python: fill the new variables, possibly based on existing ones
There are a number of ways in which you can fill the newly constructed columns for the 'crop factor'. For this it may be helpful to know that there is a short-hand way to address a column in a dataframe. You were used to write `df['K_in']` to get the global radiation from the data frame. But `df.K_in` also works. In the examples below we will mostly use that notation.

#### Assign the values day-by-day
You can access a certain element in the variable directly by selecting it with `[row_number]`, e.g.:
```
k_soil[2] = 1.2
```` 
will  put a value of `1.2` in the 3rd element of `k_soil` (note that Python starts counting at zero ;).

#### Assign values based on conditions on other variables
Suppose that you want to make your 'crop factor' dependent on the air temperature, you could fill it with the `np.where` function (the logic of `where` is: `where(condition, if_true, if_false)` returns a new series filled with `if_true` for those locations where the `condition` is true, and with `if_false` if the `condition` is false. The following expression would fill `var1` with a value `1.0` for those days that the temperature is above 10 degree Celcius, and otherwise use `0.8`.
```
var1 = np.where( (df.T_1_5 > 10.0), 9.0, 10.0)
```
You can also combine conditions:
* if both conditions should be true: condition1 & condition2
* if one of both conditions should be true: condition | condition2

E.g. you could required that the temperature should be above 10 degree Celcius, and the relative humidity should be below 80%:
```
var3 = np.where( ( (df.T_1_5 > 10.0) & (df.RH_1_5 < 80.0) ), 1.0, 0.3)
```
In the cell below you can test some conditions (use a dummy variables for this so that you do not cause any damage to your real crop factor variables).

In [None]:
# Test with filling the array point-by-point
var1 = np.empty(len(df), dtype=float)
var1[:] = np.nan
var1[1] = 1.2

# Test with filling the array using a condition
# Conditions can be:
# a > b: a is larger than b
# a < b: a is smaller than b
# a == b: a is equal to b (NB: note the double '=')
# a != b: a is not equal to b
#
## One condition
var2 = np.empty(len(df), dtype=float)
var2[:] = np.nan
var2 = np.where( (df.T_1_5 > 10.0), 9.0, 10.0)
## Two conditions
var3 = np.empty(len(df), dtype=float)
var3[:] = np.nan
var3 = np.where( ( (df.T_1_5 > 10.0) & (df.RH_1_5 < 80.0) ), 1.0, 0.3)
## One condition, and if false, keep the old value
var4 = np.empty(len(df), dtype=float)
var4[:] = np.nan
var4 = np.where( (df.T_1_5 > 10.0), 9.0, var4)

#print (var1, var2, var3, var4)

### Back to the exercise: construct the day-to-day crop factor for grass
Now construct a time series of the 'crop factor' for grass for the time period of the current dataset. You should base that on the work you've done in Step 1 (last week) and your insights on the weather in the current dataset (based on your exploration at the start of this notebook). 
As rainfall may be an important parameter, we have made an additional function available: `f_days_since_rain`. The following command:
```
ndays_since_rain = f_days_since_rain(precipitation, threshold = 0.1)
```
will return an array in which for each day it is indicated how many days ago it rained more than `threshold` mm/day. On days that it rains, the value will be zero. With the value of `threshold` you can determine what you consider a 'rainy' day. The values for `precipitation` you will likely get from the data frame: `df['prec']`. 

### <span style='background:lightblue'>Question 3</span>
Construct a time series of the 'crop factor' for grass. Important weather variables to look at might be rainfall, temperature and humidity.

Check that the numbers that you constructed make sense. With the function `check_crop_factor(cf)` you can check your constructed crop factor for common errors (replace `cf` by your variable containing the crop factor, probably `k_grass` or `k_soil`). This function only checks for the most obvious errors that could be due to coding errors (remaining not-a-number values, negatieve values, excessively high values). It does *not* check whether your values are correct for the given data set. That assessment is up to you as an expert.

In [None]:
# Use this cell to construct your crop factor


In [None]:
# Check your crop factor
# check_crop_factor(k_grass)

### Construct the day-to-day cropfactor for bare soil

Next,  construct a time series of the 'crop factor' for bare soil for the current dataset. For this, use your work of  Step 2 (last week) and your insights on the weather in the current dataset (based on your exploration at the start of this notebook).

### <span style='background:lightblue'>Question 4</span>
Construct a time series of the 'crop factor' for bare soil. Important weather variables to take into account at might be rainfall and days since last rain.

Check that the numbers that you constructed make sense.

In [None]:
# Use this cell to construct your crop factor


In [None]:
# Check your crop factor
# check_crop_factor(k_soil)

By now your output dataframe should at least contain columns for the date, the reference evapotranspiration and two 'crop factors'.

## Apply the 'crop factors' to obtain estimates of ET<sub>act</sub>
Now combine the reference evapotranspiration with the two crop factors to obtain estimates of the actual evapotranspiration of grass, and the actual evaporation from bare soil. 

### <span style='background:lightblue'>Question 5</span>
Construct a time series of actual evapotranspiration of grass and bare soil and store both series in your dataframe. Check whether the computed ET values are reasonable.

In [None]:
# Use this cell to construct your ET_act for grass and bare soil


In [None]:
# Check your ET values
# check_ET( )

## Determine the actual evapotranspiration for the full catchment
Now that you gave estimate of the actual evapotranspiration of both dominant land-use types, you can combine them into an estimate for the entire catchment. For this, use you knowledge about the catchment in terms of the relative contribution of the various land-use types.

### <span style='background:lightblue'>Question 6</span>
Compute the actual evapotranspiration in mm/day for the entire catchment, for each day in the dataset. Store this as a new variable in your dataframe.

In [None]:
# Use this cell to construct your ET_act for the entire catchment


In [None]:
# Check your ET values
# check_ET( )

## Conclusion
With this, you've come to the end of the 4-step process to determine the actual evapotranspiration of the Hupsel catchment for May 2022. 
### <span style='background:lightblue'>Question 7</span>
Now that you have your final results, there are two things to do:
* save the values for May 5 until May 19 (inclusive) so that you can use them as part of the overall water balance you're going to make for the catchment
* copy your values for the full dataseries to this [Excel file](result_FIRSTNAME_LASTNAME.xlsx) and rename the file (i.e. replace FIRSTNAME and LASTNAME by your name). At the end of this practical session, you will submit it to Brightspace, together with the answer sheets of step 3 and 4.


## Up to the next exercise
In the next exercise for this practical session you are going to extend your expertise on reference evapotranspiration methods to the methods of Penman-Monteith and Priestley-Taylor. That exercise will have a seperate answer sheet. After completing step 4, upload both asnwer sheets to Brightspace.