# Actual evapotranspiration and CO<sub>2</sub> uptake (Step 6)

## Intro

In this practical we will focus on actual evapotranspiration and photosynthesis.

In principle, the Penman-Monteith method is a sound physical model of transpiration. If we would like to use the Penman-Monteith method to estimate *actual evapotranspiration* rather than reference evapotranspiration (as you did in Practical 5) we need to make sure that the parameters used in the Penman-Monteith equation are representative of the current conditions. Here we focus on 
* albedo
* roughness length
* canopy resistance 

The loss of water through transpiration is intimately coupled to the *uptake of carbon-dioxide* related to photosynthesis. Therefore we will also look at a number of aspects of carbon fluxes:
* light response curve
* light use efficiency 
* water use efficiency

**Note**: in some parts of the exercise there are different <span style='background:lightgreen'>options</span> to choose from. In that case it is sufficient if you work on one of the options only. For those parts the questions have an *a* and a *b* part (e.g. <span style='background:lightblue'>Question 3a</span> and <span style='background:lightblue'>Question 3b</span>).


As usual, this practical comes with an [answer sheet](Actual_ET_6-answer-sheet.docx).

## Initialize Python stuff 

Please run the cell below by selecting it and pressing Shift+Enter. Or Press the Run button in the toolbar at the top of the screen (with the right pointing triangle).

In [None]:
# Load some necessary Python modules
import pandas as pd # Pandas is a library for data analysis
pd.set_option("mode.chained_assignment", None)
import numpy as np # Numpy is a library for processing multi-dimensional datasets
from hupsel_helper import myplot, myreadfile
from hupsel_helper import f_Lv, f_esat, f_s, f_gamma, f_cp, f_cos_zenith, f_atm_transmissivity, f_ra, check_z0, check_rc

With the commands above, the following functions have become available:
* `f_Lv(T)`: compute latent heat of vapourization from temperature (in K)
* `f_esat(T)`: compute saturated vapour pressure from temperature (in K)
* `f_s(T)`: compute the slope of the saturated vapour pressure as a function of temperature (in K)
* `f_gamma(T, p, q)`: compute the psychrometer constant from temperature (K), pressure (Pa) and specific humidity (kg/kg)
* `f_cp(q)`: compute the specific heat of air (in J/kg/K) using specific humidity (in kg/kg)

In this practical you may need some of the more advanced plotting capabilities of `myplot`, so check those in Step 0.

## Read the data

Since the process of evapotranspiration is an instantaneous process with a strong diurnal cycle we need to use data with an averaging interval that enables to resolved that diurnal cycle. Therefore this practical will be based on 30-minute average fluxes.

The data that you will use come from the same dataset as used in Practical-1, but with the difference that we now use 30-minute averages.

Now read the 30-minute data for 2014 from the Excel file.

In [None]:
# File name: this is a different file that you worked on before
fname='Hupsel2014_MeteoData.xlsx'

# Get the data
df = myreadfile(fname, type='30min')

## Albedo

When modelling net radiation (as is done in the FAO method) an essential parameter is the albedo. The FAO-method assumes a value of 0.23. But if you want to use the Penman-Monteith equation to determine actual ET for an actual surface, you need to know the real albedo. 

From the measurements we did in the field we know that the albedo can be variable between land-use types and within fields. But it can also vary with time.

### <span style='background:lightblue'>Question 1</span>
Compute the albedo and investigate how it varies with time:
* throughout the experiment (plot as a function of `df['Date']`)
* throughout the day (taking all days together: plot as a function of `df['Time']` (plot with *dots* (i.e. `'o'`), not lines))

You may need to zoom in a bit to ignore extreme values.

In [None]:
# Plot albedo over the entire experiment

# Plot albedo as a function of time-of-day


### Albedo is variable: solar zenith angle and diffuse radiation

We see that the albedo varies during the day. This variation can be related to (amongst others):
* the angle between the solar beam and the Earth's surface (which varies over the day: hence the variation of albedo with time)
* the fraction of the radiation that is diffuese

This is illustrated in the figure below (after figure 2.10 in Moene & Van Dam (2014)). 
<img src="albedo_cloudy.jpg" width="40%">
Now continue with either question **2a** or **2b** (or both, if you like).


### <span style='background:lightgreen'>Option a: solar zenith angle</span>
To quantify the dependence of albedo on the direction of solar radiation, we need information on the location of the Sun relative to the Earth's surface. Fortunately this is a nicely predictable quantity if we know date and time, as well as location.  We quantify the position of the sun with the cosine of the solar zenith angle ( $\cos (\theta_z)$ ). 

The cosine of the zenith angle can be obtained with the function `f_cos_zenith`:
```
cos_zenith_angle = f_cos_zenith(date_time, latitude, longitude)

```
where `date_time` is an array with time stamps (simply use `df['Date']` for that) and `latitde` and `longitude` are the coordinates in degrees. The Hupsel KNMI station is located at latitude = 52.0675 $^o$ and longitude = 6.6567 $^o$.

### <span style='background:lightblue'>Question 2a</span>
Determine the $\cos (\theta_z)$ for your data using the function `cos_zenith_angle` described above. 
* What is the range of values you expect for $\cos (\theta_z)$? 
* Plot the albedo (y-axis) as a function of $\cos (\theta_z)$. 
* Explain/interpret the relationship that you see.

In [None]:
# Location of Hupsel weather station
latitude = 52.0675
longitude = 6.6567

# Determine the solar zenith angle with f_cos_zenith


# Plot albedo versus cosine of the zenith angle (you computed the albedo in question 1)



### <span style='background:lightgreen'>Option b: diffuse radiation</span>
The dependence of the reflectivity of a surface on the direction of the solar radiation (or on time) only has an effect on the overall albedo if the radiation comes mainly from one direction. So when the radiation is diffuse (under cloudy conditions), one would expect the albedo to be mostly *independent* of $\cos ( \theta_z )$ (see the figure from  Moene & Van Dam (2014) above). 

To test this, we would need information the amount of diffuse radiation. This info is not available directly, but there are two variables that could be helpful here:
* the sunshine duration within the 30 minute interval (`df['sun_dur']`)
* the transmissivity of the atmosphere $\tau_b = \frac{K^\downarrow}{K_0}$ where $K_0$ is the radiation at the top of the atmosphere.

The transmissivity of the atmosphere can be computed with the function `f_atm_transmissivity`:
```
trans = f_atm_transmissivity(date_time, latitude, longitude, K_in)
```
where `date_time` is an array with time stamps (`df['Date']`), `latitde` and `longitude` are the coordinates in degrees and `K_in` is global radiation.  The Hupsel KNMI station is located at latitude = 52.0675 $^o$ and longitude = 6.6567 $^o$.

### <span style='background:lightblue'>Question 2b</span>
Determine one or both of the indicators for diffuse radiation (sunshine duration or atmospheric transmissivity).
* What values for sunshine duration or atmospheric transmissivity do you expect for cloudy versus sunny conditions?
* Now plot the albedo as a function time (or date). Color the dots by one of the variable that would help to distingish between mostly diffuse conditions and conditions with mainly direct radiation. Use the additional keyword `color_by` in the plot command: `myplot([.. , .., 'o'], color_by = df['sun_dur'])`.
* Does the dependence of albedo on time-of-day differ between cloudy and sunny conditions? If so, how?

Notes: 
* Rather than using time on the horizontal axis, you could also use the cosine of the zenith angle as an indicator of the direction of the solar beam (see question 2a for how to compute $\cos(\theta_z)$)
* An alternative way to do the analysis is to plot albedo as a function of `df['Date']` (the full experiment). Plot in the same graph the relative sunshine duration (`df['sun_dur']/30`: the fraction of the interval in which it was sunny). You now very quicky see which days were sunny and which were cloudy (based on the relative sunshine duration). Now zoom in in time to a period that contains both sunny and cloudy days. You will now clearly see how the variability of albedo differs between sunny and cloudy days.

In [None]:
# Location of Hupsel weather station
latitude = 52.0675
longitude = 6.6567

# Determine the indicator you want to use for the cloudiness (sunshine duration from the data or 
# transmissivity computed with the f_atm_tranmissivitiy function


# Plot albedo versus time (df['Time']) and color the points by one of the indicators for 
# diffuse radiation (you computed the albedo in question 1)


# Alternative: plot albedo versus date-time (df['Date']) and plot in the same graph relative sunshine duration 
# (see note above)



## Roughness length

To derive the roughness lengths for momentum ($z_0$) and heat ($z_{0h}$)from observations we need to consider the effect of stability on the wind profiles (equation (3.42) in the AVSI book):
$$
 \overline{u}(z_u) = \frac{u_*}{\kappa} \left[ \ln\left(\frac{z_u}{z_0}\right) - 
                                               \Psi_m\left(\frac{z_u}{L}\right) + 
                                               \Psi_m\left(\frac{z_0}{L}\right) \right]
$$

However, to obtain $z_0$ from the expressions for the profiles would require quite some programming. An easier way is to start with a two-step method for $z_0$:

* First compute the roughness length for momentum from observed $\overline{u}$ and $u_*$  for each data point, assuming neutral conditions (i.e. $\frac{z}{L} \approx 0$). Then, the computation of the roughness length is a matter of rewriting the expression for the logarithmic wind profile (remember that the Python functions that you might need are `np.log(x)` and `np.exp(x)`).
* Filter the data such that only the most neutral data are retained:

As the KNMI station was surrounded by grass you can assume that the roughness length you derive is representative for a grass meadow. However, for some wind directions and wind speeds the farm to the west might be located in the footprint of the eddy-covariance station (see figure below). In that case, the derived value for $z_0$ might be incorrect. 
<img src="surrounding_KNMI_station.png" width="80%">


The question to be answered is: what is the value of $z_0$ for the grass and is the value you find for $z_0$ very different from the values used in the FAO method?  


### <span style='background:lightblue'>Question 3</span>
Determine the roughness length for momentum from the current dataset: compute the value assuming neutral condition. This should yield a time series of $z_0$ values (determining one correct value we will do in the next step).

You can check your values with the function `check_z0` which you use in the following way: `check_z0(df, your_z0)` wher `df` is the data frame with data and `your_z0` is the variable in which you stored your computed values.

In [None]:
# Use this cell to compute the roughness length for all data 
# You can check your values with the function check_z0(df, your_z0) 
# For wind speed you can use the 10 meter wind of the KNMI station
# or the 3.05 meter wind of the eddy covariance system





If you list or plot your computed $z_0$ you will see that the values vary wildly. This is due to the fact that the assumption of neutral conditions is generally not valid. Next, you have to determine which of the values you just computed could be correct. Option a and b use different methods. Choose one of the two.

<img src="roughness_methods.png" width="80%">

### <span style='background:lightgreen'>Option a: Select neutral conditions based on wind speed</span>
The simplest method to select the most neutral data is to use high wind speed as an indicator of neutral conditions. 
You can implement this by plotting $z_0$ versus wind speed to see where 
the neutral data are (see figure above, left panel). From that part of the graph you can *estimate* the roughness length (within an order of magnitude, but that is enough). Usually it helps to use a log-scale for the $z_0$–axis because the spread in values is quite large.

The plot you get will at first seem quite chaotic. But you will get a reasonable view on the values when you zoom in to a range between $10^{-4}$ and $10^{1}$ m for $z_0$. You can best do that by using the `ylim` keyword in `myplot`, e.g.  `ylim=[1e-4, 1e1]`).

### <span style='background:lightblue'>Question 4a</span>
Determine the roughness length for momentum with the method described above. Based on all of these data points you should come up with a single value: your best estimate.
* Is the value that you get  a reasonable value? 
* How does it compare to the value that is assumed in the FAO method for reference ET? 
* Are some of the values affected by upstream conditions (see map above). To test this, color the dots with the wind drection (so add to the plot command the keyword `color_by`: `color_by=df['u_dir']`.

In [None]:
# Method 1: plot z0 against wind speed, look at the z0 values at hight wind speeds.
# To detect effects of upstram conditions, color the dots with df['u_dir']: color_by=df['u_dir']
# usefull range for the y-axis is: ylim=[1e-4, 1e1]




### <span style='background:lightgreen'>Option b: Select neutral conditions based on $\frac{z}{L}$</span>
The more complex method is also more exact. You can use the stability indicator $\frac{z}{L}$ ($L$ is the Obukhov length) as an indication for neutral conditions: plot $z_0$ versus $\frac{z}{L}$ and zoom in on the neutral part (see figure above, right panel).

What we call 'filtering' above can simply be done by plotting: plot all $z_0$ values and then search for that part of the plot where you expect neutral conditions. From that part of the graph you can *estimate* the roughness length (within an order of magnitude, but that is enough). Usually it helps to use a log-scale for the $z_0$–axis because the spread in values is quite large.

The plot you get will at first seem quite chaotic. But you will get a reasonable view on the values when you zoom in to a range between $10^{-4}$ and $10^{1}$ m for $z_0$. You can best do that by using the `ylim` keyword in `myplot`, e.g.  `ylim=[1e-4, 1e1]`). To zoom into neutral conditions, take a subregion around $\frac{z}{L} = 0$. Start with a wide region, e.g. between -1 and 1 (use `xlim = [-1,1]` in the plot command). Subsequently narrow down to $\frac{z}{L}$ between -0.05 and 0.05.

### <span style='background:lightblue'>Question 4b</span>
Determine the roughness for momentum with the method described above.  Based on all of these data points you should come up with a single value: your best estimate. 
* Is the value that you get  a reasonable value? 
* How does it compare to the value that is assumed in the FAO method for reference ET? 
* Are some of the values affected by upstream conditions (see map above). To test this, color the dots with the wind direction (so add to the plot command the keyword `color_by`: `color_by=df['u_dir']`.

In [None]:
# Method 2: plot against z/L, look at the z0 values at neutral conditions (z/L = 0)
# First compute z/L. Assume that you can ignore the effect of humidity on buoyancy 
# (i.e. you can use normal temperature rather than virtual temperature)



# Plot z0 (y-axis) as a function of z/L
# Use a log-axis for the y-axis and choose proper limits for that axis, e.g. ylim=[1e-3,1e0]
# To detect effects of upstream conditions, color the dots with df['u_dir']: color_by=df['u_dir']




## Canopy resistance

If both the actual evapotranspiration is measured as well as all input variables for the Penman-Monteith equation (Q*, G, T, e, ra), then the canopy resistance can be obtained. Inversion of the Penman-Monteith yields for rc the following explicit expression:  
$$
r_c=r_a \left[ \frac{s(Q^*-G)+\frac{\rho c_p}{r_a} \left(e_s (T_a)-e_a \right) )}{\gamma L_v E}-\frac{s}{\gamma}-1 \right]
$$

### <span style='background:lightblue'>Question 5</span>
Compute the canopy resistance for each data point. In your analysis focus on:
* The diurnal cycle (how does $r_c$ vary through the day): why does the $r_c$ vary with time in this way?
* The development over time of the midday value of the canopy resistance (are there periods of significantly higher or lower values, perhaps linked to periods of soil moisture stress, wet canopy etc.)
* Compare the values you find to those prescribed by the FAO.

Notes: 
* The data may be quite noisy, so it might help to use a logarithmic axis for $r_c$. If the plot does not auto-scale well, use the `ylim` keyword in the plot command, using limits of $10^{-1}$ and $10^{4}$.
* The canopy resistance may depend on a range of external factors (e.g. VPD, RH, temperature) or on the conditions of the surface (e.g. before and after mowing). To discover such dependencies, use the `color_by` keyword (see the documentation of `myplot`).
* In the computation you will need the aerodynamic resistance. There are two routes for this:
  * use function `f_ra` which uses wind speed and two roughness lengths. You should realize that the function is based on the assumption that conditions are neutral
  * use a simplified version of equation (3.44) in Moene & Van Dam (2014): $r_a = \frac{1}{\kappa u_*} \ln \left(\frac{z_T}{z_{oh}} \right)$. In this expression you circumvent the assumption of neutral conditions at least for the part related to $u_*$


In [None]:
# Compute the canopy resistance 

# First determine the aerodynamic resistance with the function f_ra 
# or with the equation given above (bsaed on 3.44 in the AVSI book)



# Next collect the required other variables (temperature, vapour pressure, net radiation, ....
# Note that the LvE used in the equation above is the *actual* latent heat flux (i.e. the 
# eddy-covariance flux, available here as df['LvE_m'])


# Now compute the canopy resistance. To prevent errors it can be helpful to split the 
# horrible equation in a number of handy chunks.


# You can check your values with the function check_rc(df, your_z0) 
# check_rc(df, rc)

# Plot your rc as a function of time of day (df['Time'] on the x-axis)
# Hints:
# * Use dots (not lines)
# * You may need to use the ylim keyword to get a reasonable range in the vertical
# * Alternatively, you could use a logarithmic axis for the y-axis so that  
# even outliers can be easily plotted (use the keyword y_axis_type='log')
# * It may be helpful to color the dots by e.g. day number (color_by=df['DOY'])


# To have a clearer view on the variation of rc over the experiment, you could
# focus on the midday values. In order to plot only those, you need to select 
# part of the data. You can do that as follows:
# select = (df['Hour']==12)   # select those half hours that have 12 hours as their full hour
# tmp_date = df['Date']
# x = tmp_date[select]
# y = rc[select]              # assuming that your rc values are in a variable names 'rc'

# Now you can plot x versus y which will show you only the midday values of rc.



## Photosynthesis

The data set also contains information about the exchange of CO2 between the plants and the atmosphere. The eddy-covariance system directly measures the net ecosystem exchange (NEE): the net effect of uptake by photosynthesis (GPP, gross primary production) and release due to respiration. The portioning of NEE over respiration (TER) and GPP cannot be measured but has been estimated. There are a number of aspects you can look at (only choose one):
a. light response curve
b. light use efficiency 
c. water use efficiency

### <span style='background:lightblue'>Question 6</span>
Before you dive into one of the three topics, first explore the NEE flux (variable `FCO2_m`). In particular we focus on its diurnal cycle. How does it very as a function of time of day (variable `Time`)?

In [None]:
# Make a plot of 'FCO2_m' (y-axis) versus 'Time' (x-axis)
# Since conditions may have changed during the experiment, color the points with the day number (color_by='DOY')
# To make the plot less noisy, use a proper range for the y-axis (e.g. ylim=[-1e-6,1e-6])



Since we are primarily interested here in the rol of plant transpiration, we focus here on the part of the CO2 flux that is most closely related to photosynthesis: gross primary production (GPP). 

### <span style='background:lightgreen'>Option a: Light response curve and light-use efficiency</span>

The photosynthesis is related to light interception. An important concept in is the light response curve (LRC, how much CO2 uptake takes place at which light level). 

### <span style='background:lightblue'>Question 7a</span>
Construct a light response curve by making a scatter plot of GPP (y-axis) versus global radiation (as a proxy for the photosynthetically active radiation (you may need to tweak the axes to reduce the effect of outliers).
* Focus on the general shape (initial slope at low light levels and maximum assimilation at high light levels).
* Conditions may have changed during the experiment: can you detect those in the light response curve?
* During the AVSI course we saw that plants are more efficient in taking up CO2 under cloudy conditions (due to a higher proportion of diffuse radation). Check this.
Estimate level of the plateau in the light response curve (the maximum assimilation) for about 10 days (note them down) and try to find a relationship between that maximum value and conditions during those days.

In [None]:
# Plot a light response curve
# Useful axis limits are:
#   xlim=[0,1000]
#   ylim=[-2e-7,2e-6]
# Plot with dots ('o')
# Since conditions may have changed during the experiment (e.g. moving), color the dots with the day number:
#   color_by=df['DOY']




# To check if plants are more efficient in taking up CO2 under cloudy conditions (due to a higher proportion 
# of diffuse radiation) you can use the variable sunshine duration as an indication of sunny conditions
# (little diffuse radiation). Use: color_by=df['sun_dur']





Another way of quantifying the relation between supplied light energy and resulting carbon uptake is the light use efficiency (LUE): the ratio of GPP over radiation input:
$$
LUE = \frac{GPP}{PAR}
$$
where PAR would be the amount of photosynthetically active radiation. Here we use the global radiation as a proxy: it has the same variation, but is about twice as large as PAR.
for a given amount of radiation (so you do not look at the LRC as a curve, but at an individual point) (again use global radiation as a proxy for PAR).

### <span style='background:lightblue'>Question 8a</span>
Compute for each data point the light use efficiency. 
* Plot the LUE as a function of time on the day to see the average diurnal cycle. What does it look like, could you explain it?
* Does the LUE vary with meteorological variables (e.g. relative humidity, temperature ….)? You can check that by coloring the plot with those variables.

In [None]:
# Compute the light-use efficiency
LUE = df['GPP']/df['K_in']

# Plot LUE as a function of time of day (df['Time']). Reasonable axis limits are xlim=[0,24] and 
# ylim=[0,1e-8] . Try various variables to color the dots with:
# * 'DOY': how does the LUE vary though the experiment
# * 'RH_1_5': how does a larger atmospheric demand for water vapour influence LUE?
# * 'T_1_5': how does do higher temperature have an impact (is it through the biology, or via
#            vapour pressure deficit?)




### <span style='background:lightgreen'>Option b: Water use response curve and water use efficiency</span>
The uptake of CO2 and the transpiration are closely coupled via the stomata. In that respect an important variable is the water use efficiency (WUE): amount of CO2 uptake for a given amount of water loss (less water use per carbon uptake means higher efficiency). To simplify the analysis we will use the latent heat flux as a proxy for the amount of water lost. 

### <span style='background:lightblue'>Question 7b</span>
First we analyse the WUE in the form of a ‘$L_v E$ response curve’: GPP as a function of $L_v E$.  Such a curve could answer the question: does additional evapotranspiration lead to additional CO2 uptake or does it level off?
* Focus on the general shape (initial slope at low transpiration levels and maximum assimilation at high levels).
* Conditions may have changed during the experiment: can you detect those in the light response curve?
* During the AVSI course we saw that plants are more efficient in taking up CO2 under cloudy conditions (due to a higher proportion of diffuse radation). Does this also translate into a higher water use efficiency during cloudy conditions?

In [None]:
# Water use efficiency
# Plot a transpiration response curve
# Useful axis limits are:
#    xlim=[0,400]
#   ylim=[-2e-7,2e-6]
# Plot with dots ('o')
# Since conditions may have changed during the experiment (e.g. moving), color the dots with the day number:
#   color_by=df['DOY']
  
    
    

# To check if plants are more efficient in taking up CO2 under cloudy conditions (due to a higher proportion 
# of diffuse radation) you can use the variable sunshine duration as an indication of sunny conditions
# (little diffuse radiation). Use: color_by=df['sun_dur']






Another way of quantifying the relation between use water and resulting carbon uptake is the water use efficiency (WUE): the ratio of GPP over radiation input:
$$
WUE = \frac{GPP}{T}
$$
where $T$ is transpiration. Here we will use the latent heat flux as a proxy for transpiration: it has the same variation as transpiration but it differs in magnitude and some details. So our working definition for WUE is:
$$
WUE = \frac{GPP}{L_v E}
$$


### <span style='background:lightblue'>Question 8b</span>
Compute for each data point the water use efficiency. 
* Plot the WUE as a function of time on the day to see the average diurnal cycle. What does it look like, could you explain it?
* Does the WUE vary with meteorological variables (e.g. relative humidity, temperature ….)? You can check that by coloring the plot with those variables.

In [None]:
# Compute the water use efficiency (based on LvE and GPP)



# Plot WUE as a function of time of day (df['Time']). Reasonable axis limits are xlim=[0,24] and 
# ylim=[-2e-9,2e-8] . Try various variables to color the dots with:
# * 'DOY': how does the WUE vary though the experiment
# * 'RH_1_5': how does a larger atmospheric demand for water vapour influence WUE?
# * 'T_1_5': how does do higher temperature have an impact (is it through the biology, or via
#            vapour pressure deficit?)



## Conclusion
This was it, folks. Thanks for your hard work. Hopefully it was a rewarding process.

## Final report
Now the final step is to transfer some of your results and findings from into the final report document that you will upload to Brightspace. 