# Evaporation 

## Objectives

- Apply the water balance equation to estimate long-term average catchment evaporation

- Apply the Budyko equation to explore the fundamental limits on catchment water availability

- Visualize the range of aridity and evaporative fraction values in UK catchments 


We will start by loading the CAMELS data that we have used previously, and working through the analysis using an arbitrary catchment:

In [7]:
import os
DATADIR = os.path.join('data', '8344e4f3-d2ea-44f5-8afa-86d2987543a9', 'data')


In [8]:
import pandas as pd
id = '97002'
data = pd.read_csv(os.path.join(DATADIR, 'timeseries', f'CAMELS_GB_hydromet_timeseries_{id}_19701001-20150930.csv'), parse_dates=[0])
data['id'] = id
data.head()

Unnamed: 0,date,precipitation,pet,temperature,discharge_spec,discharge_vol,peti,humidity,shortwave_rad,longwave_rad,windspeed,id
0,1970-10-01,9.93,1.02,8.91,,,1.31,6.01,61.76,325.33,7.65,97002
1,1970-10-02,4.01,1.41,7.66,,,1.76,5.11,93.56,294.2,10.03,97002
2,1970-10-03,7.27,1.17,8.77,,,1.4,5.41,61.95,321.14,5.41,97002
3,1970-10-04,3.77,0.06,9.74,,,0.23,7.76,42.83,341.28,7.27,97002
4,1970-10-05,1.19,1.56,9.46,,,1.86,5.49,92.13,299.08,7.9,97002


In the first week we calculated the annual actual evaporation by applying the catchment water balance equation. You should have written a loop to do this for many catchments. 

In [9]:
data['water_year'] = data['date'].dt.to_period('Y-SEP')
data.head()

Unnamed: 0,date,precipitation,pet,temperature,discharge_spec,discharge_vol,peti,humidity,shortwave_rad,longwave_rad,windspeed,id,water_year
0,1970-10-01,9.93,1.02,8.91,,,1.31,6.01,61.76,325.33,7.65,97002,1971
1,1970-10-02,4.01,1.41,7.66,,,1.76,5.11,93.56,294.2,10.03,97002,1971
2,1970-10-03,7.27,1.17,8.77,,,1.4,5.41,61.95,321.14,5.41,97002,1971
3,1970-10-04,3.77,0.06,9.74,,,0.23,7.76,42.83,341.28,7.27,97002,1971
4,1970-10-05,1.19,1.56,9.46,,,1.86,5.49,92.13,299.08,7.9,97002,1971


In [10]:
data = data.dropna(subset=['precipitation', 'pet', 'peti', 'discharge_spec'])
data = data.groupby(['id', 'water_year']).agg(
    precipitation=('precipitation', 'sum'),
    peti=('peti', 'sum'),
    pet=('pet', 'sum'),
    discharge_spec=('discharge_spec', 'sum'),
    valid_count=('id', 'count')  # Count non-NaN rows in 'precipitation'
).reset_index()
data = data[data['valid_count'] > (365 * 0.95)]
data['aet'] = data['precipitation'] - data['discharge_spec']
data = data.groupby(['id'])[['precipitation', 'peti', 'pet', 'discharge_spec', 'aet']].sum()

Now let's compute the evaporative fraction (actual evaporation divided by precipitation) and aridity (potential evaporation divided by precipitation):

In [11]:
data['evaporative_fraction'] = data['aet'] / data['precipitation']
data['aridity_index'] = data['peti'] / data['precipitation']
data.head()

Unnamed: 0_level_0,precipitation,peti,pet,discharge_spec,aet,evaporative_fraction,aridity_index
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
97002,47092.45,20068.04,17697.72,29943.8,17148.65,0.364149,0.426141


Is the catchment energy limited or water limited? 

Now I would like you to plot the Budyko curve using every catchment in the CAMELS-GB dataset. However, I would like you to implement some data quality procedures: 

1. Exclude water years with less than 95% data availability; 
2. Exclude catchments with fewer than 20 years that meet criteria (1). 

To get you started I have read in one of the metadata files to retrieve the list of catchment IDs, and started off the loop. 

In [12]:
metadata = pd.read_csv(os.path.join(DATADIR, f'CAMELS_GB_topographic_attributes.csv'))
metadata['gauge_id'] = metadata['gauge_id'].astype(str)
catchment_ids = metadata['gauge_id'].to_list()

results = []
for id in catchment_ids: 
    data = pd.read_csv(os.path.join(DATADIR, 'timeseries', f'CAMELS_GB_hydromet_timeseries_{id}_19701001-20150930.csv'), parse_dates=[0])
    data['id'] = id

    # Do something here 

    if data.shape[0] > 1: 
        raise NotImplementedError()

    results.append(data)

data = pd.concat(results).reset_index(drop=True)
data.head()
    

NotImplementedError: 

Now you have calculated the evaporative fraction and aridity for every basin you can make the plot. Your plot should have: 
1. Aridity index on x-axis
2. Evaporative fraction on y-axis
3. A line showing the theoretical Budyko curve
4. Two straight lines showing the theoretical energy and water limits 
5. A point for every catchment (computed above). 

For reference, the Budyko equation can be expressed: 

<!-- $\frac{\overline{E}}{\overline{P}} = \left\{(1 - \exp{$-\frac{\overline{E_p}}{\overline{P}}}\right\}$ -->
$\frac{\overline{E}}{\overline{P}} = \left\{1 - \exp{\left({-\frac{\overline{E_p}}{\overline{P}}}\right)}\right\}$

To create the theoretical Budyko curve, create a vector of numbers between 0 and 5 representing $\frac{\overline{E_p}}{\overline{P}}$:



In [14]:
import numpy as np
aridity_vals = np.arange(0, 5.1, 0.1)

Now supply these values to the above equation to get the predicted evaporative fraction. 