# Lesson: degree day model

Today we are going to use the [degree day model](http://www.antarcticglaciers.org/glaciers-and-climate/numerical-ice-sheet-models/modelling-glacier-melt/) (DDM) as a pretext to learn a little bit more about logical tools in numpy/pandas.


**Spend some time to read the webpage linked above explaining how the simple degree day model works.**

We will use the AWS data at Zhadang, but in a good shape this time (with variable names and corrected time, its on OLAT: data_Zhadang_localtime.csv

In [None]:
# imports and defaults
import pandas as pd  
%matplotlib inline 
import matplotlib.pyplot as plt
import numpy as np
pd.options.display.max_rows = 14
import seaborn as sns
sns.set_style('ticks')
sns.set_context('talk')

In [None]:
# read the data
df = pd.read_csv('data/data_Zhadang_localtime.csv', index_col=0, parse_dates=True)

The DDM works best with daily (or sometimes even monthly) time steps. So we resample:

In [None]:
# resample 
df = df.resample('D').mean()

To test the model, we are concentrating on the melting season 2012, which I define as follows:

In [None]:
# from the middle of May onwards
df = df.loc['2012-05-15':]
# how can we see if ablation is happening?
df['SR50'].plot(title='Surface height in m (0 = ice)');

## 1. Simplest DDM

For the simplest DDM, we are going to use a single factor for the entire melting period. The DDM formulation is very simple:

$$Melt = f \cdot PDD$$

Where PDD is the sum total of daily average temperatures above 0°C in a given time period and f is a melting factor. Let's define the melt in meters of snow/ice which is melted away. Determine the unit that $f$ should have in that (quite clumsy) case.

OK, so now we need to count the PDD's over that period. One method that you already learned is following:

In [None]:
# select all days with temp above 0
seltemp = df.TEMP.loc[df.TEMP > 0]
# sum it
seltemp.sum()

For the purpose of our modelling, however, it is easier to define a new variable (PDD), which is the daily average of temperature when it is above 0°C, and zero otherwise. For this we are using the numpy function [np.where](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.where.html):

In [None]:
df['PDD'] = np.where(df.TEMP > 0, df.TEMP, 0)

**Q: Read the documentation for np.where. Try it out with simple data. Can you understand what it does? Verify that the the sum of this  new PDD variable corresponds to our computation above.**

In [None]:
# your answer here

Now we are going to calibrate our model, i.e. we are computing our factor $f$. The total of snow/ice melt (in m) during this period is:

In [None]:
obs_melt = df.SR50.iloc[-1] - df.SR50.iloc[0]  # what is iloc[] by the why? Ask the notebook!
obs_melt

Which gives us a melt factor:

In [None]:
melt_factor = obs_melt / df['PDD'].sum()
melt_factor

It is now very easy to define a variable (MELT1), which is the daily melt due to this factor:

In [None]:
df['MELT1'] = df['PDD'] * melt_factor

**Q: Plot this new variable. What are we looking at?**

In [None]:
# answer here

In order to compare this melt with observations, we should compute its cumulative sum: 

In [None]:
df['MELT1'] = (df['PDD'] * melt_factor).cumsum()

If you plot this variable again, you will see that something is still missing. Therefore, we now add the starting snow depth to this timeseries. We select the first element of the observations array with `iloc[]` and add it to ours:

In [None]:
df['MELT1'] = (df['PDD'] * melt_factor).cumsum() + df.SR50.iloc[0]

Done! Let's plot the result of our modeling approach:

In [None]:
df[['MELT1', 'SR50']].plot();

**Q: Discuss the performance of our model. Where is it performing well? Where is it performing less well? Can you tell why?**

## 2. A more reasonable DDM

It is more reasonable to distinguish between snow and ice in our model. Fortunately, the person who provided the data nicely set the 0 level to the original ice surface before the ablation season.

**Q: Add a new variable IS_SNOW to the dataframe, which is equal to True when the surface is above 0 and to False otherwise.**

In [None]:
# your answer here

In [None]:
df['IS_SNOW'] = np.where(df.SR50 > 0, True, False)

We are now computing the melt factor for the snowmelt period:

In [None]:
# first, compute the PDD sum during the snowmelt period:
pdd_snow = df.PDD.loc[df['IS_SNOW']].sum()
# Then, compute the observed melt during this period. It is simply:
melt_snow = - df.SR50.iloc[0]
# Finally, compute the factor:
fac_snow = melt_snow / pdd_snow
fac_snow

For the ice surface, we have to introduce a new operator, "~". This is the logical operator for "not":

In [None]:
print(~ np.array([True, False, True]))

Once we have this, computing the PDD sum during the ice period is easy:

In [None]:
# first, compute the PDD sum during the ice melt period:
pdd_ice = df.PDD.loc[~ df['IS_SNOW']].sum()  # note the ~
# Then, compute the observed melt during this period. It is simply:
melt_ice = df.SR50.iloc[-1]
# Finally, compute the factor:
fac_ice = melt_ice / pdd_ice
fac_ice

**Q: compare the two factors. Discuss their relative value in light of the physical properties of snow and ice. Does it make sense for you?**

**Q: Define a new variable (MIXED_FAC) in the dataframe, which is equal to fac_snow during the snow period and to fac_ice otherwise. Using the same approach as before, compute a new variable MELT2 wich is the cumulative melt during that period. Plot it together with the SR50 observations.**

In [None]:
# your answer here

In [None]:
df['MIXED_FAC'] = np.where(df['IS_SNOW'], fac_snow, fac_ice)
df['MELT2'] = (df['PDD'] * df['MIXED_FAC']).cumsum() + df.SR50.iloc[0]
df[['MELT2', 'SR50']].plot();

**Q: Discuss the performance of our new model. Is it performing better than before? Can you tell why?**

## 3. A even more reasonable DDM

In our previous models, we compeletely neglected snowfall, which of course is bad. If you are ambitious, you can try to propose solutions to this problem.

Now you can continue with the next exercise (on OLAT as usual).