# Group Assignment Q1 Week 1: Breaking the Ice

*[CEGM1000 MUDE](http://mude.citg.tudelft.nl/)*

*Written by: Jialei Ding, based on work by Robert Lanzafame*

*Due: Friday 5th September, 2025. This assignment does not need to be turned in*

## Part 1: Data and Models

### Introduction

<p>
The Nenana Ice Classic is an annual guessing game and fundraiser held in Nenana, Alaska.
Each year you can bet on the day and time the river ice in the Tanana River will break apart along the waterfront of the town Nenana. (the town is called Nenana because it is located just upstream from the confluence of this tributary with the Tanana River). Below you can see its position, with a Netherlands for scale!
<br />
<br />

<img src="images\nenana_map.png" width="600"/>

<br />
<br />
A tripod is constructed on the ice during the first weekend in March. This tripod is connected to a clock via a pulley system on the shore. When the ice starts moving and the tripod drifts 100 feet downriver, the line tightens and triggers the clock to stop, marking the official break-up time. The record earliest break-up was in 2019 (April 14). The latest was in 1964 (May 20).

You can buy a ticket for $3 and place a bet between Feb 1 and April 5. Our goal is to create a model to predict the ice break-up and WIN!!!

Visit the [website](nenanaakiceclassic.com) for more information, or to view a live webcam of the river. 
In addition to the official website, you can read more about the competition here:
- [Blogpost from a local](https://rivahman.blogspot.com/2010/04/nenana-ice-classic.html)
- [More on the clock mechanism](https://www.adn.com/opinions/2017/05/01/this-antique-engineering-marvel-records-spring-breakup-in-alaska-like-clockwork/)

<img src="images\tripod.png" width="600"/>

[image source](https://www.google.com/url?sa=i&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FNenana_Ice_Classic&psig=AOvVaw3tnHXKo9OPR3-LQdijw42w&ust=1694170505343000&source=images&cd=vfe&opi=89978449&ved=0CBAQjRxqFwoTCICe9omrmIEDFQAAAAAdAAAAABAE)
<p/>

### System Components
The setup of the tripod and clock directly determines the definition of ice break-up: a rope is attached from the tripod to a clock on the riverbank. When the rope is pulled tight it stops the clock; this time is used to determine the winning guess of the breakup day and time (down to the minute). As such, these components could be interesting to include in a model. A diagram of the setup is shown below.

![setup](images/setup.png)

*Diagram by Robert Lanzafame*

### Available data and scale
When building a model to predict the breakup time, it is essential to consider the scale of the data available. As a general rule, larger scale data shows overall trends but may be too generic while smaller-scale data has more variation as the data is not smoothed by averaged effects.

The **local scale** concerns itself with the data collected in Nenana itself or immediately around the Tanana River at the town. It's the most fine-grained and directly related to the tripod and ice. Examples include the local river discharge and ice thickness (although this is extremely difficult to measure).

Next, the **watershed scale** includes data pertaining to the Tanana River watershed, including upstream areas that feed into the Tanana. This includes regions like the Alaska Range, tributaries (including Nenana!), and snowfields. An example is the annual snowpack and precipitation rates in the Alaska Range.

The **regional data** covers a larger area than local or watershed scale — for example, Interior Alaska or the entire state. It captures trends affecting multiple towns, rivers, and ecosystems. We can, for instance, look at the sun hours of the region to estimate the amount of solar radiation. 

The **global scale** refers to planet-wide systems that can affect Alaska indirectly, including the little town of Nenana. Average global temperatures is perhaps the most well known measure for this. And we have indeed seen a direct correlation between the average global temperature and ice breakup date!

Below is a table showing some examples of data at the scales discussed.

![data](images/data.png "data")

Data Sources: [USGC data](https://waterdata.usgs.gov/nwis/dvstat), [NOAA data](https://www.ncdc.noaa.gov), [CRU data](http://www.cru.uea.ac.uk/data/), [BerkeleyEarth](https://berkeleyearth.org/data/)

When building a model, you should consider using different scales. Long-term global and regional trends give you the “big picture.” While the watershed and local scale tells you how much gives you information that is more precisely related to the ice breakage.

<p>

### Example Models
These models have been arranged in order of large to small scale and can be considered as one of many tools that could be used to inform your prediction for break-up day and time in the Nenana Ice Classic. Each model may have more than one "sub-model" that you can consider.

<table>
    <tr>
        <td>Model 1</td>
        <td>From a global climate model considering ENSO (sea-surface temperatures in the South Pacific Ocean) and Alaska temperatures (a) determine the heat exchange of the system which leads to (b) precipitation (c) river discharge and (d) ice melting rate.</td>
    </tr>
    <tr>
        <td>Model 2</td>
        <td>Consider the heat of the sun, cloud coverage and the snow/ice cover at a regional scale (whole Alaska for instance) to (a) determine the heat absorption of the ground and/or snow/ice. Then (b) determine the rate of ice melting in the river from the river temperature and discharge due to snowmelt and rainfall.</td>
    </tr>
    <tr>
        <td>Model 3</td>
        <td>Consider river water discharge and river water temperature (a) to determine ice melting rate in the river within 1 km upstream of Nenana, which is used to (b) predict deformation and movement of ice downstream and (c) the tension in the rope until it reaches a point that the clock is stopped.</td>
    </tr>
    <tr>
        <td>Model 4</td>
        <td>Given a velocity of ice moving downstream (slowly) as it melts and deforms (as a plastic continuum), (b) calculate the rope tension as in model 3 above. The velocity is derived from (a) past measurements that are represented with a probability distribution.</td>
    </tr>
</table>

<p/>

<div style="background-color:#AABAB2; color: black; width:90%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>

$\text{Task 1.1:}$
    
Answer the questions in the report under **Part 1** with reference to the example models. You don't need to make changes to the notebook for this task
    
</p>
</div>

## Part 2: Model vs Model
In this part of the assignment we will fit two models to observations of ice break-up date and reflect on their performance.

### Exploring the data

The data we are working with represents the number of days since the new year that it took for the ice in a river to completely melt and break apart. The record goes from 1917 to 2025, which is in total 103 years of measurements. The data in this notebook has been pre-prepared for you, the original data can be downloaded [here](https://daacdata.apps.nsidc.org/pub/DATASETS/nsidc0064_nenana_ice_classic_v2/).

Before starting, run the cell below to import the required packages.


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as sci
import scipy.optimize as opt

<div style="background-color:#AABAB2; color: black; width:90%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>

$\text{Task 2.1:}$
Import the data by running the cell below. To get more used to reading code, try to explain to each other what is happening in the code. 
    
</p>
</div>

In [None]:
# load data
data = np.loadtxt('data/days.csv', dtype=str, delimiter=',', skiprows=1)
print(data[0:10])

# More information about the data
shape_of_data = np.shape(data)
print('Shape of data:',shape_of_data)

mean = np.mean(data[:,1])
std = np.std(data[:,1])
print(f'Mean: {mean:.3f}\n\
Standard deviation: {std:.3f}')

<div style="background-color:#FAE99E; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 90%">

$\text{Solution 2.1:}$

We import the dataset by using the numpy function loadtxt. Inside the function we specify the following:
- We read the file named *data/days.csv*. 
- We also set the delimiter between columns as *delimeter=','*
- We also specify to skip the first row (which contains the header of the columns) as *skiprows=1*.

We also print the first 10 elements of the array, so you can see how it actually looks.

Then, we find the shape of the data. The result is a (X,X) array, i.e., a matrix with 103 rows and 2 columns. The first column contains the year of record, while the second one contains the measured data.

We can also compute the mean and the standard deviation of the variable of interest (second column) to get a sense of how the variable behaves.

</p></div>

<div style="background-color:#AABAB2; color: black; width:90%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>

$\text{Task 2.2:}$

Plot the data in a scatterplot using the code below. Take note of any interesting trends you might see.
    
</p>
</div>

In [None]:
# Plot the data
plt.scatter(data[:, 0], data[:, 1], label='Measured data')
plt.xlabel('Year [-]')
plt.ylabel('Number of days/year [-]')
plt.title(f'Number of days per year between {data[0,0]:.0f}-{data[-1,0]:.0f}')
plt.grid()

<div style="background-color:#FAE99E; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 90%">

$\text{Solution 2.2:}$

The day usually falls between 140 and 110 days after the new year every year.
There seems to be a slight downward trend of the days it takes the ice to melt.

</p></div>

### Model 1

We are going to create a model which allows us to predict the number of days until the ice broke as function of the year. For that, we are going to assume a linear relationship between the variables (a linear model) and we will fit it using linear regression. This is, we will fit a regression model days=m⋅year+qdays=m⋅year+q, where mm represents the slope of the line, and qq is the intercept.

We will do it using functions which were already coded for us. We will use the scipy.stats library which contains the linregress function. For more info see here.


In [None]:
# regression here

One way of assessing the uncertainty around the predictions of a model are confidence intervals. They give us insight into the precision of their predictions by transforming them into probabilities. In short, the 95% confidence interval (significance α=0.05α=0.05) shows the range of values within which my observation would be with a probability of 95%. Here, we want you to focus on their interpretation. In the following weeks (1.3), you will learn more about how to compute them.

**Note:**
**The confidence intervals as computed here are based on a simplification. Later in the course, you will see how to compute the confidence intervals correctly. This will also result in a different 'shape' (curved intervals).**


In [None]:
#intervals here

### Model 2



As we have seen, the data-driven linear model is not really a good choice for representing the data we have. Let's try with one which is slightly more complicated: a non-linear model.

In this section, we will analyze the fitting of a quadratic model as days=Ayear2+Byear+Cdays=Ayear2+Byear+C. The steps are the same as in the previous section, so we will go fast through the code to focus on the interpretation and comparison between the two models.

You do not need to worry about this right now, but in case you are curious: we will make use of the scipy.optimize library, which contains the curve_fit function. For further info on the function see here.


In [None]:
# regression and plot here

<div style="background-color:#AABAB2; color: black; width:90%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>

$\text{Task 2.3:}$
    
After running the two models above, answer **Part 2** of the report.
    
</p>
</div>

<div style="background-color:#FAE99E; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 90%">

$\text{Solution 2.3:}$

See report solutions

</p></div>

## Part 3: More Advanced Models

Add info from Robert slides

Add links to papers shared by Stuart, Kieran and Robert

<div style="background-color:#AABAB2; color: black; width:90%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>

$\text{Task 3.1:}$
    
Consider the information given and linked below, answer **Part 3** of the report. You may want to split the readings between your group members and come together to discuss. 
    
</p>
</div>

<div style="background-color:#FAE99E; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 90%">

$\text{Solution 3.1:}$

See report solutions

</p></div>

<div style="margin-top: 50px; padding-top: 20px; border-top: 1px solid #ccc;">
  <div style="display: flex; justify-content: flex-end; gap: 20px; align-items: center;">
    <a rel="MUDE" href="http://mude.citg.tudelft.nl/">
      <img alt="MUDE" style="width:100px; height:auto;" src="https://gitlab.tudelft.nl/mude/public/-/raw/main/mude-logo/MUDE_Logo-small.png" />
    </a>
    <a rel="TU Delft" href="https://www.tudelft.nl/en/ceg">
      <img alt="TU Delft" style="width:100px; height:auto;" src="https://gitlab.tudelft.nl/mude/public/-/raw/main/tu-logo/TU_P1_full-color.png" />
    </a>
    <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">
      <img alt="Creative Commons License" style="width:88px; height:auto;" src="https://i.creativecommons.org/l/by/4.0/88x31.png" />
    </a>
  </div>
  <div style="font-size: 75%; margin-top: 10px; text-align: right;">
    &copy; Copyright 2025 <a rel="MUDE" href="http://mude.citg.tudelft.nl/">MUDE</a> TU Delft. 
    This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY 4.0 License</a>.
  </div>
</div>