# Trends in atmospheric Carbon Dioxide Concentration (30 points)

The carbon dioxide data measured on Mauna Loa in Hawaii constitute the longest record of direct measurements of CO$_2$ in the atmosphere. They were started by C. David Keeling of the Scripps Institution of Oceanography in March of 1958 at a facility of the National Oceanic and Atmospheric Administration [Keeling, 1976]. NOAA started its own CO$_2$ measurements in May of 1974, and they have run in parallel with those made by Scripps since then [Thoning, 1989].

In [2]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats

In [4]:
# Monthly Average Carbon Dioxide Concentration in parts per million (ppm) 
# measured at Mauna Loa Observatory, Hawaii
# source: https://scripps.ucsd.edu/programs/keelingcurve/permissions-and-data-sources/
data=pd.read_csv('Data/monthly_in_situ_co2_mlo.csv')
data.tail()

Unnamed: 0,1,Year,Month,Date,CO2_ppm
716,723,2018,4,2018.2877,410.31
717,724,2018,5,2018.3699,411.31
718,725,2018,6,2018.4548,410.88
719,726,2018,7,2018.537,408.9
720,727,2018,8,2018.6219,407.1


## Plot the data (3 points)

**Plot the data, CO$_{2}$ concentration (`CO2_ppm`) as a function of `Date`. Don't forget to label the axes.**

## Fit a linear model to the data (8 points)

- **Use `np.polyfit()` to determine the best-fit linear (i.e. degree 1) model to the CO$_{2}$ data**

- **Use `np.polyval()` to find the model CO$_2$ concentration ($y$ values) for the dates ($x$ values).**

- **Plot the data and the best-fit line.**

- **Calculate and plot the residual.**

## Fit a quadratic model to the data (8 points)

- **Use `np.polyfit()` to determine the best-fit quadratic (i.e. degree 2) model to the CO$_{2}$ data**

- **Use `np.polyval()` to find the model CO$_2$ concentration ($y$ values) for the dates ($x$ values).**

- **Plot the data and the best-fit line.**

- **Calculate and plot the residual.**

## Which fit is better and what does that tell us? (3 points)

**Is the quadratic (degree 2) curve fit better or worse than the linear (degree 1) model? What does that tell you about the rate of increase in atmospheric CO$_2$ concentation with time?** 

*write your answer here*

## What will CO2 concentrations be in 2050 (if the same trend continues)?

Using your preferred fit, calculate at the atmospheric CO$_2$ level would be if the data continues along this model in the year 2050. `np.polyval()` will allow you to do this. (2 points)

## Why is there a periodic oscillation in the data?

**You should see that there is periodic structure in the data and in the residual plot. What is the time-scale of this periodic variation? Do some research on your own and write up a paragraph describing why there is such up and down variability in addition to the long-term trend.** (6 points)

*write your answer here*