# Fitting Functions to Real Data

In this notebook, we shall download measurements of carbon dioxide in the atmosphere from 1958 to 2019. Your task will be to fit a function to this data in order to predict what the $CO_2$ concentration in the atmosphere is going to be in 2050. Our first step is to download the data, which we do below

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
from plotter import plot_fit
%matplotlib inline

# Here is the url source to a file which contains the measurements
url = "ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt"

# Here we dowload the file, ignore the headers, and specify the data is separated by spaces
df = pd.read_csv(url, skiprows=72, sep='\s+').reset_index(drop=True)
# Here we rename the columns to something more convenient 
df.columns = ["year", "month", "decimal_year", "co2_av", "co2_interp", "co2_trend", "ignore"]
# Missing data is filled in with -99.99, so we simply get rid of it 
df = df[~df['co2_av'].isin([-99.99])]
# Removing a column we don't need 
del df['ignore']
# Here we view the first five entries of the table 
df.head()


Where in the data above, the only data we need to be concerned with is the `decimal_year` column and the `co2_av` column. Let's plot those and see what they look like

In [None]:
ax = df.plot(x='decimal_year', y='co2_av', kind = 'scatter', figsize =(12,8), s=4)
ax.set_xlabel("Year" , size = 16)
ax.set_ylabel("$CO_2$ (PPM)", size = 16)
plt.show()


From the plot above, you probably notice that there's an increasing trend and some periodic behavior. Your task is to fit a function to this data. Feel free to make _any_ transformations you see fit to the data and fit _any_ function you wish to the data as well. Feel free to try multiple. Be creative! 


## Getting Started

Below we set up some place holder code for you to get working on this task. Good Luck!

In [None]:
# Here we extract only the data we want from the table .
x_data = df['decimal_year']
y_data = df['co2_av']

# Feel free to transform this data as you see fit! 
# For example, if you wanted to take the square root of the entire column, you could type
# x_data_sqrt = np.sqrt(x_data)

# Create your function to fit here!
def co2_fit(x, YOUR_OTHER_ARGUMENTS_HERE):
    return # Your function here



values, fit_quality  = curve_fit(co2_fit, x_data, y_data)
