# Fitting CO$_2$ data from Mauna Loa Observatory Reprise

Data taken from [NOAA ESRL](https://www.esrl.noaa.gov/gmd/ccgg/trends/)


Now that we have a toolbox of matrix tools, we can compute polynomial fits directly for arbitrary polynomial order without calculating the variables individually. Recall that, given inputs $\{x_i,y_i\}$ with uncertainties $\{\sigma_i\}$, we can define the input vector 

$$
b_i = y_i / \sigma_i
$$

and then construct the solution for the coefficients $a_j$ as:


$$
\vec{a} = \left(A^TA\right)^{-1}A^T\vec{b}
$$

With the matrix "A" being the design matrix:

$$
A=\left[\begin{array}{cccc}{1 / \sigma_{0}} & {x_{0} / \sigma_{0}} & {x_{0}^{2} / \sigma_{0}} & {\ldots} \\ {1 / \sigma_{1}} & {x_{1} / \sigma_{1}} & {x_{1}^{2} / \sigma_{1}} & {\ldots} \\ {\vdots} & {\vdots} & {\vdots} & {\ddots} \end{array}\right]
$$


The uncertainties can be read off from the diagonal elements of the covariance matrix

$$
C = \left(A^TA\right)^{-1}
$$




In [None]:
import matplotlib.pyplot as plt
import numpy as np
#import numpy.polynomial as poly
from polyfit import polyfit
from read_co2 import read_co2

In [None]:
# Make the plots a bit bigger to see
# NOTE: Must be done in a separate cell
plt.rcParams['figure.dpi'] = 100

## First try the polyfit function

In [None]:
# Get the CO2 data
dates,data,err = read_co2('co2_mm_mlo.txt')
sel = (data > 0) & (err > 0)
dates = dates[sel]
data = data[sel]
err = err[sel]

# Fit to polynomial
a, siga, C, chi2, yfit = polyfit(dates, data, err, 2)

plt.plot( dates, data, label="Data")
plt.plot(dates, yfit, label="Fit")
plt.legend()
plt.show()

if len(data) - 3 > 0 :
    print (' chi-square/d.o.f. = ', chi2/(len(data)-3))
    for i in range(3):
        print ( ' a[%3d] = %+6.2e +- %+6.2e' % (i, a[i], siga[i]) )



## Explicitly try with matrix solution

In [None]:
M = 3
x = dates
y = data
i = np.arange(M)
b = y / err

xv,iv = np.meshgrid(x,i)
ev,iv = np.meshgrid(err,i)

In [None]:
A = xv.T**i / ev.T

In [None]:
C = np.linalg.inv( np.matmul(A.T, A) )

In [None]:
a = np.matmul( np.matmul( C, A.T ), b )
siga = np.sqrt(np.diag(C) )

In [None]:
if len(data) - 3 > 0 :
    for i in range(3):
        print ( ' a[%3d] = %+6.2e +- %+6.2e' % (i, a[i], siga[i]) )

## Finally, try with numpy

In [None]:
poly_coeffs, poly_cov = np.polyfit(dates, data, 2, w=1./err, cov="unscaled")
for i in range(3):
    print(f"a{i} = {poly_coeffs[i]:.4e} +/- {poly_cov[i,i]**0.5:.2e}")
