This notebook reviews the exercise of finding the line that best describes a set of datapoints by minimizing the $\chi^2$ between the data and the linear model.  As an example, we use measurements of stars' masses and radii, and try to find the relationship between them.

The "%matplotlib inline" allows us to generate plots as we go along.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

**Exercise 1**

We read in a table which has columns of mass (in solar masses), radius (in solar radii), and the errors in those two quantities.  In this file, comments are denoted by semicolons and the values are separated by commas.

In [None]:
data = np.loadtxt('Lopez-Morales07_table1.txt', comments=';', delimiter=',')
Mstar = data[:,0]
Merr = data[:,1]
Rstar = data[:,2]
Rerr = data[:,3]

Let's plot the data to see what it looks like!  Note that matplotlib can use latex notation, hooray!

In [None]:
plt.xlabel('Mass (M$_\odot$)')
plt.ylabel('Radius (R$_\odot$)')
plt.errorbar(Mstar, Rstar, xerr=Merr, yerr=Rerr, fmt='o')

This data looks like it might be well-described by a straight line, y = a0 + a1 x.  We want to find the values of a0 and a1 that fit the data best.

One way to describe the difference between a model and data is by $\chi^2$ ("chi-squared"), defined as

$\chi^2 = \sum_i \frac{(\rm{model}(x_i) - \rm{data}(x_i))^2}{\sigma_i^2}$

The skeleton of a function to calculate $\chi^2$ is below.  **To do:** Implement this function.  

(Hints: 
- If you use the function np.sum which sums over all the elements in an array, it is only one line!  
- Recall that exponentiation in Python uses the operator \*\*, not ^.  
- For now we will only consider the errors in the y-values - we'll figure out how to deal with 2d errors later!)

In [None]:
def chi2(A0, A1, X, D, SIG): 
    ''' Calculates chi^2 between the data (X, D) and the line described by y = A0 + A1*x
    
        Input
        =====
        A0 : y-intercept of the model
        A1 : slope of the model
        X : array of x-values from the data
        D : array of y-values from the data
        SIG : the expected variance of the y-values
            (could be a single value, or an array of the same length as X and D)
    '''
    # YOUR CODE HERE
    
    return chi2 

Now we can use $\chi^2$ to quantify the goodness-of-fit between the model and data.  The code below defines a set of x-data, a slope, and an intercept, and calculates the corresponding y-values. 

In [None]:
xplot = np.linspace(0,1,20)
a0 = 0.4    # y-intercept
a1 = 0.0    # slope
yplot = a0 + a1*xplot

This code plots the line we just defined on top of the data.  It also calculates $\chi^2$ using the function you just defined, and quotes the value in the top left panel.

**To do:** Try playing around with the values of a0 and a1, and see how small of a $\chi^2$ you can get!

In [None]:
plt.xlabel('Mass (M$_\odot$)')
plt.ylabel('Radius (R$_\odot$)')
plt.plot(Mstar, Rstar, 'o', color='blue')
plt.plot(xplot,yplot,color='red')
plt.text(0.1, 0.9, "$\chi^2$ = " + "{:1.1f}".format(chi2(a0,a1,Mstar,Rstar,Rerr)))

**Exercise 2**

Obviously, manually playing around with a0 and a1 is not the most efficient way to find the best values.  Instead, let's define a grid of a0 and a1 values, and see which has the smallest $\chi^2$.

The code below initializes arrays of a0 and a1, as well as an empty array for the corresponding $\chi^2$ values.  
**To do:** Fill in the $\chi^2$ array by iterating over the a0, a1 values.

In [None]:
a0grid = np.linspace(-0.5, 0.5, 100)
a1grid = np.linspace(0.5, 1.5, 100)
chi2grid = np.zeros((100,100))

# YOUR CODE HERE

Execute the code below to produce an image of your $\chi^2$ grid.  (The flips and transpositions are there to make the image display in the proper orientation... I don't know why Python is like this.)

How close was your eyeball estimate from earlier to the minimum $\chi^2$?

In [None]:
plt.imshow(np.flipud(np.log10(chi2grid).T), extent=[-0.5, 0.5, 0.5, 1.5])
plt.xlabel('a0')
plt.ylabel('a1')

The np.argmin() function gives you the index of the minimum value of $\chi^2$ within the array.  Unfortunately, this is a "flattened" index, which means that rather than an (i,j) for the (row, column), it just gives you a (k) out of (row\*column).  However, since our chi2grid array has a shape of 100x100, this conveniently means that the first two digits of the index are the row, and the last two are the column.  (Think about why this is so!)

In [None]:
print(np.argmin(chi2grid))

**To do:** Find the values of a0 and a1 that correspond to this location.  

In [None]:
# FINISH THESE LINES
a0min = 
a1min = 
print(a0min, a1min)

Now let's plot this location on top of the image from earlier.  If everything has come out right, we should get a red dot right on top of the $\chi^2$ minimum!

In [None]:
plt.axis([-0.5, 0.5, 0.5, 1.5])
plt.imshow(np.flipud(np.log10(chi2grid).T), extent=[-0.5, 0.5, 0.5, 1.5])
plt.xlabel('a0')
plt.ylabel('a1')
plt.plot([a0min], [a1min], 'o', color='red')

Finally, let's check out how our model looks in comparison to the data.  
**To do:** Define a line using your best-fit values, and plot it over the data below!

In [None]:
plt.xlabel('Mass (M$_\odot$)')
plt.ylabel('Radius (R$_\odot$)')
plt.errorbar(Mstar, Rstar, yerr=Rerr, fmt='o', color='blue')
plt.text(0.1, 0.9, "$\chi^2$ = " + "{:1.1f}".format(chi2(a0min,a1min,Mstar,Rstar,Rerr)))

# YOUR CODE HERE

**Exercise 3**

Let's use linear algebra to find the exact solution!  We've just figured out that 

$$ 
\begin{bmatrix}
a_0 \\
a_1 
\end{bmatrix}
= 
\frac{1}{\Delta}
\begin{bmatrix}
\sum x_i^2/\sigma_i^2 & - \sum x_i/\sigma_i^2 \\
-\sum x_i/\sigma_i^2 & \sum 1/\sigma_i^2 
\end{bmatrix}
\begin{bmatrix}
\sum y_i^2/\sigma_i^2 \\
-\sum x_i y_i/\sigma_i^2 
\end{bmatrix}
$$

where $\Delta = (\sum x_i^2/\sigma_i^2)(\sum 1/\sigma_i^2) - (\sum x_i/\sigma_i^2)^2$ is the determinant of the square matrix.

**To do:** Compute all of the necessary sums for our data.  I've done the first two for you.

In [None]:
nothingsum = np.sum(1./Rerr**2)
xsum = np.sum(Mstar/Rerr**2)
# FINISH THESE LINES
x2sum = 
ysum = 
xysum = 

**To do:** Compute the determinant $\Delta$.

In [None]:
# FINISH THIS LINE
determinant = 

**To do:** Compute $a_0$ and $a_1$!  How do they compare to your earlier values?

In [None]:
# FINISH THESE LINES
a0lin = 
a1lin = 

In [None]:
print(a0lin, a0min)

In [None]:
print(a1lin, a1min)