<div style="border:2px solid black; padding:10px">
    
# <font color="blue">Objective: </font>Describe the process used to calculate the theta best.
</div>

# Import Dependencies

In [1]:
# Python ≥3.5 is required
import sys
assert sys.version_info >= (3, 5)

# Scikit-Learn ≥0.20 is required
import sklearn
assert sklearn.__version__ >= "0.20"

import pandas as pd
import numpy as np

# Load data
%store -r X
%store -r y

<hr style="border-top: 2px solid black;">

# Linear regression using the Normal Equation

 - Linear Regression Modeling using Cost Function
 - Cost function measures the distance between the linear model's prediction and the training examples.
 - It measures how bad the model is at predicting.
 - The linear model finds the best parameters that make the model fit best to the data.
 - Normal Equation is an analytical approach to Linear Regression with a Least Square Cost Function. 

# Normal Equation
<img src="static/images/normal_equation.png" alt="html" class="background-pic" width="25%" height="10%">

 - Hat theta (theta_best) is the value of the theta parameter that minimizes the cost function
 - y is the vector of the target we are to predict

<hr style="border-top: 2px solid black;">

# The theta_best can be calcualted in two steps. The following code will explain these steps more carefully, and show you what the data should look like at each step. This will help you troubleshoot your own projects using this model.

 - Step 1: <code>X_b = np.c_[np.ones((len(X), 1)), X]</code>
 - Step 2: <code>theta_best = np.linalg.inv(X_b.T.dot(X_b).dot(X_b.T).dot(y)</code>

# Explain the theta_best calculation of The Normal Equation in 5 Steps

 - Step 1: Create a 2D Array  <code>X_b</code> from <code>X</code>
  - <code>X_b</code> will containing two values per instance <code>1, and GDP per country</code>
 - Step 2: Create a new Matrix <code>X_b_transposed</code>
  - <code>X_b_transposed</code> is the Transposed Matrix of <code>X_b</code>
 - Step 3: Create a new Matrix <code>mm</code>
  - mm is obtained by multiplying <code>X_b_transposed</code> by <code>X_b</code>
 - Step 4: Create a new Matrix <code>inv_matrix</code>
  - This matrix is the inverse of the <code>mm</code> matrix
 - Step 5: Calculte the optimal parameter Theta_best $\hat{\theta}$
  - theta_best = <code>inv_matrix.dot(X_b.T).dot(y)</code>

# Reminder: the <CODE>X</CODE> is a stacked array with the GDP per capita for each country

 - This is our 'independent variable'

In [2]:
X

array([[ 50961.87],
       [ 43724.03],
       [ 40106.63],
       [ 43331.96],
       [ 17256.92],
       [ 52114.17],
       [ 41973.99],
       [ 37675.01],
       [ 40996.51],
       [ 18064.29],
       [ 12239.89],
       [ 50854.58],
       [ 51350.74],
       [ 29866.58],
       [ 32485.55],
       [ 27195.2 ],
       [101994.09],
       [  9009.28],
       [ 43603.12],
       [ 37044.89],
       [ 74822.11],
       [ 12495.33],
       [ 19121.59],
       [ 15991.74],
       [ 25864.72],
       [ 49866.27],
       [ 80675.31],
       [  9437.37],
       [ 43770.69],
       [ 55805.2 ],
       [  8670.  ],
       [ 13340.91],
       [ 17288.08],
       [ 35343.34],
       [ 13618.57],
       [  9054.91],
       [ 20732.48],
       [  5694.57],
       [  6083.51],
       [ 14210.28]])

In [3]:
X.shape

(40, 1)

<div style="border:1px solid black; padding:10px">
    
<font color="blue">Note:</font><br>
 - 40 is the number of GDPs 
 - 1 is the number of 'columns' in this X array

</div>

<hr style="border-top: 2px solid black;">

# Step 1: Create a 2D Array <code>X_b</code> that contains  <code>1</code> and  <code>X</code> per instance. The X contains the GDP per country

Use numpy's np.c_ and np.ones()
 - np.c_ : Translates slice objects to concatenation along the second axis.
 - Documentation as of 5/15/20:
  - https://numpy.org/devdocs/reference/generated/numpy.c_.html
 - np.ones(): Return a new array of given shape and type, filled with ones.
 - Documentation as of 5/15/20:
  - https://numpy.org/devdocs/reference/generated/numpy.ones.html

In [4]:
# add x0 = 1 to each instance, 
# and then it also adds the GPD from the X array
# np.c_ creates the stacked 2d array out of those values
X_b = np.c_[np.ones((40, 1)), X]
X_b

array([[1.0000000e+00, 5.0961870e+04],
       [1.0000000e+00, 4.3724030e+04],
       [1.0000000e+00, 4.0106630e+04],
       [1.0000000e+00, 4.3331960e+04],
       [1.0000000e+00, 1.7256920e+04],
       [1.0000000e+00, 5.2114170e+04],
       [1.0000000e+00, 4.1973990e+04],
       [1.0000000e+00, 3.7675010e+04],
       [1.0000000e+00, 4.0996510e+04],
       [1.0000000e+00, 1.8064290e+04],
       [1.0000000e+00, 1.2239890e+04],
       [1.0000000e+00, 5.0854580e+04],
       [1.0000000e+00, 5.1350740e+04],
       [1.0000000e+00, 2.9866580e+04],
       [1.0000000e+00, 3.2485550e+04],
       [1.0000000e+00, 2.7195200e+04],
       [1.0000000e+00, 1.0199409e+05],
       [1.0000000e+00, 9.0092800e+03],
       [1.0000000e+00, 4.3603120e+04],
       [1.0000000e+00, 3.7044890e+04],
       [1.0000000e+00, 7.4822110e+04],
       [1.0000000e+00, 1.2495330e+04],
       [1.0000000e+00, 1.9121590e+04],
       [1.0000000e+00, 1.5991740e+04],
       [1.0000000e+00, 2.5864720e+04],
       [1.0000000e+00, 4.

In [5]:
X_b.shape

(40, 2)

<div style="border:1px solid black; padding:10px">
    
<font color="blue">Note:</font><br>
 - 40 is the number of GDPs, this did not change
 - 2 is the number of 'columns' in this X_b 2d array, the 1st column is the 1 we added used np.ones(), and the other value is the GDP

</div>

<hr style="border-top: 2px solid black;">

# Step 2: Create a new Matrix <code>X_b_transposed</code> by transposing <code>X_b</code>

## Matrix transpose using numpy .T attribute.

 - Documentation as of 5/15/20: https://numpy.org/devdocs/reference/generated/numpy.ndarray.T.html
 
## Matrix transpose Example
The transpose of a matrix $M$ is a matrix noted $M^T$ such that the $i^{th}$ row in $M^T$ is equal to the $i^{th}$ column in $M$:

$ Xb^T =
\begin{bmatrix}
  1 & 5.09 \\
  1 & 4.37 
\end{bmatrix}^T =
\begin{bmatrix}
  1 & 1 \\
  5.09 & 4.37
\end{bmatrix}$

In [6]:
X_b_transposed = X_b.T

### Inspect transposition

In [7]:
X_b_transposed

array([[1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00,
        1.0000000e+00, 1.0000000e+00, 1.0000000e+00, 1.0000000e+00],
       [5.0961870e+04, 4.3724030e+04, 4.0106630e+04, 4.3331960e+04,
        1.7256920e+04, 5.2114170e+04, 4.1973990e+04, 3.7675010e+04,
        4.0996510e+04, 1.8064290e+04, 1.2239890e+04, 5.0854580e+04,
        5.1350740e+04, 2.9866580e+04, 3.2485550e+04, 2.7195200e+04,
        1.0199409e+05, 9.0092800e+03, 4.3603120

In [8]:
X_b_transposed.shape

(2, 40)

<div style="border:1px solid black; padding:10px">
    
<font color="blue">Note:</font><br>
 - The X<code>X_b</code>b had a shape of (40, 2)
 - The transposed <code>X_b_transposed</code> has a shape of (2, 40), so now there are two 'rows' and 40 'instances'
 - The 1's have been 'separated' from the GPD data in this 2D transpossed array

</div>

<hr style="border-top: 2px solid black;">

# Step 3: Create a new Matrix <code>mm</code> by Multiplying the transposed maxtrix  <code>X_b_transposed</code> by <code>X_b</code>

 - Use the dot product (scalar product) of two vectors approach
 - Documentation as of 5/15/20: https://numpy.org/devdocs/reference/generated/numpy.dot.html

In [9]:
# Matrix multiplication (mm)
mm = X_b_transposed.dot(X_b)

In [10]:
# Print values of mm
mm

array([[4.00000000e+01, 1.31373628e+06],
       [1.31373628e+06, 6.19433521e+10]])

In [11]:
mm.shape

(2, 2)

<div style="border:1px solid black; padding:10px">
    
<font color="blue">Note:</font><br>
 - <code>mm</code> is now 2 by 2
 - Matrix multiplication will reduce the matrix shape

</div>

<hr style="border-top: 2px solid black;">

# Step 4: Compute the inverse of the matrix using <code>np.linalg.inv(a)</code>

 - Compute the (multiplicative) inverse of a matrix.
 - Documentation as of 5/15/20: https://numpy.org/doc/1.18/reference/generated/numpy.linalg.inv.html

In [12]:
inv_matrix = np.linalg.inv(mm)

In [13]:
# Print output
inv_matrix

array([[ 8.23899871e-02, -1.74738227e-06],
       [-1.74738227e-06,  5.32034411e-11]])

<hr style="border-top: 2px solid black;">

# Step 5: Calculate Theta_best by multiplying the inverse matrix <code>X_b.T</code> to the predictor vector <code>y</code> using the <code>.dot()</code> method

In [14]:
theta_best = inv_matrix.dot(X_b.T).dot(y)

In [15]:
theta_best

array([[5.72408174e+00],
       [2.46904428e-05]])

In [16]:
intercept, line_coef = theta_best[0], theta_best[1]
print('The intercept for the normal equation is {}, \nand the coefficient is: {} '.format(intercept, line_coef))

The intercept for the normal equation is [5.72408174], 
and the coefficient is: [2.46904428e-05] 


<hr style="border-top: 2px solid black;">