<b> Chapter 4 is a technical chapter which will go over the mathematics behind some of the machine learning models that have been used so far.  I will be focusing on the linear algebra definitions and applications as they more succinctly embody the theories and calculations </b>

## Linear Regression

<b> Simply put linear regression is a weighted average of features and a scalar bias.  The weights are bias are chosen based off the minimization of a cost function.  The most common cost function for linear regressions is Mean Squared Error (MSE) which is rooted (RMSE) when comparing across multiple models.  The reason MSE is used is because the derivative is much simpler to compute and requires significantly less computational power than the rooted version.  Since rooting the function simply scales it, selecting MSE to minimize has the same end result as selected RMSE to minimize.  This is true in general for any cost or reward function and it is encouraged to use the form of the cost function which minimizes computational effort. </b>

# The Normal Equation

${\hat{\theta}}$ = $(\mathbf{X^T}\mathbf{X})^{-1} \mathbf{X^T} \mathbf{y}$

In this equation:

${\hat{\theta}}$ is the value of $\theta$ that minimizes the cost function
<br>
$\mathbf{y}$ is the vector of target values
<br>
$\mathbf{X}$ is a matrix of features with a bias column

In [4]:
' Generate some data to validate the above expression'
import numpy as np

X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

' Introduce the constant bias'
X_b = np.c_[np.ones((100, 1)), X]

' Apply the normal equation'
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
theta_best

array([[4.09113012],
       [2.96427959]])

<b> The bias chosen was 4 and the weight was 3.  The gaussian noise introduced makes it impossible to get these exact values but the derived values are almost perfect.</b>

In [5]:
' Make some predictions'
X_new = np.array([[0], [2]])
X_new_b = np.c_[np.ones((2, 1)), X_new]

y_predict = X_new_b.dot(theta_best)
y_predict

array([[ 4.09113012],
       [10.0196893 ]])