# Normal Equation

Data imported in this algorithm:
1. **test.csv** : dataset used to train our model
2. **train.csv** : dataset used to test the trained model

## 1. Mathematical Background

* Let's say that we have the following equations for some weight values $w_0$ and $w_1$:

$y^{(1)} = w_0 + w_1x^{(1)}$<br>
$y^{(2)} = w_0 + w_1x^{(2)}$<br>
$y^{(3)} = w_0 + w_1x^{(3)}$<br>
$y^{(4)} = w_0 + w_1x^{(4)}$<br>
$y^{(5)} = w_0 + w_1x^{(5)}$<br>
<br>
* We can express these equations in terms of matrices and vectors $\matrix y = \matrix Xw$ where :

$\mathbf{y}$ = $
\begin{bmatrix}
\ y^{(1)} \\
\ y^{(2)} \\
\ y^{(3)} \\
\ y^{(4)} \\
\ y^{(5)} \\
\end{bmatrix}$ &ensp; &ensp; &ensp; $\mathbf{X}$ = $
\begin{bmatrix}
\ 1 & x^{(1)} \\
\ 1 & x^{(2)} \\
\ 1 & x^{(3)} \\
\ 1 & x^{(4)} \\
\ 1 & x^{(5)} \\
\end{bmatrix}$ &ensp; &ensp; &ensp; $\mathbf{w}$ = $
\begin{bmatrix}
\ w_0 \\
\ w_1 \\
\end{bmatrix}$<br><br>

* We can use normal equation to find expected weight values:<br><br>
$\large \hat{\mathbf{w}} = (\mathbf{X^{T}}\mathbf{X})^{-1}\mathbf{X^{T}}\mathbf{y}$

## 2. Linear Regression Algorithm

LinearRegression class based on normal equation that returns weight values (intercept and coefficients) for given matrix X and vector y.

In [8]:
import pandas as pd
import numpy as np

class LinearRegression(object):
    def __init__(self, fit_intercept=True, copy_X=True):
        """
        fit_intercept is given False when matrix X has only one variable. If fit_intercept is False,
        a column vector of ones is given to the first column of matrix X.
        
        If fit_intercept is True, we assume that there are more than two variables in matrix X.
        """
        self.fit_intercept = fit_intercept
        self.copy_X = copy_X

        self._coef = None
        self._intercept = None
        self._new_X = None

    def fit(self, X, y):
        """
        input: matrix X containing values of X, vector y containing values of y
        output: weight values (intercept and coefficients)
        """
        #assign the matrix values to initial value
        self._new_X = X

        #a column vector of ones is given to the first column of matrix X
        if self.fit_intercept==True:
            ones = np.ones((X.shape[0], 1), dtype=int)
            X = np.concatenate([ones, X], axis=1)
        
        #perform normal equation
        normal_equa = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y)
        self._intercept = normal_equa[0]
        self._coef = normal_equa[1:]
        

    def predict(self, X):
        if self.fit_intercept==True:
            ones = np.ones((X.shape[0], 1), dtype=int)
            X = np.concatenate([ones, X], axis=1)

        #perform prediction
        pred = np.array([self._intercept, self._coef])
        return np.dot(X, pred)
        

    @property
    def coef(self):
        return self._coef

    @property
    def intercept(self):
        return self._intercept

## 3. Import Dataset

In [3]:
df = pd.read_csv("./test.csv")
df.head()

Unnamed: 0,x,y
0,77,79.775152
1,21,23.177279
2,22,25.609262
3,20,17.857388
4,36,41.849864


In [4]:
X = df["x"].values.reshape(-1,1)
y = df["y"].values

In [5]:
print("Shape of X : " + str(X.shape))
print("Shape of y : " + str(y.shape)) #1-dimensional array

Shape of X : (300, 1)
Shape of y : (300,)


## 4. Modeling

In [10]:
#create a linear regression object with fit_intercept=True
lr = LinearRegression(fit_intercept=True)

In [11]:
#fit the model using matrix X and vector y
lr.fit(X, y)

In [15]:
#use intercept function to find the intercept
lr.intercept

-0.461810773661153

In [13]:
#use coef function to find the coefficient
lr.coef

array([1.01433536])

In [16]:
#predict the y values using the model
lr.predict(X)[:10]

array([77.64201157, 20.83923168, 21.85356704, 19.82489633, 36.05426201,
       14.75321955, 62.42698124, 95.90004796, 19.82489633,  4.609866  ])

Compare the expected values of y to the actual values of y

In [19]:
np.array(df['y'][:10])

array([79.77515201, 23.17727887, 25.60926156, 17.85738813, 41.84986439,
        9.80523488, 58.87465933, 97.61793701, 18.39512747,  8.74674765])

## 5. Data Validation

In [20]:
from sklearn import linear_model
sk_lr = linear_model.LinearRegression(normalize=False)
sk_lr.fit(X, y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)

In [27]:
#find the intercept using sklearn
print("Intercept from normal equation algorithm : " + str(lr.intercept))
print("Intercept from sklearn : " + str(sk_lr.intercept_))

Intercept from normal equation algorithm : -0.461810773661153
Intercept from sklearn : -0.4618107736611776


In [28]:
#find the coefficient using sklearn
print("Intercept from normal equation algorithm : " + str(lr.coef))
print("Intercept from sklearn : " + str(sk_lr.coef_))

Intercept from normal equation algorithm : [1.01433536]
Intercept from sklearn : [1.01433536]


In [24]:
#check if the two values are close
np.isclose(lr.coef, sk_lr.coef_)

array([ True])

In [29]:
#expected values using sklearn
sk_lr.predict(X)[:10]

array([77.64201157, 20.83923168, 21.85356704, 19.82489633, 36.05426201,
       14.75321955, 62.42698124, 95.90004796, 19.82489633,  4.609866  ])

### 5.1 Data Validation Continued

In [30]:
df_test = pd.read_csv("./train.csv")
df_test.head()

Unnamed: 0,x,y
0,24,21.549452
1,50,47.464463
2,15,17.218656
3,38,36.586398
4,87,87.288984


In [33]:
X_test = df_test["x"].values.reshape(-1,1)

In [34]:
#predict the expected values via sklearn using x values from df_test
sk_lr.predict(X_test)[:5]

array([23.88223775, 50.25495698, 14.75321955, 38.08293272, 87.78536512])

In [35]:
#predict the expected values via normal equation algorithm using x values from df_test
lr.predict(X_test)[:5]

array([23.88223775, 50.25495698, 14.75321955, 38.08293272, 87.78536512])

## 6. Multiple Variables

In [36]:
df = pd.read_csv("./mlr09.csv")
df.head()

Unnamed: 0,height_in_feet,weight_in_pounds,successful_field_goals,percent_of_successful_free_throws,average_points_scored
0,6.8,225,0.442,0.672,9.2
1,6.3,180,0.435,0.797,11.7
2,6.4,190,0.456,0.761,15.8
3,6.2,180,0.416,0.651,8.6
4,6.9,205,0.449,0.9,23.2


In [37]:
#isolate the target values
y = df["average_points_scored"].values

In [38]:
#isolate the variables
df.iloc[:,:-1].head()

Unnamed: 0,height_in_feet,weight_in_pounds,successful_field_goals,percent_of_successful_free_throws
0,6.8,225,0.442,0.672
1,6.3,180,0.435,0.797
2,6.4,190,0.456,0.761
3,6.2,180,0.416,0.651
4,6.9,205,0.449,0.9


In [39]:
X = df.iloc[:,:-1].values

In [40]:
X[:5]

array([[  6.8  , 225.   ,   0.442,   0.672],
       [  6.3  , 180.   ,   0.435,   0.797],
       [  6.4  , 190.   ,   0.456,   0.761],
       [  6.2  , 180.   ,   0.416,   0.651],
       [  6.9  , 205.   ,   0.449,   0.9  ]])

### 6.1 Rescaling

In [41]:
#rescale the values
mu_X = np.mean(X, axis=0)
std_X = np.std(X, axis=0)

rescaled_X = (X - mu_X) / std_X

In [42]:
rescaled_X[:5]

array([[ 0.46843663,  0.50336336, -0.12692668, -0.70404955],
       [-0.63137111, -0.99746237, -0.25187012,  0.55584824],
       [-0.41140956, -0.66394554,  0.12296022,  0.19299768],
       [-0.85133266, -0.99746237, -0.59100234, -0.91571238],
       [ 0.68839818, -0.1636703 , -0.00198323,  1.59400403]])

### 6.2 Multiple Variable Validation

In [43]:
#fit the weight values using normal equation algorithm
lr.fit(rescaled_X, y)

In [45]:
#use coef function to find the coefficient
lr.coef

array([-1.67779283,  0.28359762,  2.68586629,  1.12816882])

In [44]:
#use intercept function to find the intercept
lr.intercept

11.790740740740738

In [46]:
#use sklearn to fit weight values
sk_lr.fit(rescaled_X, y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)

In [47]:
#coefficient from sklearn
sk_lr.coef_

array([-1.67779283,  0.28359762,  2.68586629,  1.12816882])

In [48]:
#intercept from sklearn
sk_lr.intercept_

11.790740740740736