In [181]:
import numpy as np
import matplotlib.pyplot as plt

# Simple linear regression

In [182]:
age = np.array([5, 6, 7, 8, 9])
height = np.array([100, 105, 108, 112, 115])

Calculate $\bar{x}$ (Mean Age)  
Calculate $\bar{y}$ (Mean Height)

In [183]:
age.mean()
height.mean()

np.float64(7.0)

np.float64(108.0)

$x - \bar{x}$  
$y - \bar{y}$

In [184]:
age - age.mean()
height - height.mean()

array([-2., -1.,  0.,  1.,  2.])

array([-8., -3.,  0.,  4.,  7.])

$(x - \bar{x})^2$

In [185]:
(age - age.mean()) ** 2

array([4., 1., 0., 1., 4.])

$(x - \bar{x})(y - \bar{y})$

In [186]:
(age - age.mean()) * (height - height.mean())

array([16.,  3.,  0.,  4., 14.])

Sum the $(x - \bar{x})^2$ column. This is your Denominator ($S_{xx}$).

In [187]:
denominator = ((age - age.mean()) ** 2).sum()
denominator

np.float64(10.0)

Sum the $(x - \bar{x})(y - \bar{y})$ column. This is your Numerator ($S_{xy}$).

In [188]:
numerator = ((age - age.mean()) * (height - height.mean())).sum()
numerator

np.float64(37.0)

Calculate Slope $m = \frac{\text{Numerator}}{\text{Denominator}}$

In [189]:
m = numerator / denominator
m

np.float64(3.7)

Calculate Intercept $b = \bar{y} - m\bar{x}$

In [190]:
b = height.mean() - (m * age.mean())
b

np.float64(82.1)

Final Equation:$$Height = 3.7(Age) + 82.1$$

Prediction  
For Age 10: $y = 3.7(10) + 82.1 =$ $119.1 \text{ cm}$

In [191]:
x = 10
y = m * x + b
y

np.float64(119.1)

Let's do some more predictions

In [192]:
for x in age:
  y = m * x + b
  print(y)

100.6
104.3
108.0
111.69999999999999
115.4


How far off we are from actual heights?

In [193]:
for x, y in zip(age, height):
  y_ = m * x + b
  print(y, y_)

100 100.6
105 104.3
108 108.0
112 111.69999999999999
115 115.4


No need to manually iterate over samples.  
Should enjoy numpy broadcasting üòè

In [194]:
height_pred = m * age + b
height_pred

array([100.6, 104.3, 108. , 111.7, 115.4])

Calculate residuals

In [195]:
height - height_pred

array([-0.6,  0.7,  0. ,  0.3, -0.4])

Sum Squared Residuals (SSR)

In [196]:
((height - height_pred) ** 2).sum()

np.float64(1.1000000000000085)

Mean Squared Error (MSE)

In [197]:
((height - height_pred) ** 2).sum() / len(age)

np.float64(0.2200000000000017)

Root Mean Squared Error (RMSE)

In [198]:
np.sqrt(((height - height_pred) ** 2).sum() / len(age))

np.float64(0.46904157598234475)

$$R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$$

Where:  
$SS_{res}$ (Residual Sum of Squares): $\sum (y_{true} - \hat{y})^2$ ‚Äî The error our model makes.  
$SS_{tot}$ (Total Sum of Squares): $\sum (y_{true} - \bar{y})^2$ ‚Äî The variation in the data itself.

In [199]:
ss_res = ((height - height_pred) ** 2).sum()
ss_res

ss_tot = ((height - height.mean()) ** 2).sum()
ss_tot

r2 = 1 - ss_res / ss_tot
r2

np.float64(1.1000000000000085)

np.float64(138.0)

np.float64(0.9920289855072463)

# Multiple linear regression (with just 1 feature)

**`X` should be a matrix (two dimensions)**

1. Either expand dimension

In [200]:
X = np.expand_dims(age, axis=1)
X
X.shape  # 5 rows, 1 col

array([[5],
       [6],
       [7],
       [8],
       [9]])

(5, 1)

1. Or reshape (preferred)

In [201]:
X = X.reshape(-1, 1)
X
X.shape  # 5 rows, 1 col

array([[5],
       [6],
       [7],
       [8],
       [9]])

(5, 1)

In [202]:
bias_col = np.ones(len(age))
bias_col

array([1., 1., 1., 1., 1.])

Add bias column at the beginning. To finally have `X` like:  
Shape: $(5 \times 2)$  
Column 1: The Bias (all 1s).  
Column 2: The Age values (5, 6, 7, 8, 9).  

In [203]:
X = np.c_[bias_col, X]
X
X.shape

array([[1., 5.],
       [1., 6.],
       [1., 7.],
       [1., 8.],
       [1., 9.]])

(5, 2)

Create $y$. This is just the Height values.  
Shape: $(5 \times 1)$

In [204]:
y = height.reshape(-1, 1)
y
y.shape

array([[100],
       [105],
       [108],
       [112],
       [115]])

(5, 1)

Party A: The Transpose ($X^T$)  
It should be shape $(2 \times 5)$.  

In [205]:
X.T
X.T.shape

array([[1., 1., 1., 1., 1.],
       [5., 6., 7., 8., 9.]])

(2, 5)

Party B: The Gram Matrix ($X^T X$)

In [206]:
X.T.shape, X.shape

((2, 5), (5, 2))

In [207]:
X.T  # Rows of this will be multiplied to -
X  # columns of this

array([[1., 1., 1., 1., 1.],
       [5., 6., 7., 8., 9.]])

array([[1., 5.],
       [1., 6.],
       [1., 7.],
       [1., 8.],
       [1., 9.]])

**Matrix Multiplication**

For matrices $A \in \mathbb{R}^{m \times n}$ and $B \in \mathbb{R}^{n \times p}$, the elements of the product $C = AB$ are given by:$$C_{ij} = \sum_{k=1}^{n} A_{ik} B_{kj}$$

**Explanation:**

$A$ is an $m \times n$ matrix (rows $\times$ columns).

$B$ is an $n \times p$ matrix.

The resulting matrix $C$ is $m \times p$.

To find the value at row $i$, column $j$ of the result, you perform a dot product of the $i$-th row of $A$ and the $j$-th column of $B$.

In [208]:
C = np.zeros(shape=(2, 2))

for i in range(2):
  for j in range(2):
    C[i, j] = np.dot(X.T[i], X[:, j])

C

array([[  5.,  35.],
       [ 35., 255.]])

Gram matrix will look like this:
$$\begin{bmatrix} \text{Count}(n) & \sum x \\ \sum x & \sum x^2 \end{bmatrix}$$

In [209]:
gram_matrix = X.T @ X
gram_matrix
gram_matrix.shape

array([[  5.,  35.],
       [ 35., 255.]])

(2, 2)