### What is linear regression ?

$ y = ax + b $

* <var>x</var>: input features
* <var>w</var>: weight (slope)
* <var>b</var>: bias (intercept)
* <var>y</var>: predicted

for example:

$ y = x² + 1$

dataset example:

| x | y |
|---|---|
| 1 | 2 |
| 2 | 5 |
| 3 | 10|

$
Y = \begin{pmatrix}
y_1 \\
y_2 \\
y_3 \\
\vdots \\
y_n
\end{pmatrix}
$

$
\begin{pmatrix}
x_1 \\
x_2 \\
x_3 \\
\vdots \\
x_n
\end{pmatrix}
\cdot
\begin{pmatrix}
w_1
\end{pmatrix} + b
= \begin{pmatrix}
x_1w_1 + b \\
x_2w_1 + b \\
x_3w_1 + b \\
\vdots \\
x_nw_1 + b \\
\end{pmatrix}
= Ŷ
$
that is
$
Ŷ = \begin{pmatrix}
ŷ_1 \\
ŷ_2 \\
ŷ_3 \\
\vdots \\
ŷ_n
\end{pmatrix}
$

$
errors = Ŷ - Y
$

##### Descent gradients
* $ Y = WX^T + b $
* $ MSE = \frac{errors²}{n} $
* $ errors = Ŷ - (WX^T + b)$

$$ \frac{\delta MSE}{\delta W} = \frac{\delta MSE}{\delta errors} \cdot \frac{\delta errors}{\delta W} $$

$ \frac{\delta MSE}{\delta W} = \frac{-2X^Terrors}{n}$
$\implies$
$ \frac{\delta MSE}{\delta W} = \frac{X^Terrors}{n}$

$$ \frac{\delta MSE}{\delta b} = \frac{\delta MSE}{\delta errors} \cdot \frac{\delta errors}{\delta b} $$

$ \frac{\delta MSE}{\delta b} = \frac{-2errors}{n}$
$\implies$
$ \frac{\delta MSE}{\delta b} = \frac{errors}{n}$

##### update weights and bias

$ new_W = old_W - \alpha\cdot \frac{\delta MSE}{\delta W} $

$ new_b = old_b - \alpha\cdot \frac{\delta MSE}{\delta b} $

In [1]:
import numpy as np
import math

In [2]:
class LinearRegression:
  def __init__(self, learning_rate=0.01, n_iterations=1000):
    self.learning_rate = learning_rate
    self.n_iterations = n_iterations
    self.weights = None
    self.bias = None
  pass

  def fit(self, X, y):
    n_samples, n_features = X.shape
    print(f"n_samples: {n_samples}, n_features: {n_features}")
    self.weights = np.random.normal(0, 0.1, size=(n_features, 1)) # [0]
    self.bias = 0

    for _ in range(self.n_iterations):
      # make predictions
      y_predicteds = np.dot(X, self.weights) + self.bias # Linear model prediction y = ax + b [[1], [2]] x [0]
      # calculate the errors
      errors = y_predicteds - y

      # update weights and bias (gradients)
      self.weights -= self.learning_rate * (1/n_samples) * np.dot(X.T, errors)
      self.bias -= self.learning_rate * (1/n_samples) * np.sum(errors)
    pass
  pass

  def predict(self, X):
    y_predicteds = np.dot(X, self.weights) + self.bias
    return y_predicteds
  pass

In [3]:
X = np.array([[1], [2], [3], [4], [5]])
Y = np.array([[2], [5], [10], [17], [26]])

In [4]:
learning_rate = 0.01
n_iterations = 10000

linear_model = LinearRegression(learning_rate, n_iterations)
linear_model.fit(X, Y)
predictions = linear_model.predict(X) # data leaking, but for simplicity we use the same data for training and prediction

print(predictions[:10])

n_samples: 5, n_features: 1
[[2.30558229e-07]
 [6.00000014e+00]
 [1.20000001e+01]
 [1.80000000e+01]
 [2.39999999e+01]]


In [8]:
# Create a dummy dataset
X = np.random.normal(loc=0.0, scale=1, size=(1000, 1)) # features
Y = X*2 + 3 + np.random.normal(0.0, 0.1, size=(1000, 1)) # labels

n = int(len(X) * 0.8)

x_train, x_test = X[:n], X[n:]
y_train, y_test = Y[:n], Y[n:]

learning_rate = 0.01
n_iterations = 1000

linear_model = LinearRegression(learning_rate, n_iterations)
linear_model.fit(x_train, y_train)
predictions_test = linear_model.predict(x_test)
# print(f"predictions: {predictions_test[:10]},\n\n expecteds: {y_test[:10]}")

score_card = []
for i in range(predictions_test.shape[0]):
  if math.isclose(predictions_test[i][0], y_test[i][0], rel_tol=0.1):
    score_card.append(1)
  else:
    score_card.append(0)
  pass
pass

print(score_card[:10])
acc = np.sum(score_card) / len(score_card) * 100 # accuracy
print(f"Accuracy: {acc:.2f}%")

n_samples: 800, n_features: 1
[1, 1, 1, 1, 1, 0, 0, 1, 1, 1]
Accuracy: 91.50%
