<a href="https://colab.research.google.com/github/ravi18kumar2021/numpy-to-viz/blob/main/numpy/simple-linear-regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 🎯 Goal:
### Fit a line y = mx + b to a dataset without using sklearn, only using NumPy formulas.

### 🧠 Things to learn
* Line fitting using least squares
* Working with 1D NumPy arrays
* Implementing regression manually

Equation of Linear regression

*General Form:*

$
y = mx + b
$

where:
- \($y$\) is the dependent variable (target)
- \($x$\) is the independent variable (feature)
- \($m$\) is the intercept
- \($b$\) is the slope (coefficient)

### Least Square Method Formula

$\beta_1 = \frac{n\sum xy - \sum x \sum y}{n\sum x^2 - (\sum x)^2}$, $\beta_0 = \frac{\sum y \sum x^2 - \sum x \sum xy}{n\sum x^2 - (\sum x)^2}$

In [60]:
import numpy as np

In [61]:
x = np.array([3, 9, 5, 3])
y = np.array([8, 6, 4, 2])

In [62]:
sum_x = np.sum(x)
sum_y = np.sum(y)

In [63]:
sum_x2 = np.sum(np.pow(x, 2))
sum_xy = np.sum(x*y)

In [64]:
n = len(x)

In [65]:
# finding slope m
m_num = n * sum_xy - sum_x * sum_y
m_den = n * sum_x2 - sum_x**2

In [66]:
m = m_num / m_den

In [67]:
m

np.float64(0.16666666666666666)

In [68]:
# finding intercept b
b = (sum_y - m * sum_x) / n
b

np.float64(4.166666666666667)

In [69]:
# equation in general form
print(f'y = {m:.3f}x + {b:.3f}')

y = 0.167x + 4.167


In [70]:
y_pred = m * x + b
y_pred

array([4.66666667, 5.66666667, 5.        , 4.66666667])

In [71]:
# finding mean sqaured error (mse)
np.mean((y - y_pred)**2)

np.float64(4.833333333333333)

In [72]:
def simple_linear_regression(x, y):
  sum_x = np.sum(x)
  sum_y = np.sum(y)
  sum_x2 = np.sum(np.pow(x, 2))
  sum_xy = np.sum(x*y)
  n = len(x)
  m_num = n * sum_xy - sum_x * sum_y
  m_den = n * sum_x2 - sum_x**2
  m = m_num / m_den
  b = (sum_y - m * sum_x) / n
  return m, b

In [73]:
def display_info(m, b):
  print(f'General Equation : y = {m:.3f}x', end=' ')
  if b < 0:
    print(f'- {-b:.3f}')
  else:
    print(f'+ {b:.3f}')
  y_pred = m * X + b
  print('Predicted values :', y_pred)
  mse = np.mean((y - y_pred)**2)
  print(f'Mean Sqaured Error : {mse:.3f}')

In [74]:
X = np.array([1, 2, 3, 4, 5])
y = np.array([3, 4, 2, 5, 6])

In [75]:
m, b = simple_linear_regression(X, y)

In [76]:
display_info(m, b)

General Equation : y = 0.700x + 1.900
Predicted values : [2.6 3.3 4.  4.7 5.4]
Mean Sqaured Error : 1.020
