# Deriving Least Squares Solution for Linear Regression

We begin with the cost function used in simple linear regression, which aims to minimize the squared difference between predicted and actual values.

In [None]:
# Cost Function
E = sum((y_i - (a * x_i + b))**2 for i in range(N))

## Step 1: Compute the Partial Derivatives
We compute the partial derivatives of E with respect to both parameters a and b.

In [None]:
dE_da = -2 * sum(x_i * (y_i - (a * x_i + b)) for i in range(N))
dE_db = -2 * sum((y_i - (a * x_i + b)) for i in range(N))

## Step 2: Set the derivatives to zero to find the minimum
To minimize the error, we set the derivatives equal to zero.

In [None]:
# Set derivatives to zero
# sum(x_i * (y_i - a * x_i - b)) = 0
# sum(y_i - a * x_i - b) = 0

## Step 3: Use summation shorthand
To simplify, we use the following notation:
- S_x  = sum x_i
- S_y  = sum y_i
- S_xx = sum x_i^2
- S_xy = sum x_i * y_i
- N    = number of data points

In [None]:
# Rewriting equations using shorthand
# Equation 1: S_xy - a * S_xx - b * S_x = 0
# Equation 2: S_y - a * S_x - b * N = 0

## Step 4: Solve for 'a' using Equation 1 and 2
First, solve Equation 2 for b and substitute into Equation 1.

In [None]:
b = (S_y - a * S_x) / N

In [None]:
a = (N * S_xy - S_x * S_y) / (N * S_xx - S_x**2)

## Step 5: Solve for 'b' using the solved value of 'a'

In [None]:
b = (S_y - a * S_x) / N

## Final Closed-form Expressions for Linear Regression Coefficients

In [None]:
a = (N * sum(x_i * y_i) - sum(x_i) * sum(y_i)) / (N * sum(x_i**2) - (sum(x_i))**2)
b = (sum(y_i) - a * sum(x_i)) / N