## Linear Regression

# Mathematical Formulation

Linear regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables. The equation for a simple linear regression model is:

$$
y = \beta_0 + \beta_1 x + \epsilon
$$

where:
	•	 y  is the dependent variable.
	•	 x  is the independent variable.
	•	$ \beta_0 $ is the y-intercept.
	•	$ \beta_1 $ is the slope of the line.
	•	$ \epsilon $ is the error term.

Assumptions

Linear regression makes several key assumptions:
	1.	Linearity: The relationship between the independent and dependent variables is linear.
	2.	Independence: The residuals (errors) are independent.
	3.	Homoscedasticity: The residuals have constant variance at every level of  x .
	4.	Normality: For any fixed value of  x , the residuals are normally distributed.

## Goal

The objective is to find the values of $ \beta_0 $  and $ \beta_1 $  that minimize the sum of squared errors between the predicted values and the actual values.

## Cost Function

The sum of squared errors can be represented as the cost function:

$$
J(\beta_0, \beta_1) = \frac{1}{n} \sum_{i=1}^n \left( y_i - (\beta_0 + \beta_1 x_i) \right)^2
$$

## Gradient Descent

To derive the gradient descent algorithm for linear regression, we start with the cost function:

$$
J(\beta_0, \beta_1) = \frac{1}{n} \sum_{i=1}^n \left( y_i - (\beta_0 + \beta_1 x_i) \right)^2
$$

We need to find the partial derivatives of the cost function with respect to $ \beta_0 $ and $ \beta_1 $:

$$
\frac{\partial J}{\partial \beta_0} = -\frac{2}{n} \sum_{i=1}^n \left( y_i - (\beta_0 + \beta_1 x_i) \right)
$$

$$
\frac{\partial J}{\partial \beta_1} = -\frac{2}{n} \sum_{i=1}^n \left( y_i - (\beta_0 + \beta_1 x_i) \right) x_i
$$

Using these partial derivatives, we update the parameters $ \beta_0 $ and $ \beta_1 $ iteratively:

$$
\begin{align*}
\beta_0 &:= \beta_0 - \alpha \frac{\partial J}{\partial \beta_0} \\
\beta_1 &:= \beta_1 - \alpha \frac{\partial J}{\partial \beta_1}
\end{align*}
$$

Substituting the partial derivatives, we get:

$$
\begin{align*}
\beta_0 &:= \beta_0 + \alpha \frac{2}{n} \sum_{i=1}^n \left( y_i - (\beta_0 + \beta_1 x_i) \right) \\
\beta_1 &:= \beta_1 + \alpha \frac{2}{n} \sum_{i=1}^n \left( y_i - (\beta_0 + \beta_1 x_i) \right) x_i
\end{align*}
$$

This is the gradient descent algorithm for linear regression.

$$
y = \beta_0 + \beta_1 x
$$

After training, we might find:

$$
\begin{align*}
\beta_0 &\approx 0.6667 \
\beta_1 &\approx 1.3333
\end{align*}
$$

This model fits the data well, with an approximate slope of 1.33 and an intercept of approximately 0.67.

## Algorithm
1.  Initialize Parameters: Start with initial guesses for $  \beta_0  $ and $ \beta_1$ .
2.	Compute Predictions: Use the current parameters to compute the predicted values  $\hat{y}$ .
3.	Calculate Error: Compute the difference between the predicted values $ \hat{y}$  and the actual values  y .
4.	Update Parameters: Adjust the parameters to minimize the error using a method such as gradient descent.
5.	Repeat: Iterate over steps 2–4 until the parameters converge to values that minimize the error.


## Here's an example

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

KeyboardInterrupt: 

In [1]:

# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 3, 2, 5, 4])

# Create and train the model
model = LinearRegression()
model.fit(X, y)

# Make predictions
y_pred = model.predict(X)

# Plot the results
plt.scatter(X, y, color='blue', label='Actual')
plt.plot(X, y_pred, color='red', label='Predicted')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()

NameError: name 'np' is not defined