**Simple Linear Regression** is a supervised machine learning algorithm used to predict a continuous output using only one input feature.
It finds a straight-line relationship between an independent variable (X) and a dependent variable (Y).

#### What it does
- You give the model one input value (X)
- The model predicts one output value (Y)
- The prediction is done using a straight line

Example: 
Input (X): Study hours 
Output (Y): Marks obtained 

### Mathematical Equation
The straight-line equation used is:  
y=mx+c

Where:
y ‚Üí Predicted output, x ‚Üí Input feature, m ‚Üí Slope of the line (how much y changes when x changes), c ‚Üí Intercept (value of y when x = 0)

#### Key Components
- Independent Variable (X): Input feature
- Dependent Variable (Y): Output to predict
- Regression Line: Best-fit straight line through data points
- **Error (Residual)**: Difference between actual value and predicted value

### Objective
The goal of simple linear regression is to find the best values of m and c such that the prediction error is minimum (usually by minimizing Mean Squared Error).

üîπ *Real-World Examples*
- Predicting house price based on area
- Predicting salary based on years of experience
- Predicting marks based on study hours

üîπ *When to use Simple Linear Regression*
- Only one input feature
- Relationship between X and Y is linear
- Output is continuous (not categories)

## 1Ô∏è‚É£ Model / Hypothesis Equation
In Simple Linear Regression, we assume a **linear relationship** between input `x` and output `y`.

The model (hypothesis) is:

h(x) = Œ∏‚ÇÄ + Œ∏‚ÇÅx

Where:
- h(x) ‚Üí predicted output
- x ‚Üí input feature
- Œ∏‚ÇÄ ‚Üí intercept (value of y when x = 0)
- Œ∏‚ÇÅ ‚Üí slope (change in y for 1 unit change in x)

This is similar to the straight-line equation:

y = mx + c

### **predicted points ***≈∑ = h(x)***

### **ERROR = y - ≈∑**

## 2Ô∏è‚É£ Predicted vs Actual Value

For each data point:
- Actual value ‚Üí y‚ÅΩ‚Å±‚Åæ
- Predicted value ‚Üí h(x‚ÅΩ‚Å±‚Åæ)

Prediction error (residual):

Error = h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ


## 3Ô∏è‚É£ Cost Function (Mean Squared Error)

To measure how good the line fits the data, we use a **cost function**.

Cost function J(Œ∏‚ÇÄ, Œ∏‚ÇÅ):

J(Œ∏‚ÇÄ, Œ∏‚ÇÅ) = (1 / 2m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ )¬≤

Where:
- m ‚Üí number of training examples
- Œ£ ‚Üí summation
- Squaring ensures all errors are positive
- 1/2m simplifies differentiation


## 4Ô∏è‚É£ Objective of Training

The goal of Simple Linear Regression is:

Minimize J(Œ∏‚ÇÄ, Œ∏‚ÇÅ)

That means:
- Find the best Œ∏‚ÇÄ and Œ∏‚ÇÅ
- Such that prediction error is minimum



## 5Ô∏è‚É£ Gradient Descent (Optimization Algorithm)

Gradient Descent is used to minimize the cost function.

General update rule:

Œ∏‚±º := Œ∏‚±º ‚àí Œ± ( ‚àÇJ / ‚àÇŒ∏‚±º )

Where:
- Œ± (alpha) ‚Üí learning rate (small positive value)
- ‚àÇJ / ‚àÇŒ∏‚±º ‚Üí partial derivative of cost function


## 6Ô∏è‚É£ Partial Derivative for Œ∏‚ÇÄ

‚àÇJ / ‚àÇŒ∏‚ÇÄ = (1 / m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ )

Update rule:

Œ∏‚ÇÄ := Œ∏‚ÇÄ ‚àí Œ± (1 / m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ )


## 7Ô∏è‚É£ Partial Derivative for Œ∏‚ÇÅ

‚àÇJ / ‚àÇŒ∏‚ÇÅ = (1 / m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ ) x‚ÅΩ‚Å±‚Åæ

Update rule:

Œ∏‚ÇÅ := Œ∏‚ÇÅ ‚àí Œ± (1 / m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ ) x‚ÅΩ‚Å±‚Åæ


## 8Ô∏è‚É£ Gradient Descent Algorithm (Final Form)

Repeat until convergence:

Œ∏‚ÇÄ := Œ∏‚ÇÄ ‚àí Œ± (1 / m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ )

Œ∏‚ÇÅ := Œ∏‚ÇÅ ‚àí Œ± (1 / m) Œ£·µ¢‚Çå‚ÇÅ·µê ( h(x‚ÅΩ‚Å±‚Åæ) ‚àí y‚ÅΩ‚Å±‚Åæ ) x‚ÅΩ‚Å±‚Åæ


## 9Ô∏è‚É£ Convergence

Convergence means:
- Cost function J becomes minimum
- Œ∏‚ÇÄ and Œ∏‚ÇÅ stop changing significantly
- The regression line becomes the **best-fit line**
