# Supervised Learning: Regression Models and Performance Metrics


### Question 1: What is Simple Linear Regression (SLR)? Explain its purpose.
Simple Linear Regression (SLR) is a statistical technique that models the relationship between one independent variable (X) and one dependent variable (Y) by fitting a straight line. Its main purpose is to predict Y based on X and understand how changes in X affect Y.

**Example:** Predicting a person's weight based on height.

### Question 2: What are the key assumptions of Simple Linear Regression?
1. **Linearity:** The relationship between X and Y is linear.
2. **Independence:** Observations are independent.
3. **Homoscedasticity:** Equal variance of residuals across all X values.
4. **Normality of errors:** Residuals are normally distributed.
5. **No multicollinearity:** Not applicable in SLR since there’s only one X.

**Example:** A random scatter plot of residuals supports these assumptions.

### Question 3: Write the mathematical equation for a simple linear regression model and explain each term.
**Equation:**  
\( Y = β_0 + β_1X + ε \)

**Where:**  
- Y = Dependent variable  
- X = Independent variable  
- β₀ = Intercept (value of Y when X=0)  
- β₁ = Slope (change in Y for a one-unit change in X)  
- ε = Error term (difference between actual and predicted Y)

**Example:** If Y = 5 + 2X, then Y increases by 2 units for every 1-unit increase in X.

### Question 4: Provide a real-world example where simple linear regression can be applied.
**Example:** Predicting sales revenue based on advertising spend.  
It helps businesses estimate how changes in ad budget impact revenue.

### Question 5: What is the method of least squares in linear regression?
The method of least squares finds the line that minimizes the sum of squared differences between observed and predicted values.

**Mathematically:**  
\( \text{Minimize } \sum (Y_i - \hat{Y_i})^2 \)

**Example:** If actual vs predicted values are [100,150,200] and [110,140,190], squared errors = (-10)² + (10)² + (10)² = 300.

### Question 6: What is Logistic Regression? How does it differ from Linear Regression?
**Logistic Regression** predicts categorical outcomes (0/1) using a logistic (sigmoid) function.

| Aspect | Linear Regression | Logistic Regression |
|---------|------------------|--------------------|
| Output Type | Continuous | Categorical (0/1) |
| Range | (-∞, +∞) | (0, 1) |
| Example | Predicting house price | Predicting if customer buys or not |

### Question 7: Name and briefly describe three common evaluation metrics for regression models.
1. **MAE (Mean Absolute Error):** Average of absolute errors |Y - Ŷ|.  
2. **MSE (Mean Squared Error):** Average of squared errors (Y - Ŷ)².  
3. **RMSE (Root Mean Squared Error):** Square root of MSE, in same units as Y.  

**Example:** RMSE = 5 means predictions are off by 5 units on average.

### Question 8: What is the purpose of the R-squared metric in regression analysis?
R² measures how much variance in Y is explained by X.

**Formula:**  
\( R^2 = 1 - \frac{SS_{res}}{SS_{tot}} \)

**Example:** R² = 0.9 means 90% of the variation in Y is explained by X.

### Question 9: Write Python code to fit a simple linear regression model using scikit-learn and print the slope and intercept.

In [None]:
from sklearn.linear_model import LinearRegression
import numpy as np

# Sample data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 4, 5, 4, 5])

# Create and fit the model
model = LinearRegression()
model.fit(X, y)

# Print coefficients
print('Slope (β₁):', model.coef_[0])
print('Intercept (β₀):', model.intercept_)

### Question 10: How do you interpret the coefficients in a simple linear regression model?
- **Intercept (β₀):** Expected Y when X=0.  
- **Slope (β₁):** Change in Y for each one-unit increase in X.  

**Example:** If Y = 50 + 10X, then when X increases by 1, Y increases by 10 units.