
---

### **Q1. Difference Between Simple and Multiple Linear Regression + Examples**

**Simple Linear Regression**:  
This involves one independent variable (predictor) and one dependent variable (response). It models the relationship between them using a straight line.

**Formula**:  
\[
y = \beta_0 + \beta_1 x + \varepsilon
\]  
Where:
- \( y \) = dependent variable  
- \( x \) = independent variable  
- \( \beta_0 \) = intercept  
- \( \beta_1 \) = slope  
- \( \varepsilon \) = error term

**Example**:  
Predicting house price based on size:  
> Price = \( \beta_0 \) + \( \beta_1 \times \text{Size (sq ft)} \)

---

**Multiple Linear Regression**:  
This involves **two or more** independent variables predicting one dependent variable.

**Formula**:  
\[
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n + \varepsilon
\]

**Example**:  
Predicting house price based on size, number of bedrooms, and location:  
> Price = \( \beta_0 \) + \( \beta_1 \times \text{Size} \) + \( \beta_2 \times \text{Bedrooms} \) + \( \beta_3 \times \text{Location Score} \)

---

### **Q2. Assumptions of Linear Regression and How to Check Them**

1. **Linearity**:  
   Relationship between independent and dependent variables is linear.  
   🔍 *Check*: Use scatter plots or residual vs fitted plots.

2. **Independence of errors**:  
   Residuals should not be correlated.  
   🔍 *Check*: Use the Durbin-Watson test.

3. **Homoscedasticity**:  
   Constant variance of residuals across all levels of independent variables.  
   🔍 *Check*: Plot residuals vs predicted values — look for a "funnel" shape.

4. **Normality of errors**:  
   Residuals should be normally distributed.  
   🔍 *Check*: Histogram or Q-Q plot of residuals; Shapiro-Wilk test.

5. **No multicollinearity** (for multiple regression):  
   Predictors shouldn’t be highly correlated.  
   🔍 *Check*: Use VIF (Variance Inflation Factor).

---

### **Q3. Interpreting Slope and Intercept in Linear Regression + Real-World Example**

**Slope (\( \beta_1 \))**: Change in the dependent variable for a one-unit change in the independent variable.

**Intercept (\( \beta_0 \))**: Predicted value of the dependent variable when all independent variables are 0.

**Example**:  
Let’s say we model:  
> Salary = 30,000 + 2,000 × Years of Experience

- **Intercept (30,000)**: Base salary with 0 years of experience.  
- **Slope (2,000)**: For each additional year of experience, salary increases by $2,000.

---

### **Q4. What is Gradient Descent and Its Role in Machine Learning?**

**Gradient Descent** is an optimization algorithm used to minimize a function — commonly the **loss function** in machine learning.

**Concept**:
- Imagine a valley (loss function graph).
- You want to get to the bottom (minimum loss).
- You take steps in the direction of the steepest descent (negative gradient).

**Steps**:
1. Start with random values of parameters (like slope and intercept).
2. Compute the loss (e.g., MSE).
3. Calculate gradients (partial derivatives).
4. Update parameters:  
   \[
   \theta := \theta - \alpha \frac{\partial L}{\partial \theta}
   \]
   Where \( \alpha \) is the **learning rate**.

**Use in Machine Learning**:
- Used to train models like linear regression, logistic regression, neural networks.
- Helps in finding the best-fit line or decision boundary by minimizing error.

---



### **Q5. Describe the Multiple Linear Regression Model. How Is It Different from Simple Linear Regression?**

**Multiple Linear Regression (MLR)** models the relationship between a dependent variable and **two or more** independent variables.

**Equation**:  
\[
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_n x_n + \varepsilon
\]

Where:
- \( y \): Dependent variable  
- \( x_1, x_2, ..., x_n \): Independent variables  
- \( \beta_0 \): Intercept  
- \( \beta_1, ..., \beta_n \): Coefficients (slopes)  
- \( \varepsilon \): Error term

**Difference from Simple Linear Regression (SLR)**:
- **SLR**: 1 independent variable (e.g., Predicting salary from experience).
- **MLR**: 2+ independent variables (e.g., Predicting salary from experience, education, and age).

---

### **Q6. Explain Multicollinearity in Multiple Linear Regression. How to Detect and Address It?**

**Multicollinearity** occurs when two or more independent variables are **highly correlated** — they provide redundant information.

#### ❗ Why It’s a Problem:
- Makes it hard to interpret coefficients.
- Increases the standard error → unreliable estimates.
- May cause instability in the model.

#### 🔍 How to Detect:
1. **Correlation matrix** — Check for high pairwise correlations (e.g., > 0.8).
2. **Variance Inflation Factor (VIF)** —  
   - VIF > 5 or 10 is usually a red flag.

#### 🛠️ How to Fix:
- Remove or combine correlated predictors.
- Use dimensionality reduction (e.g., **PCA**).
- Use **Ridge Regression** or **Lasso**, which can handle multicollinearity better.

---

### **Q7. What Is Polynomial Regression? How Is It Different from Linear Regression?**

**Polynomial Regression** is a type of linear regression where the relationship between the independent and dependent variable is modeled as an **nth-degree polynomial**.

**Equation (for degree 2)**:  
\[
y = \beta_0 + \beta_1 x + \beta_2 x^2 + \varepsilon
\]

#### 🆚 Difference from Linear Regression:
- **Linear Regression**: Models straight-line relationships.
- **Polynomial Regression**: Can model curves (non-linear relationships).

Despite the curve, it’s still a **linear model in terms of the coefficients**, so it's solvable with linear regression techniques.

---

### **Q8. Advantages and Disadvantages of Polynomial Regression**

#### ✅ Advantages:
- Can model non-linear relationships effectively.
- More flexible than simple linear models.

#### ❌ Disadvantages:
- Prone to **overfitting**, especially with high-degree polynomials.
- **Extrapolation** (predictions outside the data range) becomes unreliable.
- Less interpretable than linear models.

#### 🔄 When to Use Polynomial Regression:
- When data clearly shows a **non-linear trend** that a straight line can’t capture.
- Example: Modeling the effect of time on temperature, where there’s a peak and decline (e.g., daily temperature changes).

---
