---

### **1. What is Simple Linear Regression?**  
Simple Linear Regression is a statistical method used to model the relationship between one independent variable (X) and one dependent variable (Y) by fitting a straight line:  
\[
Y = mX + c
\]

---

### **2. What are the Key Assumptions of Simple Linear Regression?**  
- Linearity between X and Y  
- Independence of observations  
- Homoscedasticity (constant variance of errors)  
- Normal distribution of residuals  
- No multicollinearity (not applicable for single predictor)

---

### **3. What Does the Coefficient *m* Represent in the Equation Y = mX + c?**  
*m* is the **slope**, which shows how much Y changes for a one-unit increase in X.

---

### **4. What Does the Intercept *c* Represent in the Equation Y = mX + c?**  
*c* is the value of Y when X = 0. It is the point where the line intersects the Y-axis.

---

### **5. How Do We Calculate the Slope *m* in Simple Linear Regression?**  
\[
m = \frac{n(\sum XY) - (\sum X)(\sum Y)}{n(\sum X^2) - (\sum X)^2}
\]  
Or:  
\[
m = \frac{\text{Cov}(X, Y)}{\text{Var}(X)}
\]

---

### **6. What is the Purpose of the Least Squares Method in Simple Linear Regression?**  
To minimize the sum of the squared differences between actual and predicted values (residuals), resulting in the best-fitting line.

---

### **7. How is the Coefficient of Determination (R²) Interpreted in Simple Linear Regression?**  
R² indicates the proportion of variance in Y that is explained by X.  
- R² = 0 → No explanatory power  
- R² = 1 → Perfect prediction

---

### **8. What is Multiple Linear Regression?**  
A method to model the relationship between one dependent variable and two or more independent variables.  
\[
Y = b_0 + b_1X_1 + b_2X_2 + \ldots + b_nX_n
\]

---

### **9. What is the Main Difference Between Simple and Multiple Linear Regression?**  
- Simple Linear Regression uses one predictor.  
- Multiple Linear Regression uses two or more predictors.

---

### **10. What are the Key Assumptions of Multiple Linear Regression?**  
- Linearity  
- Independence of errors  
- Homoscedasticity  
- Normality of residuals  
- No multicollinearity  
- No autocorrelation (especially for time series data)

---

### **11. What is Heteroscedasticity, and How Does It Affect Multiple Linear Regression?**  
Heteroscedasticity is when residuals have non-constant variance.  
It affects the model by:
- Leading to inefficient estimates  
- Invalidating statistical tests (t-tests, F-tests)

---

### **12. How Can You Improve a Multiple Linear Regression Model with High Multicollinearity?**  
- Remove or combine highly correlated variables  
- Use PCA (Principal Component Analysis)  
- Apply Ridge or Lasso Regression

---

### **13. What Are Common Techniques for Transforming Categorical Variables for Use in Regression Models?**  
- One-hot encoding  
- Label encoding (for ordinal variables)  
- Dummy variables

---

### **14. What is the Role of Interaction Terms in Multiple Linear Regression?**  
Interaction terms capture the combined effect of two or more variables on the dependent variable. They help model non-additive effects.

---

### **15. How Can the Interpretation of Intercept Differ Between Simple and Multiple Linear Regression?**  
- In Simple Linear Regression, the intercept is the value of Y when X = 0.  
- In Multiple Linear Regression, it's the value of Y when all X variables = 0 (which may not always make practical sense).

---

### **16. What is the Significance of the Slope in Regression Analysis, and How Does It Affect Predictions?**  
The slope indicates how the dependent variable changes with a one-unit change in the independent variable. It directly affects the predictions made by the model.

---

### **17. What Are the Limitations of Using R² as a Sole Measure of Model Performance?**  
- R² increases with more predictors, even if they are not useful  
- Doesn't indicate model correctness or causality  
- Doesn’t show overfitting or bias

---

### **18. How Would You Interpret a Large Standard Error for a Regression Coefficient?**  
A large standard error suggests high uncertainty in the estimate. The coefficient may not be statistically significant.

---

### **19. What is Polynomial Regression?**  
A form of regression where the relationship between the independent variable and the dependent variable is modeled as an nth-degree polynomial.

---

### **20. When is Polynomial Regression Used?**  
When data shows a **non-linear** relationship that cannot be captured by a straight line.

---

### **21. How Does the Intercept in a Regression Model Provide Context for the Relationship Between Variables?**  
It gives the expected value of the dependent variable when all independent variables are zero, helping to interpret the model's baseline.

---

### **22. How Can Heteroscedasticity Be Identified in Residual Plots, and Why is It Important to Address It?**  
In a residual plot, heteroscedasticity appears as a funnel-shaped pattern.  
Addressing it is important because it violates assumptions and affects the reliability of inference.

---

### **23. What Does It Mean If a Multiple Linear Regression Model Has a High R² But Low Adjusted R²?**  
It indicates **overfitting**—the model has too many predictors that don’t contribute meaningfully.

---

### **24. Why is It Important to Scale Variables in Multiple Linear Regression?**  
- Ensures that all variables contribute equally  
- Necessary for regularization methods  
- Improves numerical stability and convergence

---

### **25. How Does Polynomial Regression Differ from Linear Regression?**  
- Linear regression fits a straight line  
- Polynomial regression fits a curved line by adding powers of the independent variable(s)

---

### **26. What is the General Equation for Polynomial Regression?**  
\[
Y = b_0 + b_1X + b_2X^2 + \ldots + b_nX^n
\]

---

### **27. Can Polynomial Regression Be Applied to Multiple Variables?**  
Yes, this is known as **Multivariate Polynomial Regression**.

---

### **28. What Are the Limitations of Polynomial Regression?**  
- Overfitting (especially with high degrees)  
- Poor generalization  
- Sensitive to outliers  
- Harder to interpret

---

### **29. What Methods Can Be Used to Evaluate Model Fit When Selecting the Degree of a Polynomial?**  
- Cross-validation  
- Adjusted R²  
- AIC/BIC (information criteria)  
- Residual analysis

---

### **30. Why is Visualization Important in Polynomial Regression?**  
It helps:
- Understand the model’s fit  
- Identify overfitting or underfitting  
- Interpret complex relationships in data

---

### **31. How is Polynomial Regression Implemented in Python?**

```python
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# Degree 2 polynomial regression
model = make_pipeline(PolynomialFeatures(degree=2), LinearRegression())
model.fit(X, y)
y_pred = model.predict(X)
```

---