**1) What is Simple Linear Regression?**
Simple Linear Regression is a statistical method used to model the relationship between a **dependent variable (Y)** and a **single independent variable (X)** using a straight line:
**Y = mX + c**

---

**2) What are the key assumptions of Simple Linear Regression?**

* Linearity: The relationship between X and Y is linear.
* Independence: Observations are independent.
* Homoscedasticity: Constant variance of residuals.
* Normality: Residuals are normally distributed.
* No multicollinearity (though with 1 variable, it doesn't apply here).

---

**3) What does the coefficient `m` represent in the equation Y = mX + c?**
The coefficient `m` is the **slope** of the line, representing the **change in Y for a one-unit increase in X**.

---

**4) What does the intercept `c` represent in the equation Y = mX + c?**
The intercept `c` is the value of Y when X = 0. It represents the **starting point** or **baseline** of the model.

---

**5) How do we calculate the slope `m` in Simple Linear Regression?**
Using the formula:
**m = Σ((X - X̄)(Y - Ȳ)) / Σ((X - X̄)²)**

---

**6) What is the purpose of the least squares method in Simple Linear Regression?**
To **minimize the sum of squared errors (residuals)** between the predicted values and actual data points.

---

**7) How is the coefficient of determination (R²) interpreted in Simple Linear Regression?**
R² measures how well the model explains the variance in the target variable.

* R² = 1 → Perfect fit
* R² = 0 → No explanatory power

---

### 📙 **Multiple Linear Regression**

**8) What is Multiple Linear Regression?**
A regression model that uses **two or more independent variables** to predict a dependent variable.
**Y = b₀ + b₁X₁ + b₂X₂ + ... + bnXn**

---

**9) What is the main difference between Simple and Multiple Linear Regression?**

* Simple Linear Regression: 1 independent variable
* Multiple Linear Regression: 2 or more independent variables

---

**10) What are the key assumptions of Multiple Linear Regression?**

* Linearity
* Independence of errors
* Homoscedasticity
* Normality of residuals
* No multicollinearity among predictors

---

**11) What is heteroscedasticity, and how does it affect the results of a Multiple Linear Regression model?**
Heteroscedasticity occurs when the **variance of residuals is not constant** across all levels of the independent variables.
It leads to **inefficient estimates** and incorrect standard errors, which can distort hypothesis tests.

---

**12) How can you improve a Multiple Linear Regression model with high multicollinearity?**

* Remove highly correlated predictors
* Use **Principal Component Analysis (PCA)**
* Apply **Ridge or Lasso Regression**
* Combine correlated variables

---

**13) What are some common techniques for transforming categorical variables for use in regression models?**

* **One-Hot Encoding**
* **Label Encoding**
* **Ordinal Encoding**
* **Binary Encoding**

---

**14) What is the role of interaction terms in Multiple Linear Regression?**
Interaction terms (e.g., `X1*X2`) capture **combined effects** of variables that are not explained individually.

---

**15) How can the interpretation of intercept differ between Simple and Multiple Linear Regression?**

* Simple: `c` is the expected Y when X = 0.
* Multiple: Intercept represents Y when **all independent variables = 0**, which may or may not be meaningful.

---

**16) What is the significance of the slope in regression analysis, and how does it affect predictions?**
A slope shows the **magnitude and direction** of the relationship. Positive slope = increase in Y with X; negative = decrease.

---

**17) How does the intercept in a regression model provide context for the relationship between variables?**
It provides a **reference point** for prediction — the expected value of Y when all Xs are zero.

---

**18) What are the limitations of using R² as a sole measure of model performance?**

* It can be **misleading with many predictors**
* Doesn't tell about **model bias or overfitting**
* Can be **artificially high** in complex models
  Use **Adjusted R², RMSE, or cross-validation** for better assessment.

---

**19) How would you interpret a large standard error for a regression coefficient?**
It means the **coefficient estimate is unstable**, indicating weak or uncertain relationship with the target variable.

---

**20) How can heteroscedasticity be identified in residual plots, and why is it important to address it?**

* Identified when residuals **fan out or funnel** in a plot.
* It violates model assumptions and affects **validity of confidence intervals** and p-values.

---

**21) What does it mean if a Multiple Linear Regression model has a high R² but low adjusted R²?**
It means **some predictors are not useful**, and the model may be **overfitting**. Adjusted R² penalizes extra, unnecessary predictors.

---

**22) Why is it important to scale variables in Multiple Linear Regression?**

* Prevents **large-valued features** from dominating
* Important for **regularization techniques** (Ridge, Lasso)
* Helps interpret coefficients more clearly

---

### 📗 **Polynomial Regression**

**23) What is Polynomial Regression?**
Polynomial regression models **nonlinear relationships** by adding powers of the independent variable:
**Y = b₀ + b₁X + b₂X² + ... + bnXⁿ**

---

**24) How does polynomial regression differ from linear regression?**

* Linear Regression fits a **straight line**
* Polynomial Regression fits a **curved line** by including powers of X

---

**25) When is polynomial regression used?**
When data shows a **non-linear** pattern that cannot be captured with a straight line.

---

**26) What is the general equation for polynomial regression?**
**Y = b₀ + b₁X + b₂X² + b₃X³ + ... + bnXⁿ**

---

**27) Can polynomial regression be applied to multiple variables?**
Yes, it becomes **Multivariate Polynomial Regression**, using polynomial terms of multiple variables and their interactions.

---

**28) What are the limitations of polynomial regression?**

* Can **overfit** easily
* Becomes **complex** with higher degrees
* Sensitive to **outliers**
* Harder to interpret

---

**29) What methods can be used to evaluate model fit when selecting the degree of a polynomial?**

* **Cross-validation**
* **Adjusted R²**
* **AIC/BIC** (information criteria)
* **Plot residuals** and test overfitting

---

**30) Why is visualization important in polynomial regression?**
Helps in **understanding curvature**, detecting **overfitting**, and ensuring the model fits data intuitively.

---

**31) How is polynomial regression implemented in Python?**

```python
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# Example: Polynomial Regression of degree 3
model = make_pipeline(PolynomialFeatures(degree=3), LinearRegression())
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
```

