\### Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it represent?

**Answer:**
- **Concept:** R-squared measures the proportion of the variance in the response variable that is explained by the predictor variables.
- **Calculation:**
  \[
  R^2 = 1 - \frac{\text{SS}_{\text{res}}}{\text{SS}_{\text{tot}}}
  \]
  where:
  - \( \text{SS}_{\text{res}} \) is the sum of squared residuals.
  - \( \text{SS}_{\text{tot}} \) is the total sum of squares.
- **Representation:** R-squared ranges from 0 to 1, with higher values indicating a better fit.

### Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.

**Answer:**
- **Definition:** Adjusted R-squared adjusts R-squared for the number of predictors in the model.
- **Difference:** Unlike R-squared, which can increase with more predictors, adjusted R-squared can decrease if the predictors do not improve the model sufficiently.

### Q3. When is it more appropriate to use adjusted R-squared?

**Answer:**
- **Use:** When comparing models with different numbers of predictors.
- **Reason:** It accounts for the number of predictors, preventing the inflation of R-squared due to excessive predictors.

### Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics calculated, and what do they represent?

**Answer:**
- **RMSE (Root Mean Squared Error):**
  - **Calculation:**
    \[
    \text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
    \]
  - **Represents:** Average magnitude of prediction errors, with penalties for larger errors.

- **MSE (Mean Squared Error):**
  - **Calculation:**
    \[
    \text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
    \]
  - **Represents:** Average squared prediction errors.

- **MAE (Mean Absolute Error):**
  - **Calculation:**
    \[
    \text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
    \]
  - **Represents:** Average absolute prediction errors.

### Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in regression analysis.

**Answer:**
- **RMSE:**
  - **Advantage:** Sensitive to large errors; useful when large errors are particularly undesirable.
  - **Disadvantage:** Penalizes large errors more heavily; not robust to outliers.

- **MSE:**
  - **Advantage:** Provides a measure of average error squared; sensitive to large errors.
  - **Disadvantage:** Difficult to interpret due to squared units; sensitive to outliers.

- **MAE:**
  - **Advantage:** More robust to outliers; easy to interpret.
  - **Disadvantage:** Does not penalize large errors as strongly as RMSE.

### Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is it more appropriate to use?

**Answer:**
- **Lasso Regularization:** Adds a penalty proportional to the absolute values of the coefficients (\( \lambda \sum |\beta_j| \)).
- **Difference from Ridge:** Lasso can shrink some coefficients to zero, performing variable selection. Ridge adds a penalty proportional to the squared values of the coefficients.
- **Use:** Lasso is preferred when variable selection is desired or when there are many predictors.

### Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an example to illustrate.

**Answer:**
- **Prevention of Overfitting:** Regularized models add a penalty to the magnitude of coefficients, discouraging complex models and reducing variance.
- **Example:** In a polynomial regression model, Ridge regularization can prevent overfitting by limiting the size of the polynomial coefficients.

### Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best choice for regression analysis.

**Answer:**
- **Limitations:**
  - **Bias:** Regularization adds bias, which can lead to underfitting.
  - **Complexity:** Choice of regularization parameter is crucial and can be challenging.
  - **Not Suitable for All Types of Data:** May not perform well with non-linear relationships without transformation.

### Q9. You are comparing the performance of two regression models using different evaluation metrics. Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better performer, and why? Are there any limitations to your choice of metric?

**Answer:**
- **Choice:** Prefer Model B (MAE of 8) if you are concerned about the average size of errors and want robustness to outliers.
- **Limitations:** RMSE penalizes large errors more heavily, while MAE provides a simpler, less sensitive measure of average error. Choice depends on the context and importance of outlier sensitivity.

### Q10. You are comparing the performance of two regularized linear models using different types of regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the better performer, and why? Are there any trade-offs or limitations to your choice of regularization method?

**Answer:**
- **Choice:** Choose based on goals:
  - **Ridge Regularization (Model A):** Better if you want to retain all predictors and prevent multicollinearity.
  - **Lasso Regularization (Model B):** Better if you want to perform variable selection and simplify the model.
- **Trade-offs:** Ridge does not perform variable selection, while Lasso can introduce bias by shrinking some coefficients to zero. Choose based on whether model simplicity or prediction accuracy is more critical.
