### Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it represent?

A1. **R-squared (R²)**:
- Represents the proportion of the variance in the dependent variable that is predictable from the independent variable(s).
- Indicates how well the regression model fits the data.

**Calculation**:
$$ R^2 = 1 - \frac{SS_{res}}{SS_{tot}} $$
- SS_{res}: Sum of squared residuals (difference between observed and predicted values).
- SS_{tot}: Total sum of squares (difference between observed values and the mean of observed values).

**Interpretation**:
- Ranges from 0 to 1.
  -  R^2 = 1 : Perfect fit, model explains all the variability.
  -  R^2 = 0: Model explains none of the variability.
- Higher R^2 indicates a better fit.

### Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.

A2. **Adjusted R-squared** adjusts the R² value for the number of predictors in the model, preventing overestimation of the goodness of fit.

**Formula**:
$$ \text{Adjusted } R^2 = 1 - \left( \frac{(1 - R^2)(n - 1)}{n - p - 1} \right) $$
- n: Number of observations.
- p: Number of predictors.

- Adjusted R² accounts for the number of predictors and penalizes the addition of unnecessary predictors.
- Regular R² can increase with the addition of more predictors, even if they are irrelevant.

### Q3. When is it more appropriate to use adjusted R-squared?

A3. **Use Adjusted R-squared**:
- When comparing models with different numbers of predictors.
- To avoid overfitting and ensure the model's additional predictors are meaningful.
- When the goal is to select the best model among several with different numbers of predictors.

### Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics calculated, and what do they represent?

A4. **RMSE (Root Mean Squared Error)**:
$$ RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} $$
- Represents the square root of the average squared differences between predicted and observed values.
- Sensitive to outliers.

**MSE (Mean Squared Error)**:
$$ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 $$
- Average squared differences between predicted and observed values.
- Also sensitive to outliers.

**MAE (Mean Absolute Error)**:
$$ MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i| $$
- Average absolute differences between predicted and observed values.
- Less sensitive to outliers compared to RMSE and MSE.

### Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in regression analysis.

A5. 
1. **RMSE**:
- **Advantages**: Penalizes large errors more than smaller ones, providing a measure of model accuracy.
- **Disadvantages**: Sensitive to outliers, can be misleading if large errors are present.

2. **MSE**:
- **Advantages**: Similar to RMSE, provides an average of squared errors, useful for optimization.
- **Disadvantages**: Also sensitive to outliers, and not in the same units as the original data.

3. **MAE**:
- **Advantages**: Less sensitive to outliers, provides a straightforward measure of average error.
- **Disadvantages**: May not penalize large errors enough, providing less information about model performance.

### Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is it more appropriate to use?

A6. **Lasso Regularization**:
- Adds a penalty equal to the absolute value of the magnitude of coefficients to the cost function.
$$ \text{Cost Function} = \text{RSS} + \lambda \sum_{j=1}^{p} |\beta_j| $$
- Can shrink some coefficients to zero, effectively performing feature selection.

**Difference from Ridge Regularization**:
- Ridge adds a penalty equal to the square of the magnitude of coefficients.
$$ \text{Cost Function} = \text{RSS} + \lambda \sum_{j=1}^{p} \beta_j^2 $$
- Ridge can only shrink coefficients but not set them to zero.

**When to Use Lasso**:
- When feature selection is desired.
- When dealing with high-dimensional data where some predictors may be irrelevant.

### Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an example to illustrate.

A7. **Regularized Linear Models**:
- Add a penalty to the cost function, discouraging overly complex models.
- Prevent overfitting by shrinking coefficients, reducing model complexity.

**Example**:
- Without regularization: A model with high variance fits the training data too closely.
- With regularization (Ridge or Lasso): The model's coefficients are penalized, leading to a simpler model that generalizes better to unseen data.

### Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best choice for regression analysis.

A8. **Limitations of regularized linear models**:
- May not perform well if the underlying relationship between predictors and the outcome is nonlinear.
- Choosing the appropriate regularization parameter (\(\lambda\)) can be challenging.
- Lasso may discard important variables if \(\lambda\) is too high.
- Ridge may include irrelevant variables if \(\lambda\) is too low.

**Not Always Best Choice**:
- When the relationship between predictors and the outcome is complex and nonlinear.
- In such cases, other models like decision trees, random forests, or neural networks may be more suitable.

### Q9. You are comparing the performance of two regression models using different evaluation metrics. Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better performer, and why? Are there any limitations to your choice of metric?

A9. 
- RMSE and MAE measure different aspects of model performance.
- RMSE penalizes larger errors more than MAE, so a lower RMSE indicates fewer large errors.

**Limitations**:
- Without comparing both metrics for both models, it's difficult to choose definitively.
- Model B has a lower MAE, indicating better average performance, but Model A's RMSE suggests fewer large errors.

### Q10. You are comparing the performance of two regularized linear models using different types of regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the better performer, and why? Are there any trade-offs or limitations to your choice of regularization method?

A10. 
- It depends on the context and goals.
  - Ridge (Model A) is preferred if all predictors are potentially relevant.
  - Lasso (Model B) is preferred if feature selection is needed.

**Trade-offs or Limitations**:
- Ridge regularization may include all predictors, potentially leading to a less interpretable model.
- Lasso may discard relevant predictors if the regularization parameter is too high, can lead to biased estimates for large 𝜆.
- Choosing the appropriate regularization parameter for each method is crucial.
- Ridge doesn't perform feature selection, which can be a disadvantage in high-dimensional settings.