# Assignment Questions - Regression 2

## Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it represent?
**R-squared (R²)**, also known as the coefficient of determination, is a statistical measure that represents the proportion of the variance for a dependent variable that's explained by the independent variable(s) in a regression model.

It is calculated as:
\[
R² = 1 - \frac{SS_{res}}{SS_{tot}}
\]
Where:
- \( SS_{res} \) is the sum of squared residuals (the difference between observed and predicted values),
- \( SS_{tot} \) is the total sum of squares (the variance of the dependent variable).

R² ranges from 0 to 1, where:
- 0 indicates that the model does not explain any of the variance.
- 1 indicates that the model perfectly explains all the variance.

---

## Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.
**Adjusted R-squared** adjusts the R² value to account for the number of independent variables in the model. While R² can increase with the addition of variables, adjusted R-squared only increases if the new variable improves the model beyond what would be expected by chance.

The formula for adjusted R² is:
\[
Adjusted\ R² = 1 - \left( \frac{(1 - R²)(n - 1)}{n - k - 1} \right)
\]
Where:
- \( n \) is the number of observations,
- \( k \) is the number of independent variables.

Adjusted R² is generally lower than R² and helps prevent overfitting by penalizing the inclusion of irrelevant variables.

---

## Q3. When is it more appropriate to use adjusted R-squared?
Adjusted R-squared is more appropriate when:
- You are comparing models with different numbers of independent variables.
- You want to ensure that adding more variables to your model improves the fit only if the variables are significant, and not just by chance.
  
It's a better metric for evaluating models with multiple predictors.

---

## Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics calculated, and what do they represent?
- **RMSE (Root Mean Squared Error):** It is the square root of the average of the squared differences between actual and predicted values.
\[
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2}
\]
- **MSE (Mean Squared Error):** It is the average of the squared differences between actual and predicted values.
\[
MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2
\]
- **MAE (Mean Absolute Error):** It is the average of the absolute differences between actual and predicted values.
\[
MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y_i}|
\]

These metrics represent the error between predicted and actual values:
- **RMSE and MSE** are more sensitive to large errors (outliers).
- **MAE** provides a more intuitive average error.

---

## Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in regression analysis.
### Advantages:
- **RMSE/MSE:**
  - Penalizes large errors more due to squaring, which can be useful when large errors are particularly undesirable.
  - Provides insights into the variance of prediction errors.
- **MAE:**
  - Simpler interpretation as it represents the average magnitude of errors in units of the target variable.
  - Less sensitive to outliers compared to RMSE/MSE.

### Disadvantages:
- **RMSE/MSE:**
  - Overly sensitive to outliers, which can distort the model's performance evaluation.
  - Less interpretable due to the squaring of errors.
- **MAE:**
  - Does not penalize larger errors as strongly as RMSE, making it less suitable when large errors are critical.

---

## Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is it more appropriate to use?
**Lasso (Least Absolute Shrinkage and Selection Operator)** regularization adds a penalty equal to the absolute value of the coefficients to the loss function:
\[
Loss = \text{RSS} + \lambda \sum_{j=1}^{p} |b_j|
\]
Where \( \lambda \) is the regularization parameter.

**Differences from Ridge Regularization:**
- Lasso tends to shrink some coefficients to exactly zero, effectively performing feature selection.
- Ridge regularization adds a penalty equal to the square of the coefficients but does not reduce any coefficients to exactly zero.

**When to Use Lasso:**
- When you suspect that only a subset of features is important for prediction.
- When you want to automatically perform feature selection in your model.

---

## Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an example to illustrate.
Regularized linear models, such as **Ridge** and **Lasso**, add a penalty to the size of the coefficients to prevent them from becoming too large, which helps in reducing overfitting.

**Example:** Suppose a model has too many features, some of which are not relevant. Without regularization, the model may fit noise in the training data (overfitting). Regularization discourages large coefficients for irrelevant features, leading to a simpler and more generalizable model.

---

## Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best choice for regression analysis.
### Limitations:
- Regularized models may oversimplify the relationship between variables by shrinking coefficients too much, potentially underfitting the data.
- Lasso may exclude important variables by shrinking coefficients to zero if the regularization parameter is too large.
- These models assume that the true relationship between variables is linear, which may not always be the case.

In cases where the data has complex interactions or non-linear relationships, other models (like decision trees or neural networks) may perform better.

---

## Q9. You are comparing the performance of two regression models using different evaluation metrics. Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better performer, and why? Are there any limitations to your choice of metric?
Model B with an **MAE of 8** appears to have a better average error compared to Model A with an RMSE of 10, as MAE represents the average error in the same units as the target variable.

However, **limitations** include:
- RMSE penalizes larger errors more, so if Model A has large outliers, Model B may be preferred.
- If outliers are important, RMSE should be considered to understand the distribution of errors better.

The choice of metric depends on the specific problem and tolerance for large errors.

---

## Q10. You are comparing the performance of two regularized linear models using different types of regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the better performer, and why? Are there any trade-offs or limitations to your choice of regularization method?
The choice between **Ridge** (Model A) and **Lasso** (Model B) depends on the specific scenario:
- **Model B (Lasso)**: If you believe that only a subset of features is important, Lasso's feature selection might give a better, simpler model.
- **Model A (Ridge)**: If all features are relevant but need regularization to avoid overfitting, Ridge might perform better since it does not shrink coefficients to zero.

**Trade-offs**:
- Lasso can lead to sparse models but may discard relevant features if the regularization parameter is too high.
- Ridge keeps all variables but may not be as effective if there are many irrelevant features.

---

