

### Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it
### represent?
- R-squared (or the coefficient of determination) measures the proportion of the variance in the dependent variable that is predictable from the independent variables. It is calculated as:
\[ R^2 = 1 - \frac{SS_{res}}{SS_{tot}} \]
Where \(SS_{res}\) is the sum of the squared residuals and \(SS_{tot}\) is the total sum of squares. It represents the goodness of fit of the model, with values ranging from 0 to 1, where a higher value indicates a better fit.

### Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.
- While R-squared always increases as more predictors are added to the model, adjusted R-squared increases only if the new term improves the model's fit more than expected by chance. It adjusts for the number of predictors in the model and is defined as:
\[ Adjusted\:R^2 = 1 - \frac{(1-R^2)(n-1)}{n-k-1} \]
Where \(n\) is the sample size and \(k\) is the number of predictors.

### Q3. When is it more appropriate to use adjusted R-squared?
- Adjusted R-squared should be used when comparing models with a different number of predictors to penalize the addition of unnecessary terms.

### Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics
calculated, and what do they represent?
- **RMSE (Root Mean Squared Error)**: It is the square root of the average of the squared differences between the observed and predicted values.
\[ RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2} \]
- **MSE (Mean Squared Error)**: It is the average of the squared differences between the observed and predicted values.
\[ MSE = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2 \]
- **MAE (Mean Absolute Error)**: It is the average of the absolute differences between the observed and predicted values.
\[ MAE = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i| \]

### Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in
regression analysis.
  *Advantages*: Penalizes large errors. 
  *Disadvantages*: Sensitive to outliers.
- **MSE**: 
  *Advantages*: Easier to compute gradients for optimization.
  *Disadvantages*: Sensitive to outliers.
- **MAE**: 
  *Advantages*: Provides a linear penalty for errors and is robust to outliers. 
  *Disadvantages*: Not differentiable at 0.

### Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is
it more appropriate to use?
- Lasso (L1 regularization) adds the absolute values of coefficients as a penalty term to the loss function. It can reduce some coefficients to zero, effectively performing variable selection. Ridge (L2 regularization), on the other hand, adds the squared values of coefficients. Lasso is more appropriate when we believe many features are irrelevant or redundant.

### Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an
example to illustrate.
- Regularized linear models prevent overfitting by adding penalty terms to the loss function, which constrain the magnitude of coefficients. This helps in preventing the model from fitting the noise in the data. 
*Example*: In a dataset with many correlated features, without regularization, the model might assign arbitrarily large positive and negative weights to these features. Regularization prevents this by penalizing large coefficients.

### Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best
choice for regression analysis.
- They assume a linear relationship between predictors and outcome.
- Might introduce bias.
- Selection of appropriate regularization strength is crucial.
- They might not capture complex nonlinear relationships.

### Q9. You are comparing the performance of two regression models using different evaluation metrics.
Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better
performer, and why? Are there any limitations to your choice of metric?
- Lower RMSE and MAE values indicate better model performance. It's hard to compare directly between RMSE and MAE since they penalize errors differently. However, given just these metrics, Model B with an MAE of 8 might be preferred if we want a model that's robust to outliers. The limitation is that RMSE is more sensitive to outliers than MAE.

### Q10. You are comparing the performance of two regularized linear models using different types of
regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B
uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the
better performer, and why? Are there any trade-offs or limitations to your choice of regularization
method?
- The choice between Ridge and Lasso depends on the dataset and the problem. Lasso might be preferred if we believe many features are irrelevant, as it performs feature selection. The regularization parameter's value (alpha) also plays a role. A higher alpha means more regularization. Choosing between these models would require cross-validation to find the best model in terms of prediction accuracy. Trade-offs include bias-variance trade-off and interpretability.

