### Q1: **R-squared in Linear Regression Models**
R-squared (R²) is a statistical measure that represents the proportion of the variance in the dependent variable (target) that is explained by the independent variables (features) in a linear regression model. It is calculated using the formula:

\[
R^2 = 1 - \frac{SS_{res}}{SS_{tot}}
\]

Where:
- \(SS_{res}\) is the sum of squared residuals (the difference between actual and predicted values),
- \(SS_{tot}\) is the total sum of squares (the variance of the actual values).

R-squared ranges from 0 to 1, where:
- \(R^2 = 1\) means the model perfectly predicts all the variance in the target variable,
- \(R^2 = 0\) means the model does not explain any of the variance.

### Q2: **Adjusted R-squared**
Adjusted R-squared is a modified version of R-squared that adjusts for the number of predictors in the model. While R-squared increases with the addition of more features, Adjusted R-squared accounts for the model complexity and penalizes the addition of unnecessary features. It is calculated as:

\[
\text{Adjusted } R^2 = 1 - \left( \frac{(1 - R^2)(n - 1)}{n - p - 1} \right)
\]

Where:
- \(n\) is the number of observations,
- \(p\) is the number of predictors.

### Q3: **When to Use Adjusted R-squared**
Adjusted R-squared is more appropriate when comparing models with different numbers of features. It prevents overestimation of the model's performance by penalizing excessive use of features.

### Q4: **RMSE, MSE, and MAE**
These are metrics used to evaluate the performance of regression models:

- **RMSE (Root Mean Squared Error)**: The square root of the average of squared errors between actual and predicted values.
  \[
  RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2}
  \]
- **MSE (Mean Squared Error)**: The average of the squared errors.
  \[
  MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2
  \]
- **MAE (Mean Absolute Error)**: The average of the absolute differences between actual and predicted values.
  \[
  MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y_i}|
  \]

These metrics represent how well the model fits the data, with RMSE and MSE giving higher penalties for larger errors, and MAE treating all errors equally.

### Q5: **Advantages and Disadvantages of RMSE, MSE, and MAE**
- **RMSE**: 
  - *Advantage*: Heavily penalizes large errors, making it useful for applications where large errors are particularly undesirable.
  - *Disadvantage*: Sensitive to outliers, as squaring the error magnifies their effect.
  
- **MSE**:
  - *Advantage*: Simple to calculate and differentiable, which is useful for gradient-based optimization.
  - *Disadvantage*: Like RMSE, it overly penalizes large errors.

- **MAE**:
  - *Advantage*: Less sensitive to outliers, as it takes the absolute value of errors.
  - *Disadvantage*: Does not penalize large errors as strongly as RMSE or MSE.

### Q6: **Lasso Regularization**
Lasso (Least Absolute Shrinkage and Selection Operator) is a type of regression that adds a penalty to the absolute values of the coefficients in the objective function. The regularization term added to the loss function is:

\[
L(\theta) = \text{MSE} + \lambda \sum_{j=1}^{p} |\theta_j|
\]

Lasso can shrink some coefficients to exactly zero, effectively performing feature selection. This contrasts with **Ridge regularization**, which adds a penalty based on the squared values of the coefficients and does not zero them out:

\[
L(\theta) = \text{MSE} + \lambda \sum_{j=1}^{p} \theta_j^2
\]

### Q7: **Regularized Linear Models and Overfitting**
Regularized models, like Lasso and Ridge, add penalties to prevent the model from fitting noise in the training data (overfitting). For example, a Lasso model can reduce the coefficients of irrelevant features to zero, simplifying the model and reducing overfitting.

#### Example:
In a high-dimensional dataset, Ridge regression helps avoid overfitting by shrinking coefficients, making the model generalize better to unseen data.

### Q8: **Limitations of Regularized Linear Models**
- **Not suited for all problems**: Regularized models assume a linear relationship, which may not be appropriate for non-linear datasets.
- **Feature selection bias**: Lasso may arbitrarily select one of many correlated features, ignoring the others.
- **Hyperparameter tuning**: Regularization requires careful tuning of the regularization parameter \(\lambda\), which can be computationally expensive.

### Q9: **Choosing Between Models A and B (RMSE vs. MAE)**
- **Model A** has an RMSE of 10, and **Model B** has an MAE of 8.
- MAE is more robust to outliers, while RMSE penalizes larger errors more heavily.
- If outliers are not a concern, **Model B** (lower MAE) could be preferable, as it suggests smaller average errors. However, **RMSE** being higher in Model A could indicate larger occasional errors, which might be critical depending on the application.

### Q10: **Choosing Between Ridge and Lasso Regularization**
- **Model A (Ridge, \(\lambda = 0.1\))** and **Model B (Lasso, \(\lambda = 0.5\))** offer different trade-offs. 
- **Ridge** retains all features but shrinks coefficients, while **Lasso** can perform feature selection by setting some coefficients to zero.
- If feature selection is important (i.e., you want a simpler model), **Lasso** may be preferable. However, if you are concerned about multicollinearity or don't want to discard features, **Ridge** is a better choice.

In practice, the choice between the two depends on whether the dataset benefits from feature selection or coefficient shrinkage.