Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it
represent?

Ans. R-squared (coefficient of determination) measures how well a linear regression model explains the variability of the dependent variable. It ranges from 0 to 1, where 0 means the model explains none of the variability, and 1 means it explains all of it.

Calculation:

R^2 = 1 - (SS(res)/SS(tot))

SS(res): Sum of squared differences between observed and predicted values.
SS(tot): Sum of squared differences between observed values and their mean.

What it Represents:

Goodness of Fit: Higher R-squared indicates a better fit.
Limitations: R-squared can be misleading; a higher value doesn't always mean the model is appropriate, especially if overfitting is a concern.

Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.

Ans. Adjusted R-squared is a version of R-squared that accounts for the number of predictors in a model. Unlike regular R-squared, which always increases with more predictors, adjusted R-squared increases only if the new predictor genuinely improves the model's fit.

Key Difference:

- Regular R-squared: Measures fit but can be misleading with many predictors.
- Adjusted R-squared: Penalizes for unnecessary predictors, offering a more accurate measure of model performance.

Q3. When is it more appropriate to use adjusted R-squared?

Ans. Adjusted R-squared is more appropriate to use when you have a multiple regression model with several predictors. It is particularly useful in the following situations:

- Model Selection: When comparing models with a different number of predictors, adjusted R-squared helps determine which model has a better fit without overfitting.

- Avoiding Overfitting: In models with many predictors, adjusted R-squared penalizes the addition of irrelevant variables, providing a more realistic assessment of the model's performance.

- Complex Models: When building a model with a large dataset or when you're experimenting with adding/removing variables, adjusted R-squared gives a more accurate indication of the model's effectiveness.

Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics
calculated, and what do they represent?

Ans. MSE (Mean Square Error): Measures the average squared difference between actual and predicted values. It penalizes larger errors more.

Formula: MSE = 1/n∑(yi − y^i)^2

RMSE (Root Mean Square Error): The square root of MSE, bringing the error metric back to the original units. It highlights the average magnitude of errors.

Formula: RMSE = underroot(MSE)

MAE (Mean Absolute Error): Measures the average absolute difference between actual and predicted values. It treats all errors equally and is less sensitive to outliers.

Formula: MAE  = 1/n(∑ ∣ yi− y^i |)



Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in
regression analysis.

Ans. MSE (Mean Square Error)

Advantages:

- Penalizes Large Errors: Squaring the errors emphasizes larger discrepancies between actual and predicted values, which can be useful if large errors are particularly undesirable.
- Differentiable: MSE is differentiable, making it useful for optimization algorithms in training models.

Disadvantages:

- Sensitive to Outliers: Since it squares the errors, MSE can be heavily influenced by outliers, which may not always be desirable.
- Less Intuitive: The units of MSE are the square of the original units, making it less interpretable in the context of the data.

RMSE (Root Mean Square Error)

Advantages:

- Interpretable Units: RMSE is in the same units as the original data, making it easier to understand and interpret.
- Penalizes Large Errors: Like MSE, RMSE also emphasizes larger errors due to the squaring but provides results in a more interpretable scale.

Disadvantages:

- Sensitive to Outliers: Similar to MSE, RMSE can be disproportionately affected by large errors.
- Less Robust: The interpretation can still be skewed by extreme values.

MAE (Mean Absolute Error)

Advantages:

- Robust to Outliers: MAE is less sensitive to outliers since it measures absolute errors, treating all errors equally.
- Intuitive Interpretation: The units are the same as the original data, making MAE easy to understand.

Disadvantages:

- Does Not Penalize Large Errors: MAE does not emphasize larger errors as much as MSE or RMSE, which might be a drawback if large errors are particularly problematic.
- Non-Differentiable: MAE is not differentiable at zero, which can make it less suitable for certain optimization algorithms.

Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is
it more appropriate to use?

Ans. 
Lasso Regularization (L1):

- Concept: Adds a penalty proportional to the absolute value of coefficients, which can shrink some coefficients to zero (feature selection).
- Use When: You suspect only a few predictors are important and want to perform feature selection.

Ridge Regularization (L2):

- Concept: Adds a penalty proportional to the square of coefficients, shrinking all coefficients but not setting any to zero.
-  When: You believe all predictors contribute to the outcome and want to prevent overfitting without excluding features.
- Key Difference: Lasso can set some coefficients to zero (sparse solutions), while Ridge shrinks coefficients but keeps all predictors in the model.

Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an
example to illustrate.

Ans. Regularized linear models prevent overfitting by adding a penalty to the size of the model’s coefficients, reducing complexity and improving generalization.

Examples:
Lasso Regularization: Encourages the model to set some coefficients to zero, selecting only the most relevant features and simplifying the model.

Ridge Regularization: Shrinks all coefficients, reducing their magnitude and stabilizing the model, especially in the presence of multicollinearity.

In both cases, regularization helps the model avoid fitting noise in the training data, leading to better performance on new, unseen data.

Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best
choice for regression analysis.

Ans. Limitations of Regularized Linear Models:

- Linear Assumption: They assume a linear relationship, which may not fit non-linear data well.
- Feature Scaling: They require features to be scaled; otherwise, regularization may unfairly penalize some features.
- Limited Feature Selection: Lasso performs feature selection but may not always choose the best subset; Ridge keeps all features, which may not be ideal.
- Non-Linear Relationships: They struggle with capturing complex non-linear relationships.
- Multicollinearity: Ridge helps but doesn’t fully solve issues with highly correlated features.
- Over-regularization Risk: Too much regularization can lead to underfitting.

In cases where the data is non-linear or features interact in complex ways, other models might be more effective.

Q9. You are comparing the performance of two regression models using different evaluation metrics.
Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better
performer, and why? Are there any limitations to your choice of metric?

Ans. When comparing Model A and Model B, it's essential to consider the context of your evaluation metrics:

- Model A (RMSE = 10): RMSE measures the average magnitude of errors but gives more weight to larger errors due to squaring. It's sensitive to outliers.

- Model B (MAE = 8): MAE measures the average absolute error and treats all errors equally, making it less sensitive to outliers.

Choosing the Better Model:

Context: If large errors are particularly problematic and you want a model that minimizes these large deviations, RMSE (Model A) might be preferred. If a model that handles all errors uniformly and is more robust to outliers is preferred, MAE (Model B) would be better.

Limitations of Metrics:

- RMSE: Can be disproportionately affected by outliers and might not reflect the model's overall performance if outliers are not a significant concern.
- MAE: May not highlight models that perform well on most data but have occasional large errors

Model Selection: Choose based on the importance of large errors versus the need for robust error measurement. If minimizing large errors is crucial, Model A (RMSE) might be better. For more balanced error handling, Model B (MAE) is preferable.

Q10. You are comparing the performance of two regularized linear models using different types of
regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B
uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the
better performer, and why? Are there any trade-offs or limitations to your choice of regularization
method?

Ans.  When comparing Model A (Ridge regularization) and Model B (Lasso regularization), consider the following factors:

Model Comparison: Model A (Ridge Regularization, λ=0.1):

- Characteristics: Ridge regularization penalizes the square of the coefficients, which shrinks them but keeps all features in the model.
- Use Case: Suitable for cases where you believe most features are relevant but need to reduce their impact. It helps handle multicollinearity.

Model B (Lasso Regularization, λ=0.5):

- Characteristics: Lasso regularization penalizes the absolute value of the coefficients, potentially setting some coefficients to zero, which performs feature selection.
- Use Case: Ideal when you suspect that only a subset of features are significant and want to simplify the model by excluding less relevant features.

Choosing the Better Performer:

- Feature Importance: If feature selection is important and you need a simpler model with fewer predictors, Model B (Lasso) may be better.
-  All Features Relevant: If you believe all features contribute to the outcome and you need to reduce the impact of each feature without excluding any, Model A (Ridge) may be more suitable.

Trade-Offs and Limitations:

Ridge Regularization (Model A):

- Pros: Keeps all features, useful for handling multicollinearity.
- Cons: Does not perform feature selection; all features remain in the model.

Lasso Regularization (Model B):

- Pros: Performs feature selection, resulting in a simpler and potentially more interpretable model.
-  Cons: May remove useful features; less effective if all features contribute to the outcome.

