<a href="https://colab.research.google.com/github/golu628/assignment/blob/main/lasso_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Q1. R-squared (R²) Explained

Concept: R-squared is a statistical measure used in linear regression to assess how well the regression line fits the data points. It represents the proportion of the variance in the dependent variable (Y) that can be explained by the independent variable(s) (X).

Calculation: R-squared is calculated as 1 minus the ratio of the variance of the residuals (unexplained errors) to the total variance of the dependent variable.

Interpretation:

A value closer to 1 indicates a better fit, meaning the model explains a higher proportion of the variance in Y.
A value of 0 means the model explains none of the variance, and a negative value (rarely occurs) indicates a worse fit than predicting the mean of Y.
Q2. Adjusted R-squared (Adjusted R²)

Definition: Adjusted R-squared is a modification of R-squared that penalizes for the number of features (independent variables) in the model. It accounts for the model's complexity and helps to compare models with different numbers of features.

Difference from R-squared: R-squared can increase simply by adding more features, even if they are not truly explanatory. Adjusted R-squared addresses this by adjusting the R-squared value based on the sample size and number of features.

Q3. When to Use Adjusted R-squared

Adjusted R-squared is more appropriate when comparing models with different numbers of features. It helps to identify the model that explains the data better while considering its complexity. For models with the same number of features, regular R-squared can be sufficient.

Q4. Error Metrics: RMSE, MSE, MAE

These metrics quantify the difference between the predicted values (Yhat) from the model and the actual values (Y) in the data.

Root Mean Squared Error (RMSE):

Calculation: Square the residuals (Y - Yhat) for each data point, calculate the average of those squared values, and then take the square root.
Represents: The average magnitude of the errors, considering both large and small errors.
Mean Squared Error (MSE):

Calculation: Similar to RMSE, but without taking the square root of the final average.
Represents: The average squared difference between predicted and actual values. Easier to interpret for calculations involving variances.
Mean Absolute Error (MAE):

Calculation: Take the absolute difference between predicted and actual values for each data point, and then calculate the average.
Represents: The average magnitude of the errors, focusing on the absolute differences rather than squares. Less sensitive to outliers than RMSE or MSE.
Q5. Advantages and Disadvantages of Error Metrics

RMSE:

Advantages: Considers both large and small errors, providing a good overall measure of fit.
Disadvantages: Sensitive to outliers, as squaring large errors amplifies their impact.
MSE:

Advantages: Easier to interpret in variance-based calculations.
Disadvantages: Similar disadvantages to RMSE regarding outlier sensitivity.
MAE:

Advantages: Less sensitive to outliers than RMSE or MSE, good for capturing the average magnitude of errors.
Disadvantages: Ignores the direction of errors (positive or negative), doesn't provide information on the spread of errors.
Choosing the Right Metric:

The best metric depends on your specific needs. If outliers are a concern, use MAE. If understanding the spread of errors is important, consider RMSE. MSE is often used for mathematical convenience in calculations.

Q6. Lasso vs. Ridge Regularization

These techniques address overfitting in linear regression by penalizing the model's complexity.

Lasso Regularization:

Adds a penalty term to the cost function that is the sum of the absolute values of the regression coefficients.
Shrinks some coefficients to zero, effectively removing those features from the model.
Useful for feature selection when some features might be irrelevant or redundant.
Ridge Regularization:

Adds a penalty term to the cost function that is the sum of the squared values of the regression coefficients.
Shrinks all coefficients towards zero, but not necessarily to zero.
Useful when all features might be relevant but contribute to model complexity.
Q7. Continued: Regularization to Prevent Overfitting (Example)

Let's continue the example from Q7:

Imagine a model fitting a random squiggly line through all the data points in the training set. This might have a high R-squared on the training data, but when you use it to predict for new unseen data points, it will likely perform poorly because it's too focused on fitting the noise in the training data rather than the underlying trend.

Regularization techniques like Lasso or Ridge penalize the model for having too many complex features or large coefficients. This discourages the model from fitting the random noise and encourages it to capture the general trend, leading to better generalization on unseen data.

Q8. Limitations of Regularized Linear Models

While regularized models help prevent overfitting, they have limitations:

Bias-Variance Trade-off: Regularization reduces variance (overfitting) but can introduce bias. By shrinking coefficients, the model might underfit the data, leading to systematic errors in predictions. Finding the right balance between reducing variance and avoiding bias is crucial.
Feature Selection Issues: Lasso performs feature selection by setting some coefficients to zero. This can be helpful, but it might also remove relevant features if they are weakly correlated with the target variable.
Not a Cure-All: Regularization doesn't guarantee perfect model performance. Choosing the right features and model complexity is still important.
Q9. Choosing Between RMSE and MAE

Here's how you might decide between RMSE and MAE in the scenario you described:

Model A with RMSE of 10: This indicates that on average, the errors between predicted and actual values are around 10 units in your chosen scale. However, RMSE is sensitive to outliers.

Model B with MAE of 8: This suggests that the average absolute difference between predicted and actual values is 8 units. This is less affected by outliers.

Choosing the Model:

Without additional information, it's difficult to definitively say which model is better. Here's why:

If outliers are a concern: Choose Model B (MAE) as it's less sensitive to them.
If understanding the spread of errors is important: Consider Model A (RMSE) as it captures both large and small errors.
Limitations:

Both metrics only provide a single measure of error. Visualizing the distribution of errors (e.g., histograms) can provide more insights.
The choice of metric depends on your specific application and what aspect of error is most relevant.
Q10. Choosing Between Regularization Techniques

Selecting between Ridge and Lasso regularization depends on your data and goals:

Model A (Ridge) with regularization parameter 0.1: Ridge shrinks all coefficients towards zero but doesn't necessarily eliminate any features.

Model B (Lasso) with regularization parameter 0.5: Lasso might set some coefficients to zero, effectively removing those features from the model.

Choosing the Model:

If feature selection is important: Consider Model B (Lasso) if you suspect some features are irrelevant or redundant.
If you want to retain all features but reduce their impact: Model A (Ridge) might be preferable.
Trade-offs and Limitations:

The choice of regularization parameter (0.1 or 0.5 in your example) also plays a role. A higher value leads to stronger regularization and potentially more bias.
There's no one-size-fits-all solution. Experiment with different regularization techniques and parameters to see what works best for your data and model