In [1]:
#Q1.

# R-squared measures how well a linear regression model fits the data.
# It indicates the proportion of the variance in the dependent varibale explained by independent variables.

# Formula: R-squared = 1 - (sum(yi - yhat)^2 / sum(yi - mean(yi))^2) 
# In this formula, n = number of data points
# yi = actual observed value of the dependent variable
# yhat = predicted value of the dependent variable by the regression model
# mean(yi) = mean of th observed values.

# R-squared ranges from 0 to 1, with higher values indicating better fit.
# It's a common metric for assessing model goodness of fit.
# However, a high R-sqaured does not guarantee a good model, and a low R-squared does not necessarily mean a bad model.

In [2]:
# Q2.

# Adjusted R-squared addresses some of the limitations of R-squared:

# Regular R-squared:
# Measures the proportion of variance in the dependent variables explained by the independent variables.
# Increases as more predictors are added, even if they don't improve the model.
# Doesn't account for overfitting.

# Adjusted R-squared:
# Penalizes excessive predictors by adjusting for model complexity.
# Increases only if additional predictors improve the model significantly.
# Provides a more accurate measure of model fit, especially when dealing with multiple predictors.
# It is always lower or equal to the regualr R-squared.

#Formula:

# Adjusted R-squared = 1 - ((1 - R^2) * (n-1)/ (n-p-1))

# n is the number of data points
# p is the number of independent variables.

In [3]:
# Q3.

# Adjusted R-squared is more appropriate in the following situations:

# Multiple Predictors: When our regression model includes multiple predictors (independent variables).
# Model Selection: To compare models with different numbers of predictors.
# Avoiding Overfitting: When we wnat to penalize the inclusion of unnecessary predictors.
# Model Assessment: For a more accurate evaluation of the model's goodness of fit.

In [5]:
# Q4.

# RMSE:
# Formula: sqrt(sum(actual - predicted)^2 / n)
# Measures the square root of the average squared differencces between actual and predicted values.
# Penalizes larger errors more than MSE.
# Sensitive to outliers.

# MSE:
# Formula: sum(actual - predicted)^2 / n
# Measures the avergae squared differences between actual and predicted values.
# Emphasizes larger errors.
# Sensitive to outliers.

# MAE:
# Formula: sum(|actual - predicted|) / n
# Measures the average absolute differences between actual and predicted values.
# Treats all errors equally.
# Less sensitive to outliers compared to RMSE and MSE.

# These metrics quantify the accuracy and goodness of fit of regression models, with RMSE and MSE giving more weight to larger errors, while MAE treats all errors equally.

In [6]:
# Q5.

# Advantages:

#RMSE:
# Sensitive to large errors, making it useful for identifying significant deviations.
# Provides a clear measure of error magnitude.

#MSE:
# Provides a continuous and differentiable loss function, making it suitable for optimization algorithms.
# Penalizes larger errors more than smaller ones.
# It has one local and one global minima

# MAE:
# Robust to outliers
# Easier to understand since it's the average absolute error.

# Disadvantages:

# RMSE:
# Sensitive to outliers and large errors, which can dominate the evaluation.

#MSE:
# Emphasizes larger errors and may not be suitable if smaller errors are more important.

# MAE:
# Ignores the magnitude of errors and lacks the squared term to differentiate between larger and smaller errors.

In [7]:
# Q6.

# Lasso Regularization:
# Concept: Lasso adds an L1 penalty to the linear/multiple regression objective, encouraging sparsity.
# Difference from Ridge: Lasso uses absolute coefficients, potentially setting some to zero, unlike Ridge that uses the squared values of coefficients which is capable of shrinking coefficients towards zero without excluding any.
# When to use Lasso: For feature selection and simplifying the model by excluding irrelevant features.

In [8]:
#Q7.

# Regularized Linear Models for Overfitting:

# Concept: Regularization techniques such as Lasso or Ridge add penalty terms to the linear/multiple regression objective to prevent overfitting.
# Example: In a polynomial regression, when the degree is high, the model can overfit the training data. Regularization helps reduce the complexity of the model by shrinking coefficients, preventing extreme curve-fitting.

In [9]:
#Q8.

# Limitations of Regularized Linear Models:

# Sensitivity to Hyperparameters: The choice of regularization strength (e.g. alpha value in Ridge/Lasso) can be critical and is often data-dependent.
# Feature Selection: Lasso regularization encourages feature selection, which may not be suitable when all features are informative.
# Assumption of Linearity: Regularized linear models assume a linear relationship b/w features and target, which may not hold in some real-world scenarios.
# Limited for Nonlinear Data: They may not capture complex nonlinear relationships b/w features and the target variable.
# Interpretability: Regularized models make coefficient values shrink towards zero, which can make interpretation more challenging.

In [10]:
#Q9.

# Choice of Better Model: Model B with an MAE of 8 would be chosen as the better performer because it has a lower error compared to Model A with an RMSE of 10.

# Advantages of MAE: MAE is robust to outliers and gives equal weight to all errors, making it easier to interpret.

# Limitations of Metric Choice: While MAE is suitable in this scenario, it doesn't consider the magnitude of errors and may not be sensitive to large errors in some cases. The choice of metric should align with the specific objectives and characteristics of the problem.

In [11]:
#Q10.

# Choice of Better Model: The choice depends on the specific goals and characteristics of the problem.

# Model A (Ridge Regularization, lambda=0.1): Suitable when multicollinearity is an issue, and we want to shrink coefficients. It doesn't perform feature selection.

# Model B (Lasso Regularization, lambda=0.5): Useful when we want feature selection as it tends to set some coefficients to zero, effectively performing feature selection.

# Trade-offs: Ridge is better at handling multicollinearity but doesn't perform feature selection. Lasso can perform feature selection but might not be as effective with strong multicollinearity.

# Limitations: The choice of regularization depends on the balance between feature selection, handling multicollinearity, and the problem's specifics.

In [12]:
# End