In [None]:
# Q1. R-squared(R²) is a statistical measure used to evaluate the goodness of fit of a linear regression model. It represents the proportion of the variance in the dependent variable(target) that is explained by the independent variables(features) in the model. In other words, it measures how well the model fits the observed data points.

# The R-squared value ranges from 0 to 1, where:
# - R² = 0 indicates that the model explains none of the variability of the response data around its mean(poor fit).
# - R² = 1 indicates that the model explains all the variability of the response data around its mean(perfect fit).

# Q2. Adjusted R-squared is an extension of the regular R-squared that penalizes the inclusion of irrelevant variables in the regression model. It takes into account the number of independent variables in the model, making it more suitable for comparing models with different numbers of predictors.

# The formula for adjusted R-squared is :
# Adjusted R² = 1 - [(1 - R²) * (n - 1) / (n - p - 1)]

# Where:
# - R² is the regular R-squared value.
# - n is the number of data points(observations).
# - p is the number of independent variables(predictors) in the model.

# Q3. Adjusted R-squared is more appropriate to use when comparing multiple regression models with different numbers of predictors. Regular R-squared tends to increase with the addition of any variable, regardless of its significance, which may lead to overfitting. Adjusted R-squared penalizes the addition of irrelevant variables, providing a more accurate representation of the model's explanatory power and making it easier to compare different models.

# Q4. In the context of regression analysis:

# - RMSE (Root Mean Squared Error) is the square root of the average of the squared differences between the predicted values and the actual values. It measures the typical magnitude of the errors made by the model. The formula is :
#   RMSE = sqrt(1/n * Σ(predicted - actual)²)

# - MSE (Mean Squared Error) is the average of the squared differences between the predicted values and the actual values. It also measures the average magnitude of the errors made by the model. The formula is :
#   MSE = 1/n * Σ(predicted - actual)²

# - MAE (Mean Absolute Error) is the average of the absolute differences between the predicted values and the actual values. It measures the average magnitude of the errors made by the model, but unlike MSE and RMSE, it does not penalize large errors as much. The formula is :
#   MAE = 1/n * Σ | predicted - actual|

# Q5. Advantages and disadvantages of using RMSE, MSE, and MAE in regression analysis:

# Advantages:
# - RMSE and MSE give higher weights to large errors, making them suitable for applications where large errors are more critical.
# - RMSE and MSE are differentiable, which is important for many optimization algorithms used in model training.
# - MAE is less sensitive to outliers compared to RMSE and MSE, making it more robust in the presence of extreme values.

# Disadvantages:
# - RMSE and MSE are not directly interpretable in the original unit of the target variable, as they involve squaring the errors.
# - RMSE and MSE are more sensitive to outliers, as the squared error term amplifies their impact.
# - MAE may not be as sensitive to differences between models with small errors, making it harder to distinguish between them.

# Q6. Lasso regularization, also known as L1 regularization, is a technique used in linear regression to prevent overfitting and feature selection. It adds a penalty term to the loss function that is proportional to the sum of the absolute values of the regression coefficients.

# The main difference between Lasso and Ridge regularization(L2 regularization) is the penalty term:
# - Lasso adds the absolute values of the coefficients: λ * Σ | coefficients|
# - Ridge adds the squared values of the coefficients: λ * Σ(coefficients²)

# The Lasso regularization has the property of shrinking some coefficients to exactly zero, effectively performing feature selection by eliminating irrelevant variables from the model. This makes Lasso regularization particularly useful when dealing with datasets that have many features, as it automatically selects the most relevant ones.

# Q7. Regularized linear models help prevent overfitting by adding penalty terms to the loss function. These penalty terms control the size of the coefficients of the independent variables, effectively limiting their impact on the model.

# Example:
# Let's consider a linear regression model with a large number of features. Without regularization, the model might try to fit all the features, including noise, leading to overfitting. However, by applying Lasso or Ridge regularization, the model will shrink some coefficients toward zero, effectively reducing the influence of less relevant features and preventing overfitting.

# Q8. Limitations of regularized linear models:

# - Choosing the right regularization strength(λ) can be challenging. If λ is too small, the model may still overfit, while if it is too large, the model may underfit by overly penalizing important coefficients.

# - Lasso regularization performs feature selection, but if two or more features are highly correlated, Lasso may arbitrarily choose one of them and set the others to zero. This can result in instability in feature selection.

# - Regularized linear models may not perform well when the relationship between the target and features is highly nonlinear. In such cases, more flexible models like decision trees or neural networks might be more appropriate.

# Q9. To determine which model is the better performer, you need to consider the specific context and requirements of the problem. RMSE and MAE measure different aspects of the model's performance:

# - RMSE penalizes large errors more than MAE, which can be important if large errors are particularly undesirable in the problem domain.

# - MAE is less sensitive to outliers, which can be an advantage if the dataset contains extreme values that could heavily impact the model's performance.

# In this case, Model B has a lower MAE(8) than Model A's RMSE(10). If the specific context does not heavily prioritize avoiding large errors, Model B might be considered the better performer due to its lower MAE.

# However, the choice of metric should depend on the specific requirements and constraints of the problem. For instance, if the cost of large errors significantly outweighs that of small errors, RMSE might be more appropriate despite its higher value in this scenario.

# Q10. To determine which regularized linear model is the better performer, you should compare their performance metrics(e.g., R-squared or cross-validated performance) on a validation dataset or through cross-validation.

# Regarding the choice of regularization method:

# - Ridge regularization(L2) tends to perform well when there are many moderately important features and you want to shrink their coefficients while keeping all of them in the model. It does not perform feature selection.

# - Lasso regularization(L1) is more suitable when you suspect that only a subset of features is truly relevant, as it can set some coefficients to exactly zero, effectively performing feature selection.

# Trade-offs and limitations:

# - Lasso's feature selection capability can be an advantage, but it can also lead to instability when features are highly correlated. In such cases, Ridge regularization may be preferred.

# - The choice between Lasso and Ridge regularization depends on the specific dataset and the underlying assumptions about the importance of features. It's

# important to experiment and tune the regularization parameters to find the best fit for the data.
