# Regression

#### Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it represent?

### R-squared in Linear Regression

R-squared is a statistical measure that represents the proportion of the variance in the dependent variable (the variable we're trying to predict) that is explained by the independent variables (the predictors) in a linear regression model. It provides an indication of how well the independent variables explain the variability in the dependent variable.

R-squared is calculated as:

R-squared = 1 - SSR/SST

where:

SSR = Sum Of Suare Resediuls
SST = Total Sum of Square

.

#### Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.

Adjusted R-squared is a modified version of the R-squared statistic that adjusts for the number of predictors in the model. It is designed to provide a more accurate measure of how well the model explains the variability in the dependent variable, especially when multiple predictors are involved.

R-squared measures the proportion of variance explained by the model but increases with the addition of any predictor, regardless of its relevance.

Adjusted R-squared also measures the proportion of variance explained but adjusts for the number of predictors, penalizing the inclusion of unnecessary variables, and thus better reflects model quality.

.

#### Q3. When is it more appropriate to use adjusted R-squared?

Use adjusted R-squared when we have multiple predictors or when comparing models with different numbers of predictors. It helps avoid overfitting by penalizing unnecessary variables, offering a more accurate measure of model quality.

.

#### Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics calculated, and what do they represent?

MSE (Mean Squared Error): Measures the average of the squared differences between actual and predicted values. It penalizes larger errors more heavily.

RMSE (Root Mean Squared Error): The square root of MSE, giving an error metric in the same units as the target variable. It's sensitive to large errors.

MAE (Mean Absolute Error): Measures the average of the absolute differences between actual and predicted values. It's less sensitive to large errors compared to MSE and RMSE.

.

#### Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in regression analysis.

Mean Squared Error (MSE):
-Advantages:
  - Penalizes larger errors more heavily, which can be useful if we want to focus on reducing big errors.
-Disadvantages:
  - Sensitive to outliers because errors are squared, which can disproportionately affect the metric.
  - Less interpretable since it’s in squared units of the target variable.

Root Mean Squared Error (RMSE):
-Advantages:
  - Provides an error metric in the same units as the target variable, making it easier to understand and communicate.
  - Like MSE, it penalizes large errors, which can be beneficial if large errors are particularly problematic.
-Disadvantages:
  - Still emphasizes large errors, which might not always be desirable depending on the context.

Mean Absolute Error (MAE):
-Advantages:
  - Less affected by outliers since it doesn’t square the errors, providing a more balanced view of typical errors.
  - Easy to interpret as it is in the same units as the target variable.
-Disadvantages:
  - Doesn’t differentiate between large and small errors, which can be a limitation if large errors need special attention.

.

#### Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is it more appropriate to use?

Lasso (Least Absolute Shrinkage and Selection Operator) regularization is a type of regression technique that introduces a penalty equivalent to the absolute value of the magnitude of the coefficients.
Lasso can shrink some coefficients to exactly zero, effectively selecting a simpler model by including only the most relevant features. This is particularly useful in high-dimensional datasets where some features may be irrelevant.

Ridge regularization is another type of regularization that penalizes the sum of the squares of the coefficients.

Differences:

Lasso:
- Uses the absolute values of the coefficients
- Can shrink some coefficients to exactly zero, leading to sparse models.
Effective for feature selection, especially when dealing with high-dimensional data with irrelevant features

Ridge:
- Uses the squares of the coefficients
- Shrinks coefficients but never zeroes them out completely, leading to models where all features are retained but with smaller coefficients.
- Does not perform feature selection as effectively because it doesn’t shrink coefficients to zero.

.

#### Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an example to illustrate.

Overfitting occurs when a machine learning model learns not only the underlying patterns in the training data but also the noise and fluctuations, leading to poor generalization to unseen data. Regularized linear models help prevent overfitting by adding a penalty term to the loss function, which discourages the model from learning overly complex or extreme weights for the features.
Regularization techniques, like Ridge and Lasso, add a penalty to the loss function that grows with the magnitude of the coefficients.

Example : Let's consider a scenario where we have a dataset with a polynomial relationship between the features and the target variable:

Without Regularization:
Suppose we fit a high-degree polynomial regression model to the data. The model may perfectly fit the training data, capturing every small fluctuation, leading to a complex curve with very high coefficients.
This model might perform very poorly on new data because it has overfit the noise in the training data.

With Regularization:
If we apply Ridge or Lasso regularization to the same high-degree polynomial model, the penalty term will discourage the model from assigning large coefficients to the polynomial terms.
As a result, the model will be simpler, with smaller coefficients, and will better capture the underlying pattern rather than the noise.

.

#### Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best choice for regression analysis

Linear Relationship Assumption:

These models assume a linear relationship between features and the target, making them unsuitable for capturing complex, non-linear patterns.
Bias Introduction:

Regularization can introduce bias, potentially leading to underfitting if the model is too simple for the data.
Limited Feature Selection (Ridge):

Ridge regression doesn't zero out coefficients, so all features are retained, including irrelevant ones, which can reduce interpretability.
Sensitivity to Scaling:

Regularized models are sensitive to the scale of features, requiring careful preprocessing to ensure proper performance.

.

#### Q9. You are comparing the performance of two regression models using different evaluation metrics. Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better performer, and why? Are there any limitations to your choice of metric?

RMSE (Root Mean Squared Error): Measures the square root of the average squared differences between predicted and actual values. It gives more weight to larger errors, making it sensitive to outliers.

MAE (Mean Absolute Error): Measures the average of the absolute differences between predicted and actual values. It treats all errors equally, providing a more straightforward interpretation.

Model B (with an MAE of 8) might be seen as the better performer for consistency in predictions. However, if outliers are a significant concern and need to be minimized, Model A's RMSE of 10 might offer a more reliable measure of performance.

.