#### Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it represent?

- R-Squared (R² or the coefficient of determination) is a statistical measure in a regression model that determines the proportion of variance in the dependent variable that can be explained by the independent variable.
- R-squared values range from 0 to 1, with higher values indicating a better fit.
- A R-squared value of 1 means that all of the variation in the dependent variable is explained by the independent variable(s), while an R-squared value of 0 means that none of the variation is explained.
#
- When it comes to determining the R-squared value, the first step is to calculate the total sum of squares (TSS), which reflects the overall variation in the dependent variable.
- Then we compute the sum of squares of residuals(RSS), which indicates any unexplained variation in the dependent variable. Finally, you can arrive at the R-squared value by using this formula:
#
![image.png](attachment:image.png)

#### Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.

- The adjusted R-squared is a tweaked version of the typical R-squared metric which considers the number of independent variables present in a linear regression model.
- Unlike the regular R-squared that gauges the percentage of variation in the dependent variable explained by independent variable(s), adjusted R-squared takes into account the number of independent variables and discourages the use of excessive irrelevant variables that do not significantly contribute to the model.
#
- Formula:
#
![image.png](attachment:image.png)

#### Q3. When is it more appropriate to use adjusted R-squared?

- Adjusted R-squared is more appropriate to use when we are comparing multiple linear regression models that have different numbers of independent variables in the regression model.

#### Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics calculated, and what do they represent?

- Regression models rely on distance metrics to determine the convergence to the best result that needs to be explained quantitatively by some metrics.
- Usually the metrics used are the
1. Mean Average Error (MAE)
2. Mean Squared Error (MSE), and
3. Root Mean Squared Error (RMSE)
#
1. Mean Average Error (MAE)
    - It represents the average absolute difference between the predicted values and the actual values.
- Formula
![image.png](attachment:image.png)
#
2. Mean Squared Error (MSE)
    - It represents the average squared difference between the predicted values and the actual values.
- Formula
![image.png](attachment:image-2.png)

#
3. Root Mean Squared Error (RMSE)
    - It is the square root of MSE.
    - It represents the standard deviation of the residuals (i.e., the errors) between the predicted values and the actual values.
- Formula
    - ![image.png](attachment:image-3.png)
#

#### Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in regression analysis.

1. RMSE : Root Mean Squared Error
- Advantages:
    - Has the same unit as the original data, making it easier to interpret.
    - It gives more weight to large errors than small errors, which can be useful in situations where large errors are particularly costly or important.
- Disadvantages:
    - It is sensitive to outlier.
    - It is harder to interpret as it is not in the same unit as the original data.
#
2. MSE : Mean Squared Error
- Advantages:
    - It penalizes larger errors more heavily, giving it more sensitivity to outliers.
    - It has a mathematical relationship with other metrics, such as variance and standard deviation, which can provide additional insights into the distribution of the errors.
- Disadvantages:
    - It is sensitive to outlier.
    - It is harder to interpret as it is not in the same unit as the original data.
#
3. MAE : Mean Absolute Error
- Advantages:
    - Easy to interpret and understand.
    - Less sensitivity to outliers.
- Disadvantages:
    - Does not take into account the direction of error.
    - It is less sensitive to outliers than RMSE and MSE, as it treats all errors equally.

#### Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is it more appropriate to use?

- Lasso regularization
    - It stands for Least Absolute Shrinkage and Selection Operator regularization.
    - It reduces model complexity and prevents overfitting by adding a penalty term to the cost function.
    - The penalty term is proportional to the absolute value of the coefficients of the model.
    - This penalty term forces some of the coefficients to become zero, effectively performing feature selection.
    - It is more appropriate when there are many features but only a few are expected to be relevant.
#
- Ridge regularization
    - It adds a penalty term proportional to the square of the coefficients.
    - It shrinks coefficients towards zero but never forces them to be exactly zero.
    - Ridge regularization is more appropriate when all features are expected to be relevant.


#### Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an example to illustrate.

- Regularized linear models help prevent overfitting in machine learning.
- Regularization adds a penalty term to the cost function that the model is trying to minimize.
- The penalty term encourages the model to learn simpler patterns that generalize better to new data.
- L2 regularization, also known as ridge regression, is a popular form of regularization in linear models.
- Ridge regression adds a penalty term proportional to the squared magnitude of the model coefficients.
- This reduces the impact of the coefficients on the final predictions, making the model less sensitive to noise and fluctuations in the training data.
#
- An example of using ridge regression to prevent overfitting is training a linear regression model on a dataset with 1000 observations and 10 features, splitting the data into a training set of 800 observations and a test set of 200 observations, and comparing the performance of a simple linear regression model without regularization to a ridge regression model with a regularization parameter of 0.1.

#### Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best choice for regression analysis.

- Regularized linear models have certain limitations that need to be taken into account.
    #
    - Flexibility
    - They have limited flexibility as they are based on linear assumptions and may not be able to capture complex non-linear relationships between features and the target variable. This means that their ability to accurately model data with high degrees of non-linearity can be restricted.
    #
    - Interpretability
    - While regularized linear models are easy to interpret and explain, the regularization penalty can make it harder to understand the specific impact of individual features on the model's predictions. This can hinder gaining a deeper understanding of underlying data relationships.
    #
    -Sensitivity to feature scaling
    - These are sensitive to feature scaling. If input features are not scaled properly, the regularization penalty may not work as intended and affect model performance.
    #
    - Difficulty in choosing the regularization parameter
    - Choosing the regularization parameter can have a significant impact on model performance but finding an optimal value requires extensive experimentation.
    #
    - Time-consuming training
    - These can be computationally expensive for large datasets or high-dimensional feature spaces which makes training time-consuming.
#
- Given these limitations, regularized linear models might not always be suitable for regression analysis. For instance, if there is high degrees of non-linearity or complex interactions between features in data then more flexible models like decision trees or neural networks might be preferred. Similarly, simpler linear models like simple linear regression might be preferable if interpretability is a priority. Ultimately, choosing a model will depend on specific requirements of the problem at hand and characteristics of data being analyzed.

#### Q9. You are comparing the performance of two regression models using different evaluation metrics. Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better performer, and why? Are there any limitations to your choice of metric?

- From the observation of model A's RMSE acore of 10 and model B's MAE as 8, we can conclude:
	- The average difference between the forecast and the observed data was 8 degrees.
    - Outliers are unlikely to exist.
    - There is some variation in the magnitude of the errors.
#
- From above conclusion, I will choose Model B as it makes on average a smaller absolute prediction error than Model A.
#
- The only limition of my choice i.e Model B, is that if there outliers exist in the data then absolute error will increase while choosing Model A would have resulted in a better overall prediction result.

#### Q10. You are comparing the performance of two regularized linear models using different types of regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the better performer, and why? Are there any trade-offs or limitations to your choice of regularization method?

- When it comes to selecting a regularization method for your model, there are two main options: Ridge and Lasso. Ridge regularization is effective when all features in the model are pertinent and impact the response variable similarly. In this case, Ridge regularization can reduce coefficient variance and enhance model stability. However, if there are irrelevant features in the model, Ridge regularization may not perform as well since it will still retain them with small coefficients.
#
- Lasso regularization is beneficial when some of the features in the model are insignificant or have minimal impact on the response variable. By removing these features from the model, Lasso regularization can improve its interpretability. However, if all features are relevant and have a similar effect on the response variable, Lasso may remove some of them from the model even if they are important.
#
- Ultimately, deciding between Ridge or Lasso regularization depends on your specific problem and analysis goals. Both methods have their pros and cons that should be considered before making a decision to ensure you choose an appropriate approach for your situation.