## Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it represent?


## Ans:- 

R-squared is a statistical measure that represents the proportion of variation in the dependent variable (the variable being predicted) that is explained by the independent variables (the variables used to make predictions) in a linear regression model. In other words, it indicates how well the model fits the data.
R-squared is calculated as the ratio of the sum of squared differences between the predicted values and the actual values of the dependent variable to the sum of squared differences between the actual values and the mean of the dependent variable. Mathematically, it can be expressed as follows:

$$ R^2 = 1 - \frac{SS_(RES)}{SS_(ROT)}$$ 

where SSres is the sum of squared residuals (the difference between predicted values and actual values) and SStot is the total sum of squared differences between the actual values and the mean of the dependent variable.

R-squared can range from 0 to 1, with higher values indicating a better fit of the model to the data. An R-squared value of 1 indicates that all of the variation in the dependent variable is explained by the independent variables in the model, while an R-squared value of 0 indicates that the model does not explain any of the variation in the dependent variable.

It is important to note that while a high R-squared value suggests a good fit of the model to the data, it does not necessarily mean that the model is the best possible model for the data or that the independent variables are causing the dependent variable. Other factors, such as multicollinearity, omitted variables, or outliers, can also affect the accuracy and validity of the model.

$$ R^2 = 1 - \frac{SS_(RES)}{SS_(ROT)} = 1- \frac{\sum_i (y_i -\hat y_i)^2}{\sum_i (y_i -\hat y)^2}$$ 

---
---

## Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.


## Ans:- 

Adjusted R-squared is a modified version of R-squared that has been adjusted for the number of predictors in the model. The adjusted R-squared increases when the new term improves the model more than would be expected by chance. It decreases when a predictor improves the model by less than expected. Typically, the adjusted R-squared is positive, not negative. It is always lower than the R-squared.

Adding more independent variables or predictors to a regression model tends to increase the R-squared value, which tempts makers of the model to add even more variables. This is called overfitting and can return an unwarranted high R-squared value. Adjusted R-squared is used to determine how reliable the correlation is and how much it is determined by the addition of independent variables.

The most obvious difference between adjusted R-squared and R-squared is simply that adjusted R-squared considers and tests different independent variables against the stock index and R-squared does not.
The most obvious difference between adjusted R-squared and R-squared is simply that adjusted R-squared considers and tests different independent variables against the stock index and R-squared does not.

R-squared, on the other hand, does have its limitations. One of the most essential limits to using this model is that R-squared cannot be used to determine whether or not the coefficient estimates and predictions are biased. Furthermore, in multiple lin
ear regression, the R-squared can not tell us which regression variable is more important than the other.

$$ Adjusted R squared = 1- \frac{(1-R^2)*(n-1)}{n-k-1} $$

---
---

## Q3. When is it more appropriate to use adjusted R-squared?


## Ans:- 

Adjusted R-squared is a modified version of the R-squared statistic that takes into account the number of predictors in a model. It is more appropriate to use adjusted R-squared when comparing models with different numbers of predictors or when the number of predictors is relatively large.

The adjusted R-squared penalizes the addition of unnecessary predictors to the model, whereas the standard R-squared can artificially increase as more predictors are added, even if they do not improve the model's predictive power. Therefore, the adjusted R-squared provides a more accurate assessment of a model's goodness of fit.

In general, it is advisable to use adjusted R-squared when evaluating regression models with multiple predictors, especially if the number of predictors is large or if some predictors are correlated with each other. However, it is important to keep in mind that the adjusted R-squared has its limitations and should be used in conjunction with other measures of model fit and interpretability.

---
---

## Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics calculated, and what do they represent?


## Ans:- 

### Mean Absolute Error (MAE) :
1. MAE represents the average of the absolute difference between the actual and predicted values in the dataset. It measures the average of the residuals in the dataset.
2. Formula for MAE :

$$ MAE = \frac{1}{n} \sum_{i=1}^n |y_i-\hat y_i| $$
 
where  $y_i$ is actual value and $\hat y_i$ is predicted value

### Mean Squared Error (MSE) :
1. MSE represents the average of the squared difference between the original and predicted values in the data set. It measures the variance of the residuals.
2. Formula for MSE : 

$$ MSE =  \frac{1}{n} \sum_{i=1}^n (y_i-\hat y_i)^2 $$
 
where  $y_i$ is actual value and $\hat y_i$ is predicted value

### Root Mean Squared Error (RMSE) :
1. RMSE is the square root of Mean Squared error. It measures the standard deviation of residuals.
2. Formula for RMSE : 
 $$ RMSE = \sqrt {MSE}= \sqrt{\frac{1}{n} \sum_{i=1}^n (y_i-\hat y_i)^2} $$
where  $y_i$ is actual value and $\hat y_i$ is predicted value

---
---

## Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in regression analysis.


## Ans:- 

Regression analysis is a statistical technique that aims to estimate the relationship between a dependent variable and one or more independent variables. To evaluate the performance of a regression model, it is common to use metrics such as Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Error (MAE). Each of these metrics has its own advantages and disadvantages, which we will discuss below:

### Advantages of RMSE:
* RMSE is a popular evaluation metric because it has a clear interpretation in the same units as the dependent variable. This means that it is easy to understand the magnitude of the error in the context of the problem being solved.
* RMSE gives more weight to large errors than small errors because it involves taking the square root of the mean of the squared errors. This can be useful if we want to penalize larger errors more heavily.
* RMSE is differentiable, which means it can be used in optimization algorithms to tune the parameters of a regression model.

### Disadvantages of RMSE:
* RMSE can be sensitive to outliers because it involves taking the square root of the mean of the squared errors. This means that a single large error can have a significant impact on the metric.
* RMSE does not have a lower bound because it can take any non-negative value. This means that it can be difficult to interpret the absolute goodness of fit of a model based on RMSE alone.

### Advantages of MSE:
* MSE is a well-known metric that is widely used in many areas of statistics.
* MSE is sensitive to both the magnitude and direction of errors, which can be useful in many situations.
* MSE is relatively easy to calculate.

### Disadvantages of MSE:
* MSE puts more weight on larger errors, which can make it less interpretable in situations where the magnitude of the error is important.
* MSE is sensitive to outliers.

### Advantages of MAE:
* MAE is less sensitive to outliers than RMSE and MSE.
* MAE is easier to interpret than RMSE and MSE because it is based on absolute errors.

### Disadvantages of MAE:
* MAE does not take into account the direction of errors, which can be a limitation in situations where the sign of the error is important.
* MAE puts equal weight on all errors, which can make it less useful in situations where large errors are more important than small errors.

Overall, the choice of which metric to use for evaluating regression models depends on the specific context of the problem being studied. For example, if the consequences of large errors are significant, then RMSE may be the best metric to use. If the data contains outliers, then MAE may be more appropriate. It is important to consider the advantages and disadvantages of each metric when selecting an appropriate evaluation metric.

---
---

## Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is it more appropriate to use?


## Ans:- 

Lasso regularization is a technique used in regression analysis to reduce overfitting by adding a penalty term to the cost function. The penalty term is the L1 norm of the coefficients, which shrinks the less important coefficients towards zero and leads to sparse models.

In contrast, Ridge regularization uses the L2 norm of the coefficients to add a penalty term to the cost function, which shrinks all coefficients towards zero. Ridge regularization typically results in models with all non-zero coefficients, whereas Lasso regularization can result in models with some zero coefficients.

The main difference between Lasso and Ridge regularization is the type of penalty term used. Lasso regularization can be more appropriate when the data is sparse, i.e., when there are many features but only a few of them are relevant to the response variable. In this case, Lasso regularization can effectively identify and remove the irrelevant features, resulting in a more interpretable and efficient model.

On the other hand, Ridge regularization can be more appropriate when all the features are potentially relevant to the response variable. In this case, Ridge regularization can shrink all coefficients towards zero, resulting in a more stable and generalizable model.

Overall, the choice between Lasso and Ridge regularization depends on the specific context of the problem being studied, and both techniques can be useful for reducing overfitting and improving the accuracy of regression models.

### Equation for Ridge Regression :
 $$L_{ridge} (w)= \sum_{i=1}^n {(yi-w^T X_i)}^2 + \lambda \sum_{j=1}^p w_j^2 $$
### Equation for Lasso Regression :
$$L_{ridge} (w)= \sum_{i=1}^n {(yi-w^T X_i)}^2 + \lambda \sum_{j=1}^p |w_j| $$
#### where 
* w is the weight vector, 
* $x_i$ is the feature vector for the i-th data point, 
* $y_i$ is the true label for the i-th data point, 
* $\lambda $ is the regularization strength, and 
* p is the number of features.

---
---

## Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an example to illustrate.

## Ans:- 

Regularized linear models, such as Ridge regression and Lasso regression, help prevent overfitting in machine learning by adding a penalty term to the cost function that limits the magnitude of the model coefficients. This penalty term discourages the model from assigning too much importance to any one feature, thereby reducing the model's sensitivity to noise and improving its generalization performance.

For example, let's consider a regression problem where we want to predict the sale price of houses based on a set of features, such as the number of bedrooms, the size of the lot, and the location of the house. We have a training set of 1000 houses, and we want to build a model that can accurately predict the sale price of new houses.

Without regularization, we might train a linear regression model that fits the training set very closely, with a high coefficient assigned to each feature. However, this model is likely to overfit to the training set, meaning it will perform poorly on new data that it has not seen before.
To prevent overfitting, we can add a penalty term to the cost function that limits the magnitude of the model coefficients. For example, we can use Ridge regression or Lasso regression to add an L2 or L1 penalty term, respectively, to the cost function.

With regularization, the model is encouraged to assign lower coefficients to less important features, which helps to prevent overfitting and improve its generalization performance. For example, in Lasso regression, the L1 penalty term can force some coefficients to be exactly zero, effectively removing those features from the model.
In summary, regularized linear models help prevent overfitting by adding a penalty term to the cost function that limits the magnitude of the model coefficients. By encouraging the model to assign lower coefficients to less important features, these models can improve the generalization performance of the model and reduce its sensitivity to noise.

---
---

## Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best choice for regression analysis.


## Ans:- 

#### While regularized linear models can be effective in preventing overfitting and improving generalization performance, they have several limitations and may not always be the best choice for regression analysis.
__1. Biased predictions for rare or important features:__ Regularization can result in biased predictions for rare or important features, as the model is encouraged to assign lower coefficients to less important features. This can be problematic in applications where rare or important features are critical to the prediction task.

__2. Difficulty in choosing the regularization parameter:__ The regularization parameter, which determines the strength of the penalty term, must be carefully chosen to balance the trade-off between fitting the training data and generalizing to new data. Choosing an inappropriate value for the regularization parameter can result in underfitting or overfitting.

__3. Nonlinear relationships:__ Regularized linear models can only capture linear relationships between the features and the target variable. In cases where the relationship is nonlinear, other models, such as decision trees or neural networks, may be more appropriate.

__4. Outliers:__ Regularized linear models are sensitive to outliers in the data, as these can have a large effect on the model coefficients. Outliers can also affect the choice of the regularization parameter, as they can cause the model to underfit or overfit.

__5. Interpretability:__ Regularized linear models may be less interpretable than non-regularized linear models, as the penalty term can result in some coefficients being exactly zero. This can make it difficult to understand the importance of individual features in the prediction task.

#### In summary, regularized linear models have several limitations that must be carefully considered when choosing an appropriate regression model. While they can be effective in preventing overfitting and improving generalization performance, they may not always be the best choice for regression analysis, particularly in cases where rare or important features are critical to the prediction task or where the relationship between the features and the target variable is nonlinear.

---
---

## Q9. You are comparing the performance of two regression models using different evaluation metrics.
## Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better performer, and why? Are there any limitations to your choice of metric?


## Ans:- 

---
---

## Q10. You are comparing the performance of two regularized linear models using different types of regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the better performer, and why? Are there any trade-offs or limitations to your choice of regularization method?

## Ans:- 

---
---