## Q1: Define overfitting and underfitting in machine learning. What are the consequences of each, and how can they be mitigated?

Overfitting and Underfitting are the two main problems that occur in machine learning and degrade the performance of the machine learning models.

**Overfitting** - Overfitting occurs when our machine learning model tries to cover all the data points or more than the required data points present in the given dataset. Because of this, the model starts taking into account the noisy and inaccurate values(outliers) present in the dataset, and all these factors reduce the efficiency and accuracy of the model. **The overfitted model has low bias(since training accuracy is high) and high variance(since testing accuracy is low).**

![image.png](attachment:image.png)

**Underfitting** - Underfitting occurs when our machine learning model is not able to capture the underlying trend of the data. To avoid the overfitting in the model, the fed of training data can be stopped at an early stage, due to which the model may not learn enough from the training data. As a result, it may fail to find the best fit of the dominant trend in the data. In the case of underfitting, the model is not able to learn enough from the training data, and hence it reduces the accuracy and produces unreliable predictions. **An underfitted model has high bias(since training accuracy is low) and low variance(since testing accuracy is high).**

![image-2.png](attachment:image-2.png)

**Techniques to Reduce Underfitting**:

1. Increase model complexity.
2. Increase the number of features, performing feature engineering.
3. Remove noise from the data.
4. Increase the number of epochs or increase the duration of training to get better results.

**Techniques to Reduce Overfitting**:

1. Increase training data.
2. Reduce model complexity.
3. Early stopping during the training phase (have an eye over the loss over the training period as soon as loss begins to increase stop training).
4. **Ridge Regularization** and **Lasso Regularization.**
5. Use dropout for neural networks to tackle overfitting.

## Q2: How can we reduce overfitting? Explain in brief.

**Techniques to Reduce Overfitting**:

1. Increase training data - As the number of data points increases, the models tries to genralize itself rather than covering each data point.
2. Reduce model complexity - Increasing the number of features by feature engineering and identifying and removing the outliers can lead to model complexity and hence reduce overfitiing.
3. Early stopping during the training phase (have an eye over the loss over the training period as soon as loss begins to increase stop training) - This leads to avoid over training the model on the dataset and hence reduces overfitting.
4. **Ridge Regularization** - It adds L2 as the penalty. L2 is the sum of the square of the magnitude of beta coefficients.  
    **Lasso Regularization.** - It stands for Least Absolute Shrinkage and Selection Operator. It adds L1 the penalty. L1 is the sum of the absolute value of the beta coefficients
5. Use dropout for neural networks to tackle overfitting - We drop some neurons for the neural networks in order to avoid overfitting the model.

## Q3: Explain underfitting. List scenarios where underfitting can occur in ML.

**Underfitting** - Underfitting occurs when our machine learning model is not able to capture the underlying trend of the data. To avoid the overfitting in the model, the fed of training data can be stopped at an early stage, due to which the model may not learn enough from the training data. As a result, it may fail to find the best fit of the dominant trend in the data. In the case of underfitting, the model is not able to learn enough from the training data, and hence it reduces the accuracy and produces unreliable predictions. **An underfitted model has high bias(since training accuracy is low) and low variance(since testing accuracy is high).**

#### Scenarios where underfitting can occur in Machine Learning are as follows:

1. When we use a linear model to fit a dataset that has a non-linear relationship between the input and output variables. In this case, the linear model is too simple to capture the non-linear patterns in the data and will underfit the training data.

2. When the training set has far fewer observations than variables, this may lead to underfitting or high bias Machine Learning models. In such cases, the Machine Learning models cannot find any relationship between input data and the response variable because the model is not complex enough to model the data.

3. When Machine Learning algorithm cannot find any pattern between training and testing set variables which may happen in the high-dimensional dataset or a large number of input variables. This could be due to insufficient Machine Learning model complexity, limited available training observations for learning patterns, limited computing power that limits Machine Learning algorithms’ ability to search for patterns in high dimensional space, etc.

##  Q4: Explain the bias-variance tradeoff in machine learning. What is the relationship between bias and variance, and how do they affect model performance?

**Bias** - The bias is known as the difference between the prediction of the values by the Machine Learning model and the correct value. Being high in biasing gives a large error in training as well as testing data. It recommended that an algorithm should always be low-biased to avoid the problem of underfitting.

**Variance** - The variability of model prediction for a given data point which tells us the spread of our data is called the variance of the model. The model with high variance has a very complex fit to the training data and thus is not able to fit accurately on the data which it hasn’t seen before. As a result, such models perform very well on training data but have high error rates on test data.

#### Bias-Variance Tradeoff

If the algorithm is too simple (hypothesis with linear equation) then it may be on high bias and low variance condition and thus is error-prone. If algorithms fit too complex (hypothesis with high degree equation) then it may be on high variance and low bias. In the latter condition, the new entries will not perform well. Well, there is something between both of these conditions, known as a Trade-off or Bias Variance Trade-off. This tradeoff in complexity is why there is a tradeoff between bias and variance. An algorithm can’t be more complex and less complex at the same time. For the graph, the perfect tradeoff will be like this.

![image.png](attachment:image.png)

We try to optimize the value of the total error for the model by using the Bias-Variance Tradeoff.

$$Total\_Error = Bias^2 + Variance + Irreducible_Error$$

![image-2.png](attachment:image-2.png)



## Q5: Discuss some common methods for detecting overfitting and underfitting in machine learning models. How can you determine whether your model is overfitting or underfitting?

**Methods for detecting overfitting and underfitting are as follows:**

1. Testing the Machine Learning models on more data with with comprehensive representation of possible input data values and types.
2. **K-fold cross validation**. The steps are as follows:
    1.    Keep one subset as the validation data and train the machine learning model on the remaining K-1 subsets.
    2.    Observe how the model performs on the validation sample.
    3.    Score model performance based on output data quality.

## Deteremine whether the model is overfitting or underfitting

We can determine whether a predictive model is underfitting or overfitting the training data by looking at the prediction error(bias) on the training data and the evaluation data.

![image.png](attachment:image.png)

The model is underfitting the training data when the model performs poorly on the training data. This is because the model is unable to capture the relationship between the input examples (often called X) and the target values (often called Y). The model is overfitting your training data when you see that the model performs well on the training data but does not perform well on the evaluation data. This is because the model is memorizing the data it has seen and is unable to generalize to unseen examples.

## Q6: Compare and contrast bias and variance in machine learning. What are some examples of high bias and high variance models, and how do they differ in terms of their performance?

|Points|Bias|Variance|
|---|---|---|
|**Definition**|Bias is a phenomenon that occurs in the machine learning model wherein an algorithm is used and it does not fit properly.|Variance specifies the amount of variation that the estimate of the target function will change if different training data was used.|
|**When used**|Bias refers to the difference between predicted values and actual values.|Variance says about how much a random variable deviates from its expected value.|
|**Learning**|The model cannot find patterns in the training dataset and fails for both seen and unseen data.|The model finds most patterns in the dataset and even learns from the unnecessary data or the noise. It fails with the unseen data points.|


**Examples of High Bias and High Variance models**:

High bias is equivalent to aiming in the wrong place. High variance is equivalent to having an unsteady aim. The below image exactly depicts the above saying.

![image-2.png](attachment:image-2.png)

**Examples of High Bias Models** - Linear Regression, Simple models with less training  
**Examples of High Variance Models** - Any neural network models with too many neurons, Decision trees with high depth

## Q7: What is regularization in machine learning, and how can it be used to prevent overfitting? Describe some common regularization techniques and how they work.

**Regularization** - Regularization is a technique used to reduce errors by fitting the function appropriately on the given training set and avoiding overfitting. It is used to calibrate Machine Learning models in order to minimize the adjusted loss function and prevent overfitting or underfitting. It mainly regularizes or reduces the coefficient of features toward zero. In simple words, "In regularization technique, we reduce the magnitude of the features by keeping the same number of features."

#### Techniques of Regularization
There are mainly two types of regularization techniques, which are given below:

**1. Ridge Regression** - Ridge regression is one of the types of linear regression in which a small amount of bias is introduced so that we can get better long-term predictions. Ridge regression is a regularization technique, which is used to reduce the complexity of the model. It is also called as **L2 regularization.** In this technique, the cost function is altered by adding the penalty term to it. The amount of bias added to the model is called Ridge Regression penalty. We can calculate it by multiplying with the lambda to the squared weight of each individual feature.

![image.png](attachment:image.png)

**2. Lasso Regression** - Lasso regression is another regularization technique to reduce the complexity of the model. It stands for Least Absolute and Selection Operator. It is similar to the Ridge Regression except that the penalty term contains only the absolute weights instead of a square of weights. It is also called as L1 regularization. Since it takes absolute values, hence, it can shrink the slope to 0, whereas Ridge Regression can only shrink it near to 0.

![image-2.png](attachment:image-2.png)