In Machine Leaning, model performance is evaluated on the basis of two important parameters. 

1. Accuracy : Accuracy means how well model predicts the right target value

2. Generalisation: how well model behaves on seen and unseen data. Model is trained on training data and evaluated on a testing set.

# Underfitting (High Bias and Low variance)

Underfitting means model has low accuracy score on training data and test data both.

- Happend with simple models that can't learn complex relationships from training data

- High Bias model always leads to high error on training as well as on test data.


# Overfitting (High variance and Low Bias)

Overfitting means model has High accuracy score on training data but low score on test data. This implies that the model cannot generalize well.

- Happends for complex models as they over learn the patterns in data. It models the nois in training data

Example of overfitting and underfitting:

Applying linear regression on non-linear data

![image.png](attachment:image.png)



Right fit model is neither Underfit and nor Overfit, it is a generalised model that performs well for seen and unseen data. **Low bias and low variance model**


Generally, You can see a general trend in the examples above:

- Linear machine learning algorithms often are Underfit. Example:Linear Regression, Logistic Regression


- Nonlinear machine learning algorithms often are Overfit. Example: Decision Tree, SVM, Neural Networks



## Bias

**Difference or error occurring between the model’s predicted value and the actual value.**

- Low Bias: Low bias value means fewer assumptions are taken to build the target function. In this case, the model will closely match the training dataset.


- High Bias: High bias value means more assumptions are taken to build the target function. In this case, the model will not match the training dataset closely. For example, a linear regression model may have a high bias if the data has a non-linear relationship.

$$
\text{Bias}(\hat{f}(x)) = \mathbb{E}[\hat{f}(x)] - f(x)
$$

where,

- $\hat{f}(x)$ represents the model's prediction for input x
- $\mathbb{E}[\hat{f}(x)]$ denotes the expected value (average) of the model's predictions over different training sets.
- $f(x)$ is the true function we're trying to approximate.

## Variance

Variance is the measure of spread in data from its mean position. In machine learning variance is the amount by which the **performance of a predictive model changes when it is trained on different subsets of the training data.** More specifically, variance is the variability of the model that how much it is sensitive to another subset of the training dataset. i.e. how much it can adjust on the new subset of the training dataset.


- Low variance: **Low variance means that the model is less sensitive to changes in the training data** and can produce consistent estimates of the target function with different subsets of data from the same distribution. This is the case of underfitting when the model fails to generalize on both training and test data.


- High variance: High variance means that the model is very sensitive to changes in the training data and can result in significant changes in the estimate of the target function when trained on different subsets of data from the same distribution. This is the case of overfitting when the model performs well on the training data but poorly on new, unseen test data. It fits the training data too closely that it fails on the new training dataset.

$$
\text{Var}(\hat{f}(x)) = \mathbb{E}[(\hat{f}(x) - \mathbb{E}[\hat{f}(x)])^2]
$$

## Dealing with underfitting/ lowering high bias:

1. Make complex models with model features. Increase the number of features: By adding more features to train the dataset will increase the complexity of the model. 


2. Use Non Linear Algorithms Example( Polynomial Regression, Kernel Function in SVM. Make our mode more complex by increasing the number of hidden layers in the case of a deep neural network. Or we can use a more complex model like Polynomial regression for non-linear datasets, CNN for image processing, and RNN for sequence learning.


3. Use non Parameterised Algorithms

4. Increase the size of the training data: Increasing the size of the training data can help to reduce bias by providing the model with more examples to learn from the dataset.

## Dealing with Overfitting/ lowering high variance :

1. Use More Data for training to make model learn maximum hidden pattern from the training data and model becomes generalised.


2. Use Regularization Techniques Example: L1 , L2, Drop Out, Early Stopping( in case of Neural Networks)etc.


3. Hyper Parameter Tuning to avoid Overfitting Example: Higher value of K in KNN, Tuning of C and Gama for SVM, Depth of Tree in Decision Tree


4. Use less number of features — Manual or Feature Selection Algorithms or automated using L1, L2 Regularization


5. Reduce complexity of Model — Reduce polynomial degree in case of Polynomial regression and Logistic regression, number of parameters reduced in case of a NN


6. Use Advance techniques like Cross Validation, Stratified Cross Validation etc.

7. Ensemble methods: It will combine multiple models to improve generalization performance. Bagging, boosting, and stacking are common ensemble methods that can help reduce variance and improve generalization performance.

## Bias Variance Tradeoff


**To build a good model, we need to find a good balance between bias and variance such that it minimizes the total error.**

If a model is simple and have a smaller number of features, then it may have high bias and low variance, in contrast, if a model has huge number of features, then it may have low bias and high variance. So, as the bias increases variance decreases and vice-versa. So, we need to get a model which has low bias as well as low variance. That is why the trade-off is required.

This tradeoff is often visualized as a curve, known as the validation error curve:

<img src="https://cdn.analyticsvidhya.com/wp-content/uploads/2020/08/eba93f5a75070f0fbb9d86bec8a009e9.png"  width="400"/> 


#### References

1. https://medium.com/@itbodhi/overfitting-and-underfitting-in-machine-learning-models-76cb60dbdaf6
2. https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/
3. https://www.cs.toronto.edu/~lczhang/321/notes/notes09.pdf
4. https://medium.com/@sarita_68521/understanding-the-bias-variance-tradeoff-in-machine-learning-examples-and-solutions-5de459ddeabd
5. https://www.analyticsvidhya.com/blog/2020/08/bias-and-variance-tradeoff-machine-learning/