# **Bias and Variance in Machine Learning**


We use the terms bias and variance or bias-variance trade-off to describe the performance of a machine learning model. In this article, I will introduce you to the concept of bias and variance in machine learning

When training a machine learning model, it is very important to understand the bias and variance of predictions of your model. It helps in analyzing prediction errors which help us in training more accurate machine learning models. In this article, I’ll walk you through how to calculate bias and variance using Python.

In machine learning, you must have heard that the model has a **high variance or high bias**.

**High bias**

To understand what bias and variance are, suppose we have a point estimator of a parameter or function. Then, **the bias is usually defined as the difference between the expected value of the estimator and the parameter we want to estimate.**

high bias is proportional to the **underfitting**.

Bias is the difference between predicted values and expected results. A machine learning model with a low bias is a perfect model and a model with a high bias is expected with a high error rate on the training and test sets.

If the bias is greater than zero, we also say that the estimator is positively biased, if the bias is less than zero, the estimator is negatively biased, and if the bias is exactly zero, the estimator is unbiased. 

**Variance**

Variance as the difference between the expected value of the estimator squared minus the expectation squared of the estimator. A machine learning model with high variance indicates that the model may work well on the data it was trained on, but it will not generalize well on the dataset it has never seen before.

 In general, one could say that a high variance is proportional to the **overfitting**

**Bias and Variance using Python**


You must be using the scikit-learn library in Python for implementing most of the machine learning algorithms. But it does not have any function to calculate the bias and variance of your trained model. So to calculate the bias and variance of your model using Python, you have to install another library known as mlxtend. You can easily install it in your system by using the pip command:

In [None]:
!pip install mlxtend

In [1]:
!pip install mlxtend --upgrade



Now let’s train a machine learning model and then we will see how we can calculate its bias and variance using Python:





In [2]:
from mlxtend.evaluate import bias_variance_decomp
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.utils import shuffle
from sklearn.metrics import mean_squared_error

data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/student-mat.csv")
data = data[["G1", "G2", "G3", "studytime", "failures", "absences"]]

predict = "G3"
x = np.array(data.drop([predict], 1))
y = np.array(data[predict])

from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)

linear_regression = LinearRegression()
linear_regression.fit(xtrain, ytrain)
y_pred = linear_regression.predict(xtest)

So till now, we have trained a machine learning model by using the linear regression algorithm, below is how we can calculate its bias and variance using Python:

In [3]:
mse, bias, variance = bias_variance_decomp(linear_regression, xtrain, ytrain, xtest, ytest, 
                                           loss='mse', num_rounds=200, random_seed=123)
print("Average Bias : ", bias)
print("Average Variance : ", variance)

Average Bias :  4.910302451198915
Average Variance :  0.05685635558630853


Bias is the difference between predicted values and expected results. Variance is the variability of your model’s predictions over different sets of data. I hope you liked this article on how to calculate the bias and variance of a machine learning model. Feel free to ask your valuable questions in the comments section below.

[Bias and Variance using Python](https://thecleverprogrammer.com/2021/05/20/bias-and-variance-using-python/)

[Bias and Variance in Machine Learning](https://thecleverprogrammer.com/2020/12/28/bias-and-variance-in-machine-learning/)

[Overfitting and Underfitting in Machine Learning
](https://thecleverprogrammer.com/2020/09/04/overfitting-and-underfitting-in-machine-learning/)