<pre>
Bias = Ability to understand the underlying predictive structure. Low bias means higher ability to understand the underlying structure and high bias means otherwise.

Variance = Fittingness of model on training data. High variance means high dependence on the input training data and low variance means otherwise.

In an ideal model, we aim for low bias-low variance. Ideal model should have high ability to uncover the underlying structure of the data and should produce results that are lowly dependent on the input training data. But this is often not attainable. We aim to minimize both but these both are inherently inversely proportional.
</pre>

In [1]:
from pandas import read_csv
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import BaggingRegressor
from mlxtend.evaluate import bias_variance_decomp

In [2]:
# load dataset. This is boston housing dataset.
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'
dataframe = read_csv(url, header=None)

In [3]:
# separate into inputs and outputs
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
# from mlxtend.data import boston_housing_data, iris_data
# X, y = iris_data() # Can be done on any data.
# X, y = boston_housing_data() # This is same as the above url data.

In [4]:
# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state=1)

In [5]:
# define the model
modelLR = LinearRegression()
modelDTR = DecisionTreeRegressor(random_state=5)
modelBR = BaggingRegressor(base_estimator=modelDTR, n_estimators=100, random_state=245)

In [6]:
# estimate bias and variance
# Linear Regression is simple model. It should have high bias but lower variance.
mseLR, biasLR, varLR = bias_variance_decomp(modelLR, X_train, y_train, X_test, y_test, loss='mse', num_rounds=200, random_seed=1)
# Decision Tree is a complex model. It should have low bias but high variance.
mseDTR, biasDTR, varDTR = bias_variance_decomp(modelDTR, X_train, y_train, X_test, y_test, loss='mse', num_rounds=200, random_seed=5)
# Bagging should have lower variance because it trains the base regressor on multiple random subsets of original dataset.
mseBR, biasBR, varBR = bias_variance_decomp(modelBR, X_train, y_train, X_test, y_test, loss='mse', num_rounds=200, random_seed=245)


In [7]:
# summarize results
print('For Linear Regression: MSE: %.3f, Bias: %.3f, Variance: %.3f' %(mseLR, biasLR, varLR))
print('For Decision Tree Regressor: MSE: %.3f, Bias: %.3f, Variance: %.3f' %(mseDTR, biasDTR, varDTR))
print('For Bagging Regressor: MSE: %.3f, Bias: %.3f, Variance: %.3f' %(mseBR, biasBR, varBR))

For Linear Regression: MSE: 22.487, Bias: 20.726, Variance: 1.761
For Decision Tree Regressor: MSE: 26.458, Bias: 9.774, Variance: 16.684
For Bagging Regressor: MSE: 13.302, Bias: 10.242, Variance: 3.059
