# Bias/variance trade-off

There are two general types of errors made by classifiers - bias and variance errors.

* Bias error is the overall difference between expected predictions made by the model and true values.

* Variance error describes how much predictions for the given point vary.

<img src="https://github.com/ParrotPrediction/docker-course-xgboost/raw/49f8de97cbc1695dcbeb09391e2662dbedf30ee1/notebooks/images/bias-variance.png" />

In [30]:
import numpy as np
import xgboost as xgb

from pprint import pprint

# reproducibility
seed = 123
np.random.seed(seed)

In [31]:
# load Agaricus data
dtrain = xgb.DMatrix('data/agaricus.txt.train')
dtest = xgb.DMatrix('data/agaricus.txt.test')

[00:58:46] 6513x127 matrix with 143286 entries loaded from data/agaricus.txt.train
[00:58:46] 1611x127 matrix with 35442 entries loaded from data/agaricus.txt.test


In [33]:
# specify general training parameters
params = {
    'objective':'binary:logistic',
    'max_depth':1,
    'silent':1,
    'eta':0.5
}

num_rounds = 5

In [34]:
watchlist  = [(dtest,'test'), (dtrain,'train')]

In [35]:
bst = xgb.train(params, dtrain, num_rounds, watchlist)

[0]	test-error:0.11049	train-error:0.113926
[1]	test-error:0.11049	train-error:0.113926
[2]	test-error:0.03352	train-error:0.030401
[3]	test-error:0.027312	train-error:0.021495
[4]	test-error:0.031037	train-error:0.025487


In [36]:
params['eval_metric'] = 'logloss'
bst = xgb.train(params, dtrain, num_rounds, watchlist)

[0]	test-logloss:0.457887	train-logloss:0.460106
[1]	test-logloss:0.383911	train-logloss:0.378728
[2]	test-logloss:0.312678	train-logloss:0.308061
[3]	test-logloss:0.269119	train-logloss:0.26139
[4]	test-logloss:0.239746	train-logloss:0.232174


In [37]:
params['eval_metric'] = ['logloss', 'auc']
bst = xgb.train(params, dtrain, num_rounds, watchlist)

[0]	test-logloss:0.457887	test-auc:0.892138	train-logloss:0.460106	train-auc:0.888997
[1]	test-logloss:0.383911	test-auc:0.938901	train-logloss:0.378728	train-auc:0.942881
[2]	test-logloss:0.312678	test-auc:0.976157	train-logloss:0.308061	train-auc:0.981415
[3]	test-logloss:0.269119	test-auc:0.979685	train-logloss:0.26139	train-auc:0.985158
[4]	test-logloss:0.239746	test-auc:0.9785	train-logloss:0.232174	train-auc:0.983744


In [38]:
# custom evaluation metric
def misclassified(pred_probs, dtrain):
    labels = dtrain.get_label() # obtain true labels
    preds = pred_probs > 0.5 # obtain predicted values
    return 'misclassified', np.sum(labels != preds)

In [39]:
bst = xgb.train(params, dtrain, num_rounds, watchlist, feval=misclassified, maximize=False)

[0]	test-logloss:0.457887	test-auc:0.892138	train-logloss:0.460106	train-auc:0.888997	test-misclassified:178	train-misclassified:742
[1]	test-logloss:0.383911	test-auc:0.938901	train-logloss:0.378728	train-auc:0.942881	test-misclassified:178	train-misclassified:742
[2]	test-logloss:0.312678	test-auc:0.976157	train-logloss:0.308061	train-auc:0.981415	test-misclassified:54	train-misclassified:198
[3]	test-logloss:0.269119	test-auc:0.979685	train-logloss:0.26139	train-auc:0.985158	test-misclassified:44	train-misclassified:140
[4]	test-logloss:0.239746	test-auc:0.9785	train-logloss:0.232174	train-auc:0.983744	test-misclassified:50	train-misclassified:166


In [40]:
evals_result = {}
bst = xgb.train(params, dtrain, num_rounds, watchlist, feval=misclassified, maximize=False, evals_result=evals_result)

[0]	test-logloss:0.457887	test-auc:0.892138	train-logloss:0.460106	train-auc:0.888997	test-misclassified:178	train-misclassified:742
[1]	test-logloss:0.383911	test-auc:0.938901	train-logloss:0.378728	train-auc:0.942881	test-misclassified:178	train-misclassified:742
[2]	test-logloss:0.312678	test-auc:0.976157	train-logloss:0.308061	train-auc:0.981415	test-misclassified:54	train-misclassified:198
[3]	test-logloss:0.269119	test-auc:0.979685	train-logloss:0.26139	train-auc:0.985158	test-misclassified:44	train-misclassified:140
[4]	test-logloss:0.239746	test-auc:0.9785	train-logloss:0.232174	train-auc:0.983744	test-misclassified:50	train-misclassified:166


In [41]:
pprint(evals_result)

{'test': {'auc': [0.892138, 0.938901, 0.976157, 0.979685, 0.9785],
          'logloss': [0.457887, 0.383911, 0.312678, 0.269119, 0.239746],
          'misclassified': [178.0, 178.0, 54.0, 44.0, 50.0]},
 'train': {'auc': [0.888997, 0.942881, 0.981415, 0.985158, 0.983744],
           'logloss': [0.460106, 0.378728, 0.308061, 0.26139, 0.232174],
           'misclassified': [742.0, 742.0, 198.0, 140.0, 166.0]}}


In [42]:
params['eval_metric'] = 'error'
num_rounds = 1500

bst = xgb.train(params, dtrain, num_rounds, watchlist, early_stopping_rounds=10)

[0]	test-error:0.11049	train-error:0.113926
Multiple eval metrics have been passed: 'train-error' will be used for early stopping.

Will train until train-error hasn't improved in 10 rounds.
[1]	test-error:0.11049	train-error:0.113926
[2]	test-error:0.03352	train-error:0.030401
[3]	test-error:0.027312	train-error:0.021495
[4]	test-error:0.031037	train-error:0.025487
[5]	test-error:0.019243	train-error:0.01735
[6]	test-error:0.019243	train-error:0.01735
[7]	test-error:0.015518	train-error:0.013358
[8]	test-error:0.015518	train-error:0.013358
[9]	test-error:0.009311	train-error:0.007523
[10]	test-error:0.015518	train-error:0.013358
[11]	test-error:0.019243	train-error:0.01735
[12]	test-error:0.009311	train-error:0.007523
[13]	test-error:0.001862	train-error:0.001996
[14]	test-error:0.005587	train-error:0.005988
[15]	test-error:0.005587	train-error:0.005988
[16]	test-error:0.005587	train-error:0.005988
[17]	test-error:0.005587	train-error:0.005988
[18]	test-error:0.005587	train-error:0.00

In [43]:
print("Booster best train score: {}".format(bst.best_score))
print("Booster best iteration: {}".format(bst.best_iteration))
print("Booster best number of trees limit: {}".format(bst.best_ntree_limit))

Booster best train score: 0.001996
Booster best iteration: 13
Booster best number of trees limit: 14


In [44]:
num_rounds = 10 # how many estimators
hist = xgb.cv(params, dtrain, num_rounds, nfold=10, metrics={'error'}, seed=seed)
hist


Unnamed: 0,train-error-mean,train-error-std,test-error-mean,test-error-std
0,0.113926,0.001479,0.113924,0.013314
1,0.113926,0.001479,0.113924,0.013314
2,0.030401,0.000633,0.030401,0.0057
3,0.021496,0.000586,0.021495,0.005275
4,0.025488,0.000606,0.025487,0.005459
5,0.019687,0.00349,0.020728,0.007625
6,0.01735,0.000374,0.01735,0.003366
7,0.014467,0.001922,0.015353,0.003696
8,0.013358,0.000418,0.013358,0.003764
9,0.011737,0.003818,0.01259,0.004698
