# Resampling

Some common over-sampling and under-sampling techniques can be found in [imbalanced-learn](https://imbalanced-learn.readthedocs.io/) library. imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance. It is compatible with scikit-learn and is part of scikit-learn-contrib projects.

It is easy to add resampling methods on the training of your algorithms through sklearn Pipelines, you can check examples in this tutorial: [Pipeline Object](https://imbalanced-learn.readthedocs.io/en/stable/auto_examples/pipeline/plot_pipeline_classification.html#sphx-glr-auto-examples-pipeline-plot-pipeline-classification-py).
**In order for the pipeline to work, you should import `make_pipeline` using**:

`from imblearn.pipeline import make_pipeline`

and not the `sklearn.pipeline` one.

Some techniques include: imblearn.over_sampling.RandomOverSampler, imblearn.under_sampling.RandomUnderSampler, and imblearn.SMOTE. For these algorithms there is a nice parameter that allows the user to change the sampling ratio.

For example, in SMOTE, to change the ratio you would input a dictionary, and all values must be greater than or equal to the largest class (since SMOTE is an over-sampling technique). The reason I have found SMOTE to be a better fit for model performance in my experience is probably because with RandomOverSampler you are duplicating rows, which means the model can start to memorize the data rather than generalize to new data. SMOTE uses the K-Nearest-Neighbors algorithm to make "similar" data points to those under sampled ones.

Sometimes it is not good practice to blindly use SMOTE, setting the ratio to it's default (even class balance) because the model may overfit one or more of the minority classes, even though SMOTE is using nearest neighbors to make "similar" observations. In a similar way that you tune hyperparameters of a ML model you will tune the hyperparameters of the SMOTE algorithm, such as the ratio and/or knn. Below is a working example of how to properly use SMOTE.

**NOTE:** It is vital that you do not use resampling methods on the full data set. You MUST use them ONLY on the training set, and then validate on the validation set and test sets to see if your SMOTE model out performed your other model(s). If you do not do this there will be data leakage and you will get a totally irrelevant model.

In [40]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [82]:
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import NearMiss, EditedNearestNeighbours,RepeatedEditedNearestNeighbours
from imblearn.combine import SMOTEENN, SMOTETomek
from imblearn.pipeline import make_pipeline

In this case, if we want to use a keras NN in our Voting Ensemble, we cannot use the native sklearn function. We need to build the ensemble by hand.

In [41]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import os
import sklearn

# plt.style.use('fivethirtyeight')
sns.set_style("whitegrid")
sns.set_context("notebook")
DATA_PATH = '../data/'

VAL_SPLITS = 4

In [42]:
# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0

# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set the `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set the `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

In [62]:
from plot_utils import plot_confusion_matrix
from cv_utils import run_cv_f1
from cv_utils import plot_cv_roc
from cv_utils import plot_cv_roc_prc
from cv_utils import print_scores_cv

In [47]:
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.neural_network import MLPClassifier
# Experimental: Based on LightGMB https://github.com/Microsoft/LightGBM
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier, BaggingClassifier

import xgboost as xgb
from sklearn.model_selection import cross_validate
from sklearn.metrics import f1_score, accuracy_score, precision_score

In [48]:
from imblearn.ensemble import BalancedBaggingClassifier
from imblearn.ensemble import BalancedRandomForestClassifier
from imblearn.ensemble import EasyEnsembleClassifier
from imblearn.ensemble import RUSBoostClassifier

In [93]:
from sklearn.preprocessing import FunctionTransformer
from sklearn_utils import FeatureSelectorDic

For this part of the project, we will only work with the training set, that we will split again into train and validation to perform the hyperparameter tuning.

We will save the test set for the final part, when we have already tuned our hyperparameters.

In [111]:
df = pd.read_csv(os.path.join(DATA_PATH,'df_train.csv'))
df.drop(columns= df.columns[0:2],inplace=True)
X = df.drop(columns='Class').to_numpy()
y = df['Class'].to_numpy()
idx_to_feat = dict(enumerate([feat for feat in df.drop(columns='Class').columns ]))
feat_to_idx = {feat : idx for idx,feat in idx_to_feat.items()}
del(idx_to_feat)
cv = StratifiedShuffleSplit(n_splits=VAL_SPLITS,test_size=0.15,random_state=0)
df.head()

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,...,V24,V25,V26,V27,V28,Class,TimeScaled,TimeSin,TimeCos,AmountBC
0,-0.829392,1.118573,0.926038,1.163686,0.009824,0.527347,0.17337,0.723997,-0.638939,-0.162923,...,-0.298908,-0.060301,-0.217935,0.291312,0.120779,0,0.460069,-0.480989,0.876727,3.195062
1,-2.814527,1.613321,0.654307,0.581821,0.399491,0.73004,0.456233,-2.464347,0.654797,2.248682,...,-0.329526,-0.307374,-0.440007,-2.135657,0.011041,0,0.266395,-0.204567,-0.978853,3.125269
2,2.105028,-0.7004,-1.338043,-0.596395,-0.395217,-0.75505,-0.276951,-0.291562,-0.965418,1.107179,...,-0.278137,-0.040685,0.789267,-0.066054,-0.069956,0,0.762303,-0.153992,-0.988072,3.421235
3,2.205839,-1.023897,-1.270137,-0.950174,-0.868712,-0.975492,-0.475464,-0.280564,0.503713,0.448173,...,-0.041177,0.089158,1.105794,-0.066285,-0.079881,0,0.87974,-0.998227,0.059524,1.072145
4,2.02709,-0.778666,-1.552755,-0.558679,0.020939,-0.026071,-0.20781,-0.124288,-0.635953,0.817757,...,0.033477,-0.157992,-0.606327,-0.003931,-0.039868,0,0.821649,-0.783558,-0.621319,3.97149


## Comparison of sampling methods

In [105]:
scaler = StandardScaler()
over_sampler = SMOTE(random_state=0, n_jobs=-1)
clf_ = xgb.sklearn.XGBClassifier(n_jobs=-1,verbosity=0, 
                                 learning_rate=0.1, 
                                 random_state=0)
clf = make_pipeline(scaler,over_sampler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.20 +- 0.01
train_f1: 0.22 +- 0.01
test_average_precision: 0.73 +- 0.08
train_average_precision: 0.78 +- 0.02
test_roc_auc: 0.98 +- 0.00
train_roc_auc: 1.00 +- 0.00
test_precision: 0.11 +- 0.00
train_precision: 0.13 +- 0.01
test_recall: 0.87 +- 0.04
train_recall: 0.97 +- 0.00


In [94]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.82 +- 0.06
train_f1: 1.00 +- 0.00
test_average_precision: 0.78 +- 0.07
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.92 +- 0.02
train_roc_auc: 1.00 +- 0.00
test_precision: 0.92 +- 0.03
train_precision: 1.00 +- 0.00
test_recall: 0.74 +- 0.08
train_recall: 1.00 +- 0.00


In [83]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
over_sampler = SMOTEENN(random_state=0)
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,over_sampler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.22 +- 0.01
train_f1: 0.30 +- 0.01
test_average_precision: 0.68 +- 0.05
train_average_precision: 0.94 +- 0.02
test_roc_auc: 0.93 +- 0.02
train_roc_auc: 1.00 +- 0.00
test_precision: 0.13 +- 0.01
train_precision: 0.18 +- 0.01
test_recall: 0.81 +- 0.04
train_recall: 1.00 +- 0.00


In [84]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
over_sampler = SMOTE(random_state=0, n_jobs=-1)
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,over_sampler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.31 +- 0.01
train_f1: 1.00 +- 0.00
test_average_precision: 0.69 +- 0.04
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.93 +- 0.02
train_roc_auc: 1.00 +- 0.00
test_precision: 0.19 +- 0.01
train_precision: 1.00 +- 0.00
test_recall: 0.80 +- 0.06
train_recall: 1.00 +- 0.00


In [85]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
over_sampler = SMOTETomek(random_state=0)
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,over_sampler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.30 +- 0.02
train_f1: 0.96 +- 0.01
test_average_precision: 0.69 +- 0.05
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.93 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.19 +- 0.01
train_precision: 0.93 +- 0.02
test_recall: 0.79 +- 0.06
train_recall: 1.00 +- 0.00


In [86]:
list_features = ['V3','V4','V12','V14','V16','V17']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
over_sampler = NearMiss(random_state=0, n_jobs=-1)
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.30 +- 0.02
train_f1: 0.96 +- 0.01
test_average_precision: 0.69 +- 0.05
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.93 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.19 +- 0.01
train_precision: 0.93 +- 0.02
test_recall: 0.79 +- 0.06
train_recall: 1.00 +- 0.00


In [87]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
over_sampler = EditedNearestNeighbours(random_state=0, n_jobs=-1)
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,over_sampler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.82 +- 0.05
train_f1: 0.96 +- 0.01
test_average_precision: 0.76 +- 0.07
train_average_precision: 0.99 +- 0.01
test_roc_auc: 0.92 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.89 +- 0.04
train_precision: 0.93 +- 0.01
test_recall: 0.77 +- 0.06
train_recall: 1.00 +- 0.00


In [88]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()

# Create the samplers
enn = EditedNearestNeighbours()
renn = RepeatedEditedNearestNeighbours()

clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,enn,renn,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.81 +- 0.04
train_f1: 0.95 +- 0.00
test_average_precision: 0.74 +- 0.09
train_average_precision: 0.96 +- 0.02
test_roc_auc: 0.92 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.85 +- 0.04
train_precision: 0.90 +- 0.01
test_recall: 0.78 +- 0.06
train_recall: 1.00 +- 0.00


In [89]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
# Create the samplers
enn = EditedNearestNeighbours()
renn = RepeatedEditedNearestNeighbours()
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,enn,renn,scaler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.80 +- 0.04
train_f1: 0.95 +- 0.01
test_average_precision: 0.73 +- 0.10
train_average_precision: 0.96 +- 0.02
test_roc_auc: 0.92 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.85 +- 0.02
train_precision: 0.90 +- 0.01
test_recall: 0.76 +- 0.06
train_recall: 1.00 +- 0.00


In [95]:
list_features = ['V9','V14','V16']
feat_select = FeatureSelectorDic(list_features, feat_to_idx)
scaler = StandardScaler()
over_sampler = EditedNearestNeighbours(random_state=0, n_jobs=-1)
clf_ = ExtraTreesClassifier(n_estimators=50,n_jobs=-1,random_state=0)
clf = make_pipeline(feat_select,scaler,clf_)

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.82 +- 0.06
train_f1: 1.00 +- 0.00
test_average_precision: 0.78 +- 0.07
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.92 +- 0.02
train_roc_auc: 1.00 +- 0.00
test_precision: 0.92 +- 0.03
train_precision: 1.00 +- 0.00
test_recall: 0.74 +- 0.08
train_recall: 1.00 +- 0.00


## Comparison of ensembling classifiers with undersampling
A balanced version of bagging algorithms that randomly under-samples each bootstrap sample to
balance it.

#### Bagging of logistic regressions

In [102]:
scaler = StandardScaler()
base_estimator=LogisticRegression()
clf_ = BaggingClassifier(base_estimator=base_estimator,random_state=0,n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.71 +- 0.04
train_f1: 0.70 +- 0.01
test_average_precision: 0.74 +- 0.05
train_average_precision: 0.74 +- 0.01
test_roc_auc: 0.96 +- 0.01
train_roc_auc: 0.98 +- 0.00
test_precision: 0.88 +- 0.04
train_precision: 0.86 +- 0.01
test_recall: 0.60 +- 0.06
train_recall: 0.59 +- 0.01


In [101]:
scaler = StandardScaler()
base_estimator=LogisticRegression()
clf_ = BalancedBaggingClassifier(base_estimator=base_estimator,random_state=0,n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.09 +- 0.01
train_f1: 0.10 +- 0.02
test_average_precision: 0.62 +- 0.04
train_average_precision: 0.60 +- 0.07
test_roc_auc: 0.97 +- 0.01
train_roc_auc: 0.99 +- 0.00
test_precision: 0.05 +- 0.00
train_precision: 0.05 +- 0.01
test_recall: 0.89 +- 0.04
train_recall: 0.92 +- 0.00


#### Bagging of decision trees

In [103]:
scaler = StandardScaler()
clf_ = BaggingClassifier(random_state=0,n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.82 +- 0.04
train_f1: 0.97 +- 0.01
test_average_precision: 0.77 +- 0.08
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.91 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.90 +- 0.05
train_precision: 0.99 +- 0.01
test_recall: 0.76 +- 0.04
train_recall: 0.95 +- 0.01


In [100]:
scaler = StandardScaler()
clf_ = BalancedBaggingClassifier(random_state=0,n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.12 +- 0.02
train_f1: 0.13 +- 0.02
test_average_precision: 0.52 +- 0.02
train_average_precision: 0.57 +- 0.06
test_roc_auc: 0.96 +- 0.02
train_roc_auc: 1.00 +- 0.00
test_precision: 0.06 +- 0.01
train_precision: 0.07 +- 0.01
test_recall: 0.88 +- 0.02
train_recall: 0.99 +- 0.01


#### Random Forest Classifier

In [99]:
scaler = StandardScaler()
clf_ = RandomForestClassifier(random_state=0,n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.81 +- 0.05
train_f1: 0.97 +- 0.01
test_average_precision: 0.76 +- 0.08
train_average_precision: 1.00 +- 0.00
test_roc_auc: 0.92 +- 0.03
train_roc_auc: 1.00 +- 0.00
test_precision: 0.89 +- 0.02
train_precision: 1.00 +- 0.00
test_recall: 0.74 +- 0.07
train_recall: 0.94 +- 0.01


In [98]:
scaler = StandardScaler()
clf_ = BalancedRandomForestClassifier(random_state=0,n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.11 +- 0.01
train_f1: 0.13 +- 0.01
test_average_precision: 0.72 +- 0.06
train_average_precision: 0.80 +- 0.01
test_roc_auc: 0.97 +- 0.01
train_roc_auc: 1.00 +- 0.00
test_precision: 0.06 +- 0.00
train_precision: 0.07 +- 0.01
test_recall: 0.88 +- 0.03
train_recall: 0.99 +- 0.00


#### Adaboost

In [104]:
scaler = StandardScaler()
clf_ = AdaBoostClassifier(random_state=0)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.72 +- 0.04
train_f1: 0.74 +- 0.00
test_average_precision: 0.73 +- 0.08
train_average_precision: 0.80 +- 0.01
test_roc_auc: 0.97 +- 0.01
train_roc_auc: 1.00 +- 0.00
test_precision: 0.81 +- 0.03
train_precision: 0.82 +- 0.02
test_recall: 0.65 +- 0.05
train_recall: 0.68 +- 0.02


In [97]:
scaler = StandardScaler()
base_estimator = AdaBoostClassifier(n_estimators=10)
clf_ = EasyEnsembleClassifier(base_estimator=base_estimator,random_state=0, n_jobs=-1)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.07 +- 0.01
train_f1: 0.08 +- 0.01
test_average_precision: 0.70 +- 0.05
train_average_precision: 0.71 +- 0.03
test_roc_auc: 0.97 +- 0.01
train_roc_auc: 0.99 +- 0.00
test_precision: 0.04 +- 0.00
train_precision: 0.04 +- 0.01
test_recall: 0.89 +- 0.04
train_recall: 0.93 +- 0.01


In [96]:
scaler = StandardScaler()
base_estimator = AdaBoostClassifier(n_estimators=10)
clf_ = RUSBoostClassifier(n_estimators=10,
                              base_estimator=base_estimator)
clf = Pipeline([('scaler',scaler),('clf_',clf_)])

scores = cross_validate(clf,X,y,cv=cv,scoring=['f1','average_precision','roc_auc','precision','recall'],n_jobs=-1, return_train_score=True)
print_scores_cv(scores)

test_f1: 0.09 +- 0.00
train_f1: 0.09 +- 0.01
test_average_precision: 0.69 +- 0.06
train_average_precision: 0.70 +- 0.02
test_roc_auc: 0.97 +- 0.01
train_roc_auc: 1.00 +- 0.00
test_precision: 0.05 +- 0.00
train_precision: 0.05 +- 0.00
test_recall: 0.90 +- 0.02
train_recall: 0.94 +- 0.01
