## Pipeline: Evaluate results on validation set

Using the Titanic dataset from [this](https://www.kaggle.com/c/titanic/overview) Kaggle competition.

In this section, we will use what we learned in last section to fit the best few models on the full training set and then evaluate the model on the validation set.

### Read in data

![Eval on Validation](../../img/evaluate_on_validation.png)

In [1]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score

tr_features = pd.read_csv('../../../train_features.csv')
tr_labels = pd.read_csv('../../../train_labels.csv', header=None, skiprows=1)

val_features = pd.read_csv('../../../val_features.csv')
val_labels = pd.read_csv('../../../val_labels.csv', header=None, skiprows=1)

te_features = pd.read_csv('../../../test_features.csv')
te_labels = pd.read_csv('../../../test_labels.csv', header=None, skiprows=1)

### Fit best models on full training set

Results from last section:
```
0.76 (+/-0.116) for {'max_depth': 2, 'n_estimators': 5}
0.796 (+/-0.119) for {'max_depth': 2, 'n_estimators': 50}
0.803 (+/-0.117) for {'max_depth': 2, 'n_estimators': 100}
--> 0.828 (+/-0.074) for {'max_depth': 10, 'n_estimators': 5}
0.816 (+/-0.028) for {'max_depth': 10, 'n_estimators': 50}
--> 0.826 (+/-0.046) for {'max_depth': 10, 'n_estimators': 100}
0.785 (+/-0.106) for {'max_depth': 20, 'n_estimators': 5}
0.813 (+/-0.027) for {'max_depth': 20, 'n_estimators': 50}
0.809 (+/-0.029) for {'max_depth': 20, 'n_estimators': 100}
0.794 (+/-0.04) for {'max_depth': None, 'n_estimators': 5}
0.809 (+/-0.037) for {'max_depth': None, 'n_estimators': 50}
--> 0.818 (+/-0.035) for {'max_depth': None, 'n_estimators': 100}
```

In [6]:
# fit best 3 models

rf1 = RandomForestClassifier(n_estimators=5, max_depth=10)
rf1.fit(tr_features, tr_labels.values.ravel())

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=10, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=5,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

In [7]:
rf2 = RandomForestClassifier(n_estimators=100, max_depth=10)
rf2.fit(tr_features, tr_labels.values.ravel())

rf3 = RandomForestClassifier(n_estimators=100, max_depth=None)
rf3.fit(tr_features, tr_labels.values.ravel())

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=100,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

### Evaluate models on validation set

![Evaluation Metrics](../../img/eval_metrics.png)

In [8]:
for mdl in [rf1, rf2, rf3]:
    y_pred = mdl.predict(val_features)
    accuracy = round(accuracy_score(val_labels, y_pred), 3)
    precision = round(precision_score(val_labels, y_pred), 3)
    recall = round(recall_score(val_labels, y_pred), 3)
    print('MAX DEPTH: {} / # OF EST: {} -- A: {} / P: {} / R: {}'.format(mdl.max_depth,
                                                                         mdl.n_estimators,
                                                                         accuracy,
                                                                         precision,
                                                                         recall))

MAX DEPTH: 10 / # OF EST: 5 -- A: 0.821 / P: 0.855 / R: 0.697
MAX DEPTH: 10 / # OF EST: 100 -- A: 0.827 / P: 0.846 / R: 0.724
MAX DEPTH: None / # OF EST: 100 -- A: 0.827 / P: 0.826 / R: 0.75


### Evaluate the best model on the test set

![Final Model](../../img/final_model_selection.png)

In [9]:
y_pred = rf2.predict(te_features)
accuracy = round(accuracy_score(te_labels, y_pred), 3)
precision = round(precision_score(te_labels, y_pred), 3)
recall = round(recall_score(te_labels, y_pred), 3)
print('MAX DEPTH: {} / # OF EST: {} -- A: {} / P: {} / R: {}'.format(rf2.max_depth,
                                                                     rf2.n_estimators,
                                                                     accuracy,
                                                                     precision,
                                                                     recall))

MAX DEPTH: 10 / # OF EST: 100 -- A: 0.792 / P: 0.741 / R: 0.662
