## Pipeline: Evaluate results on validation set

Using the Titanic dataset from [this](https://www.kaggle.com/c/titanic/overview) Kaggle competition.

In this section, we will use what we learned in last section to fit the best few models on the full training set and then evaluate the model on the validation set.

### Read in data & create train/validation/test set

In [8]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, roc_auc_score
from sklearn.model_selection import train_test_split, GridSearchCV

titanic = pd.read_csv('../titanic_cleaned.csv')

features = titanic.drop('Survived', axis=1)
labels = titanic['Survived']

X_train, X_val, y_train, y_val = train_test_split(features, labels, test_size=0.4, random_state=42)
X_test, X_val, y_test, y_val = train_test_split(X_val, y_val, test_size=0.5, random_state=42)

### Fit best models on full training set

Results from last section:
```
0.813 (+/-0.112) for {'max_depth': 2, 'n_estimators': 5}
0.8 (+/-0.124) for {'max_depth': 2, 'n_estimators': 50}
0.801 (+/-0.117) for {'max_depth': 2, 'n_estimators': 100}
0.792 (+/-0.037) for {'max_depth': 10, 'n_estimators': 5}
--> 0.82 (+/-0.052) for {'max_depth': 10, 'n_estimators': 50}
--> 0.826 (+/-0.048) for {'max_depth': 10, 'n_estimators': 100}
0.803 (+/-0.043) for {'max_depth': 20, 'n_estimators': 5}
--> 0.822 (+/-0.054) for {'max_depth': 20, 'n_estimators': 50}
0.811 (+/-0.051) for {'max_depth': 20, 'n_estimators': 100}
0.798 (+/-0.051) for {'max_depth': None, 'n_estimators': 5}
0.811 (+/-0.06) for {'max_depth': None, 'n_estimators': 50}
0.818 (+/-0.04) for {'max_depth': None, 'n_estimators': 100}
```

In [19]:
rf1 = RandomForestClassifier(n_estimators = 50, max_depth = 10)
rf2 = RandomForestClassifier(n_estimators = 100, max_depth = 10)
rf3 = RandomForestClassifier(n_estimators = 50, max_depth = 20)

rf1.fit(X_train, y_train)
rf2.fit(X_train, y_train)
rf3.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=20, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=50, n_jobs=None,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

### Evaluate models on validation set

![Evaluation Metrics](img/eval_metrics.png)

In [18]:
for i in [rf1, rf2, rf3]:
    y_pred = i.predict(X_val)
    accuracy = round(accuracy_score(y_val, y_pred), 3)
    precision = round(precision_score(y_val, y_pred), 3)
    recall = round(recall_score(y_val, y_pred), 3)
    auc = round(roc_auc_score(y_val, y_pred), 3)
    print('MAX DEPTH: {} / # OF EST: {} -- A: {} / P: {} / R: {} / AUC: {}'.format(i.max_depth,
                                                                                   i.n_estimators,
                                                                                   accuracy,
                                                                                   precision,
                                                                                   recall,
                                                                                   auc))

MAX DEPTH: 10 / # OF EST: 50 -- A: 0.81 / P: 0.818 / R: 0.711 / AUC: 0.797
MAX DEPTH: 10 / # OF EST: 100 -- A: 0.816 / P: 0.831 / R: 0.711 / AUC: 0.802
MAX DEPTH: 20 / # OF EST: 50 -- A: 0.832 / P: 0.848 / R: 0.737 / AUC: 0.82
