## Build Models To Compare Features: Compare And Evaluate All Models

In this section, we will do the following:
1. Evaluate all of our saved models on the validation set
2. Select the best model based on performance on the validation set
3. Evaluate that model on the holdout test set

### Read In Data

In [16]:
# Read in data
import joblib
import pandas as pd
from sklearn.metrics import accuracy_score, precision_score, recall_score
from time import time
%matplotlib inline


test_features = pd.read_csv('taitanic_test_cleaned_req.csv')

test_features.head()

Unnamed: 0,PassengerId,Sex,Age,Fare,Cabin_ind,Family_cnt,2,3
0,892,1,34.5,1.509188,0,0,0,1
1,893,0,47.0,1.475773,0,1,0,1
2,894,1,62.0,1.574861,0,0,1,0
3,895,1,27.0,1.540028,0,0,0,1
4,896,0,22.0,1.651554,0,2,0,1


### Read In Models

In [17]:
# Read in models
mdl = joblib.load('mdl_rf_tit_std.pkl')
mdl_sc = joblib.load('scaler.pkl')

In [None]:
# Scale the training, test, and validation sets
features = test_features.columns.drop('PassengerId')
test_features[features] = mdl_sc.transform(test_features[features])
test_features.head()

### Predicting On The Test Set

In [18]:
test = test_features.drop('PassengerId', axis =1)
pred = mdl.predict(test)
pred

array([0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
       0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
       0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

In [19]:
df = pd.DataFrame({'Survived':pred})
df.head()

Unnamed: 0,Survived
0,0
1,1
2,0
3,0
4,1


In [20]:
test_ans = pd.concat([test_features['PassengerId'],df],axis =1)
test_ans.head()

Unnamed: 0,PassengerId,Survived
0,892,0
1,893,1
2,894,0
3,895,0
4,896,1


In [21]:
test_ans.shape

(418, 2)

In [22]:
test_ans.to_csv('test_ans.csv',index = False)