<h1>Model Comparsion</h1>

<h3>Comparing Different Models</h3>

<p>Evaluation techniques are essential for deciding between multiple model alternatives for a given dataset.</p>

<p>We use these evaluation techniques to compare between possible models, in this case we have the following three models:</p>
<ul>
    <li>A logistic regression model using all of the features in our dataset</li>
    <li>A logistic regression model using just the Pclass, Age, and Sex columns</li>
    <li>A logistic regression model using just the Fare and Age columns</li>
</ul>

<p>We wouldn’t expect the second or third model to do better since it has less information, but we might determine that using just those two or three columns yields comparable performance to using all the columns.</p>

<h3>Building the Models with Scikit-learn</h3>

<p>In this section, we use k-fold cross validation to calculate the accuracy, precision, recall and F1 score for the three models so that we can compare them.</p>

<p>First, we import the necessary modules and prep the data</p>

In [1]:
from sklearn.model_selection import KFold
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
import pandas as pd
import numpy as np

df = pd.read_csv('../data/titanic.csv')
df['male'] = df['Sex'] == 'male'

<p>
    Now we can build the KFold object using the standard 5 splits.
</p>

<strong>
     Note that we want to create a single KFold object that all of the models will use.<br/> It would be unfair if different models got a different split of the data.
</strong>

In [2]:
kf = KFold(n_splits=5, shuffle=True)

<p>Now we’ll create three different feature matrices X1, X2 and X3. All will have the same target y.</p>

In [3]:
X1 = df[['Pclass', 'male', 'Age', 'Siblings/Spouses', 'Parents/Children', 'Fare']].values
X2 = df[['Pclass', 'male', 'Age']].values
X3 = df[['Fare', 'Age']].values
y = df['Survived'].values

<p>
    The function below uses the KFold object to calculate the accuracy, precision, recall and F1 score for a Logistic Regression model with the given feature matrix X and target array y.
</p>

In [4]:
def score_model(X, y, kf):
    accuracy_scores = []
    precision_scores = []
    recall_scores = []
    f1_scores = []
    for train_index, test_index in kf.split(X):
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = y[train_index], y[test_index]
        model = LogisticRegression()
        model.fit(X_train, y_train)
        y_pred = model.predict(X_test)
        accuracy_scores.append(accuracy_score(y_test, y_pred))
        precision_scores.append(precision_score(y_test, y_pred))
        recall_scores.append(recall_score(y_test, y_pred))
        f1_scores.append(f1_score(y_test, y_pred))
    print("accuracy:", np.mean(accuracy_scores))
    print("precision:", np.mean(precision_scores))
    print("recall:", np.mean(recall_scores))
    print("f1 score:", np.mean(f1_scores), "\n")

<p>
    Now we can call this function three times for each of our three feature matrices and see the results.
</p>

In [5]:
print("Logistic Regression with all features")
score_model(X1, y, kf)
print("Logistic Regression with Pclass, Sex & Age features")
score_model(X2, y, kf)
print("Logistic Regression with Fare & Age features")
score_model(X3, y, kf)

Logistic Regression with all features
accuracy: 0.7992890243128293
precision: 0.7627698895650215
recall: 0.6953830544418059
f1 score: 0.7273676901051221 

Logistic Regression with Pclass, Sex & Age features
accuracy: 0.7936202628070843
precision: 0.7489761384525269
recall: 0.6983683940215222
f1 score: 0.7223602309887129 

Logistic Regression with Fare & Age features
accuracy: 0.6607376372754397
precision: 0.6829195314489432
recall: 0.2432043656165157
f1 score: 0.3557699038225296 



<h3>Choosing a Best Model</h3>

<p>
     To interpret the numbers from our evalution results, we notice that the first two models are much better options than the third.
    <br/>
    This matches our intuition since the third model doesn’t have access to the sex of the passenger, and we expect that women are more likely to survive, so having the sex would be a very valuable predictor.
</p>

<p>
    Since the first two models have equivalent results, it makes sense to choose the simpler model, the one that uses the Pclass, Sex & Age features.
</p>

<p>
    Below we build the model with Pclass, Sex & Age features.
</p>

In [6]:
model = LogisticRegression()
model.fit(X2, y)

<p>
    Now we can make a prediction for a 25 years old female passenger in the third class using this model
</p>

In [7]:
print(model.predict([[3, False, 25]]))

[1]


<p>
    The model predicts that this passenger survive!
</p>

<strong>
Note that we have only tried three different combinations of features. It’s possible a different combination would also work.
</strong>