#What is Model Evaluation?


Model evaluation is the process of using different evaluation metrics to understand a machine learning model’s performance, as well as its strengths and weaknesses. Model evaluation is important to assess the efficacy of a model during initial research phases, and it also plays a role in model monitoring.

To understand if your model(s) is working well with new data, you can leverage a number of evaluation metrics.


##Classification

The most popular metrics for measuring classification performance include accuracy, precision, confusion matrix, log-loss, and AUC (area under the ROC curve).


**Accuracy**  measures how often the classifier makes the correct predictions, as it is the ratio between the number of correct predictions and the total number of predictions.


**Precision**  measures the proportion of predicted Positives that are truly Positive. Precision is a good choice of evaluation metrics when you want to be very sure of your prediction. For example, if you are building a system to predict whether to decrease the credit limit on a particular account, you want to be very sure about the prediction or it may result in customer dissatisfaction.


**A confusion matrix** (or confusion table) shows a more detailed breakdown of correct and incorrect classifications for each class. Using a confusion matrix is useful when you want to understand the distinction between classes, particularly when the cost of misclassification might differ for the two classes, or you have a lot more test data on one class than the other. For example, the consequences of making a false positive or false negative in a cancer diagnosis are very different.

In [71]:
#Importing the libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

In [24]:
#Loading the titanic dataset

path = ("https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv")

# Read the CSV file into a DataFrame
df = pd.read_csv(path)

# Display the DataFrame
df.head()



Unnamed: 0,Survived,Pclass,Name,Sex,Age,Siblings/Spouses Aboard,Parents/Children Aboard,Fare
0,0,3,Mr. Owen Harris Braund,male,22.0,1,0,7.25
1,1,1,Mrs. John Bradley (Florence Briggs Thayer) Cum...,female,38.0,1,0,71.2833
2,1,3,Miss. Laina Heikkinen,female,26.0,0,0,7.925
3,1,1,Mrs. Jacques Heath (Lily May Peel) Futrelle,female,35.0,1,0,53.1
4,0,3,Mr. William Henry Allen,male,35.0,0,0,8.05


In [25]:
#Shape of the data

df.shape

(887, 8)

In [26]:
#Check for null values

df.isnull().sum()

Survived                   0
Pclass                     0
Name                       0
Sex                        0
Age                        0
Siblings/Spouses Aboard    0
Parents/Children Aboard    0
Fare                       0
dtype: int64

In [51]:
import seaborn as sns

# Load the Titanic dataset from seaborn
titanic_data = sns.load_dataset('titanic')

# Select the specified columns
selected_columns = ['age', 'fare', 'sex', 'sibsp', 'parch', 'pclass', 'embarked', 'survived']
titanic_subset = titanic_data[selected_columns]

# Rename the 'pclass' column to 'Pclass' and 'survived' to 'x2survived'
titanic_subset = titanic_subset.rename(columns={'pclass': 'Pclass', 'survived': 'x2survived'})

# Display the subset of the dataset
print(titanic_subset)

      age     fare     sex  sibsp  parch  Pclass embarked  x2survived
0    22.0   7.2500    male      1      0       3        S           0
1    38.0  71.2833  female      1      0       1        C           1
2    26.0   7.9250  female      0      0       3        S           1
3    35.0  53.1000  female      1      0       1        S           1
4    35.0   8.0500    male      0      0       3        S           0
..    ...      ...     ...    ...    ...     ...      ...         ...
886  27.0  13.0000    male      0      0       2        S           0
887  19.0  30.0000  female      0      0       1        S           1
888   NaN  23.4500  female      1      2       3        S           0
889  26.0  30.0000    male      0      0       1        C           1
890  32.0   7.7500    male      0      0       3        Q           0

[891 rows x 8 columns]


In [52]:
#Dataframe

t_df = pd.DataFrame(titanic_subset)

t_df.head()

Unnamed: 0,age,fare,sex,sibsp,parch,Pclass,embarked,x2survived
0,22.0,7.25,male,1,0,3,S,0
1,38.0,71.2833,female,1,0,1,C,1
2,26.0,7.925,female,0,0,3,S,1
3,35.0,53.1,female,1,0,1,S,1
4,35.0,8.05,male,0,0,3,S,0


In [53]:
t_df.isnull().sum()

age           177
fare            0
sex             0
sibsp           0
parch           0
Pclass          0
embarked        2
x2survived      0
dtype: int64

In [39]:
#Feature Engeneering


#combine sibsp and parch to one column called company (this is passanger children and sibling/spouse)

t_df.loc[(t_df.parch + t_df.sibsp < 1), 'Company'] = 0
t_df.loc[(t_df.parch + t_df.sibsp > 0), 'Company'] = 1

t_df.head()

Unnamed: 0,age,fare,sex,sibsp,parch,Pclass,embarked,x2survived,Company
0,22.0,7.25,male,1,0,3,S,0,1.0
1,38.0,71.2833,female,1,0,1,C,1,1.0
2,26.0,7.925,female,0,0,3,S,1,0.0
3,35.0,53.1,female,1,0,1,S,1,1.0
4,35.0,8.05,male,0,0,3,S,0,0.0


In [None]:
#Group fare and age into fare and age groups

# Define the age groups
bins = [0, 18, 35, 55, 75, float('inf')]
labels = [1, 2, 3, 4, 5]

# Create the 'Age_group' column based on the specified age groups
t_df['Age_group'] = pd.cut(titanic_subset['age'], bins=bins, labels=labels, right=False)

t_df.head()

In [31]:
#Same for fare

# Get the range of the 'fare' column
fare_range = t_df['fare'].min(), t_df['fare'].max()# Display the fare range
print("Fare Range:", fare_range)

Fare Range: (0.0, 512.3292)


In [41]:
# Define the fare groups
bins = [0, 100, 300,  float('inf')]
labels = [1, 2, 3]

# Create the 'Age_group' column based on the specified age groups
t_df['Fare_group'] = pd.cut(titanic_subset['fare'], bins=bins, labels=labels, right=False)

t_df.head()

Unnamed: 0,age,fare,sex,sibsp,parch,Pclass,embarked,x2survived,Company,Age_group,Fare_group
0,22.0,7.25,male,1,0,3,S,0,1.0,2,1
1,38.0,71.2833,female,1,0,1,C,1,1.0,3,1
2,26.0,7.925,female,0,0,3,S,1,0.0,2,1
3,35.0,53.1,female,1,0,1,S,1,1.0,3,1
4,35.0,8.05,male,0,0,3,S,0,0.0,3,1


In [42]:
t_df.dtypes

age            float64
fare           float64
sex             object
sibsp            int64
parch            int64
Pclass           int64
embarked        object
x2survived       int64
Company        float64
Age_group     category
Fare_group    category
dtype: object

In [43]:
# Convert 'sex' column to 0 and 1
t_df['sex'] = t_df['sex'].map({'male': 0, 'female': 1})
t_df.head()

Unnamed: 0,age,fare,sex,sibsp,parch,Pclass,embarked,x2survived,Company,Age_group,Fare_group
0,22.0,7.25,0,1,0,3,S,0,1.0,2,1
1,38.0,71.2833,1,1,0,1,C,1,1.0,3,1
2,26.0,7.925,1,0,0,3,S,1,0.0,2,1
3,35.0,53.1,1,1,0,1,S,1,1.0,3,1
4,35.0,8.05,0,0,0,3,S,0,0.0,3,1


In [44]:
# Drop specified columns ('age', 'fare', 'sibsp', 'parch')
t_df = t_df.drop(['age', 'fare', 'sibsp', 'parch'], axis=1)

t_df.head()


Unnamed: 0,sex,Pclass,embarked,x2survived,Company,Age_group,Fare_group
0,0,3,S,0,1.0,2,1
1,1,1,C,1,1.0,3,1
2,1,3,S,1,0.0,2,1
3,1,1,S,1,1.0,3,1
4,0,3,S,0,0.0,3,1


In [45]:
# Convert 'Embarked' column to 0, 1 and 2
t_df['embarked'] = t_df['embarked'].map({'S': 2, 'C': 0, 'Q': 1})
t_df.head()

Unnamed: 0,sex,Pclass,embarked,x2survived,Company,Age_group,Fare_group
0,0,3,2.0,0,1.0,2,1
1,1,1,0.0,1,1.0,3,1
2,1,3,2.0,1,0.0,2,1
3,1,1,2.0,1,1.0,3,1
4,0,3,2.0,0,0.0,3,1


#Training the models

In [46]:
#Reset index
dft = t_df.reset_index()



In [48]:
dft.shape

(891, 8)

In [50]:
dft.isnull().sum()

index           0
sex             0
Pclass          0
embarked        2
x2survived      0
Company         0
Age_group     177
Fare_group      0
dtype: int64

In [55]:
#Drop null
dft = dft.dropna()

dft.isnull().sum()


index         0
sex           0
Pclass        0
embarked      0
x2survived    0
Company       0
Age_group     0
Fare_group    0
dtype: int64

##Split the data

In [58]:
#Split the data

x = dft.drop(['index','x2survived', 'embarked','Company'], axis = 1)
y = dft['x2survived']

In [59]:
y.head()

0    0
1    1
2    1
3    1
4    0
Name: x2survived, dtype: int64

In [60]:
x.head()

Unnamed: 0,sex,Pclass,Age_group,Fare_group
0,0,3,2,1
1,1,1,3,1
2,1,3,2,1
3,1,1,3,1
4,0,3,3,1


In [67]:
#Split the data into x_train and x_test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=0)

 #Classifier (Classify as survived or not survived)

clf = LogisticRegression()

# Train the classifier on the training data
clf.fit(x_train, y_train)

In [72]:
# Make predictions on the testing data
y_pred = clf.predict(x_test)

# Evaluate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
con_matrix = confusion_matrix(y_test, y_pred)
clas_report = classification_report(y_test, y_pred)


print("Accuracy:", accuracy)

print("Confussion Matrix:", con_matrix)

print("classification Report:", clas_report)




Accuracy: 0.7850467289719626
Confussion Matrix: [[102  23]
 [ 23  66]]
classification Report:               precision    recall  f1-score   support

           0       0.82      0.82      0.82       125
           1       0.74      0.74      0.74        89

    accuracy                           0.79       214
   macro avg       0.78      0.78      0.78       214
weighted avg       0.79      0.79      0.79       214



The confusion matrix shows the counts of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) predictions. In this case:

True Positive (TP): 66

True Negative (TN): 102

False Positive (FP): 23

False Negative (FN): 23





**Precision:**

Precision for class 0 (not survived): 0.82

Precision for class 1 (survived): 0.74

Precision is the ratio of correctly predicted positive observations to the total predicted positives. A high precision indicates a low false positive rate.


**Recall (Sensitivity or True Positive Rate):**

Recall for class 0: 0.82

Recall for class 1: 0.74

Recall is the ratio of correctly predicted positive observations to the total actual positives. A high recall indicates a low false negative rate.


**F1-Score:**

F1-score for class 0: 0.82

F1-score for class 1: 0.74

The F1-score is the weighted average of precision and recall. It is a balance between precision and recall, providing a single metric that considers both false positives and false negatives.

**Support:**

Support for class 0: 125

Support for class 1: 89

Support is the number of actual occurrences of the class in the specified dataset.

**Accuracy:**

Overall accuracy: 0.79

Accuracy is the ratio of correctly predicted observations to the total observations. It provides a general measure of model performance.

**Macro Avg:**

Macro-averaged precision, recall, and F1-score consider each class equally, providing an unweighted average across classes.


**Weighted Avg:**

Weighted-averaged precision, recall, and F1-score consider each class with weight proportional to its support (number of occurrences), providing an average that considers the imbalance in class distribution.


In summary, this classification report offers a comprehensive view of how well your logistic regression model is performing for each class and overall. It's a valuable tool for understanding the strengths and weaknesses of your model in different aspects of classification.
