# Model Training
<center>
<img src="https://www.pngkit.com/png/full/182-1820449_whitepages-pro-machine-learning-machine-learning-model-icon.png" alt="drawing" width="240"/>
</center>

This notebook focuses on training a machine learning model using the cleaned dataset. The goal is to build, evaluate, and use our model

## Table of Contents
1. [Introduction](#introduction)
2. [Importing Libraries](#importing-libraries)
3. [Loading Data](#loading-data)
4. [Splitting Dataset](#splitting-dataset)
5. [Model Training](#model-training)
6. [Model Evaluation](#model-evaluation)
7. [Use Case Example](#use-case-example)
8. [Model Exportation](##Exporting-the-Model)
9. [Others](#others)
---



## ***Introduction***

- This notebook focuses on training our model using a preprocessed dataset.  
  - **<font color='red'>Model Training </font>**
  - **<font color='red'>Performance Evaluation</font>**
  - **<font color='red'>Exporting the Model</font>**
---

## Importing Libraries

In [6]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
# import __________

## Loading Data

In [7]:
# loading the training data into a DataFrame for machine learning.
titanic_data = pd.read_csv("../data/titanic_train.csv")

## Splitting Dataset

In [8]:
y_data = titanic_data['Survived']
x_data = titanic_data.drop('Survived', axis = 1)

In [9]:
from sklearn.model_selection import train_test_split
x_training_data, x_test_data, y_training_data, y_test_data = train_test_split(x_data, y_data, test_size = 0.2)

## Model Training

In [10]:
# logistic regression algorithm was selected for this binary problem(Survival Prediction =[0, 1])
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(max_iter=1000)
model.fit(x_training_data, y_training_data)
predictions = model.predict(x_test_data)


## Model Evaluation

In [11]:
from sklearn.metrics import classification_report
print(classification_report(y_test_data, predictions))


              precision    recall  f1-score   support

           0       0.81      0.84      0.82       110
           1       0.72      0.68      0.70        68

    accuracy                           0.78       178
   macro avg       0.76      0.76      0.76       178
weighted avg       0.77      0.78      0.77       178



In [12]:
# confsuion Matrix
from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test_data, predictions))

[[92 18]
 [22 46]]


## Use case Example

In [13]:
new_input_data = pd.DataFrame({
    'PassengerId': 3434,
    'Pclass': [2],
    'Age': [1.0],
    'SibSp': [1],
    'Parch': [0],
    'Fare': [7.25],
    'male': [1],  
    'Q': [0],
    'S': [1]
})

new_prediction = model.predict(new_input_data)

if new_prediction == 0:
    new_prediction = "This Passenger did not survive"
else:
    new_prediction = "This Passenger did survive"

print("Prediction:", new_prediction)


Prediction: This Passenger did survive


## Exporting the Model

In [14]:
import joblib

joblib_file = "../models/titanic-prediction-model.joblib"
joblib.dump(model, joblib_file)


['../models/titanic-prediction-model.joblib']