# Machine Learning Model Evaluation

## Data Loading

We load the Titanic dataset from Google Drive.

## Data Preprocessing

We preprocess the data by dropping unnecessary columns, performing one-hot encoding, and filling missing values.

In [1]:

from google.colab import drive
drive.mount('/content/drive')

import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC


Mounted at /content/drive


In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:

# Load Titanic dataset from Google Drive
dfo = pd.read_csv('/content/drive/MyDrive/Collab-Learning/PITP_Data Science/Datasets/Iris.csv')
dfo.head(3)

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
0,1,5.1,3.5,1.4,0.2,Iris-setosa
1,2,4.9,3.0,1.4,0.2,Iris-setosa
2,3,4.7,3.2,1.3,0.2,Iris-setosa


# Preprocess DAtaset

In [7]:
# Preprocess the Data
df = dfo.drop(['Id'], axis=1)
df.isna().sum()
X = df.drop('Species', axis=1)
y = df['Species']

In [11]:

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize models
models = {
    'Logistic Regression': LogisticRegression(max_iter=500),
    'Decision Tree': DecisionTreeClassifier(),
    'Random Forest': RandomForestClassifier(),
    'SVM': SVC()
}

# Evaluate models
# Evaluate models
results = {}
for name, model in models.items():
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    results[name] = {
        'Accuracy': accuracy_score(y_test, y_pred),
        'Precision': precision_score(y_test, y_pred, average='weighted'), # Change here
        'Recall': recall_score(y_test, y_pred, average='weighted'), # Change here
        'F1-Score': f1_score(y_test, y_pred, average='weighted'), # Change here
        'Confusion Matrix': confusion_matrix(y_test, y_pred).tolist()
    }

# Display results
for model_name, metrics in results.items():
    print(f"Model: {model_name}")
    for metric, value in metrics.items():
        print(f"{metric}: {value}")
    print('-' * 50)


Model: Logistic Regression
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-Score: 1.0
Confusion Matrix: [[10, 0, 0], [0, 9, 0], [0, 0, 11]]
--------------------------------------------------
Model: Decision Tree
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-Score: 1.0
Confusion Matrix: [[10, 0, 0], [0, 9, 0], [0, 0, 11]]
--------------------------------------------------
Model: Random Forest
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-Score: 1.0
Confusion Matrix: [[10, 0, 0], [0, 9, 0], [0, 0, 11]]
--------------------------------------------------
Model: SVM
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-Score: 1.0
Confusion Matrix: [[10, 0, 0], [0, 9, 0], [0, 0, 11]]
--------------------------------------------------


## Model: Logistic Regression

We train and evaluate a Logistic Regression model on the Titanic dataset.

## Model: Decision Tree

We train and evaluate a Decision Tree classifier on the Titanic dataset.

## Model: Random Forest

We train and evaluate a Random Forest classifier on the Titanic dataset.

## Model: SVM

We train and evaluate a Support Vector Machine (SVM) classifier on the Titanic dataset.