# Titanic Survival Classification

## Objective
The objective of this task is to build a machine learning classification model
to predict whether a passenger survived the Titanic disaster based on features
such as age, gender, passenger class, and fare.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# **DataSet Load**

In [None]:
data = pd.read_csv('/content/Titanic-Dataset (1).csv')
data.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


# **Data Cleaning**

In [None]:
# Handling missing values
data['Age'] = data['Age'].fillna(data['Age'].mean())

# Encoding Sex column
data['Sex'] = data['Sex'].map({'male': 0, 'female': 1})

# Handling Embarked column
data['Embarked'] = data['Embarked'].fillna(data['Embarked'].mode()[0])
data = pd.get_dummies(data, columns=['Embarked'], drop_first=True)

# Dropping unnecessary columns safely
data = data.drop(['Name', 'Ticket', 'Cabin'], axis=1, errors='ignore')

In [None]:
data.head()

Unnamed: 0,PassengerId,Survived,Pclass,Sex,Age,SibSp,Parch,Fare,Embarked_Q,Embarked_S
0,1,0,3,0,22.0,1,0,7.25,False,True
1,2,1,1,1,38.0,1,0,71.2833,False,False
2,3,1,3,1,26.0,0,0,7.925,False,True
3,4,1,1,1,35.0,1,0,53.1,False,True
4,5,0,3,0,35.0,0,0,8.05,False,True


# **Train_Test_Split**

In [None]:
from sklearn.model_selection import train_test_split

X = data.drop('Survived', axis=1)
y = data['Survived']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

## Model Training
The dataset was split into training and testing sets.
A Logistic Regression model was selected for binary classification.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

X = data.drop('Survived', axis=1)
y = data['Survived']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

# **Model Evaluation**

In [None]:
from sklearn.metrics import accuracy_score, classification_report

y_pred = model.predict(X_test)

print("Model Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))

Model Accuracy: 0.8044692737430168

Classification Report:
               precision    recall  f1-score   support

           0       0.82      0.85      0.84       105
           1       0.77      0.74      0.76        74

    accuracy                           0.80       179
   macro avg       0.80      0.80      0.80       179
weighted avg       0.80      0.80      0.80       179



## Result and Conclusion
The Logistic Regression model was successfully trained and evaluated.
The model achieved satisfactory accuracy in predicting passenger survival.
This task demonstrates effective data preprocessing and classification
using machine learning techniques.**