# Titanic Survival Prediction

This project explores the Titanic dataset and builds a machine learning model to predict passenger survival.

**Author:** Aikerim Turgynbek
**Dataset:** Titanic - Machine Learning from Disaster (Kaggle)

## 1. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

plt.style.use('default')

## 2. Load Dataset

In [None]:
train_df = pd.read_csv('../data/train.csv')
test_df = pd.read_csv('../data/test.csv')

train_df.head()

## 3. Exploratory Data Analysis (EDA)

We explore basic statistics and relationships in the data.

In [None]:
train_df.info()

train_df.describe()

## 4. Data Preprocessing

In [None]:
# Fill missing values
train_df['Age'].fillna(train_df['Age'].median(), inplace=True)
train_df['Embarked'].fillna(train_df['Embarked'].mode()[0], inplace=True)

# Encode categorical variables
train_df = pd.get_dummies(train_df, columns=['Sex', 'Embarked'], drop_first=True)

train_df.head()

## 5. Model Training

In [None]:
X = train_df.drop(['Survived', 'Name', 'Ticket', 'Cabin'], axis=1)
y = train_df['Survived']

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)

model = LogisticRegression(max_iter=1000)
model.fit(X_train_scaled, y_train)

## 6. Evaluation

In [None]:
y_pred = model.predict(X_val_scaled)

print('Accuracy:', accuracy_score(y_val, y_pred))
print(classification_report(y_val, y_pred))

## 7. Conclusion

This notebook demonstrated a full ML pipeline from data loading to model evaluation. Future work may include feature engineering and advanced models.