
# 🛳 Titanic Survival Predictor - Part 1

This mini project uses Logistic Regression to predict survival outcomes from the Titanic dataset using Python's scikit-learn library.

---

## ✅ Part 1: Data Loading & Setup

```python
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = sns.load_dataset('titanic')
df.head()
```

---

## 🔍 Part 2: Exploratory Data Analysis (EDA)

```python
# Basic info
print("Shape of dataset:", df.shape)
print("Missing values:\n", df.isnull().sum())

# Survival count
sns.countplot(x='survived', data=df)
plt.title("Survival Count")
plt.show()

# Age distribution
sns.histplot(df['age'].dropna(), kde=True)
plt.title("Age Distribution")
plt.show()

# Class vs Survival
sns.countplot(x='pclass', hue='survived', data=df)
plt.title("Passenger Class vs Survival")
plt.show()
```

---

## 🧹 Part 3: Data Cleaning

```python
features = ['age', 'fare', 'sex', 'pclass']
df_cleaned = df.dropna(subset=features + ['survived']).copy()

# Encode categorical variables
df_cleaned['sex'] = df_cleaned['sex'].map({'male': 0, 'female': 1})
```

---

## 🧠 Part 4: Logistic Regression Model

```python
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X = df_cleaned[features]
y = df_cleaned['survived']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
```

---

## 📊 Part 5: Feature Importance

```python
importance = pd.Series(model.coef_[0], index=X.columns)
importance.plot(kind='barh')
plt.title("Feature Importance")
plt.xlabel("Coefficient Value")
plt.show()
```

---

## 📝 Part 6: Summary (Markdown)

**Goal:** Predict survival on the Titanic using logistic regression.  
**Features Used:** Age, Fare, Sex, Pclass  
**Model Accuracy:** Varies, usually ~75%  
**Key Insight:** Gender (`sex`) is the most important factor influencing survival in this simple model.

---
