<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); padding: 20px; border-radius: 10px; color: white;">
    <h1 style="color: white; text-align: center;">ðŸš¢ Titanic Survival Prediction Analysis</h1>
    <p style="text-align: center; font-size: 18px;"><strong>From EDA to Machine Learning - A Professional Data Science Workflow</strong></p>
</div>

## ðŸ“‹ Project Overview
This analysis explores the Titanic dataset through a comprehensive EDA-to-ML workflow, serving as a bridge between exploratory analysis and predictive modeling in my data science portfolio.

**ðŸŽ¯ Key Objectives:**
- Perform comprehensive exploratory data analysis (EDA)
- Engineer meaningful features from raw data
- Build and evaluate multiple machine learning models
- Demonstrate professional workflow and documentation

## 1. ðŸ“¥ Import & Setup

We begin by importing all necessary libraries and configuring our environment. This foundational step ensures we have the right tools for data manipulation, visualization, and machine learning tasks throughout the project.

### 1a. Importing Essential Libraries

We import core data science libraries that form the foundation of our analysis. Each library serves a specific purpose in the data science workflow, from data manipulation to machine learning implementation.

In [5]:
# Data manipulation core libraries
import pandas as pd  # Primary data structure (DataFrame) and analysis tools
import numpy as np   # Numerical computing and array operations

# Data visualization libraries  
import matplotlib.pyplot as plt  # Foundation for all plotting in Python
import seaborn as sns            # Enhanced statistical visualizations

# Scikit-learn preprocessing modules
from sklearn.preprocessing import StandardScaler  # Standardizes numeric features (mean=0, std=1)
from sklearn.preprocessing import LabelEncoder    # Converts categorical text to numerical labels
from sklearn.impute import SimpleImputer          # Systematically fills missing values
from sklearn.model_selection import train_test_split  # Creates training/test splits for ML

# System and utility libraries
import warnings  # Manages warning messages during execution
from datetime import datetime  # Handles date/time for analysis timestamping

print("âœ… Libraries imported successfully!")

âœ… Libraries imported successfully!


### 1b. Configuration & Settings

We configure our environment with global settings for visualizations and data display. Professional configuration ensures consistent, publication-quality plots and prevents common issues like truncated outputs.

In [6]:
# Configure matplotlib for professional visualizations
plt.style.use('seaborn-v0_8-whitegrid')  # Use seaborn's whitegrid theme for clean background
sns.set_palette("husl")  # Set color palette to "husl" for distinct, accessible colors

# Set default figure size for all plots
plt.rcParams['figure.figsize'] = (10, 6)  # Width: 10 inches, Height: 6 inches
plt.rcParams['font.size'] = 12  # Base font size for all text elements in plots

# Configure pandas display options for better data inspection
pd.set_option('display.max_columns', 50)  # Show up to 50 columns when displaying DataFrames
pd.set_option('display.max_rows', 100)    # Show up to 100 rows when displaying DataFrames
pd.set_option('display.float_format', '{:.2f}'.format)  # Format floats to 2 decimal places

# Suppress warnings for cleaner output (use with caution)
warnings.filterwarnings('ignore')  # Ignore warning messages that don't affect analysis

print("âœ… Environment configured successfully!")
print(f"ðŸ“… Analysis timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

âœ… Environment configured successfully!
ðŸ“… Analysis timestamp: 2025-11-24 09:44:47
