## Feature Engineering

Feature engineering was performed to enhance the predictive power of the dataset by
creating new features based on domain knowledge and passenger characteristics.


In [13]:
import pandas as pd

df = pd.read_csv("../ML/titanic_cleaned_training_data.csv")

## FamilySize

**Why:**  
The number of family members a passenger is traveling with can affect their survival.  
Large families may have more difficulty evacuating, while small families or single passengers may have different survival probabilities.

**What we do:**  
Sum the `SibSp` (siblings/spouses) and `Parch` (parents/children) columns and add 1 to include the passenger themselves.

In [14]:
df['FamilySize'] = df['SibSp'] + df['Parch'] + 1



### AgeGroup Feature

The `AgeGroup` feature was created by binning passenger ages into meaningful life-stage categories. This helps capture non-linear survival patterns that are not easily learned from raw age values, improves interpretability, and provides a consistent way to handle missing age information.


In [15]:
import numpy as np

def assign_age_group(age):
    if pd.isna(age):
        return 'unknown'
    elif age <= 2:
        return 'infant'
    elif age <= 12:
        return 'child'
    elif age <= 19:
        return 'teen'
    elif age <= 29:
        return 'young_adult'
    elif age <= 59:
        return 'adult'
    else:
        return 'senior'

df['AgeGroup'] = df['Age'].apply(assign_age_group)


In [16]:
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Fare,TitleGroup,Embarked,FamilySize,AgeGroup
0,1,0,3,"Braund, Mr. Owen Harris",0,22,1,0,7.25,Mr,0,2,young_adult
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",1,38,1,0,71.2833,Mrs,1,2,adult
2,3,1,3,"Heikkinen, Miss. Laina",1,26,0,0,7.925,Miss,0,1,young_adult
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",1,35,1,0,53.1,Mrs,0,2,adult
4,5,0,3,"Allen, Mr. William Henry",0,35,0,0,8.05,Mr,0,1,adult


In [17]:
df.to_csv("../ML/titanic_cleaned_training_data_FE.csv", index=False)