# Clayton Seabaugh: Building a Classifier
**Author:** Clayton Seabaugh  
**Date:** 3-28-2025  
**Objective:** Build and evaluate different models for machine learning classification

## Section 1. Import and Inspect the Data
Load the titanic dataset from the directly from the seaborn library.

In [1]:
# Imports

import seaborn as sns
import pandas as pd

In [2]:
# Load Titanic dataset
titanic = sns.load_dataset('titanic')

## Section 2. Data Exploration and Preparation
 
### 2.1 Handle Missing Values and Clean Data

In [10]:
# Impute missing values for age using the median
titanic['age'].fillna(titanic['age'].median(), inplace=True)

# Fill in missing values for embark_town using the mode
titanic['embark_town'].fillna(titanic['embark_town'].mode()[0], inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  titanic['age'].fillna(titanic['age'].median(), inplace=True)


### 2.2 Feature Engineering


In [11]:
# Create new features

# Add family_size - number of family members on board
titanic['family_size'] = titanic['sibsp'] + titanic['parch'] + 1
# Convert categorical "sex" to numeric
titanic['sex'] = titanic['sex'].map({'male': 0, 'female': 1})
# Convert categorical "embarked" to numeric
titanic['embarked'] = titanic['embarked'].map({'C': 0, 'Q': 1, 'S': 2})
# Binary feature - convert "alone" to numeric
titanic['alone'] = titanic['alone'].astype(int)