### Required Codio Assignment 21.2: Full Network Example

**Expected Time = 60 minutes**

**Total Points = 30**

This activity focuses on using the full Titanic dataset and remaining features with a single layered neural network.  In the last example with only two features an accuracy of roughly 65% was achieved.  Using more features and a similar network architecture you will that the model improves. 

#### Index 

- [Problem 1](#-Problem-1)
- [Problem 2](#-Problem-2)
- [Problem 3](#-Problem-3)

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf

from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.compose import make_column_transformer
from sklearn.pipeline import Pipeline
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
import warnings
warnings.filterwarnings('ignore')

### The Data

Below the titanic data is again loaded.  For this exercise you will use the columns `['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked', 'class']` as the features for predicting the column `survived`. 

In [2]:
titanic = sns.load_dataset('titanic')

In [3]:
X = titanic.loc[:, ['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked', 'class']]
y = titanic['survived']

In [4]:
X.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 8 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   pclass    891 non-null    int64   
 1   sex       891 non-null    object  
 2   age       714 non-null    float64 
 3   sibsp     891 non-null    int64   
 4   parch     891 non-null    int64   
 5   fare      891 non-null    float64 
 6   embarked  889 non-null    object  
 7   class     891 non-null    category
dtypes: category(1), float64(2), int64(3), object(2)
memory usage: 49.9+ KB


In [5]:
y.head()

0    0
1    1
2    1
3    1
4    0
Name: survived, dtype: int64

[Back to top](#-Index)

### Problem 1

#### Preparing the features

**10 points**

Begin by first filling the missing values inside the `age` columns with the mean of that column.

Next, use the `make_column_transformer` function to prepare the features.  Use `OneHotEncoder` with `drop = 'if_binary'` on all categorical features. Use the `StandardScaler` on the remaining features. Assign the transformation to `t` below.

Finally, use the `fit_transform` function on `t` with argument equal to `X` to transfomf the data and assign the result to `X_t` below.

In [6]:
### GRADED
X_t = ''



# YOUR CODE HERE
# Fill missing age values with mean
X['age'] = X['age'].fillna(X['age'].mean())

# Create column transformer
categorical_features = ['sex', 'embarked', 'class']
numerical_features = ['pclass', 'age', 'sibsp', 'parch', 'fare']

t = make_column_transformer(
    (OneHotEncoder(drop='if_binary'), categorical_features),
    (StandardScaler(), numerical_features)
)

# Transform the data
X_t = t.fit_transform(X)

### ANSWER CHECK
X_t.shape

(891, 13)

[Back to top](#-Index)

### Problem 2

#### The Network Architecture

**10 points**

Below, construct a sequantial network named `model` with one hidden layer with 100 dense units and a `relu` activation function.  The output layer of this model should have one node and should be using the `sigmoid` activation function.  

In [7]:
### GRADED
tf.random.set_seed(42)
model = ''


# YOUR CODE HERE
tf.random.set_seed(42)
model = Sequential([
    Dense(100, activation='relu', input_shape=(X_t.shape[1],)),
    Dense(1, activation='sigmoid')
])

### ANSWER CHECK
model

<keras.src.engine.sequential.Sequential at 0x105c8c160>

[Back to top](#-Index)

### Problem 3

#### Train and Evaluate the Network

**10 points**

Finally, train and evaluate the network using the following compilation settings.  Assign the fit model to `history` below.

- `optimizer = 'rmsprop'`
- `loss = 'bce'`
- `metrics = ['accuracy']`
- `epochs = 20`
- `batch_size = 10`
- `verbose = 0`

Also, be sure to leave the `tf.random.set_seed(42)` for proper grading.

In [8]:
### GRADED
history = ''


# YOUR CODE HERE
tf.random.set_seed(42)
model.compile(optimizer='rmsprop',
             loss='binary_crossentropy',
             metrics=['accuracy'])

history = model.fit(X_t, y,
                   epochs=20,
                   batch_size=10,
                   verbose=0)

### ANSWER CHECK
history.history['accuracy'][-1]

0.8406285047531128

### Summary of Exercises

This notebook demonstrated building a neural network for the Titanic survival prediction problem using multiple features. The exercise was broken down into three main parts:

1. **Data Preparation**
   - Used more features compared to previous examples: 'pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked', 'class'
   - Handled missing values in the age column using mean imputation
   - Applied feature preprocessing using column transformer:
     - OneHotEncoder for categorical features
     - StandardScaler for numerical features

2. **Network Architecture**
   - Built a simple sequential model with:
     - One hidden layer (100 units with ReLU activation)
     - Output layer (1 unit with sigmoid activation)
   - Input shape adapted to handle 13 features after preprocessing

3. **Model Training**
   - Compiled model with RMSprop optimizer and binary cross-entropy loss
   - Trained for 20 epochs with batch size of 10
   - Achieved approximately 84% accuracy on the full dataset

### Key Takeaways

- Using more features improved the model's performance compared to the previous example (from 65% to 84% accuracy)
- Proper feature preprocessing is crucial:
  - Categorical features need to be one-hot encoded
  - Numerical features benefit from standardization
- Even a simple neural network architecture can achieve good results with well-prepared data
- The combination of appropriate feature engineering and neural network design can significantly impact model performance