### Full Network Example

This activity focuses on using the full dataset and remaining features with a single layered neural network.  In the last example with only two features an accuracy of roughly 65% was achieved.  Using more features and a similar network architecture you will see if the model improves. 

#### Index 

- [Problem 1](#-Problem-1)
- [Problem 2](#-Problem-2)
- [Problem 3](#-Problem-3)

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings

warnings.filterwarnings("ignore")

from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.compose import make_column_transformer
from sklearn.pipeline import Pipeline

import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

2024-07-13 05:11:39.662933: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-07-13 05:11:39.666493: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-07-13 05:11:39.675684: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-13 05:11:39.690239: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-13 05:11:39.694549: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-07-13 05:11:39.706628: I tensorflow/core/platform/cpu_feature_gu

### The Data

Below the titanic data is again loaded.  For this exercise we use columns `['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked', 'class']` as our features to predict `survived`. 

In [2]:
titanic = sns.load_dataset("titanic")

In [3]:
X = titanic.loc[
    :, ["pclass", "sex", "age", "sibsp", "parch", "fare", "embarked", "class"]
]
y = titanic["survived"]

In [4]:
X.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 8 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   pclass    891 non-null    int64   
 1   sex       891 non-null    object  
 2   age       714 non-null    float64 
 3   sibsp     891 non-null    int64   
 4   parch     891 non-null    int64   
 5   fare      891 non-null    float64 
 6   embarked  889 non-null    object  
 7   class     891 non-null    category
dtypes: category(1), float64(2), int64(3), object(2)
memory usage: 49.9+ KB


In [5]:
y.head()

0    0
1    1
2    1
3    1
4    0
Name: survived, dtype: int64

[Back to top](#-Index)

### Problem 1

#### Preparing the features

**10 points**

Below, use the `make_column_transformer` to prepare the features.  Use the `OneHotEncoder` with `drop = 'if_binary'` on all categorical features, and use the `StandardScaler` on the remaining features.  

Be sure to first fill the missing values in `age` with the mean of the column.

Assign the transformed array to `X_t` below.

In [6]:
### GRADED
X["age"].fillna(value=X["age"].mean(), inplace=True)
X_t = make_column_transformer(
    (
        OneHotEncoder(drop="if_binary"),
        ["sex", "embarked", "class"],
    ),
    remainder=StandardScaler(),
).fit_transform(X)


### ANSWER CHECK
X_t

array([[ 1.        ,  0.        ,  0.        , ...,  0.43279337,
        -0.47367361, -0.50244517],
       [ 0.        ,  1.        ,  0.        , ...,  0.43279337,
        -0.47367361,  0.78684529],
       [ 0.        ,  0.        ,  0.        , ..., -0.4745452 ,
        -0.47367361, -0.48885426],
       ...,
       [ 0.        ,  0.        ,  0.        , ...,  0.43279337,
         2.00893337, -0.17626324],
       [ 1.        ,  1.        ,  0.        , ..., -0.4745452 ,
        -0.47367361, -0.04438104],
       [ 1.        ,  0.        ,  1.        , ..., -0.4745452 ,
        -0.47367361, -0.49237783]])

[Back to top](#-Index)

### Problem 2

#### The Network Architecture

**10 points**

Below, construct a network named `model` that has one hidden layer with 100 Dense units that use the `relu` activation function.  Use an output layer with one node that uses the `sigmoid` activation function.  

In [7]:
### GRADED
tf.random.set_seed(42)
model = tf.keras.Sequential(
    [
        Dense(100, activation="relu"),
        Dense(1, activation="sigmoid"),
    ]
)

### ANSWER CHECK
model

<Sequential name=sequential, built=False>

[Back to top](#-Index)

### Problem 3

#### Train and Evaluate the Network

**10 points**

Finally, train and evaluate the network using the following compilation settings.  Assign the fit model to `history` below.

- `optimizer = 'rmsprop'`
- `loss = 'bce'`
- `metrics = ['accuracy']`
- `epochs = 20`
- `batch_size = 10`
- `verbose = 0`

Also, be sure to leave the `tf.random.set_seed(42)` for proper grading.

In [8]:
0.8383838534355164

0.8383838534355164

In [9]:
### GRADED
tf.random.set_seed(42)
model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"])
history = model.fit(x=X_t, y=y, epochs=20, batch_size=10, verbose=0)

### ANSWER CHECK
history.history["accuracy"][-1]

0.8451178669929504