# 2023 Super Bowl LVII-Deep Learning

The business team has given you a CSV containing more than 320 data points from the NFL. With your knowledge of machine learning and neural networks, you decide to use the features in the provided dataset to create a binary classifier model that will predict whether an applicant will become a successful business. The CSV file contains a variety of information about these teams, including whether or not they have won the SuperBowl.

## Instructions:

The steps for this challenge are broken out into the following sections:

* Prepare the data for use on a neural network model.

* Compile and evaluate a binary classification model using a neural network.

* Optimize the neural network model.

### Prepare the Data for Use on a Neural Network Model 

Using your knowledge of Pandas and scikit-learn’s `StandardScaler()`, preprocess the dataset so that you can use it to compile and evaluate the neural network model later.

Open the starter code file, and complete the following data preparation steps:

1. Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.   

2. Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.
 
3. Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

4. Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

5. Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 

6. Split the features and target sets into training and testing datasets.

7. Use scikit-learn's `StandardScaler` to scale the features data.

### Compile and Evaluate a Binary Classification Model Using a Neural Network

Use your knowledge of TensorFlow to design a binary classification deep neural network model. This model should use the dataset’s features to predict whether an Alphabet Soup&ndash;funded startup will be successful based on the features in the dataset. Consider the number of inputs before determining the number of layers that your model will contain or the number of neurons on each layer. Then, compile and fit your model. Finally, evaluate your binary classification model to calculate the model’s loss and accuracy. 
 
To do so, complete the following steps:

1. Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.

2. Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.

> **Hint** When fitting the model, start with a small number of epochs, such as 20, 50, or 100.

3. Evaluate the model using the test data to determine the model’s loss and accuracy.

4. Save and export your model to an HDF5 file, and name the file `SuperBowl.h5`. 

### Optimize the Neural Network Model

Using your knowledge of TensorFlow and Keras, optimize your model to improve the model's accuracy. Even if you do not successfully achieve a better accuracy, you'll need to demonstrate at least two attempts to optimize the model. You can include these attempts in your existing notebook. Or, you can make copies of the starter notebook in the same folder, rename them, and code each model optimization in a new notebook. 

> **Note** You will not lose points if your model does not achieve a high accuracy, as long as you make at least two attempts to optimize the model.

To do so, complete the following steps:

1. Define at least three new deep neural network models (the original plus 2 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.

2. After finishing your models, display the accuracy scores achieved by each model, and compare the results.

3. Save each of your models as an HDF5 file.


In [1]:
# Imports
import numpy as np
import h5py
import pandas as pd
from pathlib import Path
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,OneHotEncoder

---

## Prepare the data to be used on a neural network model

### Step 1: Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.  


In [2]:
# Load the data into a Pandas DataFrame
df_Football_Season = pd.read_csv(
    Path("./Resources/2021_2011_regular_ season_1.csv",
                          index_col=["Years","Teams","W"],
                          parse_dates = True, 
                          infer_datetime_format = True)
)

# Drop Date column
df_Football_Season=df_Football_Season.drop(columns=["Date"])

# Display sample data
df_Football_Season.head(10)

Unnamed: 0,Years,Teams,W,L,T,PCT,PF,PA,Net Pts,Home,Road,Div,Pct,Conf,Pct.1,Non-Conf,Strk,Last 5
0,2021,Bills,11,6,0,0.647,483,289,194,6 - 3 - 0,5 - 3 - 0,5 - 1 - 0,0.833,7 - 5 - 0,0.583,4 - 1 - 0,4W,4 - 1 - 0
1,2021,Dolphins,9,8,0,0.529,341,373,-32,6 - 3 - 0,3 - 5 - 0,4 - 2 - 0,0.667,6 - 6 - 0,0.5,3 - 2 - 0,1W,4 - 1 - 0
2,2021,Patriots,10,7,0,0.588,462,303,159,4 - 5 - 0,6 - 2 - 0,3 - 3 - 0,0.5,8 - 4 - 0,0.667,2 - 3 - 0,1L,2 - 3 - 0
3,2021,Jets,4,13,0,0.235,310,504,-194,3 - 6 - 0,1 - 7 - 0,0 - 6 - 0,0.0,4 - 8 - 0,0.333,0 - 5 - 0,2L,1 - 4 - 0
4,2021,Bengals,10,7,0,0.588,460,376,84,5 - 4 - 0,5 - 3 - 0,4 - 2 - 0,0.667,8 - 4 - 0,0.667,2 - 3 - 0,1L,3 - 2 - 0
5,2021,Steelers,9,7,1,0.559,343,398,-55,6 - 2 - 1,3 - 5 - 0,4 - 2 - 0,0.667,7 - 5 - 0,0.583,2 - 2 - 1,2W,3 - 2 - 0
6,2021,Browns,8,9,0,0.471,349,371,-22,6 - 3 - 0,2 - 6 - 0,3 - 3 - 0,0.5,5 - 7 - 0,0.417,3 - 2 - 0,1W,2 - 3 - 0
7,2021,Ravens,8,9,0,0.471,387,392,-5,5 - 4 - 0,3 - 5 - 0,1 - 5 - 0,0.167,5 - 7 - 0,0.417,3 - 2 - 0,6L,0 - 5 - 0
8,2021,Titans,12,5,0,0.706,419,354,65,7 - 2 - 0,5 - 3 - 0,5 - 1 - 0,0.833,8 - 4 - 0,0.667,4 - 1 - 0,3W,4 - 1 - 0
9,2021,Colts,9,8,0,0.529,451,365,86,4 - 5 - 0,5 - 3 - 0,3 - 3 - 0,0.5,7 - 5 - 0,0.583,2 - 3 - 0,2L,3 - 2 - 0


In [3]:
# Display sample data
df_Football_Season.tail(10)

Unnamed: 0,Years,Teams,W,L,T,PCT,PF,PA,Net Pts,Home,Road,Div,Pct,Conf,Pct.1,Non-Conf,Strk,Last 5
342,2011,Bears,8,8,0,0.5,353,341,12,5 - 3 - 0,3 - 5 - 0,3 - 3 - 0,0.5,7 - 5 - 0,0.583,1 - 3 - 0,1W,1 - 4 - 0
343,2011,Vikings,3,13,0,0.188,340,449,-109,1 - 7 - 0,2 - 6 - 0,0 - 6 - 0,0.0,3 - 9 - 0,0.25,0 - 4 - 0,1L,1 - 4 - 0
344,2011,Saints,13,3,0,0.813,547,339,208,8 - 0 - 0,5 - 3 - 0,5 - 1 - 0,0.833,9 - 3 - 0,0.75,4 - 0 - 0,8W,5 - 0 - 0
345,2011,Falcons,10,6,0,0.625,402,350,52,6 - 2 - 0,4 - 4 - 0,3 - 3 - 0,0.5,7 - 5 - 0,0.583,3 - 1 - 0,1W,3 - 2 - 0
346,2011,Panthers,6,10,0,0.375,406,429,-23,3 - 5 - 0,3 - 5 - 0,2 - 4 - 0,0.333,3 - 9 - 0,0.25,3 - 1 - 0,1L,3 - 2 - 0
347,2011,Buccaneers,4,12,0,0.25,287,494,-207,3 - 5 - 0,1 - 7 - 0,2 - 4 - 0,0.333,3 - 9 - 0,0.25,1 - 3 - 0,10L,0 - 5 - 0
348,2011,49ers,13,3,0,0.813,380,229,151,7 - 1 - 0,6 - 2 - 0,5 - 1 - 0,0.833,10 - 2 - 0,0.833,3 - 1 - 0,3W,4 - 1 - 0
349,2011,Cardinals,8,8,0,0.5,312,348,-36,6 - 2 - 0,2 - 6 - 0,4 - 2 - 0,0.667,7 - 5 - 0,0.583,1 - 3 - 0,1W,4 - 1 - 0
350,2011,Seahawks,7,9,0,0.438,321,315,6,4 - 4 - 0,3 - 5 - 0,3 - 3 - 0,0.5,6 - 6 - 0,0.5,1 - 3 - 0,2L,3 - 2 - 0
351,2011,Rams,2,14,0,0.125,193,407,-214,1 - 7 - 0,1 - 7 - 0,0 - 6 - 0,0.0,1 - 11 - 0,0.083,1 - 3 - 0,7L,0 - 5 - 0


In [4]:
# Review the data types associated with the columns
# Display sample data
df_Football_Season.dtypes

Years         int64
Teams        object
W             int64
L             int64
T             int64
PCT         float64
PF            int64
PA            int64
Net Pts       int64
Home         object
Road         object
Div          object
Pct         float64
Conf         object
Pct.1       float64
Non-Conf     object
Strk         object
Last 5       object
dtype: object

In [5]:
# Display sample data
df_Football_Season.columns

Index(['Years', 'Teams', 'W', 'L', 'T', 'PCT', 'PF', 'PA', 'Net Pts', 'Home',
       'Road', 'Div', 'Pct', 'Conf', 'Pct.1', 'Non-Conf', 'Strk', 'Last 5'],
      dtype='object')

### Step 2: Drop the useless columns from the DataFrame, because they are not relevant to the binary classification model.

In [6]:
# Define a dictionary containing Students data
data = {'Teams': ['49ers', 'Bears', 'Bengals', 'Bills', 'Broncos', 'Browns', 'Buccaneers','Cardinals','Chargers','Chiefs','Colts','Cowboys','Dolphins','Eagles','Falcons',
        'Giants','Jaguars','Jets','Lions','Packers','Panthers','Patriots','Raiders','Rams','Ravens','Redskins','Saints','Seahawks','Steelers','Texans','Titans',
        'Vikings'],
        'W': [90,79,87,91,97,56,73,89,84,112,92,98,82,90,87,70,47,63,74,118,86,128,73,88,105,70,110,112,111,84,86,90],
        'Net Pts': [127,-258,72,106,284,-933,-347,-66,133,643,-18,380,-492,208,21,-556,-1203,-1039,-342,708,-32,1599,-1030,-43,793,-676,813,946,574,-193,-284,105],      
        'Unique Identifier': [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]
}

# Convert the dictionary into DataFrame
df_SuperBowlWinners_Deep = pd.DataFrame.from_dict(data)

# Declare a list that is to be converted into a column
SuperBowlWins = [0,0,0,0,1,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1,0,0,0,0]

# Using 'Previous Super Bowl Winner' as the column name
# and equating it to the list
df_SuperBowlWinners_Deep['Previous Super Bowl Winner'] = SuperBowlWins

# Set Index & DType
# df_SuperBowlWinners_Deep["Teams"] = df_SuperBowlWinners_Deep["Teams"].astype(float)
df_SuperBowlWinners_Deep.reset_index(drop=True)


# Observe the result
df_SuperBowlWinners_Deep


Unnamed: 0,Teams,W,Net Pts,Unique Identifier,Previous Super Bowl Winner
0,49ers,90,127,1,0
1,Bears,79,-258,2,0
2,Bengals,87,72,3,0
3,Bills,91,106,4,0
4,Broncos,97,284,5,1
5,Browns,56,-933,6,0
6,Buccaneers,73,-347,7,1
7,Cardinals,89,-66,8,0
8,Chargers,84,133,9,0
9,Chiefs,112,643,10,1


### Step 3: Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

In [7]:
# Create a list of categorical variables 
categorical_variables = ['Unique Identifier', 'Teams']

# Display the categorical variables list
categorical_variables


['Unique Identifier', 'Teams']

In [8]:
# Create a OneHotEncoder instance (Takes non-numerical data into 0 or 1)
enc =  OneHotEncoder(sparse=False)

In [9]:
# Encode the categorcal variables using OneHotEncoder
encoded_data = enc.fit_transform(df_SuperBowlWinners_Deep[categorical_variables])
encoded_data

array([[1., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.],
       [0., 0., 0., ..., 0., 0., 1.]])

In [10]:
# Create a DataFrame with the encoded variables
encoded_df = pd.DataFrame(
    encoded_data,
    columns = enc.get_feature_names_out(categorical_variables)
)
# Review the DataFrame
encoded_df


Unnamed: 0,Unique Identifier_1,Unique Identifier_2,Unique Identifier_3,Unique Identifier_4,Unique Identifier_5,Unique Identifier_6,Unique Identifier_7,Unique Identifier_8,Unique Identifier_9,Unique Identifier_10,...,Teams_Raiders,Teams_Rams,Teams_Ravens,Teams_Redskins,Teams_Saints,Teams_Seahawks,Teams_Steelers,Teams_Texans,Teams_Titans,Teams_Vikings
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Step 4: Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

In [11]:
# Add the numerical variables from the original DataFrame to the one-hot encoding DataFrame
numerical_variables_df = df_SuperBowlWinners_Deep.drop(columns = categorical_variables)

# Review the DataFrame
numerical_variables_df.head()

Unnamed: 0,W,Net Pts,Previous Super Bowl Winner
0,90,127,0
1,79,-258,0
2,87,72,0
3,91,106,0
4,97,284,1


In [12]:
# Using the Pandas concat function, combine the DataFrames the contain the encoded categorical data and the numerical data
encoded2_df = pd.concat(
    [
        numerical_variables_df,
        encoded_df
    ],
    axis=1
)

# Reveiw the DataFrame
encoded2_df.head()

Unnamed: 0,W,Net Pts,Previous Super Bowl Winner,Unique Identifier_1,Unique Identifier_2,Unique Identifier_3,Unique Identifier_4,Unique Identifier_5,Unique Identifier_6,Unique Identifier_7,...,Teams_Raiders,Teams_Rams,Teams_Ravens,Teams_Redskins,Teams_Saints,Teams_Seahawks,Teams_Steelers,Teams_Texans,Teams_Titans,Teams_Vikings
0,90,127,0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,79,-258,0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,87,72,0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,91,106,0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,97,284,1,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Step 5: Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “Previous Super Bowl Winner”. The remaining columns should define the features dataset. 



In [13]:
# Define the target set y using the 'Previous Super Bowl Winner' column
y = encoded2_df["Previous Super Bowl Winner"]

# Display a sample of y
y[:10]


0    0
1    0
2    0
3    0
4    1
5    0
6    1
7    0
8    0
9    1
Name: Previous Super Bowl Winner, dtype: int64

In [14]:
# Define features set X by selecting all columns but IS_SUCCESSFUL
X = encoded2_df.drop(columns=["Previous Super Bowl Winner"])

# Review the features DataFrame
X.head()


Unnamed: 0,W,Net Pts,Unique Identifier_1,Unique Identifier_2,Unique Identifier_3,Unique Identifier_4,Unique Identifier_5,Unique Identifier_6,Unique Identifier_7,Unique Identifier_8,...,Teams_Raiders,Teams_Rams,Teams_Ravens,Teams_Redskins,Teams_Saints,Teams_Seahawks,Teams_Steelers,Teams_Texans,Teams_Titans,Teams_Vikings
0,90,127,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,79,-258,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,87,72,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,91,106,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,97,284,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Step 6: Split the features and target sets into training and testing datasets.


In [15]:
# Split the preprocessed data into a training and testing dataset
# Assign the function a random_state equal to 1
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

### Step 7: Use scikit-learn's `StandardScaler` to scale the features data.

In [16]:
# Create a StandardScaler instance
scaler = StandardScaler()

# Fit the scaler to the features training dataset
X_scaler = scaler.fit(X_train)

# Fit the scaler to the features training dataset
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)


---

## Compile and Evaluate a Binary Classification Model Using a Neural Network

### Step 1: Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.


In [17]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features

66

In [18]:
# Define the number of neurons in the output layer
number_output_neurons = 10

In [19]:
# Define the number of hidden nodes for the first hidden layer
hidden_nodes_layer1 =  (number_input_features + 1) // 2 

# Review the number hidden nodes in the first layer
hidden_nodes_layer1

33

In [20]:
# Define the number of hidden nodes for the second hidden layer
hidden_nodes_layer2 = (hidden_nodes_layer1 + 1) // 2

# Review the number hidden nodes in the second layer
hidden_nodes_layer2

17

In [21]:
# Create the Sequential model instance
nn = Sequential()

In [22]:
# Add the first hidden layer
nn.add(Dense(units=hidden_nodes_layer1, input_dim=number_input_features, activation="relu"))

In [23]:
# Add the second hidden layer
nn.add(Dense(units=hidden_nodes_layer2, activation="relu"))

In [24]:
# Add the output layer to the model specifying the number of output neurons and activation function
nn.add(Dense(units=1, activation="sigmoid"))

In [25]:
# Display the Sequential model summary
nn.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 33)                2211      
                                                                 
 dense_1 (Dense)             (None, 17)                578       
                                                                 
 dense_2 (Dense)             (None, 1)                 18        
                                                                 
Total params: 2,807
Trainable params: 2,807
Non-trainable params: 0
_________________________________________________________________


### Step 2: Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.


In [26]:
# Compile the Sequential model
nn.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [27]:
# Fit the model using 50 epochs and the training data
fit_model = nn.fit(X_train_scaled, y_train, epochs=50)


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


### Step 3: Evaluate the model using the test data to determine the model’s loss and accuracy.


In [28]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

1/1 - 0s - loss: 0.6074 - accuracy: 0.5000 - 123ms/epoch - 123ms/step
Loss: 0.6074205040931702, Accuracy: 0.5


### Step 4: Save and export your model to an HDF5 file, and name the file `SuperBowl.h5`. 


In [29]:
# Set the model's file path 
file_path = Path("Resources/SuperBowl_1.h5")   

# initializing a random numpy array
arr = np.random.randn(1000)

# Export your model to a HDF5 file
with h5py.File('SuperBowl_1.hdf5', 'w') as f:
    dset = f.create_dataset("default", data = arr)

---

## Optimize the neural network model


### Step 1: Define at least three new deep neural network models (resulting in the original plus 3 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.


### Alternative Model 1

In [30]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features

66

In [31]:
# Define the number of neurons in the output layer [Increased from 10 to 20]
number_output_neurons_A1 = 20

In [32]:
# Define the number of hidden nodes for the first hidden layer
hidden_nodes_layer1_A1 = (number_input_features + 1) // 2 

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A1 

33

In [33]:
# Create the Sequential model instance
nn_A1 = Sequential()

In [34]:
# First hidden layer
nn_A1.add(Dense(units=hidden_nodes_layer1, input_dim=number_input_features, activation="relu"))


# Output layer
nn_A1.add(Dense(units=1, activation="sigmoid"))


# Check the structure of the model
nn_A1.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 33)                2211      
                                                                 
 dense_4 (Dense)             (None, 1)                 34        
                                                                 
Total params: 2,245
Trainable params: 2,245
Non-trainable params: 0
_________________________________________________________________


In [35]:
# Compile the Sequential model
nn_A1.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [36]:
# Fit the model using 50 epochs and the training data
fit_model_A1 = nn_A1.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


#### Alternative Model 2

In [37]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features

66

In [38]:
# Define the number of neurons in the output layer
number_output_neurons_A2 = 10

In [39]:
# Define the number of hidden nodes for the first hidden layer
hidden_nodes_layer1_A2 = (number_input_features + 1) // 2 

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A2

33

In [40]:
# Create the Sequential model instance
nn_A2 = Sequential()

In [41]:
# First hidden layer
nn_A2.add(Dense(units=hidden_nodes_layer1, input_dim=number_input_features, activation="relu"))


# Output layer [Changed activation from sigmoid to softmax]
nn_A2.add(Dense(units=1, activation="softmax"))

# Check the structure of the model
nn_A2.summary()


Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_5 (Dense)             (None, 33)                2211      
                                                                 
 dense_6 (Dense)             (None, 1)                 34        
                                                                 
Total params: 2,245
Trainable params: 2,245
Non-trainable params: 0
_________________________________________________________________


In [42]:
# Compile the model
nn_A2.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [43]:
# Fit the model
fit_model_A2 = nn_A2.fit(X_train_scaled, y_train, epochs=50)


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


### Step 2: After finishing your models, display the accuracy scores achieved by each model, and compare the results.

In [44]:
print("Original Model Results")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Original Model Results
1/1 - 0s - loss: 0.6074 - accuracy: 0.5000 - 17ms/epoch - 17ms/step
Loss: 0.6074205040931702, Accuracy: 0.5


In [45]:
print("Alternative Model 1 Results")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A1.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 1 Results
1/1 - 0s - loss: 0.5584 - accuracy: 0.8750 - 104ms/epoch - 104ms/step
Loss: 0.55844646692276, Accuracy: 0.875


In [46]:
print("Alternative Model 2 Results")


# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A2.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 2 Results
1/1 - 0s - loss: 0.5979 - accuracy: 0.3750 - 93ms/epoch - 93ms/step
Loss: 0.597945511341095, Accuracy: 0.375


### Step 3: Save each of your alternative models as an HDF5 file.


In [47]:
# Set the file path for the first alternative model
file_path = Path("SuperBowl_2.h5")

# Export your model to an HDF5 file
nn_A1.save(file_path)


In [48]:
# Set the file path for the second alternative model
file_path = Path("SuperBowl_3.h5")

# Export your model to an HDF5 file
nn_A2.save(file_path)