# Venture Funding with Deep Learning

You work as a risk management associate at Alphabet Soup, a venture capital firm. Alphabet Soup’s business team receives many funding applications from startups every day. This team has asked you to help them create a model that predicts whether applicants will be successful if funded by Alphabet Soup.

The business team has given you a CSV containing more than 34,000 organizations that have received funding from Alphabet Soup over the years. With your knowledge of machine learning and neural networks, you decide to use the features in the provided dataset to create a binary classifier model that will predict whether an applicant will become a successful business. The CSV file contains a variety of information about these businesses, including whether or not they ultimately became successful.

## Instructions:

The steps for this challenge are broken out into the following sections:

* Prepare the data for use on a neural network model.

* Compile and evaluate a binary classification model using a neural network.

* Optimize the neural network model.

### Prepare the Data for Use on a Neural Network Model 

Using your knowledge of Pandas and scikit-learn’s `StandardScaler()`, preprocess the dataset so that you can use it to compile and evaluate the neural network model later.

Open the starter code file, and complete the following data preparation steps:

1. Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.   

2. Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.
 
3. Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

4. Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

5. Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 

6. Split the features and target sets into training and testing datasets.

7. Use scikit-learn's `StandardScaler` to scale the features data.

### Compile and Evaluate a Binary Classification Model Using a Neural Network

Use your knowledge of TensorFlow to design a binary classification deep neural network model. This model should use the dataset’s features to predict whether an Alphabet Soup&ndash;funded startup will be successful based on the features in the dataset. Consider the number of inputs before determining the number of layers that your model will contain or the number of neurons on each layer. Then, compile and fit your model. Finally, evaluate your binary classification model to calculate the model’s loss and accuracy. 
 
To do so, complete the following steps:

1. Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.

2. Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.

> **Hint** When fitting the model, start with a small number of epochs, such as 20, 50, or 100.

3. Evaluate the model using the test data to determine the model’s loss and accuracy.

4. Save and export your model to an HDF5 file, and name the file `AlphabetSoup.h5`. 

### Optimize the Neural Network Model

Using your knowledge of TensorFlow and Keras, optimize your model to improve the model's accuracy. Even if you do not successfully achieve a better accuracy, you'll need to demonstrate at least two attempts to optimize the model. You can include these attempts in your existing notebook. Or, you can make copies of the starter notebook in the same folder, rename them, and code each model optimization in a new notebook. 

> **Note** You will not lose points if your model does not achieve a high accuracy, as long as you make at least two attempts to optimize the model.

To do so, complete the following steps:

1. Define at least three new deep neural network models (the original plus 2 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.

2. After finishing your models, display the accuracy scores achieved by each model, and compare the results.

3. Save each of your models as an HDF5 file.


In [1]:
# Imports
import pandas as pd
import numpy as np
from pathlib import Path
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,OneHotEncoder




In [2]:
# Filter out warnings to improve readability
import warnings
warnings.filterwarnings('ignore')

---

## Prepare the data to be used on a neural network model

### Step 1: Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.  


In [3]:
# Read the applicants_data.csv file from the Resources folder into a Pandas DataFrame
applicant_data_df = pd.read_csv(
    Path("./Resources/applicants_data.csv")
)

# Review the DataFrame
applicant_data_df.head()


Unnamed: 0,EIN,NAME,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
0,10520599,BLUE KNIGHTS MOTORCYCLE CLUB,T10,Independent,C1000,ProductDev,Association,1,0,N,5000,1
1,10531628,AMERICAN CHESAPEAKE CLUB CHARITABLE TR,T3,Independent,C2000,Preservation,Co-operative,1,1-9999,N,108590,1
2,10547893,ST CLOUD PROFESSIONAL FIREFIGHTERS,T5,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
3,10553066,SOUTHSIDE ATHLETIC ASSOCIATION,T3,CompanySponsored,C2000,Preservation,Trust,1,10000-24999,N,6692,1
4,10556103,GENETIC RESEARCH INSTITUTE OF THE DESERT,T3,Independent,C1000,Heathcare,Trust,1,100000-499999,N,142590,1


In [4]:
# Review the data types associated with the columns
applicant_data_df.dtypes


EIN                        int64
NAME                      object
APPLICATION_TYPE          object
AFFILIATION               object
CLASSIFICATION            object
USE_CASE                  object
ORGANIZATION              object
STATUS                     int64
INCOME_AMT                object
SPECIAL_CONSIDERATIONS    object
ASK_AMT                    int64
IS_SUCCESSFUL              int64
dtype: object

### Step 2: Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.

In [5]:
# Drop the 'EIN' and 'NAME' columns from the DataFrame
applicant_data_df = applicant_data_df.drop(columns=['EIN','NAME'])

# Review the DataFrame
applicant_data_df.head()


Unnamed: 0,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
0,T10,Independent,C1000,ProductDev,Association,1,0,N,5000,1
1,T3,Independent,C2000,Preservation,Co-operative,1,1-9999,N,108590,1
2,T5,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
3,T3,CompanySponsored,C2000,Preservation,Trust,1,10000-24999,N,6692,1
4,T3,Independent,C1000,Heathcare,Trust,1,100000-499999,N,142590,1


### Step 3: Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

In [6]:
# Create a list of categorical variables 
categorical_variables = list(applicant_data_df.dtypes[applicant_data_df.dtypes == "object"].index)

# Display the categorical variables list
categorical_variables


['APPLICATION_TYPE',
 'AFFILIATION',
 'CLASSIFICATION',
 'USE_CASE',
 'ORGANIZATION',
 'INCOME_AMT',
 'SPECIAL_CONSIDERATIONS']

In [7]:
# Create a OneHotEncoder instance
enc = OneHotEncoder(sparse=False)

In [8]:
# Encode the categorcal variables using OneHotEncoder
encoded_data = enc.fit_transform(applicant_data_df[categorical_variables])


In [9]:
# Create a DataFrame with the encoded variables
encoded_df = pd.DataFrame(
    encoded_data,
    columns = enc.get_feature_names_out(categorical_variables)
)

# Review the DataFrame
encoded_df.head()

Unnamed: 0,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,APPLICATION_TYPE_T25,APPLICATION_TYPE_T29,...,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0


### Step 4: Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

In [10]:
# Add the numerical variables from the original DataFrame to the one-hot encoding DataFrame
encoded_df = pd.concat(
    [
        applicant_data_df.drop(columns=categorical_variables),
        encoded_df
    ],
    axis=1
)

# Review the Dataframe
encoded_df

Unnamed: 0,STATUS,ASK_AMT,IS_SUCCESSFUL,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,...,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y
0,1,5000,1,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1,1,108590,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
2,1,5000,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,1,6692,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
4,1,142590,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34294,1,5000,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
34295,1,5000,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
34296,1,5000,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
34297,1,5000,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0


### Step 5: Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 



In [11]:
# Define the target set y using the IS_SUCCESSFUL column
y = encoded_df['IS_SUCCESSFUL']

# Display a sample of y
y[:5]


0    1
1    1
2    0
3    1
4    1
Name: IS_SUCCESSFUL, dtype: int64

In [12]:
# Define features set X by selecting all columns but IS_SUCCESSFUL
X = encoded_df.drop(columns=['IS_SUCCESSFUL'])

# Review the features DataFrame
X.head()


Unnamed: 0,STATUS,ASK_AMT,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,...,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y
0,1,5000,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1,1,108590,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
2,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,1,6692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
4,1,142590,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0


### Step 6: Split the features and target sets into training and testing datasets.


In [13]:
# Split the preprocessed data into a training and testing dataset
# Assign the function a random_state equal to 1
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)


### Step 7: Use scikit-learn's `StandardScaler` to scale the features data.

In [14]:
# Create a StandardScaler instance
scaler = StandardScaler()

# Fit the scaler to the features training dataset
X_scaler = scaler.fit(X_train)

# Fit the scaler to the features training dataset
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)


---

## Compile and Evaluate a Binary Classification Model Using a Neural Network

### Step 1: Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.


In [15]:
# Define the the number of inputs (features) to the model
# The number of features in the dataset is the length of any row
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features

116

In [16]:
# Define the number of neurons in the output layer
number_output_neurons = 1

In [17]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use the Python floor division (//) to return the quotient
hidden_nodes_layer1 = (number_input_features + number_output_neurons) // 2 

# Review the number hidden nodes in the first layer
hidden_nodes_layer1

58

In [18]:
# Define the number of hidden nodes for the second hidden layer
# Use the mean of the number of hidden nodes in the first hidden layer plus the number of output neurons
# Use the Python floor division (//) to return the quotient
hidden_nodes_layer2 =  (hidden_nodes_layer1 + number_output_neurons) // 2 

# Review the number hidden nodes in the second layer
hidden_nodes_layer2

29

In [19]:
# Create the Sequential model instance
nn = Sequential()




In [20]:
# Add the first hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn.add(Dense(units=hidden_nodes_layer1, input_dim=number_input_features, activation="relu"))

In [21]:
# Add the second hidden layer
# Specify the number of hidden nodes and the activation function
nn.add(Dense(units=hidden_nodes_layer2, activation="relu"))

In [22]:
# Add the output layer to the model specifying the number of output neurons and activation function
nn.add(Dense(units=1, activation="sigmoid"))

In [23]:
# Display the Sequential model summary
nn.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 58)                6786      
                                                                 
 dense_1 (Dense)             (None, 29)                1711      
                                                                 
 dense_2 (Dense)             (None, 1)                 30        
                                                                 
Total params: 8527 (33.31 KB)
Trainable params: 8527 (33.31 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### Step 2: Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.


In [24]:
# Compile the Sequential model
nn.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])




In [25]:
# Fit the model using 50 epochs and the training data
fit_model = nn.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


### Step 3: Evaluate the model using the test data to determine the model’s loss and accuracy.


In [26]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - loss: 0.5538 - accuracy: 0.7280 - 427ms/epoch - 2ms/step
Loss: 0.5538214445114136, Accuracy: 0.7280466556549072


### Step 4: Save and export your model to an HDF5 file, and name the file `AlphabetSoup.h5`. 


In [27]:
# Set the model's file path
file_path = Path("./Model_Files/AlphabetSoup.h5")

# Export your model to a HDF5 file
nn.save_weights(file_path)

---

## Optimize the neural network model


### Define at least two new deep neural network models (resulting in the original plus 2 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.


## Attempt 1: Adjusting the input data

The first attempt at improving the model will focus on adjusting the input data, without changing anything about the structure of the neural network.  In the previous model, all of the non-binary numeric variables were treated as being continuous variables and all of the categorical variables were treated as being discrete variables.

It's possible that the model could improve with some slight changes to the variables.

For example, not all numeric variables should be evaluated on a continuous scale, because their relationship with the target variable may not be affected by the magnitude of the difference between the two numbers.  A different approach is to bin the numeric values into groups and treat them as categorical variables.

As another example, categorical variables with extremely low class counts (e.g., there was only 1 in the dataset) aren't likely to have a meaningful relationship with the target variable, because the class is too small to establish the meaningful relationship.  A different approach could be to group low-class-size categories together or to oversample the low classes to improve the model.

### Exploring the data

In [28]:
# Data exploration:
for column in applicant_data_df.columns:
    print(applicant_data_df[column].value_counts())
    print()

APPLICATION_TYPE
T3     27037
T4      1542
T6      1216
T5      1173
T19     1065
T8       737
T7       725
T10      528
T9       156
T13       66
T12       27
T2        16
T25        3
T14        3
T29        2
T15        2
T17        1
Name: count, dtype: int64

AFFILIATION
Independent         18480
CompanySponsored    15705
Family/Parent          64
National               33
Regional               13
Other                   4
Name: count, dtype: int64

CLASSIFICATION
C1000    17326
C2000     6074
C1200     4837
C3000     1918
C2100     1883
         ...  
C4120        1
C8210        1
C2561        1
C4500        1
C2150        1
Name: count, Length: 71, dtype: int64

USE_CASE
Preservation     28095
ProductDev        5671
CommunityServ      384
Heathcare          146
Other                3
Name: count, dtype: int64

ORGANIZATION
Trust           23515
Association     10255
Co-operative      486
Corporation        43
Name: count, dtype: int64

STATUS
1    34294
0        5
Name: count, 

### Transforming the ASK_AMT column

The ASK_AMT column is continuous, but I'm going to try binning the values into 20 groups.

In [29]:
# Checking the ASK_AMT column to see if there is any value beneath 5000
print(f"Lowest ASK_AMTs:")
display(applicant_data_df.sort_values(by=['ASK_AMT'], ascending=True).head())
print()
print(f"Highest ASK_AMTs:")
display(applicant_data_df.sort_values(by=['ASK_AMT'], ascending=False).head())

Lowest ASK_AMTs:


Unnamed: 0,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
0,T10,Independent,C1000,ProductDev,Association,1,0,N,5000,1
20358,T5,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
20356,T6,CompanySponsored,C2000,ProductDev,Association,1,0,N,5000,1
20355,T3,CompanySponsored,C1000,Preservation,Trust,1,0,N,5000,1
20354,T3,CompanySponsored,C1200,Preservation,Association,1,0,N,5000,0



Highest ASK_AMTs:


Unnamed: 0,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
33175,T3,Independent,C1000,Heathcare,Trust,1,50M+,N,8597806340,1
34222,T3,Independent,C2000,Preservation,Co-operative,1,50M+,N,8556638692,0
33678,T3,Independent,C1200,Heathcare,Trust,1,50M+,N,5591584994,0
24795,T3,Independent,C2000,Preservation,Trust,1,50M+,N,4653011914,0
31337,T3,CompanySponsored,C1000,Preservation,Trust,1,50M+,N,3391919220,1


Note that the minimum ask amount is 5000, which explains why there are so many more instances of 5000 being requested than other numbers

In [30]:
# The biggest ASK_AMT bin is 5000 and it has 25,398 entries out of 34299 in the data set (74% of the applications ask for exactly $5K)
# Create 19 other bins of equal size from the remaining ~9K applications
# 20 total bins is arbitrary, but I also tested 5 bins and 10 bins
# 20 bins was selected because neural networks should be able to handle high amounts of complexity
number_of_extra_ask_amt_bins = 19

# Print the cutoff values
print(pd.qcut(applicant_data_df[applicant_data_df.ASK_AMT != 5000].ASK_AMT, q=number_of_extra_ask_amt_bins).value_counts())

# Save the upper boundary cutoff values
ask_amt_cutoff_values = pd.qcut(applicant_data_df[applicant_data_df.ASK_AMT != 5000].ASK_AMT, q=number_of_extra_ask_amt_bins).cat.categories.right

ASK_AMT
(5000.999, 9249.789]           469
(58254.421, 73901.105]         469
(1256226.0, 2630415.789]       469
(455419.579, 721749.947]       469
(157856.474, 217128.158]       469
(92757.316, 120183.895]        469
(28017.316, 36371.263]         469
(15038.684, 20860.368]         469
(7583539.474, 8597806340.0]    469
(46796.316, 58254.421]         468
(36371.263, 46796.316]         468
(73901.105, 92757.316]         468
(9249.789, 15038.684]          468
(120183.895, 157856.474]       468
(217128.158, 305688.158]       468
(305688.158, 455419.579]       468
(20860.368, 28017.316]         468
(721749.947, 1256226.0]        468
(2630415.789, 7583539.474]     468
Name: count, dtype: int64


In [31]:
# Create a dataframe and map the bins to it
ask_amt_df = applicant_data_df[['ASK_AMT']]

# Initialize the cutoffs with the minimum bucket
ask_amt_cutoffs = [0, 5000.5]

# Loop through the rest of the cutoff values to append them to the array:
for i in range(0,len(ask_amt_cutoff_values)):
    ask_amt_cutoffs.append(ask_amt_cutoff_values[i])

# Create the labels based on the number of groups in the array
labels_array = []
for i in range(1,len(ask_amt_cutoffs)):
    labels_array.append("Group" + str(i))

# Create the bin column, cutting the data by the cutoff values listed in the ask_amt_cutoffs array, using the labels in the labels_array
ask_amt_df['ASK_AMT_BIN'] = pd.cut(applicant_data_df['ASK_AMT'], ask_amt_cutoffs, labels=labels_array)

# Review the dataframe with the bins
ask_amt_df.head()

Unnamed: 0,ASK_AMT,ASK_AMT_BIN
0,5000,Group1
1,108590,Group11
2,5000,Group1
3,6692,Group2
4,142590,Group12


### Transforming the CLASSIFICATION column

The CLASSIFICATION column is categorical, but many of the classifications have small class sizes.  I'm going to try binning the classifications with low class sizes.

In [32]:
# Enable the Jupyter notebook to display all of the rows without truncating the result
pd.set_option('display.max_rows', None)

# Check the Classification column for the number of entries with each classification
print(f"Classification classes:")
classification_bin_df = applicant_data_df[['CLASSIFICATION']].groupby(by=['CLASSIFICATION']).value_counts().sort_values(ascending=False)

# Review the number of entries for each classification
classification_bin_df

Classification classes:


CLASSIFICATION
C1000    17326
C2000     6074
C1200     4837
C3000     1918
C2100     1883
C7000      777
C1700      287
C4000      194
C5000      116
C1270      114
C2700      104
C2800       95
C7100       75
C1300       58
C1280       50
C1230       36
C1400       34
C2300       32
C7200       32
C1240       30
C8000       20
C7120       18
C1500       16
C1800       15
C6000       15
C1250       14
C8200       11
C1278       10
C1238       10
C1235        9
C1237        9
C7210        7
C1720        6
C4100        6
C2400        6
C1600        5
C1257        5
C2710        3
C1260        3
C0           3
C1267        2
C1246        2
C1256        2
C3200        2
C1234        2
C4500        1
C4120        1
C5200        1
C3700        1
C6100        1
C4200        1
C1900        1
C2600        1
C1728        1
C1236        1
C1245        1
C1248        1
C1283        1
C1370        1
C1570        1
C1580        1
C1732        1
C2570        1
C1820        1
C2150        1
C2170     

In [33]:
# Reset the display so the results are truncated again
pd.reset_option('display.max_rows')

In [34]:
# Notice that some of the classifications that have very small class sizes
# Transform all classifications whose class sizes are less than 10 into a consolidated bin
# Binning by class sizes fewer than 10 is arbitrary, but I also tried 30 and 50
# As before, preserving more variation in the column is assumed to be desirable because neural networks are able to handle significant amounts of complexity

# Create the dataframe from the count results above
classification_bin_df = pd.DataFrame(classification_bin_df)

#Initialize a column for binned results
classification_bin_df['CLASSIFICATION_BIN'] = 0

# Create the bin
# All classifications with class sizes more than 10 retain their values and everything fewer gets grouped into "Other"
classification_bin_df['CLASSIFICATION_BIN'] = np.where((classification_bin_df['count']>=10), classification_bin_df.index, "Other")

# Review the dataframe
classification_bin_df

Unnamed: 0_level_0,count,CLASSIFICATION_BIN
CLASSIFICATION,Unnamed: 1_level_1,Unnamed: 2_level_1
C1000,17326,C1000
C2000,6074,C2000
C1200,4837,C1200
C3000,1918,C3000
C2100,1883,C2100
...,...,...
C2190,1,Other
C2380,1,Other
C2500,1,Other
C2561,1,Other


In [35]:
# Review the counts in each bin, including "Other"
classification_bin_df.groupby(by=['CLASSIFICATION_BIN']).sum().sort_values(by=['count'], ascending=False)

Unnamed: 0_level_0,count
CLASSIFICATION_BIN,Unnamed: 1_level_1
C1000,17326
C2000,6074
C1200,4837
C3000,1918
C2100,1883
C7000,777
C1700,287
C4000,194
C5000,116
C1270,114


In [36]:
# Create a mapping dictionary for creating the bin values from the classifications
# E.g., create a dictionary like {classification_1 : classification_bin, classification_2 : classification_bin}
# If the classification class size is >=10, then the classification_bin value retains the classification name
# If the classification class size is <10, then the classification_bin value becomes "Other" in the mapping dictionary
classification_map = dict(zip(classification_bin_df.index, classification_bin_df['CLASSIFICATION_BIN']))

In [37]:
# Review the classification mapping dictionary
classification_map

{'C1000': 'C1000',
 'C2000': 'C2000',
 'C1200': 'C1200',
 'C3000': 'C3000',
 'C2100': 'C2100',
 'C7000': 'C7000',
 'C1700': 'C1700',
 'C4000': 'C4000',
 'C5000': 'C5000',
 'C1270': 'C1270',
 'C2700': 'C2700',
 'C2800': 'C2800',
 'C7100': 'C7100',
 'C1300': 'C1300',
 'C1280': 'C1280',
 'C1230': 'C1230',
 'C1400': 'C1400',
 'C2300': 'C2300',
 'C7200': 'C7200',
 'C1240': 'C1240',
 'C8000': 'C8000',
 'C7120': 'C7120',
 'C1500': 'C1500',
 'C1800': 'C1800',
 'C6000': 'C6000',
 'C1250': 'C1250',
 'C8200': 'C8200',
 'C1278': 'C1278',
 'C1238': 'C1238',
 'C1235': 'Other',
 'C1237': 'Other',
 'C7210': 'Other',
 'C1720': 'Other',
 'C4100': 'Other',
 'C2400': 'Other',
 'C1600': 'Other',
 'C1257': 'Other',
 'C2710': 'Other',
 'C1260': 'Other',
 'C0': 'Other',
 'C1267': 'Other',
 'C1246': 'Other',
 'C1256': 'Other',
 'C3200': 'Other',
 'C1234': 'Other',
 'C4500': 'Other',
 'C4120': 'Other',
 'C5200': 'Other',
 'C3700': 'Other',
 'C6100': 'Other',
 'C4200': 'Other',
 'C1900': 'Other',
 'C2600': 'Othe

In [38]:
# Create a dataframe for the classification values
classification_df = applicant_data_df[['CLASSIFICATION']]

# Create a bin column that maps the classification value based on the mapping dictionary above
classification_df['CLASSIFICATION_BIN'] = classification_df['CLASSIFICATION'].map(classification_map)

#Review the dataframe
classification_df.head()

Unnamed: 0,CLASSIFICATION,CLASSIFICATION_BIN
0,C1000,C1000
1,C2000,C2000
2,C3000,C3000
3,C2000,C2000
4,C1000,C1000


### Create and Scale the New Dataframes

In [39]:
# Create a copy of the applicant_data dataframe, so that the original data can be used later if the alternate model doesn't have improved performance
applicant_data_df_A1 = applicant_data_df.copy()

In [40]:
# Transform the SPECIAL_CONSIDERATIONS column into an int
# It was an "object" before, so the OneHotEncoder turned it into two binary variables even though Y/N options are already a binary variable

# Create the new column, mapping the "N" and "Y" values as 0s and 1s, respectively
applicant_data_df_A1['SPECIAL_CONSIDERATIONS_INT'] = applicant_data_df_A1['SPECIAL_CONSIDERATIONS'].map({"N":0,"Y":1})

# Cast the column values as integers
applicant_data_df_A1.SPECIAL_CONSIDERATIONS_INT = applicant_data_df_A1.SPECIAL_CONSIDERATIONS_INT.astype('int')

# Drop the old column
applicant_data_df_A1.drop(columns={'SPECIAL_CONSIDERATIONS'}, inplace=True)

In [41]:
# Change the values from ASK_AMT to the values in ASK_AMT_BIN
applicant_data_df_A1['ASK_AMT'] = ask_amt_df['ASK_AMT_BIN']

# Change the values from CLASSIFICATION to the values in CLASSIFICATION_BIN
applicant_data_df_A1['CLASSIFICATION'] = classification_df['CLASSIFICATION_BIN']

# Rename the columns, because only the data was changed above
applicant_data_df_A1.rename(columns = {
    'ASK_AMT':'ASK_AMT_BIN',
    'CLASSIFICATION':'CLASSIFICATION_BIN'
}, inplace=True)

# Cast the values in ASK_AMT_BIN as objects
# Previously, the values in ASK_AMT were integers (continuous variables), but they're now classifiers (discrete variables)
applicant_data_df_A1.ASK_AMT_BIN = applicant_data_df_A1.ASK_AMT_BIN.astype('object')

# Review the new dataframe
applicant_data_df_A1.head(10)

Unnamed: 0,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION_BIN,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,ASK_AMT_BIN,IS_SUCCESSFUL,SPECIAL_CONSIDERATIONS_INT
0,T10,Independent,C1000,ProductDev,Association,1,0,Group1,1,0
1,T3,Independent,C2000,Preservation,Co-operative,1,1-9999,Group11,1,0
2,T5,CompanySponsored,C3000,ProductDev,Association,1,0,Group1,0,0
3,T3,CompanySponsored,C2000,Preservation,Trust,1,10000-24999,Group2,1,0
4,T3,Independent,C1000,Heathcare,Trust,1,100000-499999,Group12,1,0
5,T3,Independent,C1200,Preservation,Trust,1,0,Group1,1,0
6,T3,Independent,C1000,Preservation,Trust,1,100000-499999,Group6,1,0
7,T3,Independent,C2000,Preservation,Trust,1,10M-50M,Group19,1,0
8,T7,Independent,C1000,ProductDev,Trust,1,1-9999,Group11,1,0
9,T5,CompanySponsored,C3000,ProductDev,Association,1,0,Group1,0,0


In [42]:
# Check the data types
applicant_data_df_A1.dtypes

APPLICATION_TYPE              object
AFFILIATION                   object
CLASSIFICATION_BIN            object
USE_CASE                      object
ORGANIZATION                  object
STATUS                         int64
INCOME_AMT                    object
ASK_AMT_BIN                   object
IS_SUCCESSFUL                  int64
SPECIAL_CONSIDERATIONS_INT     int32
dtype: object

In [43]:
# Create a list of categorical variables for the alternate model
categorical_variables_A1 = list(applicant_data_df_A1.dtypes[applicant_data_df_A1.dtypes == "object"].index)

# Display the list of categorical variables
categorical_variables_A1

['APPLICATION_TYPE',
 'AFFILIATION',
 'CLASSIFICATION_BIN',
 'USE_CASE',
 'ORGANIZATION',
 'INCOME_AMT',
 'ASK_AMT_BIN']

In [44]:
# Create a OneHotEncoder instance
enc = OneHotEncoder(sparse=False)

In [45]:
# Encode the categorcal variables using OneHotEncoder
encoded_data_A1 = enc.fit_transform(applicant_data_df_A1[categorical_variables_A1])

In [46]:
# Create a DataFrame with the encoded variables
encoded_df_A1 = pd.DataFrame(
    encoded_data_A1,
    columns = enc.get_feature_names_out(categorical_variables_A1)
)

# Review the DataFrame
encoded_df_A1.head()

Unnamed: 0,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,APPLICATION_TYPE_T25,APPLICATION_TYPE_T29,...,ASK_AMT_BIN_Group19,ASK_AMT_BIN_Group2,ASK_AMT_BIN_Group20,ASK_AMT_BIN_Group3,ASK_AMT_BIN_Group4,ASK_AMT_BIN_Group5,ASK_AMT_BIN_Group6,ASK_AMT_BIN_Group7,ASK_AMT_BIN_Group8,ASK_AMT_BIN_Group9
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

In [47]:
# Add the numerical variables from the original DataFrame to the one-hot encoding DataFrame
encoded_df_A1 = pd.concat(
    [
        applicant_data_df_A1.drop(columns=categorical_variables_A1),
        encoded_df_A1
    ],
    axis=1
)

# Review the Dataframe
encoded_df_A1

Unnamed: 0,STATUS,IS_SUCCESSFUL,SPECIAL_CONSIDERATIONS_INT,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,...,ASK_AMT_BIN_Group19,ASK_AMT_BIN_Group2,ASK_AMT_BIN_Group20,ASK_AMT_BIN_Group3,ASK_AMT_BIN_Group4,ASK_AMT_BIN_Group5,ASK_AMT_BIN_Group6,ASK_AMT_BIN_Group7,ASK_AMT_BIN_Group8,ASK_AMT_BIN_Group9
0,1,1,0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,1,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,1,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,1,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,1,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34294,1,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
34295,1,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
34296,1,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
34297,1,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [48]:
# Review the columns to make sure the bins are accurate
print(encoded_df_A1.columns)

Index(['STATUS', 'IS_SUCCESSFUL', 'SPECIAL_CONSIDERATIONS_INT',
       'APPLICATION_TYPE_T10', 'APPLICATION_TYPE_T12', 'APPLICATION_TYPE_T13',
       'APPLICATION_TYPE_T14', 'APPLICATION_TYPE_T15', 'APPLICATION_TYPE_T17',
       'APPLICATION_TYPE_T19', 'APPLICATION_TYPE_T2', 'APPLICATION_TYPE_T25',
       'APPLICATION_TYPE_T29', 'APPLICATION_TYPE_T3', 'APPLICATION_TYPE_T4',
       'APPLICATION_TYPE_T5', 'APPLICATION_TYPE_T6', 'APPLICATION_TYPE_T7',
       'APPLICATION_TYPE_T8', 'APPLICATION_TYPE_T9',
       'AFFILIATION_CompanySponsored', 'AFFILIATION_Family/Parent',
       'AFFILIATION_Independent', 'AFFILIATION_National', 'AFFILIATION_Other',
       'AFFILIATION_Regional', 'CLASSIFICATION_BIN_C1000',
       'CLASSIFICATION_BIN_C1200', 'CLASSIFICATION_BIN_C1230',
       'CLASSIFICATION_BIN_C1238', 'CLASSIFICATION_BIN_C1240',
       'CLASSIFICATION_BIN_C1250', 'CLASSIFICATION_BIN_C1270',
       'CLASSIFICATION_BIN_C1278', 'CLASSIFICATION_BIN_C1280',
       'CLASSIFICATION_BIN_C1300', '

### Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 



In [49]:
# Define the target set y using the IS_SUCCESSFUL column
y_A1 = encoded_df_A1['IS_SUCCESSFUL']

# Display a sample of y
y_A1[:5]


0    1
1    1
2    0
3    1
4    1
Name: IS_SUCCESSFUL, dtype: int64

In [50]:
# Define features set X by selecting all columns but IS_SUCCESSFUL
X_A1 = encoded_df_A1.drop(columns=['IS_SUCCESSFUL'])

# Review the features DataFrame
X_A1.head()


Unnamed: 0,STATUS,SPECIAL_CONSIDERATIONS_INT,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,...,ASK_AMT_BIN_Group19,ASK_AMT_BIN_Group2,ASK_AMT_BIN_Group20,ASK_AMT_BIN_Group3,ASK_AMT_BIN_Group4,ASK_AMT_BIN_Group5,ASK_AMT_BIN_Group6,ASK_AMT_BIN_Group7,ASK_AMT_BIN_Group8,ASK_AMT_BIN_Group9
0,1,0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Split the features and target sets into training and testing datasets.


In [51]:
# Split the preprocessed data into a training and testing dataset
# Assign the function a random_state equal to 1
X_train_A1, X_test_A1, y_train_A1, y_test_A1 = train_test_split(X_A1, y_A1, random_state=1)


### Use scikit-learn's `StandardScaler` to scale the features data.

In [52]:
# Create a StandardScaler instance
scaler = StandardScaler()

# Fit the scaler to the features training dataset
X_scaler_A1 = scaler.fit(X_train_A1)

# Fit the scaler to the features training dataset
X_train_scaled_A1 = X_scaler_A1.transform(X_train_A1)
X_test_scaled_A1 = X_scaler_A1.transform(X_test_A1)


### Create Alternative Model 1

In [53]:
# Define the the number of inputs (features) to the model
# The number of features in the dataset is the length of any row
number_input_features_A1 = len(X_train_A1.iloc[0])

# Review the number of features
number_input_features_A1

93

In [54]:
# Define the number of neurons in the output layer
number_output_neurons_A1 = 1

In [55]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use the Python floor division (//) to return the quotient 
hidden_nodes_layer1_A1 = (number_input_features_A1 + number_output_neurons_A1) // 2

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A1

47

In [56]:
# Define the number of hidden nodes for the second hidden layer
# Use the mean of the number of hidden nodes in the first hidden layer plus the number of output neurons
# Use the Python floor division (//) to return the quotient 
hidden_nodes_layer2_A1 = (hidden_nodes_layer1_A1 + number_output_neurons_A1) // 2

# Review the number of hidden nodes in the second layer
hidden_nodes_layer2_A1

24

In [57]:
# Create the Sequential model instance
nn_A1 = Sequential()

In [58]:
# First hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn_A1.add(Dense(units=hidden_nodes_layer1_A1, input_dim=number_input_features_A1, activation="relu"))

# Second hidden layer
# Specify the number of hidden nodes and the activation function
nn_A1.add(Dense(units=hidden_nodes_layer2_A1, activation="relu"))

# Output layer
nn_A1.add(Dense(units=1, activation="sigmoid"))

# Check the structure of the model
nn_A1.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 47)                4418      
                                                                 
 dense_4 (Dense)             (None, 24)                1152      
                                                                 
 dense_5 (Dense)             (None, 1)                 25        
                                                                 
Total params: 5595 (21.86 KB)
Trainable params: 5595 (21.86 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [59]:
# Compile the Sequential model
nn_A1.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [60]:
# Fit the model using 50 epochs and the training data
fit_model_A1 = nn_A1.fit(X_train_scaled_A1, y_train_A1, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


### Evaluate Alternative Model 1 using the test data to determine the model’s loss and accuracy.

In [61]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A1.evaluate(X_test_scaled_A1,y_test_A1,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - loss: 0.5700 - accuracy: 0.7241 - 431ms/epoch - 2ms/step
Loss: 0.5700469017028809, Accuracy: 0.7240816354751587


---

## Attempt 2: Increasing the number of layers

The second attempt at improving the model will keep the original data set (and test/training split), but it will add three more hidden layers for a total of five.  The general structure of the neural network will remain the same, and the number of neurons will continue to be halved each layer.

In [62]:
# Check the original data to make sure it wasn't accidentially changed
print(f"X_train:")
display(X_train.head())
print()
print(f"X_test:")
display(X_test.head())
print()
print(f"y_train:")
display(y_train.head())
print()
print(f"y_test:")
display(y_test.head())
print()
print(f"X_train_scaled:")
display(pd.DataFrame(X_train_scaled).head())
print()
print(f"X_test_scaled:")
display(pd.DataFrame(X_test_scaled).head())

X_train:


Unnamed: 0,STATUS,ASK_AMT,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,...,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y
10679,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
5052,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
9990,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
25173,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
11405,1,5000,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0



X_test:


Unnamed: 0,STATUS,ASK_AMT,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,...,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y
19398,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
14725,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
14209,1,22544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0
25213,1,5000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
5886,1,10674,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0



y_train:


10679    0
5052     1
9990     1
25173    1
11405    1
Name: IS_SUCCESSFUL, dtype: int64


y_test:


19398    1
14725    0
14209    1
25213    1
5886     0
Name: IS_SUCCESSFUL, dtype: int64


X_train_scaled:


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,106,107,108,109,110,111,112,113,114,115
0,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
1,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
2,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
3,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
4,0.0108,-0.029571,8.091371,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257



X_test_scaled:


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,106,107,108,109,110,111,112,113,114,115
0,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
1,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
2,0.0108,-0.029378,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,2.865339,-0.06402,-0.075291,0.029257,-0.029257
3,0.0108,-0.029571,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,-0.331791,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257
4,0.0108,-0.029508,-0.123588,-0.029915,-0.045438,-0.008818,-0.006235,-0.006235,-0.178463,-0.020683,...,-0.150669,-0.124397,3.013949,-0.082045,-0.16762,-0.348999,-0.06402,-0.075291,0.029257,-0.029257


In [63]:
# Define the the number of inputs (features) to the model
number_input_features_A2 = len(X_train.iloc[0])

# Review the number of features
number_input_features_A2

116

In [64]:
# Define the number of neurons in the output layer
number_output_neurons_A2 = 1

In [65]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer1_A2 = (number_input_features_A2 + number_output_neurons_A2) // 2

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A2

58

In [66]:
# Define the number of hidden nodes for the second hidden layer
# Use the mean of the number of hidden nodes in the previous hidden layer plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer2_A2 = (hidden_nodes_layer1_A2 + number_output_neurons_A2) // 2

# Review the number of hidden nodes in the second layer
hidden_nodes_layer2_A2

29

In [67]:
# Define the number of hidden nodes for the third hidden layer
# Use the mean of the number of hidden nodes in the previous hidden layer plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer3_A2 = (hidden_nodes_layer2_A2 + number_output_neurons_A2) // 2

# Review the number of hidden nodes in the third layer
hidden_nodes_layer3_A2

15

In [68]:
# Define the number of hidden nodes for the fourth hidden layer
# Use the mean of the number of hidden nodes in the previous hidden layer plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer4_A2 = (hidden_nodes_layer3_A2 + number_output_neurons_A2) // 2

# Review the number of hidden nodes in the fourth layer
hidden_nodes_layer4_A2

8

In [69]:
# Define the number of hidden nodes for the fifth hidden layer
# Use the mean of the number of hidden nodes in the previous hidden layer plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer5_A2 = (hidden_nodes_layer4_A2 + number_output_neurons_A2) // 2

# Review the number of hidden nodes in the fifth layer
hidden_nodes_layer5_A2

4

In [70]:
# Create the Sequential model instance
nn_A2 = Sequential()

In [71]:
# Add the first hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn_A2.add(Dense(units=hidden_nodes_layer1_A2, input_dim=number_input_features_A2, activation="relu"))

# Add the second hidden layer
# Specify the number of hidden nodes and the activation function
nn_A2.add(Dense(units=hidden_nodes_layer2_A2, activation="relu"))

# Add the third hidden layer
# Specify the number of hidden nodes and the activation function
nn_A2.add(Dense(units=hidden_nodes_layer3_A2, activation="relu"))

# Add the fourth hidden layer
# Specify the number of hidden nodes and the activation function
nn_A2.add(Dense(units=hidden_nodes_layer4_A2, activation="relu"))

# Add the fifth hidden layer
# Specify the number of hidden nodes and the activation function
nn_A2.add(Dense(units=hidden_nodes_layer5_A2, activation="relu"))

# Output layer
nn_A2.add(Dense(units=1, activation="sigmoid"))

# Check the structure of the model
nn_A2.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_6 (Dense)             (None, 58)                6786      
                                                                 
 dense_7 (Dense)             (None, 29)                1711      
                                                                 
 dense_8 (Dense)             (None, 15)                450       
                                                                 
 dense_9 (Dense)             (None, 8)                 128       
                                                                 
 dense_10 (Dense)            (None, 4)                 36        
                                                                 
 dense_11 (Dense)            (None, 1)                 5         
                                                                 
Total params: 9116 (35.61 KB)
Trainable params: 9116 (

In [72]:
# Compile the model
nn_A2.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])


In [73]:
# Fit the model
fit_model_A2 = nn_A2.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [74]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A2.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - loss: 0.5569 - accuracy: 0.7305 - 435ms/epoch - 2ms/step
Loss: 0.5568841099739075, Accuracy: 0.7304956316947937


---

## Attempt 3: Flattening the Neural Network

The third attempt at improving the model will keep the original data set (and test/training split), but it will flatten the structure of the neural network.  The model will have the same two original layers as the original model, but instead of decreasing in size, the second hidden layer will have the same number of neurons as the first layer.

In [75]:
# Define the the number of inputs (features) to the model
number_input_features_A3 = len(X_train.iloc[0])

# Review the number of features
number_input_features_A3

116

In [76]:
# Define the number of neurons in the output layer
number_output_neurons_A3 = 1

In [77]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer1_A3 = (number_input_features_A3 + number_output_neurons_A3) // 2

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A3

58

In [78]:
# Define the number of hidden nodes for the second hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer2_A3 = hidden_nodes_layer1_A3

# Review the number of hidden nodes in the second layer
hidden_nodes_layer2_A3

58

In [79]:
# Create the Sequential model instance
nn_A3 = Sequential()

In [80]:
# Add the first hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn_A3.add(Dense(units=hidden_nodes_layer1_A3, input_dim=number_input_features_A3, activation="relu"))

# Add the second hidden layer
# Specify the number of hidden nodes and the activation function
nn_A3.add(Dense(units=hidden_nodes_layer2_A3, activation="relu"))

# Output layer
nn_A3.add(Dense(units=1, activation="sigmoid"))

# Check the structure of the model
nn_A3.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_12 (Dense)            (None, 58)                6786      
                                                                 
 dense_13 (Dense)            (None, 58)                3422      
                                                                 
 dense_14 (Dense)            (None, 1)                 59        
                                                                 
Total params: 10267 (40.11 KB)
Trainable params: 10267 (40.11 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [81]:
# Compile the model
nn_A3.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [82]:
# Fit the model
fit_model_A3 = nn_A3.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [83]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A3.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - loss: 0.5596 - accuracy: 0.7298 - 427ms/epoch - 2ms/step
Loss: 0.5596227049827576, Accuracy: 0.7297959327697754


---

## Attempt 4: Flat Network with More Layers

The fourth attempt at improving the model will keep the original data set (and test/training split), but it will flatten the structure of the neural network and add more layers.  This model will have five layers instead of two, and each of the layers will be same size of 58 neurons.

In [84]:
# Define the the number of inputs (features) to the model
number_input_features_A4 = len(X_train.iloc[0])

# Review the number of features
number_input_features_A4

116

In [85]:
# Define the number of neurons in the output layer
number_output_neurons_A4 = 1

In [86]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer1_A4 = (number_input_features_A4 + number_output_neurons_A4) // 2

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A4

58

In [87]:
# Define the number of hidden nodes for the second hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer2_A4 = hidden_nodes_layer1_A4

# Review the number of hidden nodes in the second layer
hidden_nodes_layer2_A4

58

In [88]:
# Define the number of hidden nodes for the third hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer3_A4 = hidden_nodes_layer2_A4

# Review the number of hidden nodes in the second layer
hidden_nodes_layer3_A4

58

In [89]:
# Define the number of hidden nodes for the fourth hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer4_A4 = hidden_nodes_layer3_A4

# Review the number of hidden nodes in the second layer
hidden_nodes_layer4_A4

58

In [90]:
# Define the number of hidden nodes for the fifth hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer5_A4 = hidden_nodes_layer4_A4

# Review the number of hidden nodes in the second layer
hidden_nodes_layer5_A4

58

In [91]:
# Create the Sequential model instance
nn_A4 = Sequential()

In [92]:
# Add the first hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn_A4.add(Dense(units=hidden_nodes_layer1_A4, input_dim=number_input_features_A4, activation="relu"))

# Add the second hidden layer
# Specify the number of hidden nodes and the activation function
nn_A4.add(Dense(units=hidden_nodes_layer2_A4, activation="relu"))

# Add the third hidden layer
# Specify the number of hidden nodes and the activation function
nn_A4.add(Dense(units=hidden_nodes_layer3_A4, activation="relu"))

# Add the fourth hidden layer
# Specify the number of hidden nodes and the activation function
nn_A4.add(Dense(units=hidden_nodes_layer4_A4, activation="relu"))

# Add the fifth hidden layer
# Specify the number of hidden nodes and the activation function
nn_A4.add(Dense(units=hidden_nodes_layer5_A4, activation="relu"))

# Output layer
nn_A4.add(Dense(units=1, activation="sigmoid"))

# Check the structure of the model
nn_A4.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_15 (Dense)            (None, 58)                6786      
                                                                 
 dense_16 (Dense)            (None, 58)                3422      
                                                                 
 dense_17 (Dense)            (None, 58)                3422      
                                                                 
 dense_18 (Dense)            (None, 58)                3422      
                                                                 
 dense_19 (Dense)            (None, 58)                3422      
                                                                 
 dense_20 (Dense)            (None, 1)                 59        
                                                                 
Total params: 20533 (80.21 KB)
Trainable params: 20533

In [93]:
# Compile the model
nn_A4.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [94]:
# Fit the model
fit_model_A4 = nn_A4.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [95]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A4.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 1s - loss: 0.5558 - accuracy: 0.7311 - 563ms/epoch - 2ms/step
Loss: 0.5558373332023621, Accuracy: 0.7310787439346313


---

## Attempt 5: Changing the Activation Function to Sigmoid

The fifth attempt at improving the model will keep the original data set (and test/training split), but it will change the activation functions of the hidden layers to sigmoid functions.  Rectified Linear Unit (ReLU or relu) is a popular activation function, but can sometimes suffer from what's known as the "dying ReLU problem", usually if signifcant portions (e.g., the mean) of the (scaled) training data is negative.  For more information, see:

[An Overview of Activation Functions in Deep Learning](https://www.theaidream.com/post/an-overview-of-activation-functions-in-deep-learning)

[What is the Dying ReLU Problem in Neural Networks?](https://www.quora.com/What-is-the-dying-ReLU-problem-in-neural-networks)

The training data has been scaled using standard scaling, so each column in the training data should have a distribution with mean = 0 and standard deviation = 1.  Nevertheless, it is worth checking to see if changing the activation function affects the performance of the model.

Note that the model will have the same two hidden layers as the original model.

In [96]:
# Define the the number of inputs (features) to the model
number_input_features_A5 = len(X_train.iloc[0])

# Review the number of features
number_input_features_A5

116

In [97]:
# Define the number of neurons in the output layer
number_output_neurons_A5 = 1

In [98]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer1_A5 = (number_input_features_A5 + number_output_neurons_A5) // 2

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A5

58

In [99]:
# Define the number of hidden nodes for the second hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer2_A5 = (hidden_nodes_layer1_A5 + number_output_neurons_A5) // 2

# Review the number of hidden nodes in the second layer
hidden_nodes_layer2_A5

29

In [100]:
# Create the Sequential model instance
nn_A5 = Sequential()

In [101]:
# Add the first hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn_A5.add(Dense(units=hidden_nodes_layer1_A5, input_dim=number_input_features_A5, activation="sigmoid"))

# Add the second hidden layer
# Specify the number of hidden nodes and the activation function
nn_A5.add(Dense(units=hidden_nodes_layer2_A5, activation="sigmoid"))

# Output layer
nn_A5.add(Dense(units=1, activation="sigmoid"))

# Check the structure of the model
nn_A5.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_21 (Dense)            (None, 58)                6786      
                                                                 
 dense_22 (Dense)            (None, 29)                1711      
                                                                 
 dense_23 (Dense)            (None, 1)                 30        
                                                                 
Total params: 8527 (33.31 KB)
Trainable params: 8527 (33.31 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [102]:
# Compile the model
nn_A5.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [103]:
# Fit the model
fit_model_A5 = nn_A5.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [104]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A5.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - loss: 0.5508 - accuracy: 0.7320 - 398ms/epoch - 1ms/step
Loss: 0.5507833957672119, Accuracy: 0.7320116758346558


---

## Attempt 6: Changing the Activation Function to Leaky ReLU

The sixth attempt at improving the model will keep the original data set (and test/training split), but it will change the activation functions of the hidden layers to leaky ReLU functions.  Leaky ReLU functions replace zero values with a small, predefined value, known as alpha.  This is another way of addressing the "dying ReLU problem", because the neurons are always on, even if they're not always influential.  For more information, see:

[Details about Alpha](https://stackoverflow.com/questions/64735352/details-about-alpha-in-tf-nn-leaky-relu-features-alpha-0-2-name-none)

Note that the model will have the same two hidden layers as the original model.

In [105]:
# Define the the number of inputs (features) to the model
number_input_features_A6 = len(X_train.iloc[0])

# Review the number of features
number_input_features_A6

116

In [106]:
# Define the number of neurons in the output layer
number_output_neurons_A6 = 1

In [107]:
# Define the number of hidden nodes for the first hidden layer
# Use the mean of the number of input features plus the number of output neurons
# Use Python floor division (//) to retrun 
hidden_nodes_layer1_A6 = (number_input_features_A6 + number_output_neurons_A6) // 2

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A6

58

In [108]:
# Define the number of hidden nodes for the second hidden layer
# Use the same number of neurons as the previous hidden layer
# Use Python floor division (//) to retrun 
hidden_nodes_layer2_A6 = (hidden_nodes_layer1_A6 + number_output_neurons_A6) // 2

# Review the number of hidden nodes in the second layer
hidden_nodes_layer2_A6

29

In [109]:
# Create the Sequential model instance
nn_A6 = Sequential()

In [110]:
# Add the first hidden layer
# Specify the number of inputs, the number of hidden nodes, and the activation function
nn_A6.add(Dense(units=hidden_nodes_layer1_A6, input_dim=number_input_features_A6, activation=tf.keras.layers.LeakyReLU(alpha=0.2)))

# Add the second hidden layer
# Specify the number of hidden nodes and the activation function
nn_A6.add(Dense(units=hidden_nodes_layer2_A6, activation=tf.keras.layers.LeakyReLU(alpha=0.2)))

# Output layer
nn_A6.add(Dense(units=1, activation="sigmoid"))

# Check the structure of the model
nn_A6.summary()

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_24 (Dense)            (None, 58)                6786      
                                                                 
 dense_25 (Dense)            (None, 29)                1711      
                                                                 
 dense_26 (Dense)            (None, 1)                 30        
                                                                 
Total params: 8527 (33.31 KB)
Trainable params: 8527 (33.31 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [111]:
# Compile the model
nn_A6.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [112]:
# Fit the model
fit_model_A6 = nn_A6.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [113]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A6.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - loss: 0.5596 - accuracy: 0.7296 - 444ms/epoch - 2ms/step
Loss: 0.5595617890357971, Accuracy: 0.7295626997947693


---

## Summary of the Models:
- Original model
    - 2 layers
    - Neurons decrease by factor of 2 in each layer (58 -> 29)
- Alternative model 1: adjusted input values
    - Binning the ASK_AMT column into 20 groups
    - Grouping all CLASSIFICATIONs having a class size of 10 or fewer
    - Transforming SPECIAL_CONSIDERATIONS into an integer (boolean) from a string
- Alternative model 2: increased number of layers
    - 5 layers
    - Neurons decrease by factor of 2 in each layer (58 -> 29 -> 15 -> 8 -> 4)
- Alternative model 3: flat model structure
    - 2 layers
    - 58 neurons in each layer
- Alternative model 4: flat model structure with additional layers
    - 5 layers
    - 58 neurons in each layer
- Alternative model 5: Original model using sigmoid instead of ReLU activation functions
    - 2 layers
    - Neurons decrease by factor of 2 in each layer (58 -> 29)
    - Hidden layers use sigmoid activation functions instead of ReLU ones
- Alternative model 6: Original model using Leaky ReLU instead of ReLU activation functions
    - 2 layers
    - Neurons decrease by factor of 2 in each layer (58 -> 29)
    - Hidden layers use Leaky ReLU activation functions instead of ReLU ones

After finishing your models, display the accuracy scores achieved by each model, and compare the results.

In [114]:
print("Original Model:\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Original Model:

Results:

268/268 - 0s - loss: 0.5538 - accuracy: 0.7280 - 299ms/epoch - 1ms/step
Loss: 0.5538214445114136, Accuracy: 0.7280466556549072


In [115]:
print("Alternative Model 1 (Adjusted Input Values):\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A1.evaluate(X_test_scaled_A1,y_test_A1,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 1 (Adjusted Input Values):

Results:

268/268 - 0s - loss: 0.5700 - accuracy: 0.7241 - 302ms/epoch - 1ms/step
Loss: 0.5700469017028809, Accuracy: 0.7240816354751587


In [116]:
print("Alternative Model 2 (Increased Number of Layers from 2 to 5):\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A2.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 2 (Increased Number of Layers from 2 to 5):

Results:

268/268 - 0s - loss: 0.5569 - accuracy: 0.7305 - 318ms/epoch - 1ms/step
Loss: 0.5568841099739075, Accuracy: 0.7304956316947937


In [117]:
print("Alternative Model 3 (Flat Network Structure):\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A3.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 3 (Flat Network Structure):

Results:

268/268 - 0s - loss: 0.5596 - accuracy: 0.7298 - 287ms/epoch - 1ms/step
Loss: 0.5596227049827576, Accuracy: 0.7297959327697754


In [118]:
print("Alternative Model 4 (Flat Network Structure with 5 Layers):\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A4.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 4 (Flat Network Structure with 5 Layers):

Results:

268/268 - 0s - loss: 0.5558 - accuracy: 0.7311 - 283ms/epoch - 1ms/step
Loss: 0.5558373332023621, Accuracy: 0.7310787439346313


In [119]:
print("Alternative Model 5 (Sigmoid Activation Functions):\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A5.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 5 (Sigmoid Activation Functions):

Results:

268/268 - 0s - loss: 0.5508 - accuracy: 0.7320 - 346ms/epoch - 1ms/step
Loss: 0.5507833957672119, Accuracy: 0.7320116758346558


In [120]:
print("Alternative Model 6 (Leaky ReLU Activation Functions):\n\nResults:\n")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A6.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 6 (Leaky ReLU Activation Functions):

Results:

268/268 - 0s - loss: 0.5596 - accuracy: 0.7296 - 304ms/epoch - 1ms/step
Loss: 0.5595617890357971, Accuracy: 0.7295626997947693


---

## Save each of your alternative models as an HDF5 file.

In [121]:
# Set the file path for the first alternative model
file_path = Path("./Model_Files/Alternative_Model_1.h5")

# Export your model to a HDF5 file
nn_A1.save_weights(file_path)

In [122]:
# Set the file path for the second alternative model
file_path = Path("./Model_Files/Alternative_Model_2.h5")

# Export your model to a HDF5 file
nn_A2.save_weights(file_path)

In [123]:
# Set the file path for the third alternative model
file_path = Path("./Model_Files/Alternative_Model_3.h5")

# Export your model to a HDF5 file
nn_A3.save_weights(file_path)

In [124]:
# Set the file path for the fourth alternative model
file_path = Path("./Model_Files/Alternative_Model_4.h5")

# Export your model to a HDF5 file
nn_A4.save_weights(file_path)

In [125]:
# Set the file path for the fifth alternative model
file_path = Path("./Model_Files/Alternative_Model_5.h5")

# Export your model to a HDF5 file
nn_A5.save_weights(file_path)

In [126]:
# Set the file path for the sixth alternative model
file_path = Path("./Model_Files/Alternative_Model_6.h5")

# Export your model to a HDF5 file
nn_A6.save_weights(file_path)