# Venture Funding with Deep Learning

You work as a risk management associate at Alphabet Soup, a venture capital firm. Alphabet Soup’s business team receives many funding applications from startups every day. This team has asked you to help them create a model that predicts whether applicants will be successful if funded by Alphabet Soup.

The business team has given you a CSV containing more than 34,000 organizations that have received funding from Alphabet Soup over the years. With your knowledge of machine learning and neural networks, you decide to use the features in the provided dataset to create a binary classifier model that will predict whether an applicant will become a successful business. The CSV file contains a variety of information about these businesses, including whether or not they ultimately became successful.

## Instructions:

The steps for this challenge are broken out into the following sections:

* Prepare the data for use on a neural network model.

* Compile and evaluate a binary classification model using a neural network.

* Optimize the neural network model.

### Prepare the Data for Use on a Neural Network Model 

Using your knowledge of Pandas and scikit-learn’s `StandardScaler()`, preprocess the dataset so that you can use it to compile and evaluate the neural network model later.

Open the starter code file, and complete the following data preparation steps:

1. Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.   

2. Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.
 
3. Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

4. Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

5. Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 

6. Split the features and target sets into training and testing datasets.

7. Use scikit-learn's `StandardScaler` to scale the features data.

### Compile and Evaluate a Binary Classification Model Using a Neural Network

Use your knowledge of TensorFlow to design a binary classification deep neural network model. This model should use the dataset’s features to predict whether an Alphabet Soup&ndash;funded startup will be successful based on the features in the dataset. Consider the number of inputs before determining the number of layers that your model will contain or the number of neurons on each layer. Then, compile and fit your model. Finally, evaluate your binary classification model to calculate the model’s loss and accuracy. 
 
To do so, complete the following steps:

1. Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.

2. Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.

> **Hint** When fitting the model, start with a small number of epochs, such as 20, 50, or 100.

3. Evaluate the model using the test data to determine the model’s loss and accuracy.

4. Save and export your model to an HDF5 file, and name the file `AlphabetSoup.h5`. 

### Optimize the Neural Network Model

Using your knowledge of TensorFlow and Keras, optimize your model to improve the model's accuracy. Even if you do not successfully achieve a better accuracy, you'll need to demonstrate at least two attempts to optimize the model. You can include these attempts in your existing notebook. Or, you can make copies of the starter notebook in the same folder, rename them, and code each model optimization in a new notebook. 

> **Note** You will not lose points if your model does not achieve a high accuracy, as long as you make at least two attempts to optimize the model.

To do so, complete the following steps:

1. Define at least three new deep neural network models (the original plus 2 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.

2. After finishing your models, display the accuracy scores achieved by each model, and compare the results.

3. Save each of your models as an HDF5 file.


In [163]:
# Imports
import pandas as pd
from pathlib import Path
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,OneHotEncoder

---

## Prepare the data to be used on a neural network model

### Step 1: Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.  


In [164]:
# Read the applicants_data.csv file from the Resources folder into a Pandas DataFrame
applicant_data_df = pd.read_csv("./Resources/applicants_data.csv")

# Review the DataFrame
applicant_data_df.head()


Unnamed: 0,EIN,NAME,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
0,10520599,BLUE KNIGHTS MOTORCYCLE CLUB,T10,Independent,C1000,ProductDev,Association,1,0,N,5000,1
1,10531628,AMERICAN CHESAPEAKE CLUB CHARITABLE TR,T3,Independent,C2000,Preservation,Co-operative,1,1-9999,N,108590,1
2,10547893,ST CLOUD PROFESSIONAL FIREFIGHTERS,T5,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
3,10553066,SOUTHSIDE ATHLETIC ASSOCIATION,T3,CompanySponsored,C2000,Preservation,Trust,1,10000-24999,N,6692,1
4,10556103,GENETIC RESEARCH INSTITUTE OF THE DESERT,T3,Independent,C1000,Heathcare,Trust,1,100000-499999,N,142590,1


All of these variables are self-explanatory except for "Status". What may 'status' represent. What are the potential values? Is it just a column of all '1's?

In [168]:
applicant_data_df["STATUS"].unique()

array([1, 0])

Looks like status has both 1 and 0 as potential values. So we'll keep it.

In [169]:
# Review the data types associated with the columns
applicant_data_df.dtypes

EIN                        int64
NAME                      object
APPLICATION_TYPE          object
AFFILIATION               object
CLASSIFICATION            object
USE_CASE                  object
ORGANIZATION              object
STATUS                     int64
INCOME_AMT                object
SPECIAL_CONSIDERATIONS    object
ASK_AMT                    int64
IS_SUCCESSFUL              int64
dtype: object

We will need to encode all of the "object" variables.

### Step 2: Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.

In [123]:
# Drop the 'EIN' and 'NAME' columns from the DataFrame
applicant_data_df = applicant_data_df.drop(columns=["EIN", "NAME"], axis=1)

# Review the DataFrame
applicant_data_df.head()


Unnamed: 0,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
0,T10,Independent,C1000,ProductDev,Association,1,0,N,5000,1
1,T3,Independent,C2000,Preservation,Co-operative,1,1-9999,N,108590,1
2,T5,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
3,T3,CompanySponsored,C2000,Preservation,Trust,1,10000-24999,N,6692,1
4,T3,Independent,C1000,Heathcare,Trust,1,100000-499999,N,142590,1


### Step 3: Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

In [124]:
# Create a list of categorical variables 
categorical_variables_list = list(applicant_data_df.dtypes[applicant_data_df.dtypes == "object"].index)


# Display the categorical variables list
categorical_variables_list


['APPLICATION_TYPE',
 'AFFILIATION',
 'CLASSIFICATION',
 'USE_CASE',
 'ORGANIZATION',
 'INCOME_AMT',
 'SPECIAL_CONSIDERATIONS']

In [171]:
# Create a OneHotEncoder instance
enc = OneHotEncoder(sparse=False)

In [172]:
# Encode the categorcal variables using OneHotEncoder
encoded_data = enc.fit_transform(applicant_data_df[categorical_variables_list])



In [173]:
# Create a DataFrame with the encoded variables
encoded_df = pd.DataFrame(
    encoded_data,
    columns = enc.get_feature_names_out(categorical_variables_list)
)

# Review the DataFrame
encoded_df.head()


Unnamed: 0,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,APPLICATION_TYPE_T25,APPLICATION_TYPE_T29,...,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0


### Step 4: Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

In [128]:
# Add the numerical variables from the original DataFrame to the one-hot encoding DataFrame

# First, get numerical variables into list
numerical_variables_list = list(applicant_data_df.dtypes[applicant_data_df.dtypes == "int64"].index)

In [129]:
# Create DataFrame with only numerical variables
numerical_variables_df = applicant_data_df[numerical_variables_list]

# Preview the Dataframe
numerical_variables_df.head()

Unnamed: 0,STATUS,ASK_AMT,IS_SUCCESSFUL
0,1,5000,1
1,1,108590,1
2,1,5000,0
3,1,6692,1
4,1,142590,1


In [174]:
# Join the encoded and numerical DataFrames using .concat()
encoded_df = pd.concat([encoded_df, numerical_variables_df], axis=1)

# Review the Dataframe
encoded_df.head()

Unnamed: 0,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,APPLICATION_TYPE_T25,APPLICATION_TYPE_T29,...,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y,STATUS,ASK_AMT,IS_SUCCESSFUL
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,5000,1
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,108590,1
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,5000,0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,6692,1
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,142590,1


### Step 5: Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 



In [175]:
# Define the target set y using the IS_SUCCESSFUL column
y = encoded_df["IS_SUCCESSFUL"]

# Display a sample of y
y[:5]


0    1
1    1
2    0
3    1
4    1
Name: IS_SUCCESSFUL, dtype: int64

In [176]:
# Define features set X by selecting all columns but IS_SUCCESSFUL
X = encoded_df.drop("IS_SUCCESSFUL", axis=1)

# Review the features DataFrame
X.head()


Unnamed: 0,APPLICATION_TYPE_T10,APPLICATION_TYPE_T12,APPLICATION_TYPE_T13,APPLICATION_TYPE_T14,APPLICATION_TYPE_T15,APPLICATION_TYPE_T17,APPLICATION_TYPE_T19,APPLICATION_TYPE_T2,APPLICATION_TYPE_T25,APPLICATION_TYPE_T29,...,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,SPECIAL_CONSIDERATIONS_N,SPECIAL_CONSIDERATIONS_Y,STATUS,ASK_AMT
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,5000
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,108590
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,5000
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,6692
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,142590


### Step 6: Split the features and target sets into training and testing datasets.


In [177]:
# Split the preprocessed data into a training and testing dataset
# Assign the function a random_state equal to 1
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)


### Step 7: Use scikit-learn's `StandardScaler` to scale the features data.

In [134]:
# Create a StandardScaler instance
# We want to standardize data to a Guassian distribution, 0 mean and unit variance.
scaler = StandardScaler()

# Fit the scaler to the features training dataset
# We don't want to scale the y variable "is_successful", since it is simply a binary variable
# 'fit' calculates mean and standard deviation for variables in training set, which we will use for scaling later
X_scaler = scaler.fit(X_train)

# Now, 'transform' uses the calculated means/SDs from above to standardize the training data AND the testing data. 
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)


---

## Compile and Evaluate a Binary Classification Model Using a Neural Network

### Step 1: Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.


In [135]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features


116

116 conccurs with the number of columns we have above in the encoded_df, minus the y variable.

In [136]:
# Define the number of neurons in the output layer
# 1 output because we are seeking a binary response
number_output_neurons = 1

In [137]:
# Define the number of hidden nodes for the first hidden layer
hidden_nodes_layer1 =  2

# Review the number hidden nodes in the first layer
hidden_nodes_layer1


2

In [138]:
# Define the number of hidden nodes for the second hidden layer
hidden_nodes_layer2 =  2

# Review the number hidden nodes in the second layer
hidden_nodes_layer2


2

In [139]:
# Create the Sequential model instance
nn = Sequential()


In [140]:
# Add the first hidden layer
nn.add(Dense(units=hidden_nodes_layer1, input_dim=number_input_features, activation="relu"))


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [141]:
# Add the second hidden layer
nn.add(Dense(units=hidden_nodes_layer2, input_dim=number_input_features, activation="relu"))


In [142]:
# Add the output layer to the model specifying the number of output neurons and activation function
nn.add(Dense(units=number_output_neurons, activation="relu"))

In [143]:
# Display the Sequential model summary
nn.summary()


### Step 2: Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.


In [144]:
# Compile the Sequential model
nn.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [145]:
# Fit the model using 50 epochs and the training data
fit_model = nn.fit(X_train_scaled, y_train, epochs=50)

Epoch 1/50
[1m696/804[0m [32m━━━━━━━━━━━━━━━━━[0m[37m━━━[0m [1m0s[0m 433us/step - accuracy: 0.4687 - loss: 8.5638

[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 445us/step - accuracy: 0.4683 - loss: 8.5692
Epoch 2/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 369us/step - accuracy: 0.4647 - loss: 8.6287
Epoch 3/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 313us/step - accuracy: 0.4654 - loss: 8.6165
Epoch 4/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 314us/step - accuracy: 0.4716 - loss: 8.5171
Epoch 5/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 315us/step - accuracy: 0.4646 - loss: 8.6298
Epoch 6/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 316us/step - accuracy: 0.4656 - loss: 8.6141
Epoch 7/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 311us/step - accuracy: 0.4667 - loss: 8.5953
Epoch 8/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 314us/step - accuracy: 0.4604 - loss: 8.6967
Epoch 9/50
[1m804/804[0m [32m━━━

### Step 3: Evaluate the model using the test data to determine the model’s loss and accuracy.


In [146]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled, y_test, verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

268/268 - 0s - 442us/step - accuracy: 0.4708 - loss: 8.5300
Loss: 8.529995918273926, Accuracy: 0.4707871675491333


Pretty terrible results! Let's save this file and try to improve the model's performance.

### Step 4: Save and export your model to an HDF5 file, and name the file `AlphabetSoup.h5`. 


In [147]:
# Set the model's file path
file_path = nn.save("AlphabetSoup.h5")

#file_path = keras.save_model(nn, 'my_model.keras')

# Export your model to a HDF5 file
# YOUR CODE HERE




---

## Optimize the neural network model


### Step 1: Define at least three new deep neural network models (resulting in the original plus 3 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.


### Alternative Model 1

In [148]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features

116

In [149]:
# Define the number of neurons in the output layer
# Binary outcome.
number_output_neurons_A1 = 1

In [150]:
# Define the number of hidden nodes for the hidden layers
# Number of layers and number of nodes per layer has been doubled from before
hidden_nodes_layer1_A1 = 4
hidden_nodes_layer2_A1 = 4
hidden_nodes_layer3_A1 = 4
hidden_nodes_layer4_A1 = 4



# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A1 

4

In [151]:
# Create the Sequential model instance
nn_A1 = Sequential()


In [152]:
# Add hidden layers

# Let's throw in some different activation functions as well

nn_A1.add(Dense(units=hidden_nodes_layer1_A1, input_dim=number_input_features, activation="relu"))
nn_A1.add(Dense(units=hidden_nodes_layer2_A1, input_dim=number_input_features, activation="sigmoid"))
nn_A1.add(Dense(units=hidden_nodes_layer3_A1, input_dim=number_input_features, activation="tanh"))
nn_A1.add(Dense(units=hidden_nodes_layer4_A1, input_dim=number_input_features, activation="relu"))


# Output layer

# Tried this with "relu", but it gave about 0.73 accuracy again, let's try with sigmoid
nn_A1.add(Dense(units=number_output_neurons_A1, activation="sigmoid"))


# Check the structure of the model
nn_A1.summary()

Here we can change the model to see if that leads to any improvements by changing the optimization function.

adam, Adaptive Movement Estimation. combines ideas from : RMSprop and momentum.

Stochastic Gradient Descent (SGD): basic, based on gradient of the loss

Copied from  ChatGPT: 

"Adagrad:
Adaptive learning rate optimization algorithm.
Adjusts the learning rates individually for each parameter based on the historical gradient information.


Adadelta:
An extension of Adagrad that addresses its tendency to decrease learning rates too aggressively.

Nadam:
Adam optimizer with Nesterov momentum."


Let's use Adadelta, and Nadam if we don't see results from adding more neurons while keeping the activation the same, relu


In [153]:
# Compile the Sequential model
nn_A1.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

In [154]:
# Fit the model using 50 epochs and the training data
fit_model_A1 = nn_A1.fit(X_train_scaled, y_train, epochs=50)


Epoch 1/50


[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 421us/step - accuracy: 0.5776 - loss: 0.6713
Epoch 2/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 386us/step - accuracy: 0.7252 - loss: 0.5880
Epoch 3/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 507us/step - accuracy: 0.7226 - loss: 0.5768
Epoch 4/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 385us/step - accuracy: 0.7271 - loss: 0.5719
Epoch 5/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 399us/step - accuracy: 0.7298 - loss: 0.5629
Epoch 6/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 422us/step - accuracy: 0.7309 - loss: 0.5644
Epoch 7/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 364us/step - accuracy: 0.7322 - loss: 0.5606
Epoch 8/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 368us/step - accuracy: 0.7330 - loss: 0.5605
Epoch 9/50
[1m804/804[0m [32m━━━

Pretty bad still. 4 hidden layers each with 4 neurons, with activations relu, tanh, sigmoid, relu and output neuron activation sigmoid. 

The loss is a bit better. Accuracy is still around 0.7-0.72


According to ChatGPT4 , we should typically keep the activation function the same across all hidden layers. 

Let's keep this idea in mind for our next iteration.


#### Alternative Model 2

In [196]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])

# Review the number of features
number_input_features

116

In [197]:
# Define the number of neurons in the output layer
number_output_neurons_A2 = 1

In [198]:
# Define the number of hidden nodes for the first hidden layer
hidden_nodes_layer1_A2 = 16
hidden_nodes_layer2_A2 = 16
hidden_nodes_layer3_A2 = 16
hidden_nodes_layer4_A2 = 16
hidden_nodes_layer5_A2 = 16
hidden_nodes_layer6_A2 = 16
hidden_nodes_layer7_A2 = 16
hidden_nodes_layer8_A2 = 16
hidden_nodes_layer9_A2 = 16
hidden_nodes_layer10_A2 = 16
hidden_nodes_layer11_A2 = 16
hidden_nodes_layer12_A2 = 16
hidden_nodes_layer13_A2 = 16
hidden_nodes_layer14_A2 = 16
hidden_nodes_layer15_A2 = 16
hidden_nodes_layer16_A2 = 16

# Review the number of hidden nodes in the first layer
hidden_nodes_layer1_A2

16

In [199]:
# Create the Sequential model instance
nn_A2 = Sequential()

In [200]:
#Add hidden layers

#Let's use all 'relu'

nn_A2.add(Dense(units=hidden_nodes_layer1_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer2_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer3_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer4_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer5_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer6_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer7_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer8_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer9_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer10_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer11_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer12_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer13_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer14_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer15_A2, input_dim=number_input_features, activation="relu"))
nn_A2.add(Dense(units=hidden_nodes_layer16_A2, input_dim=number_input_features, activation="relu"))

# Note, when we want to experiement with MANY more neurons, we write a loop to adust the number each time.

# Output layer
# Since we are seeking a binary output, let's use the sigmoid function in the OUTPUT layer.

nn_A2.add(Dense(units=number_output_neurons_A2, activation="sigmoid"))


#Check the structure of the model
nn_A2.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [205]:
# Compile the model

#try with 'adam' and then again with 'adadelta'
nn_A2.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])


In [206]:
# Fit the model
# Try with 200 epochs now
fit_model_A2 = nn_A2.fit(X_train_scaled, y_train, epochs=50)


Epoch 1/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 728us/step - accuracy: 0.7364 - loss: 0.5415
Epoch 2/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 705us/step - accuracy: 0.7336 - loss: 0.5417
Epoch 3/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 754us/step - accuracy: 0.7362 - loss: 0.5424
Epoch 4/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 724us/step - accuracy: 0.7371 - loss: 0.5417
Epoch 5/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 778us/step - accuracy: 0.7384 - loss: 0.5379
Epoch 6/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 744us/step - accuracy: 0.7371 - loss: 0.5391
Epoch 7/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 724us/step - accuracy: 0.7352 - loss: 0.5432
Epoch 8/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 748us/step - accuracy: 0.7379 - loss: 0.5372
Epoch 9/50
[1m804/804[

In [207]:
# Same as above but now using the 'adadelta' optimizer

nn_A2.compile(loss="binary_crossentropy", optimizer='adadelta', metrics=["accuracy"])
nn_A2.fit(X_train_scaled, y_train, epochs=50)


Epoch 1/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 727us/step - accuracy: 0.7484 - loss: 0.5240
Epoch 2/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 743us/step - accuracy: 0.7383 - loss: 0.5352
Epoch 3/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 806us/step - accuracy: 0.7438 - loss: 0.5299
Epoch 4/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 725us/step - accuracy: 0.7412 - loss: 0.5328
Epoch 5/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 768us/step - accuracy: 0.7388 - loss: 0.5327
Epoch 6/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 724us/step - accuracy: 0.7419 - loss: 0.5306
Epoch 7/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 700us/step - accuracy: 0.7416 - loss: 0.5324
Epoch 8/50
[1m804/804[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 744us/step - accuracy: 0.7421 - loss: 0.5339
Epoch 9/50
[1m804/804[

<keras.src.callbacks.history.History at 0x2e200fdd0>

Not happy with these results. There is not much difference between using the 'adadelta' versus 'adam' optimizers. This accuracy is rather low, for deploying it in the real world we should continue tuning our algorithm.

I would next try hyperparameter tuning: droput, changing the learning rate.

Also I would increase the number of hidden nodes in each layer.

### Step 2: After finishing your models, display the accuracy scores achieved by each model, and compare the results.

In [208]:
print("Original Model Results")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled, y_test, verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Original Model Results
268/268 - 0s - 307us/step - accuracy: 0.4708 - loss: 8.5300
Loss: 8.529995918273926, Accuracy: 0.4707871675491333


In [209]:
print("Alternative Model 1 Results")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A1.evaluate(X_test_scaled, y_test, verbose=2)


# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 1 Results
268/268 - 0s - 499us/step - accuracy: 0.7255 - loss: 0.5595
Loss: 0.5594751834869385, Accuracy: 0.7254810333251953


In [210]:
print("Alternative Model 2 Results")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A2.evaluate(X_test_scaled, y_test, verbose=2)

# Display the model loss and accuracy results
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

Alternative Model 2 Results
268/268 - 0s - 682us/step - accuracy: 0.7301 - loss: 0.5548
Loss: 0.5547966361045837, Accuracy: 0.7301457524299622


### Step 3: Save each of your alternative models as an HDF5 file.


In [211]:
# Set the file path for the first alternative model
# Export your model to a HDF5 file
file_path = nn_A1.save("AltModel1.h5")




In [212]:
# Set the file path for the second alternative model
# Export your model to a HDF5 file
file_path = nn_A2.save("AltModel2.h5")



