<a href="https://colab.research.google.com/github/Bag0niku/Neural_Network_Charity_Analysis/blob/main/Charity_Funding_Neural_Network_Optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Set up the Environment

In [1]:
# %matplotlib
# Import our dependencies
!pip install keras-tuner
import kerastuner as kt

import numpy as np
import pandas as pd
import matplotlib as plt
from sklearn import metrics
from sklearn.preprocessing import StandardScaler, LabelEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

import os

filepath = "https://nn-charity-analysis.s3.us-west-2.amazonaws.com/charity_data.csv"

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting keras-tuner
  Downloading keras_tuner-1.1.3-py3-none-any.whl (135 kB)
[K     |████████████████████████████████| 135 kB 7.4 MB/s 
[?25hCollecting kt-legacy
  Downloading kt_legacy-1.0.4-py3-none-any.whl (9.6 kB)
Collecting jedi>=0.10
  Downloading jedi-0.18.1-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 56.7 MB/s 
Installing collected packages: jedi, kt-legacy, keras-tuner
Successfully installed jedi-0.18.1 keras-tuner-1.1.3 kt-legacy-1.0.4


  after removing the cwd from sys.path.


# Import and clean the data for use in the Neural Network Model

The training and testing data needs to be numeric and scaled. 

In [2]:
# Import the data into a dataframe
df = pd.read_csv(filepath)
df

Unnamed: 0,EIN,NAME,APPLICATION_TYPE,AFFILIATION,CLASSIFICATION,USE_CASE,ORGANIZATION,STATUS,INCOME_AMT,SPECIAL_CONSIDERATIONS,ASK_AMT,IS_SUCCESSFUL
0,10520599,BLUE KNIGHTS MOTORCYCLE CLUB,T10,Independent,C1000,ProductDev,Association,1,0,N,5000,1
1,10531628,AMERICAN CHESAPEAKE CLUB CHARITABLE TR,T3,Independent,C2000,Preservation,Co-operative,1,1-9999,N,108590,1
2,10547893,ST CLOUD PROFESSIONAL FIREFIGHTERS,T5,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
3,10553066,SOUTHSIDE ATHLETIC ASSOCIATION,T3,CompanySponsored,C2000,Preservation,Trust,1,10000-24999,N,6692,1
4,10556103,GENETIC RESEARCH INSTITUTE OF THE DESERT,T3,Independent,C1000,Heathcare,Trust,1,100000-499999,N,142590,1
...,...,...,...,...,...,...,...,...,...,...,...,...
34294,996009318,THE LIONS CLUB OF HONOLULU KAMEHAMEHA,T4,Independent,C1000,ProductDev,Association,1,0,N,5000,0
34295,996010315,INTERNATIONAL ASSOCIATION OF LIONS CLUBS,T4,CompanySponsored,C3000,ProductDev,Association,1,0,N,5000,0
34296,996012607,PTA HAWAII CONGRESS,T3,CompanySponsored,C2000,Preservation,Association,1,0,N,5000,0
34297,996015768,AMERICAN FEDERATION OF GOVERNMENT EMPLOYEES LO...,T5,Independent,C3000,ProductDev,Association,1,0,N,5000,1


In [3]:
# Look for null values and incorrect datatypes
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 34299 entries, 0 to 34298
Data columns (total 12 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   EIN                     34299 non-null  int64 
 1   NAME                    34299 non-null  object
 2   APPLICATION_TYPE        34299 non-null  object
 3   AFFILIATION             34299 non-null  object
 4   CLASSIFICATION          34299 non-null  object
 5   USE_CASE                34299 non-null  object
 6   ORGANIZATION            34299 non-null  object
 7   STATUS                  34299 non-null  int64 
 8   INCOME_AMT              34299 non-null  object
 9   SPECIAL_CONSIDERATIONS  34299 non-null  object
 10  ASK_AMT                 34299 non-null  int64 
 11  IS_SUCCESSFUL           34299 non-null  int64 
dtypes: int64(4), object(8)
memory usage: 3.1+ MB


In [4]:
# Count the number of unique values in each column.
df.nunique()

EIN                       34299
NAME                      19568
APPLICATION_TYPE             17
AFFILIATION                   6
CLASSIFICATION               71
USE_CASE                      5
ORGANIZATION                  4
STATUS                        2
INCOME_AMT                    9
SPECIAL_CONSIDERATIONS        2
ASK_AMT                    8747
IS_SUCCESSFUL                 2
dtype: int64

Currently have:

Features: 
*   APPLICATION_TYPE (Categorical string)
*   AFFILIATION      (Categorical string)
*   CLASSIFICATION   (Categorical string)
*   USE_CASE         (Categorical string)
*   ORGANIZATION     (Categorical string)
*   STATUS           (Numeric T/F)
*   INCOME_AMT       (Categorical string)
*   SPECIAL_CONSIDERATIONS (Numeric T/F)
*   ASK_AMT          (Number)

Target: 
*   IS_SUCCESSFUL    (Numeric T/F)


What we want the neural network to process all the features as numeric T/F columns, including the categorical strings, this technique is also known as One Hot Encoding.

In [5]:
# start with minimize "Classification" to a veriety of 10 categories, not 71.
class_df = pd.DataFrame(df["CLASSIFICATION"].value_counts())
class_df["CLASSIFICATION"].sort_values(ascending=False).head(10)

C1000    17326
C2000     6074
C1200     4837
C3000     1918
C2100     1883
C7000      777
C1700      287
C4000      194
C5000      116
C1270      114
Name: CLASSIFICATION, dtype: int64

In [6]:
# keep the top 9 calssifications and change the rest to "OTHER", totaling 10 categories
class_categories = class_df[class_df["CLASSIFICATION"] >115].index.to_list()
class_changing = int(class_df[class_df['CLASSIFICATION'] < 115]["CLASSIFICATION"].sum())
n_total = int(class_df['CLASSIFICATION'].sum())
print(f"CLASSIFICATION records being converted to 'OTHER': {class_changing} is {round((class_changing/n_total)*100, 2)}% of the total records")


CLASSIFICATION records being converted to 'OTHER': 887 is 2.59% of the total records


In [7]:
df["APPLICATION_TYPE"].value_counts().sort_values(ascending=False).head(10)

T3     27037
T4      1542
T6      1216
T5      1173
T19     1065
T8       737
T7       725
T10      528
T9       156
T13       66
Name: APPLICATION_TYPE, dtype: int64

In [8]:
# keep the top 9 App types and change the rest to "OTHER", totaling 10 categories
app_type_df = pd.DataFrame(df["APPLICATION_TYPE"].value_counts().sort_values(ascending=False))
app_types = app_type_df[app_type_df["APPLICATION_TYPE"]>100].index.to_list()
app_changing = int(app_type_df[app_type_df["APPLICATION_TYPE"] < 100].sum())
print(f"APPLICATION_TYPE records being converted to 'OTHER': {app_changing} is {round((app_changing/n_total)*100, 2)}% of the total records")


APPLICATION_TYPE records being converted to 'OTHER': 120 is 0.35% of the total records


In [9]:
# apply changes to the data using a new dataframe, so the original remains untouched
# if required to be used or modified in a different way
df2 = df.copy()
df2["APPLICATION_TYPE"] = df2["APPLICATION_TYPE"].apply(lambda x: x if x in app_types else "OTHER")
df2["CLASSIFICATION"] = df2["CLASSIFICATION"].apply(lambda x: x if x in class_categories else "OTHER")
df2["SPECIAL_CONSIDERATIONS"] = df2["SPECIAL_CONSIDERATIONS"] == 'Y'  ## converts Y/N to True/False, computer will see as 1/0
df2["STATUS"].value_counts()

1    34294
0        5
Name: STATUS, dtype: int64

In [10]:
#  Transform the string categories into T/F numerical columns representing each category.
# "APPLICATION_TYPE", "AFFILIATION", "CLASSIFICATION", "USE_CASE", "ORGANIZATION", "SPECIAL_CONSIDERATIONS", "INCOME_AMT"
one_hot_encoded_df = pd.get_dummies(df2, columns=["APPLICATION_TYPE", "AFFILIATION", "CLASSIFICATION", "USE_CASE", "ORGANIZATION", "INCOME_AMT"])

# Name and EIN will be removed for the computation, they will not help
# the machine weigh options and metrics, and IS_SUCCESSFULL is our goal.
encoded_df = one_hot_encoded_df.drop(["EIN", "NAME", "IS_SUCCESSFUL"], axis=1)
encoded_df

Unnamed: 0,STATUS,SPECIAL_CONSIDERATIONS,ASK_AMT,APPLICATION_TYPE_OTHER,APPLICATION_TYPE_T10,APPLICATION_TYPE_T19,APPLICATION_TYPE_T3,APPLICATION_TYPE_T4,APPLICATION_TYPE_T5,APPLICATION_TYPE_T6,...,ORGANIZATION_Trust,INCOME_AMT_0,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M
0,1,False,5000,0,1,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
1,1,False,108590,0,0,0,1,0,0,0,...,0,0,1,0,0,0,0,0,0,0
2,1,False,5000,0,0,0,0,0,1,0,...,0,1,0,0,0,0,0,0,0,0
3,1,False,6692,0,0,0,1,0,0,0,...,1,0,0,1,0,0,0,0,0,0
4,1,False,142590,0,0,0,1,0,0,0,...,1,0,0,0,1,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34294,1,False,5000,0,0,0,0,1,0,0,...,0,1,0,0,0,0,0,0,0,0
34295,1,False,5000,0,0,0,0,1,0,0,...,0,1,0,0,0,0,0,0,0,0
34296,1,False,5000,0,0,0,1,0,0,0,...,0,1,0,0,0,0,0,0,0,0
34297,1,False,5000,0,0,0,0,0,1,0,...,0,1,0,0,0,0,0,0,0,0


In [11]:
# Checking the status of the data
encoded_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 34299 entries, 0 to 34298
Data columns (total 47 columns):
 #   Column                        Non-Null Count  Dtype
---  ------                        --------------  -----
 0   STATUS                        34299 non-null  int64
 1   SPECIAL_CONSIDERATIONS        34299 non-null  bool 
 2   ASK_AMT                       34299 non-null  int64
 3   APPLICATION_TYPE_OTHER        34299 non-null  uint8
 4   APPLICATION_TYPE_T10          34299 non-null  uint8
 5   APPLICATION_TYPE_T19          34299 non-null  uint8
 6   APPLICATION_TYPE_T3           34299 non-null  uint8
 7   APPLICATION_TYPE_T4           34299 non-null  uint8
 8   APPLICATION_TYPE_T5           34299 non-null  uint8
 9   APPLICATION_TYPE_T6           34299 non-null  uint8
 10  APPLICATION_TYPE_T7           34299 non-null  uint8
 11  APPLICATION_TYPE_T8           34299 non-null  uint8
 12  APPLICATION_TYPE_T9           34299 non-null  uint8
 13  AFFILIATION_CompanySponsored  3

In [12]:
encoded_df = encoded_df.astype(float)
encoded_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 34299 entries, 0 to 34298
Data columns (total 47 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   STATUS                        34299 non-null  float64
 1   SPECIAL_CONSIDERATIONS        34299 non-null  float64
 2   ASK_AMT                       34299 non-null  float64
 3   APPLICATION_TYPE_OTHER        34299 non-null  float64
 4   APPLICATION_TYPE_T10          34299 non-null  float64
 5   APPLICATION_TYPE_T19          34299 non-null  float64
 6   APPLICATION_TYPE_T3           34299 non-null  float64
 7   APPLICATION_TYPE_T4           34299 non-null  float64
 8   APPLICATION_TYPE_T5           34299 non-null  float64
 9   APPLICATION_TYPE_T6           34299 non-null  float64
 10  APPLICATION_TYPE_T7           34299 non-null  float64
 11  APPLICATION_TYPE_T8           34299 non-null  float64
 12  APPLICATION_TYPE_T9           34299 non-null  float64
 13  A

In [13]:
# does the data need scaled?
encoded_df.describe()

Unnamed: 0,STATUS,SPECIAL_CONSIDERATIONS,ASK_AMT,APPLICATION_TYPE_OTHER,APPLICATION_TYPE_T10,APPLICATION_TYPE_T19,APPLICATION_TYPE_T3,APPLICATION_TYPE_T4,APPLICATION_TYPE_T5,APPLICATION_TYPE_T6,...,ORGANIZATION_Trust,INCOME_AMT_0,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M
count,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,...,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0,34299.0
mean,0.999854,0.000787,2769199.0,0.003499,0.015394,0.03105,0.788274,0.044958,0.034199,0.035453,...,0.685589,0.711041,0.021225,0.015831,0.09837,0.006997,0.027843,0.109245,0.004053,0.005394
std,0.012073,0.028046,87130450.0,0.059047,0.123116,0.173457,0.408538,0.207214,0.181743,0.184924,...,0.464288,0.453285,0.144136,0.124825,0.297819,0.083358,0.164526,0.311951,0.063532,0.073245
min,0.0,0.0,5000.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,1.0,0.0,5000.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,1.0,0.0,5000.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,1.0,0.0,7742.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,1.0,1.0,8597806000.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [14]:

# Delete the unneeded variables that were used for cleaning 
del class_df, class_categories, class_changing, n_total, app_type_df, app_types, app_changing, df2

In [14]:
## test_train_split the data
X_train, X_test, y_train, y_test = train_test_split(encoded_df, df["IS_SUCCESSFUL"])

##  scale now not earlier because of potenial data leakage and bias for the training data set.
X_train = StandardScaler().fit_transform(X_train)
X_test = StandardScaler().fit_transform(X_test)

In [15]:
# total number of input dimensions
input_dims = len(X_train[0])
input_dims

47

In [15]:
## directory for storing the checkpoints generated by the nn model
os.makedirs(os.path.join("checkpoints"), exist_ok=True)


In [16]:
## Create the checkpoint function to save the weights
def checkpoint_callback(model_name):
    name = model_name  ## solidifying the string in memory. I do not want a substitution within a substitution, then throw an error.
    checkpoint_filepath = os.path.join("checkpoints", f"{name}-"+"weights.{epoch:02d}.hdf5")
    return ModelCheckpoint(filepath=checkpoint_filepath, verbose=1, save_weights_only=True, save_freq="epoch")

# Find an Optimized version of the Neural Network Model.

The goal at this point is to be at or above 75% accuracy, the model at this point provides 72.87% accuracy with the testing data. 

*   The first attemt to optimize the model will be to use the Keras Tuner to find the best number of hidden layers and neurons.
   *    Minimum number of hidden layers: 2
   *    Maximum number of hidden layers: 6  
   *    Minimum number of neurons per layer: 1
   *    Maximum number of neurons per layer: input_dims * 1.75 
   *    Test the top 3 models the Keras Tuner finds and train them again with 500 epochs before evaluating them with the test data.
*   The second attempt will be to adjust the Keras Tuner for the ability to have multiple activation equations in the hidden layers.




In [17]:
## create a function to quickly create nn_models for evaluation with the Keras Tuner
## Uses only one activation equation in the hidden layers, upto 5 hidden layers
def create_model(hp):
    input_ = input_dims  # number of input dimensions, pulled from the global variable
    nn_model = tf.keras.models.Sequential()

    # Allow kerastuner to decide which activation function to use in hidden layers
    activation = hp.Choice('activation',['relu','tanh', 'sigmoid'])
    
    # Allow kerastuner to decide number of neurons in first layer
    nn_model.add(tf.keras.layers.Dense(units=hp.Int('first_units',
        min_value=1,
        max_value=int(input_*1.25),
        step=5), activation=activation, input_dim=input_))

    # Allow kerastuner to decide number of hidden layers and neurons in hidden layers
    for i in range(hp.Int('num_layers', 1, 5)):
        nn_model.add(tf.keras.layers.Dense(units=hp.Int('units_' + str(i),
            min_value=1,
            max_value=int(input_*1.75),
            step=5),
            activation=activation))
    
    nn_model.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))

    # Compile the model
    nn_model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])
    
    return nn_model


In [19]:
# uses a model that can have only one activation equation in the hidden layers
tuner = kt.Hyperband(
    create_model,
    objective="val_accuracy",
    factor=3,
    max_epochs=50,
    hyperband_iterations=2,
    overwrite=True)

In [20]:
## Run the Keras Tuner to find the best performing models
tuner.search(X_train,y_train,epochs=20,validation_data=(X_test,y_test))

Trial 180 Complete [00h 01m 11s]
val_accuracy: 0.7304956316947937

Best val_accuracy So Far: 0.732478141784668
Total elapsed time: 00h 53m 58s


In [21]:
# Evaluate the top three models that used only one activation equation in the hidden layers
model_reports = []
top_models = tuner.get_best_models(3)
for i, model in enumerate(top_models):
    print("=========================\n", f"Begining Optimization for Top Model Number {i}", "\n=========================")
    model.fit(X_train, y_train, epochs=500, callbacks=checkpoint_callback(f"top_model_{i}"), verbose=1)
    model_loss, model_accuracy = model.evaluate(X_test,y_test,verbose=2)
    model_reports.append(f"Top Model Number {i} =>  Loss: {model_loss}, Accuracy: {model_accuracy}")

for report in model_reports:
    print(report)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 254/500
Epoch 254: saving model to checkpoints/top_model_0-weights.254.hdf5
Epoch 255/500
Epoch 255: saving model to checkpoints/top_model_0-weights.255.hdf5
Epoch 256/500
Epoch 256: saving model to checkpoints/top_model_0-weights.256.hdf5
Epoch 257/500
Epoch 257: saving model to checkpoints/top_model_0-weights.257.hdf5
Epoch 258/500
Epoch 258: saving model to checkpoints/top_model_0-weights.258.hdf5
Epoch 259/500
Epoch 259: saving model to checkpoints/top_model_0-weights.259.hdf5
Epoch 260/500
Epoch 260: saving model to checkpoints/top_model_0-weights.260.hdf5
Epoch 261/500
Epoch 261: saving model to checkpoints/top_model_0-weights.261.hdf5
Epoch 262/500
Epoch 262: saving model to checkpoints/top_model_0-weights.262.hdf5
Epoch 263/500
Epoch 263: saving model to checkpoints/top_model_0-weights.263.hdf5
Epoch 264/500
Epoch 264: saving model to checkpoints/top_model_0-weights.264.hdf5
Epoch 265/500
Epoch 265: saving m

### Optimizing update:

*   The first attempt boils down to changing the number of hidden layers and number of neurons within the model and using only one activation equation within the hidden layers. 
*   The Keras Tuner tested 180 different models, I chose the top 3 best performing ones and retrained them with 500 epochs. The results from those 3 models did not meet the required accuracy goal with the testing data.

Next Steps:
*   Run the same Keras Tuner with one change, allow it to pick multiple activation equations for the hidden layers every time it builds a new model. The Keras Tuner summary will only be able to show one of the activation equations for each model, not all of the ones contained within. The one activation equation for the most recently created hidden layer will be displayed.



In [22]:
## can use multiple activation equations in the hidden layers, upto 5 hidden layers.
def create_multi_activation_model(hp):
    input_ = input_dims  # number of input dimensions
    nn_model = tf.keras.models.Sequential()

    # Allow kerastuner to decide which activation function to use in hidden layers
    activation = hp.Choice('activation',['relu','tanh', 'sigmoid'])
    
    # Allow kerastuner to decide number of neurons in first layer
    nn_model.add(tf.keras.layers.Dense(units=hp.Int('first_units',
        min_value=1,
        max_value=int(input_*1.25),
        step=5), activation=activation, input_dim=input_))

    # Allow kerastuner to decide number of hidden layers and neurons in hidden layers
    for i in range(hp.Int('num_layers', 1, 5)):
        activation = hp.Choice('activation',['relu','tanh', 'sigmoid'])
        nn_model.add(tf.keras.layers.Dense(units=hp.Int('units_' + str(i),
            min_value=1,
            max_value=int(input_*1.75),
            step=5),
            activation=activation))
    
    nn_model.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))

    # Compile the model
    nn_model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])
    
    return nn_model
    

In [23]:
# uses a model that can have multiple activation equations in the hidden layers
multi_tuner = kt.Hyperband(
    create_multi_activation_model,
    objective="val_accuracy",
    factor=3,
    max_epochs=50,
    hyperband_iterations=2,
    overwrite=True)

In [24]:
## Run the modified Keras Tuner to find the best performing models
multi_tuner.search(X_train,y_train,epochs=20,validation_data=(X_test,y_test))

Trial 180 Complete [00h 01m 23s]
val_accuracy: 0.7295626997947693

Best val_accuracy So Far: 0.7328279614448547
Total elapsed time: 00h 55m 44s


In [25]:
# Evaluate the top three models that used could use multiple activation equations in the hidden layers
multi_model_reports = []
multi_top_models = multi_tuner.get_best_models(3)
for i, model in enumerate(multi_top_models):
    print("=========================\n", f"Begining Optimization for Top Multi Model Number {i}", "\n=========================")
    model.fit(X_train, y_train, epochs=500, callbacks=checkpoint_callback(f"top_multi_model_{i}"), verbose=1)
    model_loss, model_accuracy = model.evaluate(X_test,y_test,verbose=2)
    multi_model_reports.append(f"Top Multi Model Number {i} =>  Loss: {model_loss}, Accuracy: {model_accuracy}")

for report in multi_model_reports:
    print(report)    

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 254/500
Epoch 254: saving model to checkpoints/top_multi_model_0-weights.254.hdf5
Epoch 255/500
Epoch 255: saving model to checkpoints/top_multi_model_0-weights.255.hdf5
Epoch 256/500
Epoch 256: saving model to checkpoints/top_multi_model_0-weights.256.hdf5
Epoch 257/500
Epoch 257: saving model to checkpoints/top_multi_model_0-weights.257.hdf5
Epoch 258/500
Epoch 258: saving model to checkpoints/top_multi_model_0-weights.258.hdf5
Epoch 259/500
Epoch 259: saving model to checkpoints/top_multi_model_0-weights.259.hdf5
Epoch 260/500
Epoch 260: saving model to checkpoints/top_multi_model_0-weights.260.hdf5
Epoch 261/500
Epoch 261: saving model to checkpoints/top_multi_model_0-weights.261.hdf5
Epoch 262/500
Epoch 262: saving model to checkpoints/top_multi_model_0-weights.262.hdf5
Epoch 263/500
Epoch 263: saving model to checkpoints/top_multi_model_0-weights.263.hdf5
Epoch 264/500
Epoch 264: saving model to checkpoints/to

In [None]:
## Not anticipating to need these anymore, delete the current training/validating
## variables to make room for the next Optimizations.
del X_train, X_test, y_train, y_test

## Optimization Update:
Results so far:
*   Ending results for the top 3 models of both Keras Tuners were very similar.  Having multiple activation equations in the hidden layers did not seem to help improve the model.

Next tasks to attempts:
1.   use the encoded_df instead of the std_scaled_df because only 1 of the 47 columns is not a true/false numerical value. Keep ASK_AMT column scaled with the standard scaler because it is not a true/false statement.
    *    X_encoded
2.   Add features, EIN and NAME columns to the X features, However i will add them seperately because they both have the same meaning as unique identifier columns. There will be 2 data sets. 
    *   X + EIN = X_ein
    *   X + NAME = X_name


In [26]:
X_encoded_train, X_encoded_test, y_encoded_train, y_encoded_test = train_test_split(encoded_df, df["IS_SUCCESSFUL"])
X_encoded_train["ASK_AMT"] = StandardScaler().fit_transform(np.array(X_encoded_train["ASK_AMT"]).reshape(-1,1))
X_encoded_test["ASK_AMT"] = StandardScaler().fit_transform(np.array(X_encoded_test["ASK_AMT"]).reshape(-1,1))

In [27]:
# uses a model that can have only one activation equation in the hidden layers
encoded_tuner = kt.Hyperband(
    create_model,
    objective="val_accuracy",
    factor=3,
    max_epochs=50,
    hyperband_iterations=2,
    overwrite=True)

## 

In [28]:
encoded_tuner.search(X_encoded_train,y_encoded_train,epochs=20,validation_data=(X_encoded_test,y_encoded_test))

Trial 180 Complete [00h 01m 23s]
val_accuracy: 0.7355102300643921

Best val_accuracy So Far: 0.7379592061042786
Total elapsed time: 00h 58m 20s


In [29]:
# Evaluate the top three models that used only one activation equation in the hidden layers
encoded_model_reports = []
encoded_top_models = encoded_tuner.get_best_models(3)
for i, model in enumerate(encoded_top_models):
    print("=========================\n", f"Begining Optimization for Top Encoded Model Number {i}", "\n=========================")
    model.fit(X_encoded_train, y_encoded_train, epochs=500, callbacks=checkpoint_callback(f"top_encoded_model_{i}"), verbose=1)
    model_loss, model_accuracy = model.evaluate(X_encoded_test,y_encoded_test,verbose=2)
    encoded_model_reports.append(f"Top Encoded Model Number {i} =>  Loss: {model_loss}, Accuracy: {model_accuracy}")


print("=========================\n", "Final Reports for Top 3 Encoded Models", "\n=========================")
for report in encoded_model_reports:
    print(report)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 255/500
Epoch 255: saving model to checkpoints/top_encoded_model_0-weights.255.hdf5
Epoch 256/500
Epoch 256: saving model to checkpoints/top_encoded_model_0-weights.256.hdf5
Epoch 257/500
Epoch 257: saving model to checkpoints/top_encoded_model_0-weights.257.hdf5
Epoch 258/500
Epoch 258: saving model to checkpoints/top_encoded_model_0-weights.258.hdf5
Epoch 259/500
Epoch 259: saving model to checkpoints/top_encoded_model_0-weights.259.hdf5
Epoch 260/500
Epoch 260: saving model to checkpoints/top_encoded_model_0-weights.260.hdf5
Epoch 261/500
Epoch 261: saving model to checkpoints/top_encoded_model_0-weights.261.hdf5
Epoch 262/500
Epoch 262: saving model to checkpoints/top_encoded_model_0-weights.262.hdf5
Epoch 263/500
Epoch 263: saving model to checkpoints/top_encoded_model_0-weights.263.hdf5
Epoch 264/500
Epoch 264: saving model to checkpoints/top_encoded_model_0-weights.264.hdf5
Epoch 265/500
Epoch 265: saving mod

## Optimization Update:
Results so far:
*   Ending results for the encoded True/False data did not do as well as the scaled version of the data. A range of 68% to 63% for the top Three instead of the ballpark 72% from the previous 2 attempts. 

Next tasks to attempt:

1.   While I am on this Rabbbit hole: 
    *    Redo the encoded_df run to see if it will improve back to the ballpark 72% without scaling the "Ask_AMT".
        *    Still X_encoded

2.   Other Rabbit Holes: 
    *    Add features, EIN and NAME columns to the X features, However i will add them seperately because they both have the same meaning as unique identifier columns. There will be 2 data sets. 
        *   X + EIN = X_ein
        *   X + NAME = X_name
        

In [31]:

X_encoded_train, X_encoded_test, y_encoded_train, y_encoded_test = train_test_split(encoded_df, df["IS_SUCCESSFUL"])

In [32]:
## Rest the variable "encoded_tuner" and not continue from previous learned experience
## uses a model that can have only one activation equation in the hidden layers
encoded_tuner = kt.Hyperband(
    create_model,
    objective="val_accuracy",
    factor=3,
    max_epochs=50,
    hyperband_iterations=2,
    overwrite=True)
##

In [33]:
##
encoded_tuner.search(X_encoded_train,y_encoded_train,epochs=20,validation_data=(X_encoded_test,y_encoded_test))

Trial 180 Complete [00h 01m 15s]
val_accuracy: 0.5336443185806274

Best val_accuracy So Far: 0.6858308911323547
Total elapsed time: 00h 56m 45s


In [34]:
# Evaluate the top three models that used only one activation equation in the hidden layers
encoded_model_reports = []
encoded_top_models = encoded_tuner.get_best_models(3)
for i, model in enumerate(encoded_top_models):
    print("=========================\n", f"Begining Optimization for Top Encoded Model Number {i}", "\n=========================")
    model.fit(X_encoded_train, y_encoded_train, epochs=500, callbacks=checkpoint_callback(f"top_encoded_model_{i}"), verbose=1)
    model_loss, model_accuracy = model.evaluate(X_encoded_test,y_encoded_test,verbose=2)
    encoded_model_reports.append(f"Top Encoded Model Number {i} =>  Loss: {model_loss}, Accuracy: {model_accuracy}")


print("=========================\n", "Final Reports for Top 3 Encoded Models", "\n=========================")
for report in encoded_model_reports:
    print(report)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 255/500
Epoch 255: saving model to checkpoints/top_encoded_model_0-weights.255.hdf5
Epoch 256/500
Epoch 256: saving model to checkpoints/top_encoded_model_0-weights.256.hdf5
Epoch 257/500
Epoch 257: saving model to checkpoints/top_encoded_model_0-weights.257.hdf5
Epoch 258/500
Epoch 258: saving model to checkpoints/top_encoded_model_0-weights.258.hdf5
Epoch 259/500
Epoch 259: saving model to checkpoints/top_encoded_model_0-weights.259.hdf5
Epoch 260/500
Epoch 260: saving model to checkpoints/top_encoded_model_0-weights.260.hdf5
Epoch 261/500
Epoch 261: saving model to checkpoints/top_encoded_model_0-weights.261.hdf5
Epoch 262/500
Epoch 262: saving model to checkpoints/top_encoded_model_0-weights.262.hdf5
Epoch 263/500
Epoch 263: saving model to checkpoints/top_encoded_model_0-weights.263.hdf5
Epoch 264/500
Epoch 264: saving model to checkpoints/top_encoded_model_0-weights.264.hdf5
Epoch 265/500
Epoch 265: saving mod

In [None]:
## Not anticipating to need these anymore, delete the current training/validating
## variables to make room for the next Optimizations.
del X_encoded_train, X_encoded_test, y_encoded_train, y_encoded_test

## Optimization Update:
Results so far:
*   Ending results for the encoded True/False data is even worse without the ASK_AMT scaled. This did not work.
* Scaling is required for this model and all futur attempts.

Next tasks to attempt:

*    Add features, EIN and NAME columns to the X features, However i will add them seperately because they both have the same meaning as unique identifier columns. There will be 2 data sets. 
        *   X + EIN = X_ein
        *   X + NAME = X_name


In [18]:
encoded_ein_df = encoded_df.join(df["EIN"])
X_ein_train, X_ein_test, y_ein_train, y_ein_test = train_test_split(encoded_ein_df, df["IS_SUCCESSFUL"])
X_ein_train = pd.DataFrame(StandardScaler().fit_transform(X_ein_train), index=X_ein_train.index, columns=X_ein_train.columns)
X_ein_test = pd.DataFrame(StandardScaler().fit_transform(X_ein_test), index=X_ein_test.index, columns=X_ein_test.columns)

In [19]:
## input dimensions have changed from 47 to 48.
input_dims = len(X_ein_train.columns)
input_dims



48

In [20]:
# uses a model that can have only one activation equation in the hidden layers
ein_tuner = kt.Hyperband(
    create_model,
    objective="val_accuracy",
    factor=3,
    max_epochs=50,
    hyperband_iterations=2,
    overwrite=True)

## cjhyb

In [21]:
ein_tuner.search(X_ein_train,y_ein_train,epochs=20,validation_data=(X_ein_test,y_ein_test))

Trial 180 Complete [00h 02m 23s]
val_accuracy: 0.7378425598144531

Best val_accuracy So Far: 0.7379592061042786
Total elapsed time: 01h 00m 14s


In [22]:
# Evaluate the top three models that used only one activation equation in the hidden layers
ein_model_reports = []
ein_top_models = ein_tuner.get_best_models(3)
for i, model in enumerate(ein_top_models):
    print("=========================\n", f"Begining Optimization for Top EIN Model Number {i}", "\n=========================")
    model.fit(X_ein_train, y_ein_train, epochs=500, callbacks=checkpoint_callback(f"top_ein_model_{i}"), verbose=2)
    model_loss, model_accuracy = model.evaluate(X_ein_test,y_ein_test,verbose=2)
    ein_model_reports.append(f"Top EIN Model Number {i} =>  Loss: {model_loss}, Accuracy: {model_accuracy}")

for report in ein_model_reports:
    print(report)


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 254/500

Epoch 254: saving model to checkpoints/top_ein_model_0-weights.254.hdf5
804/804 - 1s - loss: 0.4757 - accuracy: 0.7683 - 1s/epoch - 1ms/step
Epoch 255/500

Epoch 255: saving model to checkpoints/top_ein_model_0-weights.255.hdf5
804/804 - 1s - loss: 0.4776 - accuracy: 0.7674 - 1s/epoch - 1ms/step
Epoch 256/500

Epoch 256: saving model to checkpoints/top_ein_model_0-weights.256.hdf5
804/804 - 1s - loss: 0.4760 - accuracy: 0.7687 - 1s/epoch - 1ms/step
Epoch 257/500

Epoch 257: saving model to checkpoints/top_ein_model_0-weights.257.hdf5
804/804 - 1s - loss: 0.4756 - accuracy: 0.7695 - 1s/epoch - 1ms/step
Epoch 258/500

Epoch 258: saving model to checkpoints/top_ein_model_0-weights.258.hdf5
804/804 - 1s - loss: 0.4759 - accuracy: 0.7672 - 1s/epoch - 1ms/step
Epoch 259/500

Epoch 259: saving model to checkpoints/top_ein_model_0-weights.259.hdf5
804/804 - 1s - loss: 0.4753 - accuracy: 0.7687 - 1s/epoch - 1ms/step

In [23]:
## Not anticipating to need these anymore, delete the current training/validating
## variables to make room for the next Optimizations.
del X_ein_train, X_ein_test, y_ein_train, y_ein_test

## Optimization Update:
Results so far:
*   Ending results for the X_EIN are slightly better than the original Keras.

Next tasks to attempt:
*    Add features, EIN and NAME columns to the X features, However i will add them seperately because they both have the same meaning as unique identifier columns. There will be 2 data sets. 
        *   Already completed: X + EIN = X_ein
        *   X + NAME = X_name

In [24]:
## Encode the name column to be an accepted data type for the model
encoded_df["NAME"] = LabelEncoder().fit_transform(df["NAME"]).reshape(-1,1)

## Train test split
X_name_train, X_name_test, y_name_train, y_name_test = train_test_split(encoded_df, df["IS_SUCCESSFUL"])

## Scaling the data
X_name_train = pd.DataFrame(StandardScaler().fit_transform(X_name_train), index=X_name_train.index, columns=X_name_train.columns)
X_name_test = pd.DataFrame(StandardScaler().fit_transform(X_name_test), index=X_name_test.index, columns=X_name_test.columns)

In [25]:
encoded_df


Unnamed: 0,STATUS,SPECIAL_CONSIDERATIONS,ASK_AMT,APPLICATION_TYPE_OTHER,APPLICATION_TYPE_T10,APPLICATION_TYPE_T19,APPLICATION_TYPE_T3,APPLICATION_TYPE_T4,APPLICATION_TYPE_T5,APPLICATION_TYPE_T6,...,INCOME_AMT_0,INCOME_AMT_1-9999,INCOME_AMT_10000-24999,INCOME_AMT_100000-499999,INCOME_AMT_10M-50M,INCOME_AMT_1M-5M,INCOME_AMT_25000-99999,INCOME_AMT_50M+,INCOME_AMT_5M-10M,NAME
0,1.0,0.0,5000.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2237
1,1.0,0.0,108590.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,860
2,1.0,0.0,5000.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16310
3,1.0,0.0,6692.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,16127
4,1.0,0.0,142590.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,6807
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34294,1.0,0.0,5000.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,17335
34295,1.0,0.0,5000.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,8718
34296,1.0,0.0,5000.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13952
34297,1.0,0.0,5000.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,890


In [26]:
# uses a model that can have only one activation equation in the hidden layers
name_tuner = kt.Hyperband(
    create_model,
    objective="val_accuracy",
    factor=3,
    max_epochs=50,
    hyperband_iterations=2,
    overwrite=True)
## hjjuh

In [27]:
name_tuner.search(X_name_train,y_name_train,epochs=20,validation_data=(X_name_test,y_name_test))

Trial 180 Complete [00h 01m 22s]
val_accuracy: 0.756851315498352

Best val_accuracy So Far: 0.7651311755180359
Total elapsed time: 01h 01m 00s


In [29]:
# Evaluate the top three models that used only one activation equation in the hidden layers
name_model_reports = []
name_top_models = name_tuner.get_best_models(3)
for i, model in enumerate(name_top_models):
    print("=========================\n", f"Begining Optimization for Top Name Model Number {i}", "\n=========================")
    model.fit(X_name_train, y_name_train, epochs=500, callbacks=checkpoint_callback(f"top_name_model_{i}"), verbose=1)
    model_loss, model_accuracy = model.evaluate(X_name_test,y_name_test,verbose=2)
    name_model_reports.append(f"Top Name Model Number {i} =>  Loss: {model_loss}, Accuracy: {model_accuracy}")


print("=========================\n", "Final Reports for Top 3 Name Models", "\n=========================")
for report in name_model_reports:
    print(report)

 Begining Optimization for Top Name Model Number 0 
Epoch 1/500




[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 255/500
Epoch 255: saving model to checkpoints/top_name_model_0-weights.255.hdf5
Epoch 256/500
Epoch 256: saving model to checkpoints/top_name_model_0-weights.256.hdf5
Epoch 257/500
Epoch 257: saving model to checkpoints/top_name_model_0-weights.257.hdf5
Epoch 258/500
Epoch 258: saving model to checkpoints/top_name_model_0-weights.258.hdf5
Epoch 259/500
Epoch 259: saving model to checkpoints/top_name_model_0-weights.259.hdf5
Epoch 260/500
Epoch 260: saving model to checkpoints/top_name_model_0-weights.260.hdf5
Epoch 261/500
Epoch 261: saving model to checkpoints/top_name_model_0-weights.261.hdf5
Epoch 262/500
Epoch 262: saving model to checkpoints/top_name_model_0-weights.262.hdf5
Epoch 263/500
Epoch 263: saving model to checkpoints/top_name_model_0-weights.263.hdf5
Epoch 264/500
Epoch 264: saving model to checkpoints/top_name_model_0-weights.264.hdf5
Epoch 265/500
Epoch 265: saving model to checkpoints/top_name_mod

# Ending Results:

1.   **Optimizing Attempt:** the model will be to use the Keras Tuner to find the best number of hidden layers and neurons.
   *    Minimum number of hidden layers: 2
   *    Maximum number of hidden layers: 6  
   *    Minimum number of neurons per layer: 1
   *    Maximum number of neurons per layer: input_dims * 1.75 
   *    Test the top 3 models the Keras Tuner finds and train them again with 500 epochs before evaluating them with the test data.

    *  **Results:** This attempt boils down to changing the number of hidden layers and number of neurons within the model and using only one activation equation within the hidden layers. The Keras Tuner tested 180 different models, I chose the top 3 best performing ones and retrained them with 500 epochs. The results from those 3 models did not meet the required accuracy goal with the testing data.   

2.    **Optimizing Attempt:** Run the same Keras Tuner with one change, allow it to pick multiple activation equations for the hidden layers every time it builds a new model. The Keras Tuner summary will only be able to show one of the activation equations for each model, not all of the ones contained within. The one activation equation for the most recently created hidden layer will be displayed.

    *    **Results:** Ending results for the top 3 models of both Keras Tuners were very similar.  Having multiple activation equations in the hidden layers did not seem to help improve the model.

3. **Optimizing Attempt:** use the encoded_df instead of the std_scaled_df because only 1 of the 47 columns is not a true/false numerical value. Keep ASK_AMT column scaled with the standard scaler because it is not a true/false statement.    
    
    *    **Results:** Ending results for the encoded True/False data did not do as well as the scaled version of the data. A range of 68% to 63% for the top Three instead of the ballpark 72% from the previous 2 attempts. The **Rabbit Hole** ending results are even worse without the ASK_AMT scaled. This did not work.

4. **Optimizing Attempt:** Add features, EIN and NAME columns to the X features, However i will add them seperately because they both have the same meaning as unique identifier columns. There will be 2 data sets, X_ein and X_name. 
    *   **X_ein:** Ending results are very slightly better than the First 2 optimized models, 73% instead of the varying 72.X%.
    *  **X_name:**

## Final Model Rankings: 
(ranked first by Accuracy then by Loss)

1. **Name Model Number 1** =>  Loss: 60.55%, Accuracy: 77.06%
2. Name Model Number 2 =>  Loss: 72.37%, Accuracy: 76.75%
3. Name Model Number 0 =>  Loss: 62.25%, Accuracy: 75.97%
4. EIN Model Number 0 =>  Loss: 82%, Accuracy: 73.41%
5. EIN Model Number 1 =>  Loss: 62.14%, Accuracy: 73.18%
6. First Model Number 2 =>  Loss:64.70%, Accuracy: 72.92%
7. Original Model => Loss: 0.56.09%, Accuracy: 72.87%
8. Multi Model Number 0 =>  Loss: 62.98%, Accuracy: 72.80%
9. First Model Number 1 =>  Loss: 57.44%, Accuracy: 72.76%
10. EIN Model Number 2 =>  Loss: 85.59%, Accuracy: 72.69%
11. First Model Number 0 =>  Loss: 57.94%, Accuracy: 72.68%
12. Multi Model Number 2 =>  Loss: 70.04%, Accuracy: 72.66%
13. Multi Model Number 1 =>  Loss: 67.24%, Accuracy: 72.60%
14. Encoded_1 Model Number 2 =>  Loss: 60.15%, Accuracy: 71.40%
15. Encoded_1 Model Number 1 =>  Loss: 61.50%, Accuracy: 67.16%
16. Encoded_1 Model Number 0 =>  Loss: 63.48%, Accuracy: 65.37%
17. Encoded_2 Model Number 2 =>  Loss: 1615230.625%, Accuracy: 53.36%
18. Encoded_2 Model Number 1 =>  Loss: 161730.0%, Accuracy: 53.36%
19. Encoded_2 Model Number 0 =>  Loss: 118593.0%, Accuracy: 46.63%