# Venture Funding with Deep Learning

You work as a risk management associate at Alphabet Soup, a venture capital firm. Alphabet Soup’s business team receives many funding applications from startups every day. This team has asked you to help them create a model that predicts whether applicants will be successful if funded by Alphabet Soup.

The business team has given you a CSV containing more than 34,000 organizations that have received funding from Alphabet Soup over the years. With your knowledge of machine learning and neural networks, you decide to use the features in the provided dataset to create a binary classifier model that will predict whether an applicant will become a successful business. The CSV file contains a variety of information about these businesses, including whether or not they ultimately became successful.

## Instructions:

The steps for this challenge are broken out into the following sections:

* Prepare the data for use on a neural network model.

* Compile and evaluate a binary classification model using a neural network.

* Optimize the neural network model.

### Prepare the Data for Use on a Neural Network Model 

Using your knowledge of Pandas and scikit-learn’s `StandardScaler()`, preprocess the dataset so that you can use it to compile and evaluate the neural network model later.

Open the starter code file, and complete the following data preparation steps:

1. Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.   

2. Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.
 
3. Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

4. Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

> **Note** To complete this step, you will employ the Pandas `concat()` function that was introduced earlier in this course. 

5. Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 

6. Split the features and target sets into training and testing datasets.

7. Use scikit-learn's `StandardScaler` to scale the features data.

### Compile and Evaluate a Binary Classification Model Using a Neural Network

Use your knowledge of TensorFlow to design a binary classification deep neural network model. This model should use the dataset’s features to predict whether an Alphabet Soup&ndash;funded startup will be successful based on the features in the dataset. Consider the number of inputs before determining the number of layers that your model will contain or the number of neurons on each layer. Then, compile and fit your model. Finally, evaluate your binary classification model to calculate the model’s loss and accuracy. 
 
To do so, complete the following steps:

1. Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.

2. Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.

> **Hint** When fitting the model, start with a small number of epochs, such as 20, 50, or 100.

3. Evaluate the model using the test data to determine the model’s loss and accuracy.

4. Save and export your model to an HDF5 file, and name the file `AlphabetSoup.h5`. 

### Optimize the Neural Network Model

Using your knowledge of TensorFlow and Keras, optimize your model to improve the model's accuracy. Even if you do not successfully achieve a better accuracy, you'll need to demonstrate at least two attempts to optimize the model. You can include these attempts in your existing notebook. Or, you can make copies of the starter notebook in the same folder, rename them, and code each model optimization in a new notebook. 

> **Note** You will not lose points if your model does not achieve a high accuracy, as long as you make at least two attempts to optimize the model.

To do so, complete the following steps:

1. Define at least three new deep neural network models (the original plus 2 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.

2. After finishing your models, display the accuracy scores achieved by each model, and compare the results.

3. Save each of your models as an HDF5 file.


In [None]:
# Imports
import pandas as pd
from pathlib import Path
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,OneHotEncoder

---

## Prepare the data to be used on a neural network model

### Step 1: Read the `applicants_data.csv` file into a Pandas DataFrame. Review the DataFrame, looking for categorical variables that will need to be encoded, as well as columns that could eventually define your features and target variables.  


In [None]:
# Read the applicants_data.csv file from the Resources folder into a Pandas DataFrame
applicant_data_df = pd.read_csv(                  # Read the specified csv file into the dataframe
    Path("../Resources/applicants_data.csv")   # using a normalized version of the path and filename for the operating system on which the code is running
    )

# Review the DataFrame
display( "*** applicant_data_df Head ***", applicant_data_df.head() );                        # Show the first few rows of the dataframe
display( "*** Null Count by Column ***", applicant_data_df.isna().sum() );  # Show the number of nulls in each column out of interest to determine the quality of the data

In [None]:
# Review the data types associated with the columns
display( "*** applicant_data_df Column Data Types ***", applicant_data_df.dtypes ) # Display each column's data type

### Step 2: Drop the “EIN” (Employer Identification Number) and “NAME” columns from the DataFrame, because they are not relevant to the binary classification model.

In [None]:
# Drop the 'EIN' and 'NAME' columns from the DataFrame
applicant_data_df.drop(columns=["EIN", "NAME"], inplace=True)  # Dropping the EIN and NAME columns as they are not relevant to this binary classification

# Review the DataFrame
display( "*** applicant_data_df Head ***", applicant_data_df.head() );                        # Show the first few rows of the dataframe

In [None]:
# Get a rough idea of the type (categories) of data in each column 
display( "*** applicant_data_df dataset categories by Column ***" )
for index in applicant_data_df:                            # Iterate through each column of the data frame
    print(index, applicant_data_df[index].unique())        #    showing the column name and the list of unique values in the column

 # Initial Assessment of the Data
 The STATUS  and IS_SUCCESSFUL colums are already numeric (int64 dataype) and contain binary data (0 or 1), meaning these are already suitable as input to the Neural Network algorithms, and also provides an indication of the types of Activation functions that might be suitable for use in out model. For example, since there are no negative numbers the Leaky ReLU might not work any better than the standard ReLU. There is lineal data (ASK_AMT) and a range of categorical data which will need to be normalised / encoded. 

### Step 3: Encode the dataset’s categorical variables using `OneHotEncoder`, and then place the encoded variables into a new DataFrame.

In [None]:
# Create a list of categorical variables 
categorical_variables = list(applicant_data_df.dtypes[applicant_data_df.dtypes == "object"].index)  # Create a list of column names which have a data type of object 

# Display the categorical variables list
display( "*** categorical_variables ***", categorical_variables )       # Show the list of categorical variables

In [None]:
# Create a OneHotEncoder instance
enc = OneHotEncoder(sparse_output=False)   # Create a binary column for each category returning a sparse matrix which the neural network can work with more efficiently

In [None]:
# Encode the categorcal variables using OneHotEncoder
encoded_data = enc.fit_transform(applicant_data_df[categorical_variables]) # Fit to data, then transform it using the OneHotEncoder fit_transform method

In [None]:
# Create a DataFrame with the encoded variables
encoded_df =  pd.DataFrame( encoded_data, columns=enc.get_feature_names_out(categorical_variables) ) # Create a new data frame with the transformed data and transformed column names

# Review the DataFrame
display( "*** encoded_df Head ***", encoded_df.head() );                        # Show the first few rows of the dataframe

### Step 4: Add the original DataFrame’s numerical variables to the DataFrame containing the encoded variables.

In [None]:
# Create a new data frame `preprocessed_df` combining the encoded variables and the numeric variables from the original data frame

# Get the numerical variables from the original DataFrame 
numerical_variables_df = applicant_data_df.drop( columns=categorical_variables ) # Dropping categorical columns from the original df gives us the numeric variables

# Create a new data frame with the transformed data and transformed column names
preprocessed_df = pd.concat( [numerical_variables_df, encoded_df], axis=1 )

# Reveiw the DataFrame
display( "*** preprocessed_df Head ***", preprocessed_df.head() );                        # Show the first few rows of the dataframe

### Step 5: Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “IS_SUCCESSFUL”. The remaining columns should define the features dataset. 



In [None]:
# Define the target set y using the IS_SUCCESSFUL column
y = preprocessed_df["IS_SUCCESSFUL"] # Use the IS_SUCCESSFUL column as the y (target) dataset

# Display a sample of y
display( "*** Sample of 'y' (Target dataset) ***", y[:5] );                        # Show the first few rows

In [None]:
# Define features set X by selecting all columns but IS_SUCCESSFUL
X = preprocessed_df.drop(columns=["IS_SUCCESSFUL"]) # Use the other columns as the X (dataset) dropping IS_SUCCESSFUL since it is used as the y (target)

# Review the features DataFrame
display( "*** Sample of 'X' (Features dataset) ***", X.head() );                        # Show the first few rows of the dataframe

### Step 6: Split the features and target sets into training and testing datasets.


In [None]:
# Split the preprocessed data into a training and testing dataset
# Assign the function a random_state equal to 1 to maintain reproducible results across multiple function calls
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1) # Split the data into training and test set, but maintain the results across multiple calls to the function

### Step 7: Use scikit-learn's `StandardScaler` to scale the features data.

In [None]:
# Create a StandardScaler instance
scaler = StandardScaler() # Instantiate the StandardScaler

# Fit the scaler to the features training dataset
X_scaler = scaler.fit(X_train)   # Use the scaler to fit the X (Features) training (Compute the mean and std to be used for later scaling)

# Fit the scaler to the features training dataset
X_train_scaled = X_scaler.transform(X_train)  # Perform standardization by centering and scaling X (Features) training dataset to get a transformed (scaled) dataframe
X_test_scaled  = X_scaler.transform(X_test)   # Perform standardization by centering and scaling X (Features) test dataset to get a transformed (scaled) dataframe  

---

## Compile and Evaluate a Binary Classification Model Using a Neural Network

### Step 1: Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.

> **Hint** You can start with a two-layer deep neural network model that uses the `relu` activation function for both layers.


In [None]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.iloc[0])   # Get the count of columns in the training feature set

# Review the number of features
display( f"Number of Input Features is: {number_input_features}" )

In [None]:
# Define the number of neurons in the output layer
number_output_neurons = 1  # As we are looking for a binary outcome classification, 1 neuron is appropriate

In [None]:
# Define the number of hidden nodes (neurons) for the first hidden layer
number_hidden_nodes_layer1 = number_input_features + 1 # Use the number of features plus 1 for bias as the number of neurons in the input layer.

# Review the number hidden nodes in the first layer
display( f"Number of Neurons (Hidden Nodes) in Layer 1 is: {number_hidden_nodes_layer1}" ) # Display a label and number_hidden_nodes_layer1 value

In [None]:
# Define the number of hidden nodes for the second hidden layer
number_hidden_nodes_layer2 = max(int(number_hidden_nodes_layer1**0.5), 4) # Set layer 2 Neurons to the square root of layer 1  OR 4 whichever is greater

# Review the number hidden nodes in the second layer

display( f"Number of Neurons (Hidden Nodes) in Layer 2 is: {number_hidden_nodes_layer2}" ) # Display a label and number_hidden_nodes_layer1 value

In [None]:
# Create the Sequential model instance
nn = Sequential()   # Instantiate the Sequential model to nn 

In [None]:
# Add the first hidden layer
nn.add(Dense(units=number_hidden_nodes_layer1, input_dim=number_input_features, activation="relu")) # Add the Input layer to the Sequence model

In [None]:
# Add the second hidden layer
# Add the second Dense layer specifying the number of hidden nodes and the activation function
nn.add(Dense(units=number_hidden_nodes_layer2, activation="relu")) # Add layer 2 as a hidden layer which uses the ReLU activation function

In [None]:
# Add the output layer to the model specifying the number of output neurons and activation function of sigmoid
nn.add(Dense(number_output_neurons, activation="sigmoid")) # Add the final (output) layer as a sigmoid as we are looking at predicting a probability (of funding approval) which the function is suited for

In [None]:
# Display the Sequential model summary
display( "*** Base Model Summary Information ***" )
nn.summary() # Display the summary of the Base Model

### Step 2: Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.


In [None]:
# Compile the Sequential model
nn.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])   # Using binary_crossentropy and the Adam optimiser 

In [None]:
# Fit the model using 50 epochs and the training data
fit_model = nn.fit(X_train_scaled, y_train, epochs=50) # Fit the model

### Step 3: Evaluate the model using the test data to determine the model’s loss and accuracy.


In [None]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled, y_test, verbose=2) # Evaluate the performance of the Base Model

# Display the model loss and accuracy results
display("*** Base Model Evaluation Results ***" )
display(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

### Step 4: Save and export your model to an HDF5 file, and name the file `AlphabetSoup.h5`. 


In [None]:
# Set the model's file path
base_model_file_path = Path("../Resources/AlphabetSoup.h5")  # Set the save filename for the base model into the ../Resources folder

# Export the base model to a HDF5 file format
nn.save(base_model_file_path)  # Save the model to the specified path

---

## Optimize the neural network model


### Step 1: Define at least three new deep neural network models (resulting in the original plus 3 optimization attempts). With each, try to improve on your first model’s predictive accuracy.

> **Rewind** Recall that perfect accuracy has a value of 1, so accuracy improves as its value moves closer to 1. To optimize your model for a predictive accuracy as close to 1 as possible, you can use any or all of the following techniques:
>
> * Adjust the input data by dropping different features columns to ensure that no variables or outliers confuse the model.
>
> * Add more neurons (nodes) to a hidden layer.
>
> * Add more hidden layers.
>
> * Use different activation functions for the hidden layers.
>
> * Add to or reduce the number of epochs in the training regimen.


### Alternative Model 1

In [None]:
# Define the the number of inputs (features) to the model
number_input_features_A1 = number_input_features  # Use the same number of input features as the base model

# Review the number of features
display( f"Number of Input Features (Alternative Model 1) is: {number_input_features_A1}" )

In [None]:
# Define the number of neurons in the output layer
number_output_neurons_A1 = number_output_neurons # Maintain the same number of output neurons as the base model

In [None]:
# Define the number of hidden nodes for the first hidden layer
number_hidden_nodes_layer1_A1 = number_hidden_nodes_layer1 # Use the same number of input nodes for layer 1 as the base model

# Review the number of hidden nodes in the first layer
# Review the number hidden nodes in the first layer
display( f"Number of Neurons (Hidden Nodes) in Alternative Model 1 Layer 1 is: {number_hidden_nodes_layer1_A1}" ) # Display a label and the number of nodes

In [None]:
# Create the Sequential model instance
nn_A1 = Sequential()   # Instantiate the Sequential model to nn_A1 for the Alternative Model 

In [None]:
# First hidden layer
nn_A1.add(Dense(units=number_hidden_nodes_layer1, input_dim=number_input_features, activation="gelu")) # Add the Input layer to the Sequence model using Gelu instead

# Add the second hidden layer
# Set the number of nodes in the 2nd hidden layer
number_hidden_nodes_layer2_A1 = int((number_hidden_nodes_layer1_A1 + 1) / 2)   # For Layer 2 use Half the number of layers from Layer 1
# Add the second Dense layer specifying the number of hidden nodes and the activation function
nn_A1.add(Dense(units=number_hidden_nodes_layer2_A1, activation="mish")) # Add layer 2 as a hidden layer using MISH

# Add the third hidden layer
# Set the number of nodes in the 3rd hidden layer
number_hidden_nodes_layer3_A1 = int((number_hidden_nodes_layer2_A1 + 1) / 2)   # For Layer 3 use Half the number of layers from Layer 2
# Add the third  Dense layer specifying the number of hidden nodes and the activation function
nn_A1.add(Dense(units=number_hidden_nodes_layer2_A1, activation="tanh")) # Add layer 2 as a hidden layer which uses the Tanh instead of ReLU activation function this time

# Output layer
# Add the output layer to the model specifying the number of output neurons and activation function of sigmoid
nn_A1.add(Dense(number_output_neurons, activation="sigmoid")) # Add the final (output) layer as a sigmoid as we are looking at predicting a probability (of funding approval) which the function is suited for

# Check the structure of the model
display( "*** Alternative Model 1 Summary Information ***" )
nn_A1.summary() # Display the summary of the Alternative Model 1

In [None]:
# Compile the Sequential model
nn_A1.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])   # Using binary_crossentropy and the Adam optimiser 

In [None]:
# Fit the model using 100 epochs and the training data
fit_model_A1 = nn.fit(X_train_scaled, y_train, epochs=100) # Fit the model to 100 epochs

#### Alternative Model 2

In [None]:
# Define the the number of inputs (features) to the model
number_input_features_A2 = number_input_features / 2 # Use the same number of input features as the base model

# Review the number of features
display( f"Number of Input Features (Alternative Model 2) is: {number_input_features_A2}" )

In [None]:
# Define the number of neurons in the output layer
number_output_neurons_A2 = number_output_neurons # Maintain the same number of output neurons as the base model

In [None]:
# Define the number of hidden nodes for the first hidden layer
number_hidden_nodes_layer1_A2 = number_hidden_nodes_layer1 # Maintain the same number of neurons as the base model

# Review the number of hidden nodes in the first layer

display( f"Number of Neurons (Alternative Model 2) is: {number_hidden_nodes_layer1_A2}" )

In [None]:
# Create the Sequential model instance
nn_A2 = Sequential()   # Instantiate the Sequential model to nn_A2 for the Alternative Model 

In [None]:
# First hidden layer
nn_A2.add(Dense(units=number_hidden_nodes_layer1_A2, input_dim=number_input_features, activation="gelu")) # Add the Input layer to the Sequence model using Gelu instead

# Add the second hidden layer
# Set the number of nodes in the 2nd hidden layer
number_hidden_nodes_layer2_A2 = int((number_hidden_nodes_layer1_A1 + 1) / 2)   # For Layer 2 use Half the number of layers from Layer 1
# Add the second Dense layer specifying the number of hidden nodes and the activation function
nn_A2.add(Dense(units=number_hidden_nodes_layer2_A1, activation="gelu")) # Add layer 2 as a hidden layer using GELU

# Add the third hidden layer
# Set the number of nodes in the 3rd hidden layer
number_hidden_nodes_layer3_A2 = int((number_hidden_nodes_layer2_A1 + 1) / 2)   # For Layer 3 use Half the number of layers from Layer 2
# Add the third  Dense layer specifying the number of hidden nodes and the activation function
nn_A2.add(Dense(units=number_hidden_nodes_layer2_A1, activation="tanh")) # Add layer 2 as a hidden layer which uses the Tanh instead of ReLU activation function this time

# Output layer
# Add the output layer to the model specifying the number of output neurons and activation function of sigmoid
nn_A2.add(Dense(number_output_neurons, activation="sigmoid")) # Add the final (output) layer as a sigmoid as we are looking at predicting a probability (of funding approval) which the function is suited for

# Check the structure of the model
display( "*** Alternative Model 2 Summary Information ***" )
nn_A2.summary() # Display the summary of the Alternative Model 1

In [None]:
# Compile the model
nn_A2.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])   # Using binary_crossentropy and the Adam optimiser 

In [None]:
# Fit the model using 50 epochs and the training data
fit_model = nn_A2.fit(X_train_scaled, y_train, epochs=50) # Fit the model to 50 epochs

### Step 2: After finishing your models, display the accuracy scores achieved by each model, and compare the results.

In [None]:
display("*** Original Model Results ***")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled, y_test, verbose=2) # Evaluate the performance of the Base Model

# Display the model loss and accuracy results
display(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

In [None]:
display("*** Alternative Model 1 Results ***")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A1.evaluate(X_test_scaled, y_test, verbose=2) # Evaluate the performance of the Alternative Model 1

# Display the model loss and accuracy results
display(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

In [None]:
display("*** Alternative Model 2 Results ***")

# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn_A2.evaluate(X_test_scaled, y_test, verbose=2) # Evaluate the performance of the Alternative Model 2

# Display the model loss and accuracy results
display(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

### Step 3: Save each of your alternative models as an HDF5 file.


In [None]:
# Set the file path for the first alternative model
A1_model_file_path = Path("../Resources/AlphabetSoup_A1.h5")  # Set the save filename for alternative model 1 into the ../Resources folder

# Export the base model to a HDF5 file format
nn_A1.save(A1_model_file_path)  # Save the model to the specified path

In [None]:
# Set the file path for the second alternative model
A2_model_file_path = Path("../Resources/AlphabetSoup_A2.h5")  # Set the save filename for alternative model 2 into the ../Resources folder

# Export the base model to a HDF5 file format
nn_A2.save(A2_model_file_path)  # Save the model to the specified path