# Build Neural Network

This is a basic single-neuron, single-layer model using a test dataset.

If we are not changing the data structure of our neural network or function, this template can be used for linear and non-linear data. 

The process of **model -> fit -> predict/transform** follows the same general steps across all of data science:

* Decide on a model
* Create a model instance
* Split into training and testing sets and preprocess the data
* Train/fit the training data to the model after creating and compiling the model ("train" and "fit" are used interchangeably in Python libraries as well as the data field.)
* Use the model for predictions and transformations


In [None]:
# Load the Drive helper and mount
from google.colab import drive
drive.mount('/content/drive')

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
from google.colab import files

In [None]:
uploaded = files.upload()

In [None]:
# Import our dependencies
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,OneHotEncoder
import sklearn as skl
import tensorflow as tf
import os
from tensorflow.keras.callbacks import ModelCheckpoint
from pyspark import SparkFiles

## Import Data Set

In [1]:
# Import and read our input dataset
df = pd.read_csv('file_name.csv')
df.head()

# OR Read in data from S3 Buckets
from pyspark import SparkFiles
#url = "https://s3.amazonaws.com/dataviz-curriculum/day_1/food.csv"
url = "database-1.czpjmlarn3xk.us-east-2.rds.amazonaws.com"

# Import into a DataFrame
spark.sparkContext.addFile(url)
df = spark.read.csv(SparkFiles.get("food.csv"), sep=",", header=True)

SyntaxError: ignored

## Generate Dummy Data -- NOTE - This may not be necessary for our project 

* Create the dummy data using Scikit-learn's **make_blobs method**. The make_blobs is used to create sample values and contains many parameters that change the shape and values of the sample dataset. 
* **n_samples** = number of sample data
* **centers** = argument specifies the number of clusters in the dataset; in this case there are two clusters or number of features (known as x- and y-axis values that are linearly separable into two groups
* **random_state** = Ensures reproducibility of this dataset: even though the numbers in this dataset are generated pseudo-randomly,



In [None]:
# Generate dummy dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=78)

# Creating a DataFrame with the dummy data
df = pd.DataFrame(X, columns=["Feature 1", "Feature 2"])
df["Target"] = y

# Plotting the dummy data
df.plot.scatter(x="Feature 1", y="Feature 2", c="Target", colormap="winter")

## Pre-Process Data

* Remove non-beneficial columns
* Check unique value counts
* Convert strings or categorical values to numerical values
* Encode columns
* Bin or bucket categorical (columns) to reduce unique categorical values in a dataset is known 
* Large gaps between numerical values

### Drop non-beneficial columns

In [None]:
# Drop the non-beneficial columns
df = df.drop(columns=["", "N"], axis=1)
df.head()

### Determine the number of unique values in each column

In [None]:
# Determine the number of unique values in each column
cnt = df.nunique(axis=0)

### Look at value counts for binning

In [None]:
# Check for unique values is to use the Pandas DataFrame's value_counts method
application_counts = df.column_name.value_counts()
application_counts 

### Visualize density

Use density plot to determine which values are uncommon enough to bucket into the "other" category. Dentify where the value counts "fall off" and set the threshold within this region. 

In [None]:
# Visualize the value counts of APPLICATION_TYPE
application_counts.plot.density()

### Bin categorical variables
* Collapse all of the infrequent and rare categorical values into a single "other" category.
* Create generalized categorical 

In [None]:
# Determine which values to replace if counts are less than ...?
replace_application = list(application_counts[application_counts < 500].index)

# Replace in dataframe
for app in replace_application:
    application_df.column_name = application_df.column_name.replace(app,"Other")
    
# Check to make sure binning was successful
application_df.column_name.value_counts()

# This reduces the number of unique values

In [None]:
# Generate our categorical variable lists
application_cat = df.dtypes[df.dtypes == "object"].index.tolist()
application_cat

# Generate categorical list prior to encoding all categorical data


### Encoding
After reducing the number of unique values in the country variable, transpose the variable using one-hot encoding.

In [None]:
# Create a OneHotEncoder instance
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(sparse=False)

# Fit and transform the OneHotEncoder using the categorical variable list
encode_df = pd.DataFrame(enc.fit_transform(df[application_cat]))

# Add the encoded variable names to the dataframe (rename encoded column)
encode_df.columns = enc.get_feature_names(application_cat)
encode_df.head()

In [None]:
# Merge one-hot encoded features and drop the originals
merged_df = df.merge(encode_df,left_index=True,right_index=True).drop(application_cat,1)
merged_df.head()

## Split Data Into Training and Test Datasets 

* Use Scikit-learn's train_test_split method

In [None]:
# Split our preprocessed data into our features and target arrays
y = merged_df[""].values
X = merged_df.drop([""], axis=1).values
merged_df

# Split the preprocessed data into a training and testing dataset using sklearn to split dataset
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=78)

## Prepare Dataset For Our Neural Network Model
## Standardize Data

* First normalize or standardize numerical variables to ensure that our neural network does not focus on outliers and can apply proper weights to each input
* The more the input variables are normalized to the same scale, the more stable the neural network model is, and the better the neural network model will generalize.


In [None]:
# Create scaler instance
X_scaler = skl.preprocessing.StandardScaler()

# Fit the scaler
X_scaler.fit(X_train)

# Scale the data
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

## Create Keras Sequential Neural Network Model
To create the neural network, first create our Sequential modelKeras classes:

* The **Sequential** class is a linear stack of neural network layers, where data flows from one layer to the next. 
* The generalized **Dense** class allows us to add layers within the neural network.

In [None]:
# Create the Keras Sequential model
nn_model = tf.keras.models.Sequential()

# Define the model - deep neural net, i.e., the number of input features and hidden nodes for each layer.
number_input_features = len(X_train[0])
hidden_nodes_layer1 =  80
hidden_nodes_layer2 = 30

nn = tf.keras.models.Sequential()

# First hidden layer
nn.add(
    tf.keras.layers.Dense(units=hidden_nodes_layer1, input_dim=number_input_features, activation="relu")
)

# Second hidden layer
nn.add(tf.keras.layers.Dense(units=hidden_nodes_layer2, activation="relu"))

# Output layer
nn.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))



## Create The Layers
### First Dense Layer
Contains inputs and a hidden layer of neurons:
* The **units** parameter indicates how many neurons we want in the hidden layer
* The **input_dim** parameter indicates how many inputs will be in the model
* The **activation** parameter indicates which activation function to use

        * Use ReLU for nonlinear relationships
        * Use Signmoid for binary classification output

### Output Layer
* **nn_model** object will store the entire architecture of our neural network model.


In [None]:
# This is for a basic model example
# Add our first Dense layer, including the input layer
nn_model.add(tf.keras.layers.Dense(units=1, activation="relu", input_dim= 2))

    # By default, Dense layer will look for linear relationships
    # Example has single neuron (units), two inputs (input_dim), relu (activation function)

# Add the output layer that uses a probability activation function
nn_model.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))

    # Provide the number of output neurons

## Check Model Structure

In [None]:
# Check the structure of the Sequential model
nn_model.summary()

## Compile The Model

Inform the model how it should train using the input data. 

Depending on the function of the neural network, we'll have to compile the neural network using a specific optimization function and loss metric. 

*  **optimization function** shapes and molds a neural network model while it is being trained to ensure that it performs to the best of its ability. 
* **loss metric** is used by to score the performance of the model through each iteration and epoch by evaluating the inaccuracy of a single input. 
* **adam optimizer** is used to enhance the performance of classification neural network
* **loss function** is used binary_crossentropy to evaluate a binary classification model.

In [None]:
# Compile the Sequential model together and customize metrics
nn_model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

There are two main types of evaluation metrics—

* Predictive **accuracy** - use accuracy for classification models. For model predictive accuracy, the higher the number the better. 
* **Mean squared error (MSE)** - Use for regression models. MSE should reduce to zero.

# Train (Fit) The Model
Yse the fit method and provide the x training values and y training values, as well as the number of epochs. Each epoch is a complete pass through the training data.

In [None]:
# Fit the model to the training data
fit_model = nn_model.fit(X_train_scaled, y_train, epochs=100)

# epochs can be adjusted

## Visualize Model's Data Lost

Model object stores the loss and accuracy metrics across all epochs, which we can use to visualize the training progress. 

In [None]:
# Create a DataFrame containing training history
history_df = pd.DataFrame(fit_model.history, index=range(1,len(fit_model.history["loss"])+1))

# Plot the loss
history_df.plot(y="loss")

## Plot Accuracy

In [None]:
# Plot the accuracy
history_df.plot(y="accuracy")

## Evaluate Model Performance Using The Test Data

In [None]:
# Evaluate the model using the test data
model_loss, model_accuracy = nn_model.evaluate(X_test_scaled,y_test,verbose=2)
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

## Predict Classification Using A New Dataset -- May Not Be Needed For Our Project
Now that we have a trained neural network model and we have verified its performance using a test dataset, we can apply this model to novel datasets and predict the classification of a data point. In our Sequential model, we can use the predict method to generate predictions on new data. 

In [None]:
# Predict the classification of a new set of blob data
new_X, new_Y = make_blobs(n_samples=10, centers=2, n_features=2, random_state=78)
new_X_scaled = X_scaler.transform(new_X)
(nn_model.predict(new_X_scaled) > 0.5).astype("int32")

## Create The Checkpoint And CallBack Object

In [None]:
# Import checkpoint dependencies
import os
from tensorflow.keras.callbacks import ModelCheckpoint

# Define the checkpoint path and filenames
os.makedirs("checkpoints/",exist_ok=True)
checkpoint_path = "checkpoints/weights.{epoch:02d}.hdf5"

In [None]:
# Create a callback that saves the model's weights every 5 epochs.
cp_callback = ModelCheckpoint(
    filepath=checkpoint_path,
    verbose=1,
    save_weights_only=True,
    save_freq='epoch')

# Train the model
fit_model = nn.fit(X_train_scaled,y_train,epochs=100,callbacks=[cp_callback])

In [None]:
# Save and export your results to an HDF5 file and name it AlphabetSoupCharity.h5.
nn.save("AlphabetSoupCharity.h5")

# Optimization Techniques

### References:
* Optimization Functions
https://www.tensorflow.org/api_docs/python/tf/keras/optimizers
* Loss Metrics
https://www.tensorflow.org/api_docs/python/tf/keras/losses
* Keras Documentation
https://www.tensorflow.org/guide/keras/sequential_model

### Techniques:
1. Add more neurons to a hidden layer(s) of our neural network can help to generate a well-performing model faster than using a single-neuron, single-layer neural network:
    * To find optimal weights—faster
    * Each neuron can focus on different features to identify nonlinear smarter
    * To fixate less likely on complex variables—more robust

  Limitations: 
  * Adding too many neurons --> overfitting and computation resources
  * Large number of neurons requires equally large training dataset and more  epochs

### Good rule of thumb for a basic neural network is to have two to three times the amount of neurons in the hidden layer as the number of inputs.

2. Check out your input dataset to remove outliers 
3. Add additional hidden layers to allow neurons to train on activated input values, instead of looking at new training data. A neural network with multiple layers can identify nonlinear characteristics of the input data without requiring more input data.
4. Use a different activation function for the hidden layers. Depending on the shape and dimensionality of the input data, one activation function may focus on specific characteristics of the input values, while another activation function may focus on others.
5. Add additional epochs to the training 

### Activation Functions

* **sigmoid** function values are normalized to a probability between 0 and 1, which is ideal for binary classification.
* **tanh** function can be used for classification or regression, and it expands the range between -1 and 1.
* **ReLU** function is ideal for looking at positive nonlinear input data for classification or regression.
* **Leaky ReLU** function is a good alternative for nonlinear input data with many negative inputs.

### Adding More Neurons Steps



In [None]:
# Generate our new Sequential model
new_model = tf.keras.models.Sequential()

In [None]:
# Add the input and hidden layer
number_inputs = 2
number_hidden_nodes = 6
    # Adding 6 neurons
    
new_model.add(tf.keras.layers.Dense(units=number_hidden_nodes, activation="relu", input_dim=number_inputs))

# Add the output layer that uses a probability activation function
new_model.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))


In [None]:
# Compile the Sequential model together and customize metrics
new_model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

# Fit the model to the training data
new_fit_model = new_model.fit(X_moon_train_scaled, y_moon_train, epochs=100, shuffle=True)