# Deep Learning Example
**(c) Feb 2025 Julie Fleischer**

This file contains my code to implement the class example exercise from Deep Learning: Getting Started training by Kumaran Ponnambalam in LinkedIn Learning.

This training taught me the basics such that I could create, train, tune, and test a model using Keras.  The design decisions I made for the model are listed below.  They were based on what I learned in the training as well as additional internet searches.

I have created my own training set for this example using the structure from the class example.  The actual training set from the class is available directly from the training.

## 1 - Install libraries

In [5]:
# Install required libraries (if not already installed)
!pip install pandas
!pip install tensorflow
!pip install scikit-learn



## 2 - Load and pre-process input data

In this step, we load the data in from input file deep_learning_sample_data.csv.  This file contains the following columns:
1.  ID column with incrementing numerical ID.
2.  Seven columns with boolean values containing various independent variables (i.e. input variables) that we believe can be used to determine the dependent variable or target.  In the class example, they are error codes and computer states that influence the determined root cause of a failure.
3.  One column with strings containing the dependent variable or target data (i.e. the output).  In the class example, these are strings which contain the analyzed root cause of the failure.

We need to:
- Convert the target data column from a string to a numeric representation.
- Create a numpy array (since the Keras functions we will use to create our model utilize numpy arrays).
- From the numpy array, create one array with input data (X data), which will contain the seven boolean independent variables and one array with target data (Y data), which is the numerically encoded column with the one dependent variable.
- Convert the target data (Y data) from its current ordinal numeric representation into a binary representation using one hot encoding.
   - We convert to binary to avoid implying an ordinal relationship between the values in the target data column.  This video is helpful in understanding what the one hot encoding output looks like as well as why it is done: https://www.youtube.com/watch?v=Dz8zNQNW9RQ.
- Create our test data set using the typical best practice of 10% of the data set for test.  This is the data we will use when we have a fully trained and fine tuned model to test our results.  Note:  Keras will automatically create a validation data set, which is a set of data used to fine tune the model after initial training, so this does not need to be done.

In [6]:
# Load and pre-process input data
import pandas as pd

# Read in the file
root_cause_data = pd.read_csv('deep_learning_sample_data.csv')

print("\n ----------------- Input file has been loaded -----------------\n")
print(root_cause_data.head)

# Convert ROOT_CAUSE column (the target data column) from string to ordinal number

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
root_cause_data['ROOT_CAUSE'] = le.fit_transform(root_cause_data['ROOT_CAUSE'])
print("\n ----------------- Converted root cause to numeric -----------------\n")
print(root_cause_data['ROOT_CAUSE'].head)

# Create a numpy array from root_cause_data for use with Keras functions

root_cause_data_np = root_cause_data.to_numpy()
print("\n ----------------- Data in numpy array -----------------\n")
print(root_cause_data_np[:5])

# Create our input data (X data) array from columns 2-8 (the seven boolean columns)
# and our target data (Y data) from the last column (the ROOT_CAUSE) column

X_data = root_cause_data_np[:, 1:8]
Y_data = root_cause_data_np[:, 8]
print("\n ----------------- X data and Y data extracted -----------------\n")
print("\n         --------- X data  ---------\n")
print(X_data[:5,:])
print("\n         --------- Y data  ---------\n")
print(Y_data[:5])

# Convert ROOT_CAUSE column (the target data column) from ordinal number to boolean matrix using one hot encoding

import tensorflow as tf
Y_data = tf.keras.utils.to_categorical(Y_data)
print("\n ----------------- Y data converted to binary matrix -----------------\n")
print("\n         --------- Y data  ---------\n")
print(Y_data[:5, :])

# Split data into training data and test data.  Use 10% of the data for test.
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X_data, Y_data, test_size=0.1)
print("\n ----------------- Data split into training and test data -----------------\n")
print("\n       ------------- Training data  -------------\n")
print("\n         --------- X data  ---------\n")
print(X_train[:5,:])
print("\n         --------- Y data  ---------\n")
print(Y_train[:5,:])
print("\n         ------ X and Y training data shapes  ------\n")
print(X_train.shape, Y_train.shape)
print("\n       ------------- Test data  -------------\n")
print("\n         --------- X data  ---------\n")
print(X_train[:5,:])
print("\n         --------- Y data  ---------\n")
print(Y_train[:5,:])
print("\n         ------ X and Y test data shapes  ------\n")
print(X_test.shape, Y_test.shape)



 ----------------- Input file has been loaded -----------------

<bound method NDFrame.head of        ID  Independent Variable 1  Independent Variable 2  \
0       1                       1                       0   
1       2                       1                       0   
2       3                       1                       0   
3       4                       0                       0   
4       5                       1                       1   
..    ...                     ...                     ...   
995   996                       1                       1   
996   997                       0                       1   
997   998                       0                       0   
998   999                       0                       1   
999  1000                       0                       0   

     Independent Variable 3  Independent Variable 4  Independent Variable 5  \
0                         0                       1                       0   
1            

## 3 - Create the Deep Learning Model

We now can set up our Deep Learning Model for this data.

Design choices I made for this model were:
- Number of hidden layers and number of nodes modified to best fit.
  - I started using the best practice to have a single hidden layer with a number of nodes equal to the average of the input and output layer sizes.
  - I modified these values incrementally to see if I could get an accuracy of >90%.  Ultimately, I couldn't get higher than ~85% accuracy during both training and testing using the class example data set.  The example solution shown the training course had an accuracy of 86% during training and 76% during testing, so mine appears to be in line with the example.
  - With my own data set, I was able to get 81% accuracy during training and 72% accuracy during testing.
- Sequential model
  - A linear stack of layers is sufficient for this exercise.
- Sigmoid activation function for input and hidden layer(s)
  - Because the input was binary, a sigmoid activation function (which outputs values 0 and 1) would mimic the binary data.
- Softmax activation function for the output layer
  - Because the output could be one of three classes, I chose to use softmax since it provides a vector of probabilities to predict the class.
- Categorical cross-entropy loss function
  - Because the output could be one of three classes, I chose to use categorical cross-entropy since it is used for multi-class classification


In [7]:
# Create the DL model
from tensorflow import keras
from tensorflow.keras import layers

HIDDEN_LAYER1_NODES = 7
HIDDEN_LAYER2_NODES = 5

# Create a simple sequential model that takes all 7 columns of input and delivers one of the three target output values
model = keras.Sequential([
    # Hidden layer with HIDDEN_LAYER1_NODES nodes that takes 7 nodes of input and uses sigmoid activation function
    layers.Dense(HIDDEN_LAYER1_NODES, input_shape=(7,), name='HiddenLayer1', activation='sigmoid'), 
    # Hidden layer with HIDDEN_LAYER2_NODES nodes and uses sigmoid activation function
    layers.Dense(HIDDEN_LAYER2_NODES, name='HiddenLayer2', activation='sigmoid'), 
    layers.Dense(3, name='OutputLayer', activation='softmax')  # Output layer with three values
])

# Compile the model with categorical cross-entropy where we monitor accuracy
model.compile(loss='categorical_crossentropy', metrics=['accuracy']) 

# Print a summary of the model architecture
model.summary() 

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 HiddenLayer1 (Dense)        (None, 7)                 56        
                                                                 
 HiddenLayer2 (Dense)        (None, 5)                 40        
                                                                 
 OutputLayer (Dense)         (None, 3)                 18        
                                                                 
Total params: 114 (456.00 Byte)
Trainable params: 114 (456.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


## 4 - Train and Evaluate Model

We can now train and evaluate the model.  I used the following heuristics when choosing the hyperparameters for training:
- Batch size to be a power of 2
- Started with an epoch size of 10 and grew to 300 to increase accuracy.
- Validation split of 0.2 per best-practice to have roughly 20% of data to be validation data

In [8]:
# Train and evaluate the model

# Setting key hyperparameters as constants for easily modifying

BATCH_SIZE = 16 # power of 2
EPOCH_SIZE = 300 
VALIDATION_SPLIT = 0.2 # roughly 20% of data is validation data

# This is where training occurs.  For EPOCH_SIZE runs, we will run BATCH_SIZE data through the model to train.
# After that, we'll run VALIDATION_SPLIT percentage of the training data through to fine tune the model.

print("\n ----------------- Starting training -----------------\n")
model.fit(X_train, Y_train, epochs=EPOCH_SIZE, batch_size=BATCH_SIZE, verbose=1, validation_split=VALIDATION_SPLIT)
print("\n ----------------- Training finished -----------------\n")

# This is where we test our model and see how it did.

print("\n ----------------- Starting evaluation -----------------\n")
loss, accuracy = model.evaluate(X_test, Y_test, verbose=1)
print("\n ----------------- Evaluation results -----------------\n")
print(f"Delta between predicted and actual values for model (loss): {loss:.4f}")
print(f"Accuracy for model: {accuracy:.4f}")


 ----------------- Starting training -----------------

Epoch 1/300
Epoch 2/300
Epoch 3/300
Epoch 4/300
Epoch 5/300
Epoch 6/300
Epoch 7/300
Epoch 8/300
Epoch 9/300
Epoch 10/300
Epoch 11/300
Epoch 12/300
Epoch 13/300
Epoch 14/300
Epoch 15/300
Epoch 16/300
Epoch 17/300
Epoch 18/300
Epoch 19/300
Epoch 20/300
Epoch 21/300
Epoch 22/300
Epoch 23/300
Epoch 24/300
Epoch 25/300
Epoch 26/300
Epoch 27/300
Epoch 28/300
Epoch 29/300
Epoch 30/300
Epoch 31/300
Epoch 32/300
Epoch 33/300
Epoch 34/300
Epoch 35/300
Epoch 36/300
Epoch 37/300
Epoch 38/300
Epoch 39/300
Epoch 40/300
Epoch 41/300
Epoch 42/300
Epoch 43/300
Epoch 44/300
Epoch 45/300
Epoch 46/300
Epoch 47/300
Epoch 48/300
Epoch 49/300
Epoch 50/300
Epoch 51/300
Epoch 52/300
Epoch 53/300
Epoch 54/300
Epoch 55/300
Epoch 56/300
Epoch 57/300
Epoch 58/300
Epoch 59/300
Epoch 60/300
Epoch 61/300
Epoch 62/300
Epoch 63/300
Epoch 64/300
Epoch 65/300
Epoch 66/300
Epoch 67/300
Epoch 68/300
Epoch 69/300
Epoch 70/300
Epoch 71/300
Epoch 72/300
Epoch 73/300
Epo