# Deep Learning Approach

Our final approach will be an artificial neural network. This model will be a "black box" approach and will not allow us to determine feature importance, however, artificial neural networks are very powerful algorithms and should be able to compete with our random forest and xgboost classifiers. 

In [3]:
import pandas as pd
import numpy as np

import warnings
warnings.filterwarnings('ignore')

from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score, accuracy_score, f1_score
from sklearn.metrics import roc_curve, plot_roc_curve, auc, roc_auc_score
from sklearn.metrics import plot_confusion_matrix
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import RobustScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.feature_selection import SelectFromModel

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.wrappers.scikit_learn import KerasClassifier

# import helper functions
from helper import *

The following code has been repeated from the main notebook and will not be ellaborated on in this notebook. 

In [4]:
# import data
df = pd.read_csv('kddcup99_csv.csv')

# create target columns (1s and 0s)
df['target'] = df['label'].apply(lambda x: 0 if x=='normal' else 1)

# create modeling dataframes
target = df['target']
features = df.drop(['target', 'label'], axis=1)

# one hot encode
features = pd.get_dummies(features, drop_first=True)

# split dataset
X_train, X_val, y_train, y_val = train_test_split(features, target, test_size=0.25, random_state=42)

## Baseline Artificial Neural Network

To assist with convergence we will scale our training and test data. 

In [5]:
# initialize standardscaler
sc = StandardScaler()

# scale x train and x test separately to avoid data leakage
X_train_scaled = sc.fit_transform(X_train)
X_val_scaled = sc.fit_transform(X_val)

We will now initialize our neural network. A few notes on our initial parameters are below.

1. We choose sixty nodes for our hidden layer because, as a rule of thumb, a good place to start is to average the number of input nodes and output nodes.
2. We will use a uniform initial weight distribution to help prevent vanishing gradient problem. Could also use normally distributed weights. 
3. For our hidden layer we will use a relu activation function, again to prevent the vanishing gradient issues associated with the sigmoid activation function. 
4. For the output layer, we will use a sigmoid function as this is a binary classification problem. 
5. We will choose Adam optimization for its performance and speed. 
6. For our loss function we will use cross entropy as this is standard for binary classification problems. 
7. We will limit our epochs to 100 to minimize training time. 

In [6]:
# initialize ANN
ann_clf = Sequential()

# create input layer and hidden layer
ann_clf.add(Dense(output_dim = 60, init = 'uniform', activation = 'relu', input_dim = 115))

# we will add a second hidden layer; accuracy declined after about 10 epochs last time
ann_clf.add(Dense(output_dim = 60, init = 'uniform', activation = 'relu'))

# we will add one more hidden layer
ann_clf.add(Dense(output_dim = 60, init = 'uniform', activation = 'relu'))

# create the output layer
ann_clf.add(Dense(output_dim = 1, init = 'uniform', activation = 'sigmoid'))

In [7]:
# assemble the ANN
ann_clf.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

# Fitting the ANN to the Training set
ann_clf.fit(X_train_scaled, y_train, batch_size = 10, epochs = 10)

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where

Epoch 1/1


<keras.callbacks.callbacks.History at 0x25403a61828>

In [9]:
binary_classification_evaluation(ann_clf, X_train_scaled, X_val_scaled, y_train, y_val, 
                                 name_of_estimator="Baseline Neural Network",
                                 cm_labels=['Normal', 'Attack'], is_ANN=True)

TRAINING SET METRICS
--------------------------------------------------------------------------------------
Baseline Neural Network Classifier Training Data Scores

Recall Score: 99.958%
Precision Score: 99.95%
Accuracy Score: 99.926%
F1 Score: 0.99954
ROC AUC Score: 0.99876

TESTING SET METRICS
--------------------------------------------------------------------------------------
Baseline Neural Network Classifier Testing Data Scores

Recall Score: 99.939%
Precision Score: 99.934%
Accuracy Score: 99.898%
F1 Score: 0.99937
ROC AUC Score: 0.99835

