# Classifier comparison

A comparison of a several classifiers on CICIDS2017 webattacks dataset.

Sources:

* CICIDS2017: https://www.unb.ca/cic/datasets/ids-2017.html
* Scikit-learn demo: https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html
* Overview of classification metrics: http://www.machinelearning.ru/wiki/images/d/de/Voron-ML-Quality-slides.pdf

## Reading and preparing data

Read undersampled (balanced) and preprocessed data.

In [1]:
import pandas as pd
import numpy as np
df = pd.read_csv('web_attacks_balanced.csv')

The "Label" column is encoded as follows: "BENIGN" = 0, attack = 1.

In [2]:
df['Label'] = df['Label'].apply(lambda x: 0 if x == 'BENIGN' else 1)
y = df['Label'].values

Select the features.

In [3]:
webattack_features = ['Average Packet Size', 'Flow Bytes/s', 'Max Packet Length', 'Packet Length Mean', 
                      'Fwd Packet Length Mean', 'Subflow Fwd Bytes', 'Fwd IAT Min', 'Avg Fwd Segment Size',
                      'Total Length of Fwd Packets', 'Fwd IAT Std', 'Fwd Packet Length Max', 'Flow IAT Mean',
                      'Fwd Header Length', 'Flow Duration', 'Flow Packets/s', 'Fwd IAT Mean',
                      'Fwd IAT Total', 'Fwd Packets/s', 'Flow IAT Std', 'Fwd IAT Max']

In [4]:
X = df[webattack_features]
print(X.shape, y.shape)

(7267, 20) (7267,)


In [5]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0, random_state=42)

unique, counts = np.unique(y_train, return_counts=True)
dict(zip(unique, counts))

{0: 5087, 1: 2180}

In [6]:
X.head()

Unnamed: 0,Average Packet Size,Flow Bytes/s,Max Packet Length,Packet Length Mean,Fwd Packet Length Mean,Subflow Fwd Bytes,Fwd IAT Min,Avg Fwd Segment Size,Total Length of Fwd Packets,Fwd IAT Std,Fwd Packet Length Max,Flow IAT Mean,Fwd Header Length,Flow Duration,Flow Packets/s,Fwd IAT Mean,Fwd IAT Total,Fwd Packets/s,Flow IAT Std,Fwd IAT Max
0,80.75,3635.433,103.0,64.6,39.0,78.0,3.0,39.0,78.0,0.0,39.0,26040.0,64.0,78120.0,51.203277,3.0,3.0,25.601638,45096.54,3.0
1,50.666667,10.03516,48.0,48.0,48.0,432.0,1999848.0,48.0,432.0,20000000.0,48.0,5064547.0,204.0,86097296.0,0.209066,10700000.0,86000000.0,0.104533,14300000.0,59000000.0
2,48.0,909090.9,48.0,38.4,32.0,64.0,4.0,32.0,64.0,0.0,32.0,58.66667,64.0,176.0,22727.27273,4.0,4.0,11363.63636,95.55278,4.0
3,94.25,2000000.0,112.0,75.4,51.0,102.0,3.0,51.0,102.0,0.0,51.0,54.33333,64.0,163.0,24539.8773,3.0,3.0,12269.93865,55.36545,3.0
4,80.0,1792208.0,94.0,64.0,44.0,88.0,3.0,44.0,88.0,0.0,44.0,51.33333,64.0,154.0,25974.02597,3.0,3.0,12987.01299,82.85127,3.0


## Classifier comparison

The operation may take a long time, 3-5 minutes depending on the computer performance.

In [7]:
import time
import warnings
warnings.filterwarnings("ignore")

from sklearn import model_selection
from sklearn.model_selection import cross_val_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
from sklearn.neural_network import MLPClassifier

models = []
models.append(('KNN', KNeighborsClassifier()))
models.append(('SVM', SVC(gamma='auto')))
models.append(('CART', DecisionTreeClassifier(max_depth=5)))
models.append(('RF', RandomForestClassifier(max_depth=5, n_estimators=5, max_features=3)))    
models.append(('ABoost', AdaBoostClassifier()))
models.append(('LR', LogisticRegression(solver='lbfgs', max_iter=200)))
models.append(('NB', GaussianNB()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('QDA', QuadraticDiscriminantAnalysis()))
models.append(('MLP', MLPClassifier()))

print('Model\tAcc\tPr\tRecall\tF1\tExecution')
      
for name, model in models:
    start_time = time.time()
    kfold = model_selection.KFold(n_splits=5, random_state=24)    

    accuracy = cross_val_score(model, X_train, y_train, cv=kfold, scoring='accuracy').mean()
    precision = cross_val_score(model, X_train, y_train, cv=kfold, scoring='precision').mean()
    recall = cross_val_score(model, X_train, y_train, cv=kfold, scoring='recall').mean()
    f1_score = cross_val_score(model, X, y, cv=kfold, scoring='f1_weighted').mean()
    
    delta = time.time() - start_time
    print('{}\t{:.3f}\t{:.3f}\t{:.3f}\t{:.3f}\t{:.2f} secs'.format(name, accuracy, precision, recall, f1_score, delta))

Model	Acc	Pr	Recall	F1	Execution
KNN	0.971	0.942	0.961	0.969	4.57 secs
SVM	0.705	0.669	0.036	0.602	176.04 secs
CART	0.976	0.973	0.946	0.969	1.47 secs
RF	0.975	0.969	0.936	0.966	1.12 secs
ABoost	0.978	0.962	0.965	0.973	23.40 secs
LR	0.955	0.939	0.914	0.963	15.80 secs
NB	0.722	0.520	0.956	0.754	0.47 secs
LDA	0.939	0.921	0.872	0.941	2.23 secs
QDA	0.872	0.978	0.597	0.949	1.28 secs
MLP	0.904	0.921	0.912	0.776	93.83 secs
