# RNNTraffic

**RNNTraffic** is a Recurrent Neural Network based network traffic classification implementation. In this notebook, the whole process of data cleaning, training and testing will be explanied.

This implementation has two main features:
1. Classify where a network traffic in under OpenVPN.
2. Classify detailed type of the traffic.

And there are mainly three parts in this implementation:
1. Data prepare and process
2. Model tranining and tuning
3. Final result evaluation

## 1 Data prepare and process

For the first part, code is not include in this notebook. The detail will explanined here.

### 1.1 Dataset selection

The ISCXVPN2016 dataset which contains various types of traffic including regular traffic, TLS encryted traffic and OpenVPN encryted traffic of some common applications and services. And there are also already bunch of related work on this dataset. Therefore, [ISCXVPN2016](http://205.174.165.80/CICDataset/ISCX-VPN-NonVPN-2016/Dataset/) is used for `RNNTraffic`.

### 1.2 Data selection

Due to the time limitation of this final project and our hardware computing power. It is not pratical to modeling all the data offered by ISCXVPN2016.

Among the various traffic types, four each from VPN and non-VPN was selected.
The selected ones are "Chat", "Email", "P2P" and "Streaming". These four are typical modern network traffic.

### 1.3 Split `pcap` files to session

There, large `pcap` files are splitted by `SplitCap` tool to each session. It is kind of obvious that session contains better traffic fingerprint than single flow(packet).

Due to `SplitCap` not supporting `.pcapng`, all files are converted `.pcap` first. Secnondly, a `powershell` script can be called for splitting all `.pacp` files. The usage of script is `.\datasets_processed\packet2seesion.ps1` under PowerShell.

In this script, each `.pacp` file in `".\datasets_selected\"` are splitted to sessions and saved in `".\datasets_processed\"` recursively.

### 1.4 Convert sesssions to CSV files

Now there are many `.pcap` files which represents sessions. It essential to get more sufficient data from there files. To solve this problem, a naive approach is applied for `RNNTraffic`.

The conversion is done by the script `.datasets_processed\session2csv.py\`.

In this script, `.pcap` file of session is read as raw unsigned int8 datas to numpy array. Sessions with more than 1500 bytes are trimed, session with less than 300 bytes are deprecated and other session are repeated to fill 1500 bytes.

Each session is stored as one row in every dataframe, and the dataframe is stored as CSV files for future trainning and testing.

## 2 Model tranining and tuning

### 2.0 Import packages and test environment

In [1]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns

from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.preprocessing import LabelEncoder

import torch
import torch.nn.functional as F
from torch import nn
from torch.utils.data import DataLoader
from torchvision import transforms

import os, sys

import warnings
warnings.filterwarnings('ignore')

%matplotlib inline

print("pyTorch version: {}".format(torch.__version__))
print("GPU Avaliability: {}".format(torch.cuda.is_available()))
print("Current working directory: {}".format(os.getcwd()))

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

pyTorch version: 1.4.0
GPU Avaliability: True
Current working directory: /home/tygao/py-repos/RNNTraffic


### 2.1 Load processed data

In [2]:
vpnDatasetDir = "./datasets_processed/VPN"
nonVpnDatasetDir = "./datasets_processed/non-VPN"

def getCsvFiles(rootdir):
    csvFiles = []
    for root, subdirs, files in os.walk(rootdir):
        for file in files:
            if os.path.splitext(file)[1] == ".csv":
                csvFiles.append(root + os.path.sep + file)
    return csvFiles

def readCsvFilesToDataframes(csvFiles):
    dfList = []
    for csvFile in csvFiles:
        df = pd.read_csv(csvFile, index_col=0)
        dfList.append(df)
    return dfList

vpnCsvFiles = getCsvFiles(vpnDatasetDir)
vpnDataframes = readCsvFilesToDataframes(vpnCsvFiles)

# Add indication on label for VPN data
for vpnDataframe in vpnDataframes:
    vpnDataframe['Label'] = "VPN-" + vpnDataframe['Label']
    
nonVpnCsvFiles = getCsvFiles(nonVpnDatasetDir)
nonVpnDataframes = readCsvFilesToDataframes(nonVpnCsvFiles)

vpnDf = pd.concat(vpnDataframes, ignore_index=True, sort=False)
nonVpnDf = pd.concat(nonVpnDataframes, ignore_index=True, sort=False)

print("Shape of VPN dataframe: {}, shape of non-VPN dataframe: {}".format(vpnDf.shape, nonVpnDf.shape))

vpnLabels = list(set(vpnDf['Label'].to_list()))
nonVpnLabels = list(set(nonVpnDf['Label'].to_list()))

print("Vpn labels: {}, non-VPN labels: {}".format(vpnLabels, nonVpnLabels))

Shape of VPN dataframe: (3984, 1501), shape of non-VPN dataframe: (3429, 1501)
Vpn labels: ['VPN-Email', 'VPN-P2P', 'VPN-Chat', 'VPN-Streaming'], non-VPN labels: ['P2P', 'Chat', 'Streaming', 'Email']


In [3]:
vpnDf.head(5)

Unnamed: 0,DataIndex_0,DataIndex_1,DataIndex_2,DataIndex_3,DataIndex_4,DataIndex_5,DataIndex_6,DataIndex_7,DataIndex_8,DataIndex_9,...,DataIndex_1491,DataIndex_1492,DataIndex_1493,DataIndex_1494,DataIndex_1495,DataIndex_1496,DataIndex_1497,DataIndex_1498,DataIndex_1499,Label
0,161,178,195,212,0,2,0,4,0,0,...,38,12,221,61,204,163,237,187,254,VPN-Chat
1,161,178,195,212,0,2,0,4,0,0,...,177,137,7,248,223,123,142,145,176,VPN-Chat
2,161,178,195,212,0,2,0,4,0,0,...,214,131,0,0,0,52,0,0,0,VPN-Chat
3,161,178,195,212,0,2,0,4,0,0,...,14,16,225,209,124,232,156,108,65,VPN-Chat
4,161,178,195,212,0,2,0,4,0,0,...,70,82,64,0,50,17,207,204,157,VPN-Chat


In [4]:
nonVpnDf.head(5)

Unnamed: 0,DataIndex_0,DataIndex_1,DataIndex_2,DataIndex_3,DataIndex_4,DataIndex_5,DataIndex_6,DataIndex_7,DataIndex_8,DataIndex_9,...,DataIndex_1491,DataIndex_1492,DataIndex_1493,DataIndex_1494,DataIndex_1495,DataIndex_1496,DataIndex_1497,DataIndex_1498,DataIndex_1499,Label
0,161,178,195,212,0,2,0,4,0,0,...,231,50,60,101,80,8,0,69,0,Chat
1,161,178,195,212,0,2,0,4,0,0,...,17,103,166,34,1,187,159,46,182,Chat
2,161,178,195,212,0,2,0,4,0,0,...,84,109,186,136,183,38,0,38,8,Chat
3,161,178,195,212,0,2,0,4,0,0,...,1,0,0,0,0,0,0,32,70,Chat
4,161,178,195,212,0,2,0,4,0,0,...,4,5,180,1,3,3,8,1,1,Chat


### 2.2 VPN/non-VPN binary classification - dataloader setup

In [5]:
nonVpnDf['Label'] = "0"
nonVpnDf['Label'] = nonVpnDf['Label'].astype('int64')
vpnDf['Label'] = "1"
vpnDf['Label'] = vpnDf['Label'].astype('int64')
df = pd.concat([nonVpnDf, vpnDf], ignore_index=True, sort=False)
trainDf, validateDf, testDf = np.split(df.sample(frac=1), [int(.8*len(df)), int(.9*len(df))])

In [6]:
trainDf.shape, validateDf.shape, testDf.shape

((5930, 1501), (741, 1501), (742, 1501))

In [7]:
class VpnBinaryDataset():

    def __init__(self, dataframe, transform=None, target_transform = None):
        self.df = dataframe
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        return self.df.shape[0]

    def __getitem__(self, index):
        data = np.array([self.df.iloc[index, :-1]])
        data = torch.from_numpy(data).view(1, df.shape[1] - 1).float()
        target = self.df.iloc[index, -1]

        if self.transform is not None:
            data = self.transform(data)

        if self.target_transform is not None:
            target = self.target_transform(target)

        return data, target

In [8]:
trainDataset = VpnBinaryDataset(trainDf)
validateDataset = VpnBinaryDataset(validateDf)
testDataset = VpnBinaryDataset(testDf)
trainDataset[0][0].shape, trainDataset[0][0].dtype, trainDataset[0][0]

(torch.Size([1, 1500]),
 torch.float32,
 tensor([[161., 178., 195.,  ..., 178., 195., 212.]]))

In [9]:
loaderArgs = {'batch_size': 1000}

if use_cuda:
    loaderArgs.update({'num_workers': 1,
                       'pin_memory': True,
                       'shuffle': True}
                     )
else:
    loaderArgs.update({'shuffle': True})
    
trainLoader = torch.utils.data.DataLoader(trainDataset,**loaderArgs)
validateLoader = torch.utils.data.DataLoader(validateDataset, **loaderArgs)
testLoader = torch.utils.data.DataLoader(testDataset, **loaderArgs)

### 2.2 Define train, validate and test function

In [10]:
def train(args, model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % args['log_interval'] == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
            if args['dry-run']:
                break
                
def validate(model, device, validate_loader):
    model.eval()
    validate_loss = 0
    correct = 0
    
    preds = []
    targets = []
    with torch.no_grad():
        for data, target in validate_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            validate_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()
            preds.append(pred.cpu())
            targets.append(target.view_as(pred).cpu())

    pred_all = torch.cat(preds).squeeze().tolist()
    target_all = torch.cat(targets).squeeze().tolist()
            
    rc = recall_score(pred_all, target_all, average='macro')
    pr = precision_score(pred_all, target_all, average='macro')
    
    validate_loss /= len(validate_loader.dataset)

    accuracy = 100. * correct / len(validate_loader.dataset)
    
    print('\nValidate set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n, Recall: {}, Precision: {}\n'.format(
        validate_loss, correct, len(validate_loader.dataset), accuracy, rc, pr))

    return validate_loader, accuracy

def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    
    preds = []
    targets = []
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()
            preds.append(pred.cpu())
            targets.append(target.view_as(pred).cpu())

    pred_all = torch.cat(preds).squeeze().tolist()
    target_all = torch.cat(targets).squeeze().tolist()
            
    rc = recall_score(pred_all, target_all, average='macro')
    pr = precision_score(pred_all, target_all, average='macro')
    
    test_loss /= len(test_loader.dataset)

    accuracy = 100. * correct / len(test_loader.dataset)
    
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n, Recall: {}, Precision: {}\n'.format(
        test_loss, correct, len(test_loader.dataset), accuracy, rc, pr))

    return test_loader, accuracy

### 2.3 VPN/non-VPN binary classification - CNN model

In [11]:
class CNNBinary(nn.Module):
    def __init__(self):
        super(CNNBinary, self).__init__()
        self.conv1 = nn.Conv1d(1, 16, 7, 1)
        self.conv2 = nn.Conv1d(16, 32, 7, 1)

        self.fc1 = nn.Linear(11840, 512)
        self.fc2 = nn.Linear(512, 128)
        self.fc3 = nn.Linear(128, 2)

    def forward(self, x):
        # 1st Conv
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool1d(x, 2)
        
        # 2nd Conv
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool1d(x, 2)
        
        # Full Connect
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.fc3(x)
        output = F.log_softmax(x, dim=1)
        return output

In [28]:
modelCnnBinary = CNNBinary().to(device)

args = {'lr': 0.5,
              'gamma': 0.7,
              'dry-run': False,
              'log_interval': 2,
              'epochs': 20
             }

optimizer = torch.optim.Adadelta(modelCnnBinary.parameters(), lr=args['lr'])
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=args['gamma'])

test_result = pd.DataFrame(columns=['Epoch','Loss','Accuracy'])

for epoch in range(1, args['epochs'] + 1):
    train(args, modelCnnBinary, device, trainLoader, optimizer, epoch)
    (loss, accuracy) = validate(modelCnnBinary, device, validateLoader)
    scheduler.step()


Validate set: Average loss: 3.7196, Accuracy: 400/741 (54%)
, Recall: 0.2699055330634278, Precision: 0.5


Validate set: Average loss: 2.6410, Accuracy: 400/741 (54%)
, Recall: 0.2699055330634278, Precision: 0.5


Validate set: Average loss: 0.3153, Accuracy: 633/741 (85%)
, Recall: 0.8937007874015748, Precision: 0.8416422287390029


Validate set: Average loss: 0.2052, Accuracy: 707/741 (95%)
, Recall: 0.9533933518005541, Precision: 0.9559860703812317


Validate set: Average loss: 0.1884, Accuracy: 707/741 (95%)
, Recall: 0.9537804718122076, Precision: 0.9566348973607038


Validate set: Average loss: 0.1749, Accuracy: 709/741 (96%)
, Recall: 0.9563246866802682, Precision: 0.9591348973607039


Validate set: Average loss: 0.1646, Accuracy: 709/741 (96%)
, Recall: 0.9563246866802682, Precision: 0.9591348973607039


Validate set: Average loss: 0.1566, Accuracy: 709/741 (96%)
, Recall: 0.9563246866802682, Precision: 0.9591348973607039


Validate set: Average loss: 0.1497, Accuracy: 716/741

In [30]:
(loss, accuracy) = test(modelCnnBinary, device, testLoader)


Test set: Average loss: 0.1207, Accuracy: 728/742 (98%)
, Recall: 0.9801530612244898, Precision: 0.9824316011482805



### 2.3 VPN/non-VPN binary classification - RNN model

In [14]:
class RNNBinary(nn.Module):
    def __init__(self):
        super(RNNBinary, self).__init__()
        self.lstm1 = nn.LSTM(1500, 512, 1, bidirectional=True)
        self.fc1 = nn.Linear(1024, 128)
        self.fc2 = nn.Linear(128, 2)

    def forward(self, x):
        x, _ = self.lstm1(x)
        # Full Connect
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

In [15]:
modelRnnBinary = RNNBinary().to(device)

args = {'lr': 0.5,
              'gamma': 0.7,
              'dry-run': False,
              'log_interval': 2,
              'epochs': 20
             }

optimizer = torch.optim.Adadelta(modelRnnBinary.parameters(), lr=args['lr'])
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=args['gamma'])

test_result = pd.DataFrame(columns=['Epoch','Loss','Accuracy'])

for epoch in range(1, args['epochs'] + 1):
    train(args, modelRnnBinary, device, trainLoader, optimizer, epoch)
    (loss, accuracy) = validate(modelRnnBinary, device, validateLoader)
    scheduler.step()


Validate set: Average loss: 0.4442, Accuracy: 568/741 (77%)
, Recall: 0.8317120622568093, Precision: 0.78375


Validate set: Average loss: 0.4196, Accuracy: 571/741 (77%)
, Recall: 0.8336594911937378, Precision: 0.7875


Validate set: Average loss: 0.1787, Accuracy: 698/741 (94%)
, Recall: 0.9440104166666667, Precision: 0.94625


Validate set: Average loss: 0.1041, Accuracy: 723/741 (98%)
, Recall: 0.9749303621169916, Precision: 0.9775


Validate set: Average loss: 0.0857, Accuracy: 726/741 (98%)
, Recall: 0.9789325842696629, Precision: 0.98125


Validate set: Average loss: 0.0764, Accuracy: 728/741 (98%)
, Recall: 0.9816691984108437, Precision: 0.9835337243401759


Validate set: Average loss: 0.0726, Accuracy: 725/741 (98%)
, Recall: 0.977577902649055, Precision: 0.9797837243401759


Validate set: Average loss: 0.0676, Accuracy: 730/741 (99%)
, Recall: 0.984375, Precision: 0.9862500000000001


Validate set: Average loss: 0.0660, Accuracy: 730/741 (99%)
, Recall: 0.984375, Precision: 

In [29]:
(loss, accuracy) = test(modelRnnBinary, device, testLoader)


Test set: Average loss: 0.0634, Accuracy: 729/742 (98%)
, Recall: 0.9814814814814814, Precision: 0.9839108910891089



### 2.4 Detailed traffic multi classification - dataloader setup

In [17]:
vpnDf = pd.concat(vpnDataframes, ignore_index=True, sort=False)
nonVpnDf = pd.concat(nonVpnDataframes, ignore_index=True, sort=False)

vpnDfLabels = list(set(vpnDf['Label'].to_list()))
vpnDf['Label'] = vpnDf['Label'].apply(lambda x: vpnDfLabels.index(x))
vpnTrainDf, vpnValidateDf, vpnTestDf = np.split(vpnDf.sample(frac=1), [int(.8*len(vpnDf)), int(.9*len(vpnDf))])

nonVpnDfLabels = list(set(nonVpnDf['Label'].to_list()))
nonVpnDf['Label'] = nonVpnDf['Label'].apply(lambda x: nonVpnDfLabels.index(x))
nonVpnTrainDf, nonVpnValidateDf, nonVpnTestDf = np.split(nonVpnDf.sample(frac=1), [int(.8*len(nonVpnDf)), int(.9*len(nonVpnDf))])

In [18]:
vpnTrainDf.shape, vpnValidateDf.shape, vpnTestDf.shape, vpnDfLabels, nonVpnDfLabels

((3187, 1501),
 (398, 1501),
 (399, 1501),
 ['VPN-Email', 'VPN-P2P', 'VPN-Chat', 'VPN-Streaming'],
 ['P2P', 'Chat', 'Streaming', 'Email'])

In [19]:
vpnTrainDf

Unnamed: 0,DataIndex_0,DataIndex_1,DataIndex_2,DataIndex_3,DataIndex_4,DataIndex_5,DataIndex_6,DataIndex_7,DataIndex_8,DataIndex_9,...,DataIndex_1491,DataIndex_1492,DataIndex_1493,DataIndex_1494,DataIndex_1495,DataIndex_1496,DataIndex_1497,DataIndex_1498,DataIndex_1499,Label
1886,161,178,195,212,0,2,0,4,0,0,...,0,0,0,0,0,1,48,14,99,2
864,161,178,195,212,0,2,0,4,0,0,...,4,216,239,38,10,161,178,195,212,2
3869,161,178,195,212,0,2,0,4,0,0,...,111,103,108,101,46,105,116,130,11,3
3150,161,178,195,212,0,2,0,4,0,0,...,193,8,16,54,244,201,171,154,249,0
256,161,178,195,212,0,2,0,4,0,0,...,0,0,0,0,0,1,48,14,99,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
343,161,178,195,212,0,2,0,4,0,0,...,65,126,242,64,0,64,17,163,107,2
1691,161,178,195,212,0,2,0,4,0,0,...,4,216,239,38,10,161,178,195,212,2
2005,161,178,195,212,0,2,0,4,0,0,...,4,216,239,38,10,161,178,195,212,2
3819,161,178,195,212,0,2,0,4,0,0,...,199,187,104,5,217,227,41,70,77,3


In [36]:
class MultiDataset():

    def __init__(self, dataframe, transform=None, target_transform = None):
        self.df = dataframe
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        return self.df.shape[0]

    def __getitem__(self, index):
        data = np.array([self.df.iloc[index, :-1]])
        data = torch.from_numpy(data).view(1, df.shape[1] - 1).float()
        target = self.df.iloc[index, -1]

        if self.transform is not None:
            data = self.transform(data)

        if self.target_transform is not None:
            target = self.target_transform(target)

        return data, target

In [80]:
vpnTrainDataset = MultiDataset(vpnTrainDf)
vpnValidateDataset = MultiDataset(vpnValidateDf)
vpnTestDataset = MultiDataset(vpnTestDf)

loaderArgs = {'batch_size': 200}

if use_cuda:
    loaderArgs.update({'num_workers': 1,
                       'pin_memory': True,
                       'shuffle': True}
                     )
else:
    loaderArgs.update({'shuffle': True})
    
vpnTrainLoader = torch.utils.data.DataLoader(vpnTrainDataset,**loaderArgs)
vpnValidateLoader = torch.utils.data.DataLoader(vpnValidateDataset, **loaderArgs)
vpnTestLoader = torch.utils.data.DataLoader(vpnTestDataset, **loaderArgs)

In [81]:
nonVpnTrainDataset = MultiDataset(nonVpnTrainDf)
nonVpnValidateDataset = MultiDataset(nonVpnValidateDf)
nonVpnTestDataset = MultiDataset(nonVpnTestDf)

loaderArgs = {'batch_size': 200}

if use_cuda:
    loaderArgs.update({'num_workers': 1,
                       'pin_memory': True,
                       'shuffle': True}
                     )
else:
    loaderArgs.update({'shuffle': True})
    
nonVpnTrainLoader = torch.utils.data.DataLoader(nonVpnTrainDataset,**loaderArgs)
nonVpnValidateLoader = torch.utils.data.DataLoader(nonVpnValidateDataset, **loaderArgs)
nonVpnTestLoader = torch.utils.data.DataLoader(nonVpnTestDataset, **loaderArgs)

### 2.5 Detailed traffic multi classification - RNN model - VPN

In [88]:
class RNNMulti(nn.Module):
    def __init__(self):
        super(RNNMulti, self).__init__()
        self.lstm1 = nn.LSTM(1500, 512, 1, bidirectional=True)
        self.fc1 = nn.Linear(1024, 128)
        self.fc2 = nn.Linear(128, 4)

    def forward(self, x):
        x, _ = self.lstm1(x)
        # Full Connect
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

In [89]:
modelVpnRnnBinary = RNNMulti().to(device)

args = {'lr': 0.5,
              'gamma': 0.7,
              'dry-run': False,
              'log_interval': 2,
              'epochs': 20
             }

optimizer = torch.optim.Adadelta(modelVpnRnnBinary.parameters(), lr=args['lr'])
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=args['gamma'])

test_result = pd.DataFrame(columns=['Epoch','Loss','Accuracy'])

for epoch in range(1, args['epochs'] + 1):
    train(args, modelVpnRnnBinary, device, vpnTrainLoader, optimizer, epoch)
    (loss, accuracy) = test(modelVpnRnnBinary, device, vpnValidateLoader)
    scheduler.step()


Test set: Average loss: 0.4453, Accuracy: 342/398 (86%)
, Recall: 0.8061075762763448, Precision: 0.5953007335643992


Test set: Average loss: 0.3337, Accuracy: 346/398 (87%)
, Recall: 0.9641873278236914, Precision: 0.5887445887445888


Test set: Average loss: 0.2274, Accuracy: 372/398 (93%)
, Recall: 0.9228209688783489, Precision: 0.8060299828788574


Test set: Average loss: 0.1912, Accuracy: 376/398 (94%)
, Recall: 0.9363722697056032, Precision: 0.8406620175108921


Test set: Average loss: 0.1703, Accuracy: 376/398 (94%)
, Recall: 0.9336085311317509, Precision: 0.8295611141270305


Test set: Average loss: 0.1599, Accuracy: 378/398 (95%)
, Recall: 0.9493634259259259, Precision: 0.8490416336075499


Test set: Average loss: 0.1515, Accuracy: 380/398 (95%)
, Recall: 0.9365547489413188, Precision: 0.8709650478139224


Test set: Average loss: 0.1503, Accuracy: 380/398 (95%)
, Recall: 0.9365547489413188, Precision: 0.8709650478139224


Test set: Average loss: 0.1484, Accuracy: 378/398 (95%)


Test set: Average loss: 0.1412, Accuracy: 379/398 (95%)
, Recall: 0.9288309054029136, Precision: 0.8701611892930221


Test set: Average loss: 0.1383, Accuracy: 380/398 (95%)
, Recall: 0.93075069144377, Precision: 0.8888378502526413


Test set: Average loss: 0.1414, Accuracy: 381/398 (96%)
, Recall: 0.9421146044624746, Precision: 0.8828698097186842


Test set: Average loss: 0.1412, Accuracy: 379/398 (95%)
, Recall: 0.9288013318534961, Precision: 0.8769330883478794



In [90]:
(loss, accuracy) = test(modelVpnRnnBinary, device, vpnTestLoader)


Test set: Average loss: 0.1393, Accuracy: 381/399 (95%)
, Recall: 0.9201564334200991, Precision: 0.8910653974508463



### 2.5 Detailed traffic multi classification - RNN model - non-VPN

In [32]:
modelNonVpnRnnBinary = RNNMulti().to(device)

args = {'lr': 0.5,
              'gamma': 0.7,
              'dry-run': False,
              'log_interval': 2,
              'epochs': 20
             }

optimizer = torch.optim.Adadelta(modelNonVpnRnnBinary.parameters(), lr=args['lr'])
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=args['gamma'])

test_result = pd.DataFrame(columns=['Epoch','Loss','Accuracy'])

for epoch in range(1, args['epochs'] + 1):
    train(args, modelNonVpnRnnBinary, device, nonVpnTrainLoader, optimizer, epoch)
    (loss, accuracy) = test(modelNonVpnRnnBinary, device, nonVpnValidateLoader)
    scheduler.step()


Test set: Average loss: 1.2831, Accuracy: 124/343 (36%)
, Recall: 0.24102564102564103, Precision: 0.36650485436893204


Test set: Average loss: 1.0840, Accuracy: 156/343 (45%)
, Recall: 0.44008526850507984, Precision: 0.406576774360082


Test set: Average loss: 0.9324, Accuracy: 205/343 (60%)
, Recall: 0.7310259774975092, Precision: 0.6397339842700536


Test set: Average loss: 0.7707, Accuracy: 202/343 (59%)
, Recall: 0.7342936850275477, Precision: 0.557347409998142


Test set: Average loss: 0.5146, Accuracy: 256/343 (75%)
, Recall: 0.7674564581744217, Precision: 0.7703202127790896


Test set: Average loss: 0.4624, Accuracy: 276/343 (80%)
, Recall: 0.8526563505896023, Precision: 0.8243458095003491


Test set: Average loss: 0.4228, Accuracy: 272/343 (79%)
, Recall: 0.8118930063677798, Precision: 0.8128229634872485


Test set: Average loss: 0.3996, Accuracy: 281/343 (82%)
, Recall: 0.8483388765705838, Precision: 0.8367134948161908


Test set: Average loss: 0.3868, Accuracy: 283/343 (83%


Test set: Average loss: 0.3671, Accuracy: 288/343 (84%)
, Recall: 0.8563544325134778, Precision: 0.8542572692878422


Test set: Average loss: 0.3669, Accuracy: 286/343 (83%)
, Recall: 0.8507798876902616, Precision: 0.8491026301125845



In [34]:
(loss, accuracy) = test(modelNonVpnRnnBinary, device, nonVpnTestLoader)


Test set: Average loss: 0.3470, Accuracy: 298/343 (87%)
, Recall: 0.8782962803322586, Precision: 0.8695638307480413



## 3 Final result evaluation

We have done three expriment:
1. Binary classfication on VPN and non-VPN traffic
2. OpenVPN traffic classfication on 4 classes
3. Regular encryted traffic (TLS,HTTPS) classfication on 4 classes

### Experiment 1

In this experiment, we re-implemented previous work on 1-D CNN (Wang, Wei, et al. 2017) and compared to RNN(LSTM) model, there are no slight difference on performance.

### Experiment 2

In this experiment, RNN(LSTM) model is used to identify traffic under OpenVPN encryption, the performance on recall and precision is worse than 1-D CNN. However, the accuracy itself is acceptable.

### Experiment3 

In this experiment, RNN(LSTM) model is 