### GNNifyID

A Pipeline to Convert Raw PCAP Files into two different output.

1. Extracted Flow-based features along with their individual packet-level information.
2. Data Object for GNN model, trasnforming flow-based features along with packet-level information into individual graph.

This transformation can be utilized for a graph level prediction.

In [17]:
from utility.functions import *
import tarfile 
import shutil

#### Extraction of Compressed PCAP Files of CIC-IoT2023 Dataset

In [None]:
# Provide path to the directory where Raw Pcap Files are downloaded
Directory = "F:\\CIC IoT Dataset 2023\\*.tar.gz"
# Path where you want the extracted PCAP files to be
Out_Directory = 'G:\\CIC_IOT\\Packet_Level_Data'

Compressed_files = glob.glob(Directory)

In [None]:
for files in Compressed_files:
    file = tarfile.open(files) 
    file.extractall(Out_Directory)

#### Renaming the PCAP files

Renaming PCAP files so that it is easier to distinguish between attack classes while performing transformation into grapghs. 

In [None]:
name_mapping = {'Benign': 'Benign' , 
          'DDoS-ACK_Fragmentation':'DDos-AckFrg', 
          'DDoS-UDP_Flood':'DDos-UDPFlood',
         'DDos-SlowLoris':'DDos-SlowLoris',
         'DDoS-ICMP_Flood':'DDos-ICMPFlood',
         'DDoS-RSTFINFlood' :'DDos-RSTFIN',
         'DDoS-PSHACK_Flood':'DDos-PSHACK',
         'DDoS-HTTP_Flood':'DDos-HTTPFlood',
         'DDoS-UDP_Fragmentation':'DDos-UDPFrg' ,
         'NaN':'DDos-ICMPFrg',
         'DDoS-TCP_Flood':'DDos-TCPFlood',
         'DDoS-SYN_Flood':'DDos-SYNFlood',
         'DDoS-SynonymousIP_Flood':'DDos-SynonymousIPFlood' ,
          'DoS-TCP_Flood':'Dos-TCPFlood',
          'DoS-HTTP_Flood':'Dos-HTTPFlood',
          'DoS-SYN_Flood':'Dos-SYNFlood',
          'DoS-UDP_Flood':'Dos-UDPFlood',
          'Recon-PingSweep':'Recon-PingSweep',
          'Recon-OSScan':'Recon-OSScan',
          'VulnerabilityScan':'Recon-VulScan',
          'Recon-PortScan':'Recon-PortScan',
          'Recon-HostDiscovery':'Recon-HostDisc',
          'SqlInjection':'WebBased-SqlInject',
          'CommandInjection':'WebBased-CmmdInject',
          'Backdoor_Malware':'WebBased-BckdoorMalware',
          'Uploading_Attack':'WebBased-UploadAttack',
          'XSS':'WebBased-XSS',
          'BrowserHijacking':'Webbased-BrwserHijack',
          'DictionaryBruteForce':'BruteForce-Dictionary',
          'MITM-ArpSpoofing':'Spoofing-ARP',
          'DNS_Spoofing':'Spoofing-DNS',
          'Mirai-greip_flood':'Mirai-GREIP',
          'Mirai-greeth_flood':'Mirai-Greeth',
          'Mirai-udpplain':'Mirai-UDPPlain'
         }

In [None]:
Out_Directory = Out_Directory+"\\"
rename_files(Out_Directory, name_mapping)

#### Extracting Featurees from PCAP files

Extraction of flow-level features along with packet-level features from PCAP files

In [None]:
# Running the feature extractor on the command line as it has some issue while running in the notebook
!Python F:/Feature_extractor_flow_packet_combined.py

#### Transformation of Extracted features into graph data object. 

In [None]:
dir = "F:/GNN_Project/data/"
Files =glob.glob("F:/GNN_Project/data/raw/*")

In [None]:
## Dictionary for classifying Classes and Assigning them Class number for reference

Dict_x = {'Benign': 0 , 
          'DDos-AckFrg': 1, 
          'DDos-UDPFlood': 2,
         'DDos-SlowLoris': 3,
         'DDos-ICMPFlood': 4,
         'DDos-RSTFIN': 5,
         'DDos-PSHACK': 6,
         'DDos-HTTPFlood': 7,
         'DDos-UDPFrg' : 8,
         'DDos-ICMPFrg': 9,
         'DDos-TCPFlood': 10,
         'DDos-SYNFlood': 11,
         'DDos-SynonymousIPFlood' : 12,
          'Dos-TCPFlood': 13,
          'Dos-HTTPFlood': 14,
          'Dos-SYNFlood': 14,
          'Dos-UDPFlood': 15,
          'Recon-PingSweep':16,
          'Recon-OSScan':17,
          'Recon-VulScan': 18,
          'Recon-PortScan':19,
          'Recon-HostDisc':20,
          'WebBased-SqlInject': 21,
          'WebBased-CmmdInject':22,
          'WebBased-BckdoorMalware':23,
          'WebBased-UploadAttack': 24,
          'WebBased-XSS':25,
          'Webbased-BrwserHijack':26,
          'BruteForce-Dictionary': 27,
          'Spoofing-ARP':28,
          'Spoofing-DNS':29,
          'Mirai-GREIP':30,
          'Mirai-Greeth':31,
          'Mirai-UDPPlain':32
         }


In [None]:
data = NIDSDataset(root=dir, label_dict=Dict_x, filename=Files, skip_processing=False)