# Evaluation of Python libraries that provides packet manipulation

This notebook compares the perfomance of three different libraries:
- [PyShark](https://kiminewt.github.io/pyshark/)
- [dpkt](https://dpkt.readthedocs.io/en/latest/)
- [Scapy](https://scapy.net/)

The data used for the comparision is the Backdoor Malware PCAP file from the [CIC-IoT-2023](https://www.unb.ca/cic/datasets/iotdataset-2023.html), essentialy because of its size (11.1MB).

The tests were conducted in a notebook with an AMD Ryzen 5 2500u 2.0GHz processor, 12GB DDR4 2667MHz RAM and running Fedora Linux (Silverblue 40.20241006.0).

In [None]:
import warnings
warnings.filterwarnings('ignore') # ignore some warnings from PyShark, cleaning the output

import time

N = 10 # total number of iterations for each algorithm
filepath = '../datasets/CIC-IoT-2023/PCAP/Backdoor_Malware/Backdoor_Malware.pcap'

## PyShark

### Keeping all the packets in memory

In [None]:
import pyshark

mean_time = 0

for i in range(N):
    start = time.time()

    cap = pyshark.FileCapture(filepath)
    total = 0

    def add_length_pkt(pkt): # workaround due to a runtime error (https://github.com/KimiNewt/pyshark/issues/360#issuecomment-520208294)
        global total
        total += int(pkt.length)
    await cap.packets_from_tshark(add_length_pkt)

    running_time = time.time() - start
    mean_time += running_time

    print(i+1, total, 'bytes', running_time, 'seconds')

print(mean_time/N, 'seconds per iteration on average')

1 10010158 bytes 138.86471796035767 seconds
2 10010158 bytes 140.4166224002838 seconds
3 10010158 bytes 140.6231734752655 seconds
4 10010158 bytes 141.73613381385803 seconds
5 10010158 bytes 140.59770107269287 seconds
6 10010158 bytes 142.73043489456177 seconds
7 10010158 bytes 142.49664521217346 seconds
8 10010158 bytes 142.88600778579712 seconds
9 10010158 bytes 142.71679711341858 seconds
10 10010158 bytes 142.1335346698761 seconds
141.52017683982848 seconds per iteration on average


### Not keeping cache

In [None]:
mean_time = 0

for i in range(N):
    start = time.time()

    cap = pyshark.FileCapture(filepath, keep_packets=False)
    total = 0

    def add_length_pkt(pkt): # workaround due to a runtime error (https://github.com/KimiNewt/pyshark/issues/360#issuecomment-520208294)
        global total
        total += int(pkt.length)
    await cap.packets_from_tshark(add_length_pkt)

    running_time = time.time() - start
    mean_time += running_time

    print(i+1, total, 'bytes', running_time, 'seconds')

print(mean_time/N, 'seconds per iteration on average')

1 10010158 bytes 143.27460551261902 seconds
2 10010158 bytes 143.85054421424866 seconds
3 10010158 bytes 143.54138255119324 seconds
4 10010158 bytes 143.4262135028839 seconds
5 10010158 bytes 143.9052414894104 seconds
6 10010158 bytes 143.80340147018433 seconds
7 10010158 bytes 143.6048903465271 seconds
8 10010158 bytes 143.39428973197937 seconds
9 10010158 bytes 143.9031219482422 seconds
10 10010158 bytes 144.0539631843567 seconds
143.6757653951645 seconds per iteration on average


## dpkt

The dpkt library is unable to read PCAPNG files, so before doing the comparision we need to convert the file from the dataset to the PCAP format.

PS: you may have notice that the extension in the 'filepath' variable is '.pcap', however the file starts with the byte prefix of the PCAPNG file format (https://pcapng.com/#SHB_BlockType), and not from the PCAP format (https://wiki.wireshark.org/Development/LibpcapFileFormat#global-header), so the dpkt assumes the file is a PCAPNG.

In [None]:
import dpkt
import os

# converting to PCAP
filepath_dpkt = '/tmp/converted.pcap'
os.system('tcpdump -r ' + filepath + ' -w ' + filepath_dpkt)

# running the test
mean_time = 0

for i in range(N):
    start = time.time()

    f = open(filepath_dpkt, 'rb')
    cap = dpkt.pcap.Reader(f)
    total = 0

    for ts, buf in cap:
        total += len(buf)

    running_time = time.time() - start
    mean_time += running_time

    print(i+1, total, 'bytes', running_time, 'seconds')

print(mean_time/N, 'seconds per iteration on average')

# erasing the file
os.remove(filepath_dpkt)

reading from file ../datasets/CIC-IoT-2023/PCAP/Backdoor_Malware/Backdoor_Malware.pcap, link-type EN10MB (Ethernet), snapshot length 262144


1 10010158 bytes 0.13601422309875488 seconds
2 10010158 bytes 0.13094067573547363 seconds
3 10010158 bytes 0.11518335342407227 seconds
4 10010158 bytes 0.11715197563171387 seconds
5 10010158 bytes 0.11657929420471191 seconds
6 10010158 bytes 0.11336421966552734 seconds
7 10010158 bytes 0.1155400276184082 seconds
8 10010158 bytes 0.12699413299560547 seconds
9 10010158 bytes 0.12864136695861816 seconds
10 10010158 bytes 0.1307525634765625 seconds
0.12311618328094483 seconds per iteration on average


## Scapy

### Reading all the packets at once

In [None]:
from scapy.all import *

mean_time = 0

for i in range(N):
    start = time.time()

    cap = rdpcap(filepath)
    total = 0

    for pkt in cap:
        total += len(pkt)

    running_time = time.time() - start
    mean_time += running_time

    print(i+1, total, 'bytes', running_time, 'seconds')

print(mean_time/N, 'seconds per iteration on average')

1 10010158 bytes 12.906911134719849 seconds
2 10010158 bytes 13.645633935928345 seconds
3 10010158 bytes 14.013576030731201 seconds
4 10010158 bytes 14.000534534454346 seconds
5 10010158 bytes 13.923912286758423 seconds
6 10010158 bytes 14.505949974060059 seconds
7 10010158 bytes 13.885375499725342 seconds
8 10010158 bytes 14.434301376342773 seconds
9 10010158 bytes 13.907682418823242 seconds
10 10010158 bytes 13.881497859954834 seconds
13.910537505149842 seconds per iteration on average


### Not keeping cache

In [None]:
mean_time = 0

for i in range(N):
    start = time.time()

    total = 0

    def add_length_pkt(pkt):
        global total
        total += len(pkt)

    sniff(offline=filepath, prn=add_length_pkt, store=0)

    running_time = time.time() - start
    mean_time += running_time

    print(i+1, total, 'bytes', running_time, 'seconds')

print(mean_time/N, 'seconds per iteration on average')

1 10010158 bytes 12.10836124420166 seconds
2 10010158 bytes 12.1137855052948 seconds
3 10010158 bytes 12.099311590194702 seconds
4 10010158 bytes 12.091962814331055 seconds
5 10010158 bytes 12.148155689239502 seconds
6 10010158 bytes 12.092785120010376 seconds
7 10010158 bytes 13.581264972686768 seconds
8 10010158 bytes 12.167074203491211 seconds
9 10010158 bytes 12.188871383666992 seconds
10 10010158 bytes 12.222804069519043 seconds
12.281437659263611 seconds per iteration on average
