Introducition: This blog is a tutorial for you to get familiarized with dpkt: a powerful Python library for working with network packets(for example, .pcap file). It provides functionalities for parsing, manipulating, and analyzing network packet captures. With dpkt, we can easily read packet capture files, access packet headers, extract information, and even modify packet contents. 
In the code below, I'm gonna show a few essential function of this library: reading and extracting pcap information, write network packets into pcap, and analyzing html traffic data.

Firstly, install dpkt.

In [1]:
pip install dpkt

Collecting dpkt
  Downloading dpkt-1.9.8-py3-none-any.whl (194 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m195.0/195.0 kB[0m [31m517.5 kB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: dpkt
Successfully installed dpkt-1.9.8
Note: you may need to restart the kernel to use updated packages.


In [15]:
import dpkt
import socket

First Step: Reading a packet capture file. Then, parse the network protocols' headers, which include Ethernet, IP, ICMP or TCP header.

In [27]:
with open('ipv4frags.pcap', 'rb') as f:
    pcap = dpkt.pcap.Reader(f)
    for timestamp, buf in pcap:
        # Process the packet
        eth = dpkt.ethernet.Ethernet(buf)

        # Check if it's an IP packet
        if isinstance(eth.data, dpkt.ip.IP):
            ip = eth.data
            
            # Accessing Ethernet header fields
            eth_src = eth.src
            eth_dst = eth.dst
            
            # Accessing IP header fields
            ip_src = socket.inet_ntoa(ip.src)
            ip_dst = socket.inet_ntoa(ip.dst)
            
            # Processing TCP packet
            if isinstance(ip.data, dpkt.tcp.TCP):
                tcp = ip.data
                payload = tcp.data
                src_port = tcp.sport
                dst_port = tcp.dport
                print('Source Port:', src_port)
                print('Destination Port:', dst_port)
                
            # Processing UDP packet
            elif isinstance(ip.data, dpkt.udp.UDP):
                udp = ip.data
                payload = tcp.data
                src_port = udp.sport
                dst_port = udp.dport
                print('Source Port:', src_port)
                print('Destination Port:', dst_port)


            
            # Processing ICMP packet
            elif isinstance(ip.data, dpkt.icmp.ICMP):
                icmp = ip.data
                icmp_payload = icmp.data
                
                icmp_type = icmp.type
                icmp_code = icmp.code
                
                print("Source IP:", ip_src)
                print("Destination IP:", ip_dst)
                print("ICMP Type:", icmp_type)
                print("ICMP Code:", icmp_code)
                print("ICMP payload:", icmp_payload)
                print("-----")


Source IP: 2.1.1.2
Destination IP: 2.1.1.1
ICMP Type: 8
ICMP Code: 0
ICMP payload: b'\x13\xc2\x00\x01\x14+\xd2Y\x00\x00\x00\x00=*\x08\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMN

As you can see, dptk succesfully extracted the source IP, identified the network protocol. Most importantly, dptk can dentify the type of packet we're analyzing (e.g., TCP, UDP, ICMP) by checking the appropriate header fields. We can access the payload data using dpkt's API. Extracting data from the payload content is helpful for examining network security because we can look for suspicious patterns, known malware signatures, or indicators of compromise.

2nd step: dktp can create network packets from scratch. We can construct headers for different protocols, set field values, and generate custom packets.

In [30]:
with open('output.pcap', 'wb') as f:
    pcap_writer = dpkt.pcap.Writer(f)

    # Create a packet
    eth = dpkt.ethernet.Ethernet()
    eth.src = b'\x00\x11\x22\x33\x44\x55'  # Source MAC address
    eth.dst = b'\xaa\xbb\xcc\xdd\xee\xff'  # Destination MAC address

    ip = dpkt.ip.IP()
    ip.src = b'\xc0\xa8\x01\x01'  # Source IP address as bytes
    ip.dst = b'\xc0\xa8\x01\x02'  # Destination IP address as bytes

    tcp = dpkt.tcp.TCP()
    tcp.sport = 1234  # Source port
    tcp.dport = 80    # Destination port

    tcp.data = b'GET /index.html HTTP/1.1\r\nHost: example.com\r\n\r\n'  # HTTP request payload as bytes

    ip.data = tcp
    eth.data = ip

    # Write the packet to the pcap file
    pcap_writer.writepkt(eth)

    # You can write more packets here if needed

# The pcap file is now written and saved as 'output.pcap'


Writing packets into a pcap file allows you to capture network traffic in one setting and then replay it in a different environment. This functionality proves valuable for tasks such as replicating network conditions, examining network behavior, and testing network applications or security systems within controlled setups. The ability to replay the captured packets provides assistance in troubleshooting issues, conducting performance analysis, and simulating network scenarios that resemble real-world conditions.

3rd step: dktp can also analyze HTML traffic data from the extracted pcap file, below is the demo:

In [38]:
from dpkt.http import Request

count = 0


with open('smallFlows.pcap', 'rb') as f:
    pcap = dpkt.pcap.Reader(f)


    for ts, buf in pcap:
        eth = dpkt.ethernet.Ethernet(buf)
        
        # Check if the packet contains IP layer
        if isinstance(eth.data, dpkt.ip.IP):
            ip = eth.data

            # Check if the IP packet contains TCP layer
            if isinstance(ip.data, dpkt.tcp.TCP):
                tcp = ip.data
                
                # Extract HTTP requests
                if tcp.dport == 80 and len(tcp.data) > 0:
                    request = dpkt.http.Request(tcp.data)
                    print('HTTP Method:', request.method)
                    print('URI:', request.uri)
                    print('Host:', request.headers.get('host', ''))
                    print('User-Agent:', request.headers.get('user-agent', ''))
                    print('')
                    count+=1;
                    if(count > 3):#shows only 4 http information.
                        break


HTTP Method: GET
URI: /complete/search?client=chrome&hl=en-US&q=cr
Host: clients1.google.ca
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10

HTTP Method: GET
URI: /complete/search?client=chrome&hl=en-US&q=msn
Host: clients1.google.ca
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10

HTTP Method: GET
URI: /
Host: msn.ca
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10

HTTP Method: GET
URI: /complete/search?client=chrome&hl=en-US&q=crai
Host: clients1.google.ca
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10



analyzing HTML traffic plays a vital role in network security by detecting malware, identifying vulnerabilities, inspecting content, monitoring network behavior, and aiding in incident response.