For any technical and/or maintenance information, please kindly refer to the Official Documentation.
The pcapkit project is an open source Python program focus on PCAP parsing and analysis, which works as a stream PCAP file extractor. With support of dictdumper, it shall support multiple output report formats.
Note that the whole project supports Python 3.4 or later.
pcapkit is an independent open source library, using only dictdumper as its formatted output dumper.
There is a project called
jspcapyworks onpcapkit, which is a command line tool for PCAP extraction but now DEPRECATED.
Unlike popular PCAP file extractors, such as Scapy, dpkt, pyshark, and etc, pcapkit uses streaming strategy to read input files. That is to read frame by frame, decrease occupation on memory, as well as enhance efficiency in some way.
In pcapkit, all files can be described as following eight parts.
- Interface (
pcapkit.interface) -- user interface for thepcapkitlibrary, which standardise and simplify the usage of this library - Foundation (
pcapkit.foundation) -- synthesise file I/O and protocol analysis, coordinate information exchange in all network layers - Reassembly (
pcapkit.reassembly) -- base on algorithms described inRFC 815, implement datagram reassembly of IP and TCP packets - Protocols (
pcapkit.protocols) -- collection of all protocol family, with detail implementation and methods as well as constructors - Utilities (
pcapkit.utilities) -- collection of four utility functions and classes - CoreKit (
pcapkit.corekit) -- core utilities forpcapkitimplementation - ToolKit (
pcapkit.toolkit) -- compatibility tools forpcapkitimplementation - DumpKit (
pcapkit.dumpkit) -- dump utilities forpcapkitimplementation
Besides, due to complexity of pcapkit, its extraction procedure takes around 0.01 0.0009 seconds per packet, which is not ideal enough. Thus, pcapkit introduced alternative extraction engines to accelerate this procedure. By now, pcapkit supports Scapy, DPKT, and PyShark. Plus, pcapkit supports two strategies of multiprocessing (server & pipeline). For more information, please refer to the document.
PyPCAPKit finally boosts a bit up thanks to @59e5aaf4 with issue #29 🎉
| Key | Value |
|---|---|
| Operating System | macOS Mojave |
| Processor Name | Intel Core i7 |
| Processor Speed | 2.6 GHz |
| Total Number of Cores | 6 |
| Memory | 16 GB |
| Engine | Performance (seconds per packet) |
|---|---|
dpkt |
0.00017389218012491862 |
scapy |
0.00036091208457946774 |
default |
0.0009537641207377116 |
pipeline |
0.0009694552421569824 |
server |
0.018088217973709107 |
pyshark |
0.04200994372367859 |
Note that
pcapkitsupports Python versions since 3.4
Simply run the following to install the current version from PyPI:
pip install pypcapkitOr install the latest version from the git repository:
git clone https://github.com/JarryShaw/PyPCAPKit.git
cd pypcapkit
pip install -e .
# and to update at any time
git pull And since pcapkit supports various extraction engines, and extensive plug-in functions, you may want to install the optional ones:
# for DPKT only
pip install pypcapkit[DPKT]
# for Scapy only
pip install pypcapkit[Scapy]
# for PyShark only
pip install pypcapkit[PyShark]
# and to install all the optional packages
pip install pypcapkit[all]
# or to do this explicitly
pip install pypcapkit dpkt scapy pyshark
| NAME | DESCRIPTION |
|---|---|
extract |
extract a PCAP file |
analyse |
analyse application layer packets |
reassemble |
reassemble fragmented datagrams |
trace |
trace TCP packet flows |
| NAME | DESCRIPTION |
|---|---|
JSON |
JavaScript Object Notation (JSON) format |
PLIST |
macOS Property List (PLIST) format |
TREE |
Tree-View text format |
PCAP |
PCAP format |
| NAME | DESCRIPTION |
|---|---|
RAW |
no specific layer |
LINK |
data-link layer |
INET |
internet layer |
TRANS |
transport layer |
APP |
application layer |
| NAME | DESCRIPTION |
|---|---|
PCAPKit |
the default engine |
MPServer |
the multiprocessing engine with server process strategy |
MPPipeline |
the multiprocessing engine with pipeline strategy |
DPKT |
the DPKT engine |
Scapy |
the Scapy engine |
PyShark |
the PyShark engine |
| NAME | DESCRIPTION |
|---|---|
NoPayload |
No-Payload |
Raw |
Raw Packet Data |
ARP |
Address Resolution Protocol |
Ethernet |
Ethernet Protocol |
L2TP |
Layer Two Tunnelling Protocol |
OSPF |
Open Shortest Path First |
RARP |
Reverse Address Resolution Protocol |
VLAN |
802.1Q Customer VLAN Tag Type |
AH |
Authentication Header |
HIP |
Host Identity Protocol |
HOPOPT |
IPv6 Hop-by-Hop Options |
IP |
Internet Protocol |
IPsec |
Internet Protocol Security |
IPv4 |
Internet Protocol version 4 |
IPv6 |
Internet Protocol version 6 |
IPv6_Frag |
Fragment Header for IPv6 |
IPv6_Opts |
Destination Options for IPv6 |
IPv6_Route |
Routing Header for IPv6 |
IPX |
Internetwork Packet Exchange |
MH |
Mobility Header |
TCP |
Transmission Control Protocol |
UDP |
User Datagram Protocol |
HTTP |
Hypertext Transfer Protocol |
Documentation can be found in submodules of pcapkit. Or, you may find usage sample in the test folder. For further information, please refer to the source code -- the docstrings should help you :)
ps: help function in Python should always help you out.
The following part was originally described in
jspcapy, which is now deprecated and merged into this repository.
As it shows in the help manual, it is quite easy to use:
$ pcapkit-cli --help
usage: pcapkit-cli [-h] [-V] [-o file-name] [-f format] [-j] [-p] [-t] [-a]
[-v] [-F] [-E PKG] [-P PROTOCOL] [-L LAYER]
input-file-name
PCAP file extractor and formatted dumper
positional arguments:
input-file-name The name of input pcap file. If ".pcap" omits, it will
be automatically appended.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-o file-name, --output file-name
The name of input pcap file. If format extension
omits, it will be automatically appended.
-f format, --format format
Print a extraction report in the specified output
format. Available are all formats supported by
dictdumper, e.g.: json, plist, and tree.
-j, --json Display extraction report as json. This will yield
"raw" output that may be used by external tools. This
option overrides all other options.
-p, --plist Display extraction report as macOS Property List
(plist). This will yield "raw" output that may be used
by external tools. This option overrides all other
options.
-t, --tree Display extraction report as tree view text. This will
yield "raw" output that may be used by external tools.
This option overrides all other options.
-a, --auto-extension If output file extension omits, append automatically.
-v, --verbose Show more information.
-F, --files Split each frame into different files.
-E PKG, --engine PKG Indicate extraction engine. Note that except default
or pcapkit engine, all other engines need support of
corresponding packages.
-P PROTOCOL, --protocol PROTOCOL
Indicate extraction stops after which protocol.
-L LAYER, --layer LAYER
Indicate extract frames until which layer.
Under most circumstances, you should indicate the name of input PCAP file (extension may omit) and at least, output format (json, plist, or tree). Once format unspecified, the name of output file must have proper extension (*.json, *.plist, or *.txt), otherwise FormatError will raise.
As for verbose mode, detailed information will print while extraction (as following examples). And auto-extension flag works for the output file, to indicate whether extensions should be appended.
As described in test folder, pcapkit is quite easy to use, with simply three verbs as its main interface. Several scenarios are shown as below.
-
extract a PCAP file and dump the result to a specific file (with no reassembly)
import pcapkit # dump to a PLIST file with no frame storage (property frame disabled) plist = pcapkit.extract(fin='in.pcap', fout='out.plist', format='plist', store=False) # dump to a JSON file with no extension auto-complete json = pcapkit.extract(fin='in.cap', fout='out.json', format='json', extension=False) # dump to a folder with each tree-view text file per frame tree = pcapkit.extract(fin='in.pcap', fout='out', format='tree', files=True)
-
extract a PCAP file and fetch IP packet (both IPv4 and IPv6) from a frame (with no output file)
>>> import pcapkit >>> extraction = pcapkit.extract(fin='in.pcap', nofile=True) >>> frame0 = extraction.frame[0] # check if IP in this frame, otherwise ProtocolNotFound will be raised >>> flag = pcapkit.IP in frame0 >>> tcp = frame0[pcapkit.IP] if flag else None
-
extract a PCAP file and reassemble TCP payload (with no output file nor frame storage)
import pcapkit # set strict to make sure full reassembly extraction = pcapkit.extract(fin='in.pcap', store=False, nofile=True, tcp=True, strict=True) # print extracted packet if HTTP in reassembled payloads for packet in extraction.reassembly.tcp: for reassembly in packet.packets: if pcapkit.HTTP in reassembly.protochain: print(reassembly.info)
The CLI (command line interface) of pcapkit has two different access.
- through console scripts -- use command name
pcapkit [...]directly (as shown in samples) - through Python module --
python -m pypcapkit [...]works exactly the same as above
Here are some usage samples:
- export to a macOS Property List (
Xcodehas special support for this format)
$ pcapkit in --format plist --verbose
🚨Loading file 'in.pcap'
- Frame 1: Ethernet:IPv6:ICMPv6
- Frame 2: Ethernet:IPv6:ICMPv6
- Frame 3: Ethernet:IPv4:TCP
- Frame 4: Ethernet:IPv4:TCP
- Frame 5: Ethernet:IPv4:TCP
- Frame 6: Ethernet:IPv4:UDP
🍺Report file stored in 'out.plist'
- export to a JSON file (with no format specified)
$ pcapkit in --output out.json --verbose
🚨Loading file 'in.pcap'
- Frame 1: Ethernet:IPv6:ICMPv6
- Frame 2: Ethernet:IPv6:ICMPv6
- Frame 3: Ethernet:IPv4:TCP
- Frame 4: Ethernet:IPv4:TCP
- Frame 5: Ethernet:IPv4:TCP
- Frame 6: Ethernet:IPv4:UDP
🍺Report file stored in 'out.json'
- export to a text tree view file (without extension autocorrect)
$ pcapkit in --output out --format tree --verbose
🚨Loading file 'in.pcap'
- Frame 1: Ethernet:IPv6:ICMPv6
- Frame 2: Ethernet:IPv6:ICMPv6
- Frame 3: Ethernet:IPv4:TCP
- Frame 4: Ethernet:IPv4:TCP
- Frame 5: Ethernet:IPv4:TCP
- Frame 6: Ethernet:IPv4:UDP
🍺Report file stored in 'out'
- specify
Rawpacket - interface verbs
- review docstrings
- merge
jspcapy - write documentation
- implement IP and MAC address containers
- implement option list extractors
- implement more protocols
