This document explains everything about this project - from basic networking concepts to the complete code architecture. After reading this, you should understand exactly how packets flow through the system without needing to read the code.
- What is DPI?
- Networking Background
- Project Overview
- File Structure
- The Journey of a Packet
- Deep Dive: Each Component
- How SNI Extraction Works
- How Blocking Works
- Running the Application
- Understanding the Output
Deep Packet Inspection (DPI) is a technology used to examine the contents of network packets as they pass through a checkpoint. Unlike simple firewalls that only look at packet headers (source/destination IP), DPI looks inside the packet payload.
- ISPs: Throttle or block certain applications (e.g., BitTorrent)
- Enterprises: Block social media on office networks
- Parental Controls: Block inappropriate websites
- Security: Detect malware or intrusion attempts
User Traffic (PCAP) → [DPI Engine] → Filtered Traffic (PCAP)
↓
- Identifies apps (YouTube, Facebook, etc.)
- Blocks based on rules
- Generates reports
When you visit a website, data travels through multiple "layers":
┌─────────────────────────────────────────────────────────┐
│ Layer 7: Application │ HTTP, TLS, DNS │
├─────────────────────────────────────────────────────────┤
│ Layer 4: Transport │ TCP (reliable), UDP (fast) │
├─────────────────────────────────────────────────────────┤
│ Layer 3: Network │ IP addresses (routing) │
├─────────────────────────────────────────────────────────┤
│ Layer 2: Data Link │ MAC addresses (local network)│
└─────────────────────────────────────────────────────────┘
Every network packet is like a Russian nesting doll - headers wrapped inside headers:
┌──────────────────────────────────────────────────────────────────┐
│ Ethernet Header (14 bytes) │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ IP Header (20 bytes) │ │
72: │ │ ┌──────────────────────────────────────────────────────────┐ │ │
73: │ │ │ TCP Header (20 bytes) │ │ │
74: │ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │
75: │ │ │ │ Payload (Application Data) │ │ │ │
76: │ │ │ │ e.g., TLS Client Hello with SNI │ │ │ │
77: │ │ │ └──────────────────────────────────────────────────────┘ │ │ │
78: │ │ └──────────────────────────────────────────────────────────┘ │ │
79: │ └──────────────────────────────────────────────────────────────┘ │
80: └──────────────────────────────────────────────────────────────────┘
A connection (or "flow") is uniquely identified by 5 values:
| Field | Example | Purpose |
|---|---|---|
| Source IP | 192.168.1.100 | Who is sending |
| Destination IP | 172.217.14.206 | Where it's going |
| Source Port | 54321 | Sender's application identifier |
| Destination Port | 443 | Service being accessed (443 = HTTPS) |
| Protocol | TCP (6) | TCP or UDP |
Why is this important?
- All packets with the same 5-tuple belong to the same connection
- If we block one packet of a connection, we should block all of them
- This is how we "track" conversations between computers
Server Name Indication (SNI) is part of the TLS/HTTPS handshake. When you visit https://www.youtube.com:
- Your browser sends a "Client Hello" message
- This message includes the domain name in plaintext (not encrypted yet!)
- The server uses this to know which certificate to send
TLS Client Hello:
├── Version: TLS 1.2
├── Random: [32 bytes]
├── Cipher Suites: [list]
└── Extensions:
└── SNI Extension:
└── Server Name: "www.youtube.com" ← We extract THIS!
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Wireshark │ │ DPI Engine │ │ Output │
│ Capture │ ──► │ (Node.js) │ ──► │ PCAP │
│ (input.pcap)│ │ - Parse │ │ (filtered) │
└─────────────┘ │ - Classify │ └─────────────┘
│ - Block │
│ - Report │
└─────────────┘
- Asynchronous I/O: Uses Node.js
fs/promisesandStreamsfor efficient PCAP handling. - Protocol Support: Ethernet, IPv4, TCP, UDP.
- DPI Support: TLS (SNI), HTTP (Host), DNS (Query).
- Rule Engine: Block by IP, Application Name, or Domain.
packet_analyzer/
├── src/ # Source files (JavaScript)
│ ├── pcap_io.js # PCAP file reading/writing (Async)
│ ├── packet_parser.js # Network protocol parsing
│ ├── dpi_utils.js # SNI/Host/DNS extraction
│ ├── types.js # Common structures and constants
│ ├── rule_manager.js # Blocking rules management
│ ├── connection_tracker.js # Flow tracking and reporting
│ ├── dpi_engine.js # Main orchestrator
│ ├── main.js # ★ Packet Summary Tool ★
│ └── main_dpi.js # ★ DPI Engine CLI ★
│
├── package.json # Project dependencies and scripts
├── generate_test_pcap.py # Python script to create test data
├── test_dpi.pcap # Sample capture for testing
└── README.md # This file!
Let's trace a single packet through the system:
const reader = new PcapReader();
await reader.open("capture.pcap");What happens:
- Open the file using
fs.promises.open. - Read the 24-byte global header and verify magic numbers.
let raw;
while ((raw = await reader.readNextPacket())) {
// raw.data contains the packet bytes
// raw.header contains timestamp and length
}const parsed = PacketParser.parse(raw);What happens (in packet_parser.js): Extracts Ethernet, IP, and TCP/UDP fields into a structured object.
// For HTTPS traffic (port 443)
const sni = SNIExtractor.extract(payload);
if (sni) {
const appType = sniToAppType(sni); // e.g., AppType.YOUTUBE
}if (ruleManager.shouldBlock(ip, port, appType, sni)) {
// DROP: Don't write to output
} else {
// FORWARD: Write to output PCAP
await writer.writePacket(raw.header, raw.data);
}Handles reading and writing PCAP files.
- PcapReader: Uses
fs.promisesfor non-blocking reads. - PcapWriter: Uses
fs.createWriteStreamfor high-performance sequential writes.
Decodes raw bytes into protocol fields.
- Uses
Buffermethods likereadUInt16BEto handle network byte order.
The "brain" of the DPI engine.
- SNIExtractor: Parses TLS handshake records.
- HTTPHostExtractor: Uses regex to find the
Host:header in HTTP requests. - DNSExtractor: Decodes DNS query labels.
We extract the Server Name Indication from the TLS Client Hello packet.
- Verify TLS Record: Check for
0x16(Handshake). - Verify Handshake Type: Check for
0x01(Client Hello). - Skip Body: Skip version, random, session ID, cipher suites, and compression.
- Parse Extensions: Find extension type
0x0000(SNI). - Extract Hostname: Read the hostname string from the extension data.
The RuleManager checks packets against several criteria:
- IP Blacklist: Blocks all traffic from a specific source IP.
- App Blacklist: Blocks identified applications (e.g.,
YOUTUBE). - Domain Blacklist: Blocks based on the extracted SNI or Host header (supports substring matching).
Once a connection is blocked, all subsequent packets in that flow (same 5-tuple) are automatically dropped.
Ensure you have Node.js installed.
# Clone the repository
git clone <repo-url>
cd packet-analyzer
# No external dependencies required for the JS port!To view a summary of packets in a PCAP file:
node src/main.js test_dpi.pcap [max_packets]To run the DPI engine and block specific traffic:
# Block YouTube and Facebook
node src/main_dpi.js input.pcap output.pcap --block-app YouTube --block-app Facebook
# Block a specific IP
node src/main_dpi.js input.pcap output.pcap --block-ip 192.168.1.100
# Block a domain
node src/main_dpi.js input.pcap output.pcap --block-domain example.comThe DPI engine prints a summary after processing:
- Total Packets: Count of all packets processed.
- App Classification: Breakdown of detected applications (YouTube, Google, etc.).
- Forwarded/Dropped: Number of packets written to the output file vs. blocked.
- Connection Report: List of all unique connections and their identified apps.
Created as a high-performance JavaScript port of the original C++ DPI Engine.