# Milestone 2 - Tool Comparison
## Name: Alexander James, Joshua Ludolf, & Matthew Trevino
``Date: 02-25-2025``
### Description of this file:
This Jupyter Notebook provides a comprehensive walkthrough of using the `NFStream`, `Pyshark` & `Scapy`  library for network traffic analysis. Additionally noting Strengths & Weaknesses. The notebook includes the following sections:

1. **Installation of Requirements**: Installing the necessary `NFStream`, `Pyshark` & `Scapy` library.
2. **Importing Libraries**: Importing the required libraries for network traffic analysis.
3. **Strengths & Weaknesses**: Detailed information about the strengths and weaknesses.

In [1]:
%pip install nfstream
%pip install scapy
%pip install pyshark

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
import subprocess

result = subprocess.run(['ipconfig', '/all'], capture_output=True, text=True)
print(result.stdout)


Windows IP Configuration

   Host Name . . . . . . . . . . . . : DESKTOP-92U6QUF
   Primary Dns Suffix  . . . . . . . : 
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : wirelessinternet

Ethernet adapter Ethernet 2:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . : 
   Description . . . . . . . . . . . : ExpressVPN TAP Adapter
   Physical Address. . . . . . . . . : 00-FF-68-7E-5D-82
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Ethernet adapter Ethernet 3:

   Connection-specific DNS Suffix  . : 
   Description . . . . . . . . . . . : VirtualBox Host-Only Ethernet Adapter
   Physical Address. . . . . . . . . : 0A-00-27-00-00-14
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::3641:f60f:7e24:ac5c%20(



## NFStream

### Strengths
- **Performance**: Highly optimized C++ core with Python bindings makes NFStream significantly faster for processing large pcap files or real-time traffic
- **Memory Efficiency**: Uses streaming approach that minimizes memory footprint compared to Scapy and PyShark
- **Flow-based Analysis**: Automatically handles flow creation and tracking, providing higher-level abstractions
- **Built-in Feature Extraction**: Provides 100+ pre-built traffic features out of the box
- **ML Integration**: Designed to work well with machine learning pipelines

### Weaknesses
- **Less Flexibility**: Offers fewer packet manipulation capabilities compared to Scapy
- **Learning Curve**: Flow-based paradigm may require adjustment for those used to packet-level analysis
- **Limited Protocol Support**: Supports fewer application protocols than PyShark (which leverages Wireshark dissectors)
- **Python Version**: Only operates in python 3.11 or 3.8 and below (other version failed in build on github)

In [3]:
from nfstream import NFStreamer
import time
import pandas as pd

# Set Wi-Fi interface name.
wifi_interface = "Intel(R) Wi-Fi 6 AX201 160MHz"

# Create an NFStreamer object to capture live traffic with statistical analysis enabled.
# For live capture, set 'source' to the interface name.
streamer = NFStreamer(source=wifi_interface)

# Capture flows for a fixed duration and then print each flow.
start_time = time.time()

flows = []

print(f"Capturing live traffic on interface {wifi_interface} for 0.5 second(s)...")
for flow in streamer:
    flows.append(flow)
    if time.time() - start_time > 0.5:
        break

# Convert the captured flows to a pandas DataFrame and set 'id' as the index column
# Manually construct a dictionary for each flow
flow_dicts = [
    {
        'id': flow.id,
        'src_ip': flow.src_ip,
        'dst_ip': flow.dst_ip,
        'src_port': flow.src_port,
        'dst_port': flow.dst_port,
        'protocol': flow.protocol,
        'bidirectional_packets': flow.bidirectional_packets,
        'bidirectional_bytes': flow.bidirectional_bytes,
        'application_name': flow.application_name,
    }
    for flow in flows
]

# Convert the list of dictionaries to a pandas DataFrame
my_dataframe = pd.DataFrame(flow_dicts).set_index('id')
print(f"\n{my_dataframe}")  # Print the DataFrame
        
    



Capturing live traffic on interface Intel(R) Wi-Fi 6 AX201 160MHz for 0.5 second(s)...

           src_ip       dst_ip  src_port  dst_port  protocol  \
id                                                             
0   192.168.1.158  224.0.0.251      5353      5353        17   

    bidirectional_packets  bidirectional_bytes application_name  
id                                                               
0                       6                 1371             MDNS  


## PyShark - See/execute shark.py for sample

### Strengths
- **Wireshark Integration**: Access to all Wireshark dissectors for comprehensive protocol support
- **Readable Output**: Human-friendly packet information similar to Wireshark GUI
- **Familiar Interface**: Easy transition for Wireshark users
- **Deep Packet Inspection**: Excellent for detailed protocol analysis

### Weaknesses
- **Performance**: Much slower than NFStream, relies on tshark processes
- **Resource Intensive**: High memory usage when dealing with large captures
- **Dependency on Wireshark**: Requires Wireshark/tshark installation
- **Limited Packet Creation**: Not designed for packet crafting like Scapy
- **Jupyter Notebook Limitations**: Often encounters execution issues in Jupyter Notebook 
        environments due to asynchronous processing requirements

In [None]:
import pyshark
import threading

# Define the interface you want to capture from
interface = 'Wi-Fi'

# Create a live capture object
capture = pyshark.LiveCapture(interface=interface)

# Function to display packets
def display_packets(capture, packet_count=10):
    print("Starting live capture on interface:", interface)
    capture.sniff(packet_count=packet_count)
    for packet in capture._packets:
        print(packet)

# Run the function in a separate thread
capture_thread = threading.Thread(target=display_packets, args=(capture, 10))
capture_thread.start()



Starting live capture on interface: Wi-Fi


Packet (Length: 54)
Layer ETH
:	Destination: 54:14:f3:bb:2c:8b
	.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
	.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
	Source: f8:ca:59:07:25:d5
	.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
	.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
	Type: IPv4 (0x0800)
	Stream index: 0
Layer IP
:	0100 .... = Version: 4
	.... 0101 = Header Length: 20 bytes (5)
	Differentiated Services Field: 0x58 (DSCP: AF23, ECN: Not-ECT)
	0101 10.. = Differentiated Services Codepoint: Assured Forwarding 23 (22)
	.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
	Total Length: 40
	Identification: 0x6fd2 (28626)
	000. .... = Flags: 0x0
	0... .... = Reserved bit: Not set
	.0.. .... = Don't fragment: Not set
	..0. .... = More fragments: Not set
	...0 0000 0000 0000 = Fragment Offset: 0
	Time to Live: 60
	Protocol: TCP (6)
	Header Check

## Scapy

### Strengths
- **Performance**: Significantly faster than NFStream for large-scale analysis
- **Packet Creation/Manipulation**: Unmatched flexibility for crafting custom packets and protocols
- **Interactive Use**: Excellent for testing and experimenting with network protocols
- **Powerful Dissection**: Can decode a wide range of protocols with manual control
- **Scriptability**: Great for automating complex network tasks and penetration testing

### Weaknesses
- **Memory Usage**: Loads entire packet captures into memory
- **Steep Learning Curve**: Requires deep knowledge of protocol structures
- **Limited Scalability**: Not ideal for processing gigabytes of traffic data

In [None]:
from scapy.all import IP, TCP, send, sniff

# Create a custom packet
packet = IP(dst="192.168.1.1")/TCP(dport=80)/"GET / HTTP/1.1\r\nHost: 192.168.1.1\r\n\r\n"

# Send the packet
send(packet)

# Sniff and analyze packets
def packet_callback(packet):
    if packet.haslayer(TCP):
        print(f"Packet: {packet.summary()}")

sniff(iface="Intel(R) Wi-Fi 6 AX201 160MHz", prn=packet_callback, count=10)


Sent 1 packets.
Packet: Ether / IP / TCP 192.168.1.199:63069 > 20.50.201.205:https S
Packet: Ether / IP / TCP 20.50.201.205:https > 192.168.1.199:63069 SA
Packet: Ether / IP / TCP 192.168.1.199:63069 > 20.50.201.205:https A
Packet: Ether / IP / TCP 192.168.1.199:63069 > 20.50.201.205:https A / Raw
Packet: Ether / IP / TCP 192.168.1.199:63069 > 20.50.201.205:https PA / Raw
Packet: Ether / IP / TCP 20.50.201.205:https > 192.168.1.199:63069 A


<Sniffed: TCP:6 UDP:4 ICMP:0 Other:0>