# AIRODUMP SCAN RESULTS - CSV Analyzer USING PANDAS

We will be using the CSV file created by Airodump-NG to analyze the Wi-Fi network.

In the csv we have 2 sections:-

    1. Access Point analysis
    2. Client analysis

1. Access Point analysis

Will use below mentioned question sets to analyze our large csv file using pandas:-

    Which are the unique SSIDs?
    Are there any hidden SSID networks?
    How many APs of each SSID?
    Which channels are most occupied?
    Which manufacturers of Wi-Fi cards are most popular?

2. Client analysis

    Unique list of probed SSIDs?
    How many clients are connected vs roaming?
    
    
Lets Begin!


In [None]:
%matplotlib inline
import matplotlib.pyplot as plt 
import pandas as pd
import numpy as np


import netaddr 


import seaborn as sns
sns.set_color_codes(palette='deep')


# Parsing the CSV file to separate the AP and Client sections

In [None]:
airodump_csv = open('airodump.csv', 'r').read()

In [None]:
client_header = 'Station MAC, First time seen, Last time seen, Power, # packets, BSSID, Probed ESSIDs'

hdi = airodump_csv.index(client_header)

In [None]:
from StringIO import StringIO

ap_csv = StringIO(airodump_csv[:hdi])

client_csv = StringIO(airodump_csv[hdi:])


In [None]:
ap_df = pd.read_csv(ap_csv, 
                   sep=',', 
                   skipinitialspace=True,
                    parse_dates = ['First time seen', 'Last time seen']
                   )

In [None]:
client_df = pd.read_csv(client_csv,
                        sep=', ',
                        skipinitialspace=True,
                        engine='python',
                        parse_dates = ['First time seen', 'Last time seen']
                       )

In [None]:
ap_df.head(1)

In [None]:
client_df.head(1)

# AP Analysis
Lets rename the column names which will make it easier to work

In [None]:
ap_df.columns

In [None]:
 ap_df.rename(columns={
        'BSSID' : 'bssid',
        'First time seen' : 'firstseen',
        'Last time seen' : 'lastseen',
        'channel' : 'channel',
        'Speed' : 'speed',
        'Privacy' : 'privacy',
        'Cipher' : 'cipher',
        'Authentication' : 'authentication',
        'Power' : 'dbpower',
        '# beacons' : 'beacons',
        '# IV' : 'iv',
        'LAN IP' : 'ip',
        'ID-length' : 'idlen',
        'ESSID' : 'essid',
        'Key' : 'key'
    }, inplace=True)

ap_df.head(3)

# Which are the unique SSIDs?

In [None]:
set(ap_df.essid)

# Are there any Hidden SSID networks around?

In [None]:
# Find all ESSIDs which is null i.e. Hidden SSID

ap_df[ap_df.essid.isnull()]

In [None]:
# Let's replace the NaNs with "Hidden SSID" 

ap_df.essid.fillna('Hidden SSID', inplace=True)

ap_df.essid.hasnans

In [None]:
ap_df[ap_df.essid == 'Hidden SSID'].head(3)

# How many APs of each SSID?

In [None]:
# Let us now get the ESSID counts

essid_stats = ap_df.essid.value_counts()

essid_stats

In [None]:
essid_stats.plot(kind='barh', figsize=(10,5))

# Which channels are most occupied?

In [None]:
ap_df.channel.value_counts()

In [None]:
ap_df.channel.value_counts().plot(kind='bar')

# Which manufacturers of Wi-Fi cards are most popular?

In [None]:
# AP vendors can be figured out by the first 3 bytes of the MAC address. 


manufacturer = ap_df.bssid.str.extract('(..:..:..)', expand=False)

manufacturer.head(10)

In [None]:
manufacturer.value_counts()

In [None]:
# https://pypi.python.org/pypi/netaddr

import netaddr

In [None]:
netaddr.OUI('10:8C:CF'.replace(':', '-')).registration().org

In [None]:
for x in manufacturer.value_counts().index[:10]: 
    print x

In [None]:
def manufac(oui) :
    try:
        return netaddr.OUI(oui.replace(':', '-')).registration().org
    except:
        return "Unknown"

[ manufac(oui) for oui in manufacturer.value_counts().index]

# Client Analysis

In [None]:
client_df.head(1)

In [None]:
client_df.columns = ['clientmac', 'firstseen', 'lastseen', 'power', 'numpkts', 'bssid', 'probedssids']

client_df.head(2)

In [None]:
client_df.bssid.head(10)

In [None]:
client_df['bssid'] = client_df.bssid.str.replace(',', '')

client_df.bssid.head(10)

# What is the unique list of Probed SSIDs by all clients

In [None]:
all_probed_ssids_list = []

def createprobedlist(x) :
    if x:
        all_probed_ssids_list.extend(x.strip().split(','))
        
client_df.probedssids.apply(createprobedlist)

all_probed_ssids_list

In [None]:
set(all_probed_ssids_list)

# Number of Clients connect to an AP vs roaming?


In [None]:
client_df.count()

In [None]:
client_df.bssid.str.contains('not associated').value_counts()