### About this notebook
This notebook is part 2 of Abnormal Distribution's code deliverable for TCC's Data Analysis Bootcamp. Here we are implementing Python code to extract our previously ETL'd dataset from PostgreSQL as intrusion detection system (IDS) rules in <a href=https://suricata.io/>Suricata</a>'s format.<p>

<b>Prerequisite:</b> Completion of "Project 3 Part 1 - ETL.ipynb."

<b>A note about terminology:</b> "Signatures" tell a security control how to interpret input, such as an attack pattern, and "rules" are the functional configuration of those signatures in the control (e.g., Suricata). Functionally, the terms rule and signature are used interchangeably here.

### On to the code

In [1]:
# Ensure suricataparser is available in the local Jupyter environment
# suricataparser is the library that will export IDS signatures from our ETL'd database in Suricata format
!pip install suricataparser



In [2]:
# Import psycopg2 and suricataparser libraries for database connectivity and rule extraction
import pandas as pd, csv, psycopg2, suricataparser, sqlalchemy
from sqlalchemy import create_engine

In [3]:
""" Connect to the iot_attack_traffic database
***Note*** hardcoding user credentials is extremely insecure code ... 
anyone who has access to your notebook will have your creds.
Because this is non-production code we accept this risk. 
As the code moves to Production we would implement code to check out 
the credentials from a secure password store, such as keyring or passlib.
(https://theautomatic.net/2020/04/28/how-to-hide-a-password-in-a-python-script/)
"""
connString = 'postgresql://postgres:postgres@127.0.0.1/iot_attack_traffic'
db=create_engine(connString)

In [4]:
# Define the query we'll use to extract traffic and attack patterns from the database
# Start with a var to query the list of attacks we know about ...
tfcDf=pd.read_sql('select * from all_traffic',connString)
attacksList=('ARP_poisioning','DDOS_Slowloris','DOS_SYN_Hping','Metasploit_Brute_Force_SSH','NMAP_FIN_SCAN','NMAP_OS_DETECTION','NMAP_TCP_scan','NMAP_UDP_SCAN','NMAP_XMAS_TREE_SCAN')
# trying here to pull just the attack patterns but can't get the syntax right...
#attacksDf=pd.read_sql_query('select * from all_traffic where traffic_pattern in ('ARP_poisioning','DDOS_Slowloris','DOS_SYN_Hping','Metasploit_Brute_Force_SSH','NMAP_FIN_SCAN','NMAP_OS_DETECTION','NMAP_TCP_scan','NMAP_UDP_SCAN','NMAP_XMAS_TREE_SCAN')',connString)
patternsDf=pd.read_sql('select * from traffic_patterns',connString)

In [5]:
tfcDf.head()

Unnamed: 0,index,origin_port,response_port,proto,service,traffic_pattern
0,0,38667,1883,tcp,mqtt,MQTT_Publish
1,1,51143,1883,tcp,mqtt,MQTT_Publish
2,2,44761,1883,tcp,mqtt,MQTT_Publish
3,3,60893,1883,tcp,mqtt,MQTT_Publish
4,4,51087,1883,tcp,mqtt,MQTT_Publish


In [6]:
# Next are the traffic stats (i.e., pull the traffic signatures that will become rules)
tfcQuery="select * from all_traffic where traffic_pattern in ('ARP_poisioning','DDOS_Slowloris',\
'DOS_SYN_Hping','Metasploit_Brute_Force_SSH','NMAP_FIN_SCAN','NMAP_OS_DETECTION','NMAP_TCP_scan',\
'NMAP_UDP_SCAN','NMAP_XMAS_TREE_SCAN')"
#cur.execute(tfcQuery)
#tfc=cur.fetchall()

In [7]:
# Close the db connections
conn.close()
cur.close()

NameError: name 'conn' is not defined

In [None]:
# Use Pandas to make tfcDf and attacksDf
tfcDf=pd.DataFrame(tfc)
#tfcDf.rename(columns={'0':'origin_port','1':'response_port','2':'proto','3':'service','4':'traffic_pattern'},inplace=True)
attacksDf=pd.DataFrame(attacksList)
#attacksDf.rename(columns={'1':'pattern','2':'type'})
print(tfcDf.info())
print(attacksDf.info())

In [None]:
# Create an empty rules list, loop through the traffic and append a rule for each
rules = []
for row in tfc:
    rule=suricataparser.parse_rules(f"alert tcp any any -> any any (sid:1; gid:1;)")
    rules.append(rule)

In [None]:
# Import the csv library and write the patterns to a file
# this file can be imported to Suricata
import csv
with open('rules.csv', 'w') as f:
    writer = csv.writer(f)
    writer.writerow(['src_port','dst_port','proto','service','pattern'])
    for event in tfc:
        writer.writerow(event)