# Snort

The reason why regular expressions are popular is that they are extremely powerful. The industry-standard for NIDS (and Network Intrusion Protection System (NIPS)) in software is [Snort](https://www.snort.org/). Snort uses roughly 3500 **rules** which filter malicious traffic. These rules are written in a regex-like way.

# Let's try to mimic
One example rule of a Snort rule is:

> alert tcp 192.168.x.x any -> 172.16.x.x 49535 (msg:”We won't allow this socket”; sid:1000002; rev:1;)

In a more human language this could be (loosely) translate as

> all **TCP** traffic that comes from an IP address with **192** and **168** as first bytes, and that comes from **any given port**; AND that goes **to** an IP address that starts with **172.16** to **port 49535** should be alerted to the security engineer

The Snort software builds a set of regexes to match all its rules. Incoming traffic is then matched with all the regexes of the rule-set. 

It is important to understand that **tcp** already implies a number of things. As TCP is at the transport, it implies that the network layer is IPv4 (or IPv6 but that is left out of scope). 

One could be tempted to check byte number 24 in a frame. This byte defines which transport layer protocol is chosen. Although that is valid deduction, it is not complete. Byte number 24 only defines the transport layer protocol **IF** IPv4 is used as the networking layer protocol **AND** if Ethernet is used as the link layer protocol.

![Decision flow](images/11_flow.png)

The image above shows which evaluations have to take place and what the effect of the outcome of a decision has.

Now let's try to achieve this in Python. First we start by loading all required variables. Don't forget to run this block prior to the rest.

In [5]:
from lib.dataset import NIDSDataset

data_file = 'data/packets.npy'
labels_file = 'data/labels.npy'

dataset = NIDSDataset(data_file, labels_file)

The next step is to distinguish which frames are TCP. Running the code below will tell you how many frames are analysed and how many of them are IPv4 frames and/or TCP frames.

In [7]:
framecounter = 0
number_of_ipv4_frames = 0
number_of_tcp_frames = 0

# loop over all datasets
for d in dataset:

    wordcounter = 0

    # loop over all words
    for word in d:
        # examine Ethertype - in link layer header
        # if the Ethertype field is not 0x0800, the frame is allowed
                
        # examine Protocol - in network layer header
        # if the Protocol field is not 0x6, the frame is allowed

        # examine Source Address - in network layer header
        # if the first to source address fields are not 192 and 126,
        #   the frame is allowed

        # examine Destination Address - in network layer header
        # if the first to destination address fields are not 172 and
        #   16, the frame is allowed

        # examine Destination port - in transport layer header
        # if the destination port fields are not 49535, the 
        #   frame is allowed

        # print(word, end='')
        wordcounter += 1
    
    # end of iteration over words
    framecounter += 1
# end of iteration over datasets

# print summary
print("\nWe've received %d frames" % framecounter)
print("\tIPv4: %d frames" % number_of_ipv4_frames)
print("\t\tTCP: %d frames" % number_of_tcp_frames)



We've received 130 frames
	IPv4: 0 frames
		TCP: 0 frames


In [None]:
framecounter = 0
number_of_ipv4_frames = 0
number_of_tcp_frames = 0

decision_pass = 0
decision_alert = 0

# loop over all datasets
for d in dataset:

    wordcounter = 0

    # loop over all words
    for word in d:
        # examine Ethertype - in link layer header
        # if the Ethertype field is not 0x0800, the frame is allowed
                
        # examine Protocol - in network layer header
        # if the Protocol field is not 0x6, the frame is allowed

        # examine Source Address - in network layer header
        # if the first to source address fields are not 192 and 126,
        #   the frame is allowed

        # examine Destination Address - in network layer header
        # if the first to destination address fields are not 172 and
        #   16, the frame is allowed

        # examine Destination port - in transport layer header
        # if the destination port fields are not 49535, the 
        #   frame is allowed

        # print(word, end='')
        wordcounter += 1
    
    # end of iteration over words
    framecounter += 1
# end of iteration over datasets

# print summary
print("\nWe've received %d frames" % framecounter)
print("\tIPv4: %d frames" % number_of_ipv4_frames)
print("\t\tTCP: %d frames" % number_of_tcp_frames)
print("\nWe've decided")
print("\tok frames: %d frames" % decision_pass)
print("\talert frames: %d frames" % decision_alert)