# IT Security - Sheet 7 "Firewalls, Intrusion Detection, and Access Control"

**Total achievable points: 20**

**Released: 20.01.2025**

**Submission Deadline: 30.01.2025**

---
Group: 128

Names and matriculation numbers of **ALL** team members: Samuel Rode (445160), Nils Maasch (445796), Pau Azpeita Bergos (443428), Gereon Geuchen (445328), Ben-Jay Huckebrink (445219)

---

**Important Information**

The assignments have to be submitted by groups of 5 students. Even if you are registered in RWTHmoodle to a submission group, **please include the group number as well as the name and matriculation number of every group member in this notebook**. In case you are not part of a submission group and want to hand in assignments, please contact `ba-itsec@itsec.rwth-aachen.de`. We will then assign you to a submission group.

Enter your solutions for the tasks in the respective cells of this notebook. These cells are either marked by "YOUR ANSWER HERE" or `#YOUR CODE HERE`. In code cells, you have to remove `raise NotImplementedError()`. Please do not add any new cells or remove existing ones, especially do not copy cells. Cells marked with `###PLAYGROUND` can be used to test your implementation and generate output. Please do not add any other output or tests in the cell of the task, just implement the function with the header provided. If you want to test your implementation, use the `###PLAYGROUND` cells. They will be ignored during grading. **Do not change any other cells or add new ones.**

Please **do not import any further Python packages** except the default Python ones and the ones that are explicitly given by us or listed below.

**In this exercise additional packages are needed.** Please make sure that these packages are available in the environment the notebook is running in.
- You need `scapy` as in the last exercise.

Please make sure to install all the packages in the same environment as your jupyter notebook (`ipykernel`).

## Content of this Assignment

In this last assignment sheet for this semester, we deal with a little example of UNIX access control, namely the `chmod` command. We then have a look at firewalls and try to implement a few toy firewalls ourselves. Finally, we look at a few metrics that are used to assess the quality of an Intrusion Detection System, especially we will compute some of these.

## 1. Access Control (3.5 points)

### Task 1.1 (2 points)

You saw in the lecture that UNIX access control works with discretionary access control. A three digit octal number specifies the access rights for three different types of users, namely "owner", "group" and "others". To help us understand better what the numbers mean, usually we also get a string representation of the rights per user/group. 

Your task is to implement that helpful function `chmod_converter(num: int, is_directory: bool) -> str` yourself now. 
The function expects the following arguments:
- `num` of type `int` representing the file or directory access permissions in integer form. This is the number you would give to `chmod` to assign permissions to your chosen file/directory.
-  `is_directory` of type `bool` indicating whether the path in question is a directory (`True` if it is a directory, `False` otherwise).

Your function should return a `str` that shows the usual string representation of an `ls -l` output, i.e. in the form $$a{b_1}{b_2}{b_3}{c_1}{c_2}{c_3}{d_1}{d_2}{d_3}$$ where $a = $ `d` if the path is a directory and $a = $ `-` (a dash) otherwise, $b_1 \dots b_3$ represents the rights for the owner, $c_1, \dots c_3$ the rights for the group, and $d_1, \dots d_3$ the rights for others. The abbreviations for the access permissions are `r` for read access, `w` for write access and `x` for execution access. If in any of these a right is not present, use a dash (`-`) to indicate this. Additionally, make sure to keep the correct ordering of the rights. An example of such output will look like the following `drwxrwxrwx`. Furthermore, your function should check whether the given `num` is a valid representation of access rights and raise a `ValueError` otherwise.

In [None]:
def chmod_converter(num: int, is_directory: bool) -> str:
    if num < 0 or num > 777:
        raise ValueError("Given 'chmod' number modifier outside of expected range!")
    
    octal: str = str(num)
    res: str = ""
    if is_directory:
        res += "d"
    else:
        res += "-"

    # Yes, one _could_ do it _way more elegantly_ by using bit-shifts
    # But I simply don't care enough to do this...
    for i in range(3):
        match octal[i]:
            case "0":
                res += "---"
            case "1":
                res += "--x"
            case "2":
                res += "-w-"
            case "3":
                res += "-wx"
            case "4":
                res += "r--"
            case "5":
                res += "r-x"
            case "6":
                res += "rw-"
            case "7":
                res += "rwx"

    return res

In [2]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.

print(chmod_converter(660, True))

drw-rw----


In [3]:
# This test just checks the output of your solution

res = chmod_converter(762, True)

assert type(res) == str, "Your function does not return a string!"
for i in res:
    assert i in 'drwx-', "Your function returns a string with invalid characters!"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

### Task 1.2 (1.5 points)

**Describe** the effect of the `r` (read), `w` (write), and `x` (execute) permission for a directory. What permission is needed to access a file in the directory?

For a _directory_...
- ...`r` means that you are allowed to **list the contents** of the directory
- ...`w` means that you are allowed to **add & remove objects** in the directory
- ...`x` means that you are allowed to **`cd` into** the directory

For accessing a file in a directory, you require the `x` permission. 

## 2. Firewalls (8 points)

Now, we deal with a few examples of firewalls. We want to implement our very own firewall step by step. We start with a very easy packet filter, continue with a bidirectional firewall, and end up with a stateful firewall. Make sure that your firewalls policies are **comprehensive**, so that they really filter what was expected in the particular task. We will work on a pcap file and use scapy again to extract information from the pcap. The pcap file will be loaded in the following cell and a packet list `packets` is available.

In [4]:
from scapy.all import *

# read pcap
packets = rdpcap("./packets.pcapng")

### Task 2.1 (2 points)

In this first task we will implement a very simple packet filter. The filter should accept all DNS packets sent to the default DNS port using UDP (we will ignore DNS over TCP packets here).

Implement the function `toy_dns_firewall(packets: list[Packet]) -> tuple[list[Packet], list[Packet]]`. As argument, this function gets the packet list `packets` that was loaded prior. The function should return a tuple of two different packet lists `(accept_list, deny_list)` where `accept_list` contains a list of all packets accepted by your firewall using the specified rule above and `deny_list` contains a list of all denied packets.

The type `Packet` here and in the following is meant as `scapy.packet.Packet`.

In [9]:
def toy_dns_firewall(packets: list[Packet]) -> tuple[list[Packet], list[Packet]]:    
    accept_list: list[Packet] = list()
    deny_list: list[Packet] = list()

    for packet in packets:
        if not (UDP in packet):
            deny_list.append(packet)
        elif packet[UDP].dport != 53:
            deny_list.append(packet)
        else:
            accept_list.append(packet)
        
    return (accept_list, deny_list)

In [13]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.
for i, packet in enumerate(packets[:20]):
    print(f"Packet {i}:")
    ls(packet)

    print("=" * 10)

Packet 0:
dst        : DestMACField                        = '00:19:e2:a1:f9:86' ('None')
src        : SourceMACField                      = '00:0c:29:9d:c9:d6' ('None')
type       : XShortEnumField                     = 2048            ('36864')
--
version    : BitField  (4 bits)                  = 4               ('4')
ihl        : BitField  (4 bits)                  = 5               ('None')
tos        : XByteField                          = 0               ('0')
len        : ShortField                          = 351             ('None')
id         : ShortField                          = 4175            ('1')
flags      : FlagsField                          = <Flag 2 (DF)>   ('<Flag 0 ()>')
frag       : BitField  (13 bits)                 = 0               ('0')
ttl        : ByteField                           = 128             ('64')
proto      : ByteEnumField                       = 6               ('0')
chksum     : XShortField                         = 0               ('None')


In [10]:
# This test just checks the output of your solution

res = toy_dns_firewall(packets)

assert type(res) == tuple, "Your function does not return a tuple!"
assert len(res) == 2, "The returned tuple does not contain exactly two elements!"
assert type(res[0]) == list, "The first element in your tuple is not a list!"
assert type(res[1]) == list, "The second element in your tuple is not a list!"
assert len(res[0]) > 100 and len (res[0]) < 1100, "Something is strange. Check if your rule works properly."
assert len(res[1]) > 100 and len (res[1]) < 1100, "Something is strange. Check if your rule works properly."
assert len(res[0]) + len(res[1]) == 1241, "You lost some of the packets during filtering!"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

### Task 2.2 (2 points)

Now, we have a problem. We allow DNS messages sent through our firewall to the DNS server, but we do not allow the answers sent from this server to some client behind our firewall. We now want to accept all packets that have the default DNS port as destination or source port and are running over UDP.

Implement the function `toy_dns_bidirect_firewall(packets: list[Packet]) -> tuple[list[Packet], list[Packet]]`. The arguments and the return values are the same as in the task before, only your firewall policy should change. The function gets a packet list `packets` and returns a tuple `(accept_list, deny_list)` with the list of accepted or denied packets.

In [11]:
def toy_dns_bidirect_firewall(packets: list[Packet]) -> tuple[list[Packet], list[Packet]]:
    accept_list: list[Packet] = list()
    deny_list: list[Packet] = list()

    for packet in packets:
        if not (UDP in packet):
            deny_list.append(packet)
        elif packet[UDP].dport != 53 and packet[UDP].sport != 53:
            deny_list.append(packet)
        else:
            accept_list.append(packet)
        
    return (accept_list, deny_list)

In [None]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.

In [12]:
# This test just checks the output of your solution

res = toy_dns_bidirect_firewall(packets)

assert type(res) == tuple, "Your function does not return a tuple!"
assert len(res) == 2, "The returned tuple does not contain exactly two elements!"
assert type(res[0]) == list, "The first element in your tuple is not a list!"
assert type(res[1]) == list, "The second element in your tuple is not a list!"
assert len(res[0]) > 100 and len (res[0]) < 1100, "Something is strange. Check if your rule works properly."
assert len(res[1]) > 100 and len (res[1]) < 1100, "Something is strange. Check if your rule works properly."
assert len(res[0]) + len(res[1]) == 1241, "You lost some of the packets during filtering!"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

### Task 2.3 (4 points)

Even we improved our firewall implementation in the last subtask, it is still lacking a crucial detail. Usually, Firewalls nowadays have a stateful filter mechanism. This ensures that answers to packets are only allowed if there was an initial packet sent from us. To enable this functionality, we have to keep track of open connections.

The firewall policy is now to accept any packet that is a response to an earlier packet that came from the address `192.168.7.1` and deny any packet that is not a response to a packet sent from that IP address. A packet *R* is considered to be a response to a packet *P* if the source of *P* is now the destination of *R* and the destination of *P* is now the source of *R*. The source and destination of a packet contains the IP and the port number in our case. You do not need to deal with 'forgetting' connections. Keep in mind that any outgoing packet (i.e. sent from `192.168.7.1`) has to be accepted as well. **We also only care about TCP connections and deny all UPD packets.**

Implement the the firewall policy in the function `toy_state_firewall(packets: list[Packet]) -> tuple[list[Packet], list[Packet]]` as described above. The arguments and the return values are the same as in the task before, only your firewall policy should change.

*Hint: you have to keep track of your initiated connections and the respective communication endpoints.*

In [14]:
def toy_state_firewall(packets: list[Packet]) -> tuple[list[Packet], list[Packet]]:
    accept_list: list[Packet] = list()
    deny_list: list[Packet] = list()

    # Set of tuples of the form (src_port, dest_ip, dest_port)
    connection_list: set[tuple[int, str, int]] = set()

    for packet in packets:
        if (TCP not in packet) or (IP not in packet):
            deny_list.append(packet)
            continue

        if packet[IP].src == '192.168.7.1':
            # (Potentially) new request from the desired source address
            # Add connection for track-keeping purposes
            connection_list.add((packet[TCP].sport, packet[IP].dst, packet[TCP].dport))
            accept_list.append(packet)
        else:
            # Packet comes from somewhere else than the desired sender
            # Thus, it is allowed in _if and only if_ it is a response to a packet
            # from the sender, i.e., its ports & IPs are found in the
            # 'connection_list' set
            src_ip = packet[IP].src
            dest_ip = packet[IP].dst
            src_port = packet[TCP].sport
            dest_port = packet[TCP].dport

            if dest_ip == '192.168.7.1' and\
                (dest_port, src_ip, src_port) in connection_list:
               accept_list.append(packet)
            else:
                deny_list.append(packet)
        
    return (accept_list, deny_list)

In [15]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.

accept, deny = toy_state_firewall(packets)
print(len(accept))
print(len(deny))

28
1213


In [16]:
# This test just checks the output of your solution

res = toy_state_firewall(packets)

assert type(res) == tuple, "Your function does not return a tuple!"
assert len(res) == 2, "The returned tuple does not contain exactly two elements!"
assert type(res[0]) == list, "The first element in your tuple is not a list!"
assert type(res[1]) == list, "The second element in your tuple is not a list!"
assert len(res[0]) < 100, "Something is strange. Check if your rule works properly."
assert len(res[1]) > 100, "Something is strange. Check if your rule works properly."
assert len(res[0]) + len(res[1]) == 1241, "You lost some of the packets during filtering!"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

## 3. Intrusion Detection Systems (7.5 points)

Apart from firewalls, you also learned about Intrusion Detection Systems (IDSs) that are used to alert once there is an intrusion in the network on a host. A "good" IDS needs to have a high detection rate while also having a low false alarm rate. In the following, we have a look at these metrics and will calculate them. Our intrusion detection system works on network packets and marks them as benign or intrusion.

### Task 3.1 (1.5 points)

Your first task is to write a function that calculates the detection rate of an IDS.

Implement the function `calc_detection_rate(tp: int, fn: int) -> float`. The function receives the following arguments:
- `tp` as `int`: the number of true positives (i.e. the number of detected intrusions)
- `fn` as `int`: the number of false negatives (i.e. the number of undetected intrusions)

The function should return a `float` (between 0 and 1) representing the detection rate ( $P(alarm|attack)$ ). If your computation runs into a `ZeroDivisionError`, return `-1.0` to indicate that this is not yet defined meaningfully.

*Hint: Keep attention that your function does not run into a `ZeroDivisonError`.*

In [17]:
def calc_detection_rate(tp: int, fn: int) -> float:
    # Detection rate = recall = TP/(TP + FN)
    #
    # Only case where we run into a 'ZeroDivisionError' is when
    # both 'tp' and 'fn' are 0
    if tp == 0 and tp == 0:
        return -1.0
    
    return tp/(tp + fn)

In [18]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.

print(calc_detection_rate(95, 5))

0.95


In [19]:
# This test just checks the output of your solution

try:
    res0 = calc_detection_rate(0, 0)
except ZeroDivisionError:
    assert False, "You ran into a ZeroDivisionError! In this case, you have to return -1.0"
assert res0 == -1.0, "You have to return -1.0 in the case of a ZeroDivisionError!"

res = calc_detection_rate(95, 5)

assert type(res) == float, "The returned value is not a float"
assert (res <= 1.0 and res >= 0.0) or res == -1.0, "The returned value is not between 0.0 and 1.0 and is not -1.0"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

### Task 3.2 (1.5 points)

We also want to know about the false alarm rate of an IDS now. Implement the function `calc_false_alarm_rate(tn: int, fp: int) -> float`. The arguments to this function are the following:
- `tn` as `int`: the number of true negative (i.e. the number of undetected benign packets)
- `fp` as `int`: the number of false positive (i.e. the number of benign packets detected as intrusions)

The function should return a `float` (between 0 and 1) representing the false alarm rate ( $P(alarm|\neg attack)$ ). If your computation runs into a `ZeroDivisionError`, return `-1.0` to indicate that this is not yet defined meaningfully.

*Hint: Keep attention that your function does not run into a `ZeroDivisonError`.*

In [20]:
def calc_false_alarm_rate(tn: int, fp: int) -> float:
    # False alarm rate = FPR = fp/(fp + tn)
    #
    # Again: Only case where we run into a 'ZeroDivisionError'
    # is when both 'tn' and 'fp' are 0
    if tn == 0 and fp == 0:
        return -1.0
    
    return fp/(fp + tn)

In [21]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.

print(calc_false_alarm_rate(95, 5))

0.05


In [22]:
# This test just checks the output of your solution

try:
    res0 = calc_detection_rate(0, 0)
except ZeroDivisionError:
    assert False, "You ran into a ZeroDivisionError! In this case, you have to return -1.0"
assert res0 == -1.0, "You have to return -1.0 in the case of a ZeroDivisionError!"

res = calc_detection_rate(95, 5)

assert type(res) == float, "The returned value is not a float"
assert (res <= 1.0 and res >= 0.0) or res == -1.0, "The returned value is not between 0.0 and 1.0 and is not -1.0"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

### Task 3.3 (2.5 points)

One thing that we should consider when looking at IDS alarms is the probability that an alarm generated by the IDS being a false alarm (i.e. the probability that if an alarm is raised there is only a benign packet).

Implement the function `calc_valid_prob(intrusions_rate: float, valid_traffic_rate: float, detection_rate: float, false_alarm_rate: float) -> float`. The function has the following arguments:
- `intrusions_rate` as `float`: rate of actual intrusions in the traffic ( $P(attack)$ )
- `valid_traffic_rate` as `float`: rate of actual benign traffic ( $P(\neg attack)$ )
- `detection_rate` as `float`: rate of detections  ( $P(alarm|attack)$ )
- `false_alarm_rate` as `float`: rate of false alarms  ( $P(alarm|\neg attack)$ )

The function should return a `float` representing the probability that a triggered alarm is a false alarm ( $P(\neg attack|alarm)$ ).

In [23]:
def calc_valid_prob(intrusions_rate: float, valid_traffic_rate: float, detection_rate: float, false_alarm_rate: float) -> float:
    # Use Bayes Theorem: P(A | B) = (P(B | A) * P(A))/P(B)
    # In this case: P(not attack | alarm) = (P(alarm | not attack) * P(not attack))/P(alarm)
    #                                     = (P(alarm | not attack) * P(not attack)) / (P(alarm | not attack) * P(not attack) + P(alarm | attack) * P(attack))
    return (false_alarm_rate * valid_traffic_rate)/(false_alarm_rate * valid_traffic_rate + detection_rate * intrusions_rate)

In [24]:
### PLAYGROUND
# You can use this cell to test out your implementation. Everything in this cell will be ignored during grading.

calc_valid_prob(0.03, 0.94, 0.94, 0.06)

0.6666666666666667

In [25]:
# This test just checks the output of your solution

res = calc_detection_rate(95, 5)

assert type(res) == float, "The returned value is not a float"
assert res <= 1.0 and res >= 0.0, "The returned value is not between 0.0 and 1.0"

In [None]:
# Even this cell seems empty, it contains automatic tests. Please do not remove this cell and just ignore it.

### Task 3.4 (2 points)

Assume a new high-tech IDS with a high detection rate and quite a low false positive rate in a astonishingly secure facility that has multiple layers of defense before packets go through this IDS. Hence, it usually sees few intrusions attempts but a lot of legitimate uses since it is also used for the internal work network. 
What happens with the probability of an alert raised by this IDS being an actual incident? How do you call that phenomenon? Explain it in your own words!

We would observe that the probability of an alarm being an _actual incident_ (i.e., a _true_ alarm) would **be very low**. This is because of the **base-rate fallacy**: If the behaviour that "we are testing/looking for" (i.e., _actual_ attacks) is _very rare_, even a _very small_ false positive rate will result in _most of the alerts thrown_ being false alarms.\
This is because the very high relative amount of valid traffic will _skew_ the conditional probability to be close to $1$: The probability in question is
$$ P(\neg \text{attack} | \text{alarm}) = \frac{P(\text{alarm} | \neg \text{attack}) \cdot P(\neg \text{attack})}{P(\text{alarm} | \neg \text{attack}) \cdot P(\neg \text{attack}) + P(\text{alarm} | \text{attack}) \cdot P(\text{attack})} $$
...and if $P(\text{attack})$ is _very low_, the $P(\text{alarm} | \text{attack}) \cdot P(\text{attack})$ will be close to $0$, thus leading to the above fraction being closer & closer to $1$.

## 4. Exam Example Tasks (0.5 points)

> Note: This task is awarded with 0.5 points only if it has been seriously addressed, regardless of whether the answer is correct.

### Task 4.1

**Argue** whether or not it would be possible to create a HIDS (Host-based IDS) rule that is generally able to detect IP spoofing.

Whether or not we are able to design such a rule **depends on what is meant** by "detect IP spoofing":
- In the **sending case** (i.e., we "want to prevent _our host_ from IP spoofing"), designing such a rule **is possible _if_** our host is _directly connected_ to the internet (& not behind some router or similar), since in this case, our host (or rather, our network card) _knows_ the "real" IP that is assigned to our host\
  In _all other cases_, however (e.g., where our host is behind some NAT router or similar), it **is _not possible_**, since our host (or rather, again, our network card) does _not_ know the "real" IP for the "outside world"
- In the **receiving case** (i.e., we want to prevent _IP-spoofed packets_ from entering the system), designing such a rule **is not possible**, since we _cannot know_ where the packet originated from

### Task 4.2

**Name** the different components of an Intrusion Detection System  introduced in the lecture and briefly **describe** the purpose of each of these.

As per the lecture, an IDS has **three components**:
1. **Sensors**: A monitor that _collects data_ from (part of) the system
2. **Analyzers**: _Receives & stores_ data collected by one or more sensors, through which it _tries to determine_ whether an intrusion _has occured_
3. **User Interface**: An "output" that enables the user to _view the outputs from all analyzers_ to see the status of the system

### Task 4.3

Consider the following Unix file access situation:

```
--w-r---w- david   groupB file1
-r--r----x bob     groupA file2
----r----- charlie groupD file3
drwxrwx--- alice   groupB dir1
-rw--w-r-- alice   groupF dir1/file4

groupA = {alice, david}
groupB = {bob, charlie}
groupC = {charlie}
groupD = {david}
groupE = {bob, david}
groupF = {alice, bob}
```

**List** all files which the user `bob` can write to directly or indirectly (after legal permission changes done by `bob`). For each file you list, also briefly comment on the reason why the user has write access to that file.

`file1`, `file2` & `file4`

## 5. Feedback (0.5 points)

> Note: This task is awarded with 0.5 points only if there is feedback in any form to this exercise. It is also okay to state what was especially nice or rewarding. Literally just writing "anything" is not enough. Any feedback will improve the assignments.

Finally, you completed all the assignments! Since we want to know how it went and how we might improve the exercises, we include the following task. Here, you can write constructive feedback! You even get 0.5 points for it if you write some feedback. But don't worry, we do not grade the content itself!

This time around, even if we had to work with `scapy` again, it was _way better_ than previous `scapy`-related exercises, since we did not have to search through the documentation for hours to find the _one magic function_ that does all the work asked from us in the exercise.\
Task 1 was _okay_ (we will remark that there is not much you can to regarding coding-related exercises for UNIX file system permissions), with the theory question being nice. For Task 2, see above. For Task 3, it was a nice repetition of the lecture contents as well as (again probably unintentionally) being a welcome repetition of the "Evaluation of Supervised Learning" lectures from the "Elements of Machine Learning & Data Science" course.

Overall: A _very nice_ exercise!