# Jupyter Playbooks for Suricata

* Markus Kont
* Stamus Networks
* github.com/markuskont
* https://twitter.com/markuskont

## Introduction

* Introduce a tool
  * not for experienced data scientists
  * spark some ideas
* Focus on use-cases around Suricata
  * no iris dataset
* Might not have time to cover everything
  * presentation is meant to be a resource

### Fmt.presentation()

 * Presentation **IS** a notebook
 * it is public
 * code examples are live
 * all data is generated by the notebook
   * (from `malware-traffic-analysis.net`)

### whoami

* 2011: Server Administrator
  * Now fully recovered
* 2014: Cyber Security MSc, TalTech
  * 2015: PhD candidate
  * academia was not for me
* 2015: Technology Branch Researcher, NATO CCDCOE
  * trainings, exercises, research
  * met Eric Leblond in 2016
  * later met Peter Manev during Suricon
* 2020: Developer & Threat Researcher, Stamus Networks
  * analytics, suricata rules, detection methods, devops, backend dev, ...
  * ...insert random fancy title here...
  * in short, resident hacker

## Hello Jupyter

 * Initially IPython Notebooks
   * interactive coding
   * instant feedback
 * Then rebranded to Jupyter
   * de'facto tool for a data scientist
 * Supports different *kernels*
   * R
   * nodejs
   * julia
   * Go
   * ...

### pip install jupyter

#### Basic concepts

 * Organized into *cells*
 * *Cell* can be *code* or *markdown*
 * Cell is executed by *kernel*
 * JupyterLab is like IDE

#### Installing

```
pip install jupyter jupyterlab
```

#### Starting it up

```
(general) ➜  suricata-analytics-1 git:(next-suricon-2022-10-28) ✗ jupyter lab
[I 2022-10-30 06:10:48.141 ServerApp] jupyterlab | extension was successfully linked.
[I 2022-10-30 06:10:48.150 ServerApp] nbclassic | extension was successfully linked.
[I 2022-10-30 06:10:48.170 LabApp] JupyterLab extension loaded from /home/markus/venvs/general/lib/python3.10/site-packages/jupyterlab
[I 2022-10-30 06:10:48.170 LabApp] JupyterLab application directory is /home/markus/venvs/general/share/jupyter/lab
[I 2022-10-30 06:10:48.173 ServerApp] jupyterlab | extension was successfully loaded.
[I 2022-10-30 06:10:48.177 ServerApp] nbclassic | extension was successfully loaded.
[I 2022-10-30 06:10:48.177 ServerApp] The port 8888 is already in use, trying another port.
[I 2022-10-30 06:10:48.178 ServerApp] Serving notebooks from local directory: /home/markus/Projects/SN/suricata-analytics-1
[I 2022-10-30 06:10:48.178 ServerApp] Jupyter Server 1.21.0 is running at:
[I 2022-10-30 06:10:48.178 ServerApp] http://localhost:8889/lab?token=b675c4daec9a6c2beb11b0a6cd38a314509ae62b1989b2e2
[I 2022-10-30 06:10:48.178 ServerApp]  or http://127.0.0.1:8889/lab?token=b675c4daec9a6c2beb11b0a6cd38a314509ae62b1989b2e2
[I 2022-10-30 06:10:48.178 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2022-10-30 06:10:48.216 ServerApp]

    To access the server, open this file in a browser:
        file:///home/markus/.local/share/jupyter/runtime/jpserver-395207-open.html
    Or copy and paste one of these URLs:
        http://localhost:8889/lab?token=b675c4daec9a6c2beb11b0a6cd38a314509ae62b1989b2e2
     or http://127.0.0.1:8889/lab?token=b675c4daec9a6c2beb11b0a6cd38a314509ae62b1989b2e2
Opening in existing browser session.
```

#### Code

It is suricon, so let's start the demo by downloading a PCAP file. With **pure python**. Purpose of this is to demo:

* Simple python code in notebook;
* To get initial input for next *slides*

Firstly, import supporting libraries.

In [83]:
import requests
from zipfile import ZipFile

Then define download link and output path as variables.

In [84]:
URL = "https://malware-traffic-analysis.net/2022/01/03/2022-01-01-thru-03-server-activity-with-log4j-attempts.pcap.zip"
OUTPUT = "/tmp/malware-pcap.zip"

Download and store the file. Notice the real-time output as code gets evalutated.

In [85]:
response = requests.get(URL, stream=True)
if response.status_code == 200:
    print("Download good, writing %d KBytes to %s" % 
          (int(response.headers.get("Content-length")) / 1024,
           OUTPUT))
    with open(OUTPUT, 'wb') as f:
        f.write(response.raw.read())
    print("Done")
else:
    print("Demo effect has kicked in")

Download good, writing 1254 KBytes to /tmp/malware-pcap.zip
Done


Then unzip the archive.

In [86]:
file_name = OUTPUT
with ZipFile(file_name, "r") as zip:
    zip.extractall(path="/tmp", pwd="infected".encode("utf-8"))

Find the PCAP and store for later use.

In [87]:
import glob
FILES = glob.glob("/tmp/*.pcap")
FILES

['/tmp/2022-01-01-thru-03-server-activity-with-log4j-attempts.pcap']

In [88]:
print(FILES[0])

/tmp/2022-01-01-thru-03-server-activity-with-log4j-attempts.pcap


#### Invoking a Shell command

* Writing code to do some simple things can be a hassle
* Jupyter provides some helpers
    * `%` calls builtin magic commands
    * `!` invokes any shell command

For example, we need a Suricata ruleset to proceed with presentation.

In [89]:
%pip install suricata-update

Note: you may need to restart the kernel to use updated packages.


In [90]:
!/home/jovyan/.local/bin/suricata-update

[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Using data-directory /var/lib/suricata.[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Using Suricata configuration /etc/suricata/suricata.yaml[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Using /opt/suricata/share/suricata/rules for Suricata provided rules.[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Found Suricata version 7.0.0-beta1 at /opt/suricata/bin/suricata.[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Loading /etc/suricata/suricata.yaml[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Disabling rules for protocol pgsql[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Disabling rules for protocol modbus[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Disabling rules for protocol dnp3[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- Disabling rules for protocol enip[0m
[32m30/10/2022 -- 05:16:55[0m - <[33mInfo[0m> -- No sources configured, wil

In [91]:
!rm -rf /tmp/logs && mkdir /tmp/logs

In [92]:
!suricata -S /var/lib/suricata/rules/suricata.rules -l /tmp/logs -r /tmp/2022-01-01-thru-03-server-activity-with-log4j-attempts.pcap -v

[32m30/10/2022 -- 05:16:59[0m - <[1;33mNotice[0m> - [33mThis is Suricata version 7.0.0-beta1 RELEASE running in USER mode[0m
[32m30/10/2022 -- 05:16:59[0m - <[33mInfo[0m> - CPUs/cores online: 12[0m
[32m30/10/2022 -- 05:17:00[0m - <[33mInfo[0m> - fast output device (regular) initialized: fast.log[0m
[32m30/10/2022 -- 05:17:00[0m - <[33mInfo[0m> - eve-log output device (regular) initialized: eve.json[0m
[32m30/10/2022 -- 05:17:00[0m - <[33mInfo[0m> - stats output device (regular) initialized: stats.log[0m
[32m30/10/2022 -- 05:17:04[0m - <[33mInfo[0m> - 1 rule files processed. 28761 rules successfully loaded, 0 rules failed[0m
[32m30/10/2022 -- 05:17:04[0m - <[33mInfo[0m> - Threshold config parsed: 0 rule(s) found[0m
[32m30/10/2022 -- 05:17:04[0m - <[33mInfo[0m> - 28764 signatures processed. 1183 are IP-only rules, 5166 are inspecting packet payload, 22211 inspect application layer, 108 are decoder event only[0m
[32m30/10/2022 -- 05:17:12[0m - <

## Meercat on Jupyter

### Import pandas as pd

* `pandas` is a python library that provides *dataframes*
* more than a library, it's actually a language by itself
* think R and Julia
* forget what you know about for loops
  * but it's totally worth it!

In [93]:
%pip install pandas

Note: you may need to restart the kernel to use updated packages.


In [94]:
import pandas as pd
import numpy as np

In [95]:
import json

with open("/tmp/logs/eve.json", "r") as handle:
    DATA = [json.loads(line) for line in handle]
    DF = pd.json_normalize(DATA)
DF

Unnamed: 0,timestamp,flow_id,event_type,src_ip,src_port,dest_ip,dest_port,proto,flow.pkts_toserver,flow.pkts_toclient,...,stats.app_layer.error.nfs_udp.internal,stats.app_layer.error.krb5_udp.alloc,stats.app_layer.error.krb5_udp.parser,stats.app_layer.error.krb5_udp.internal,stats.app_layer.expectations,stats.http.memuse,stats.http.memcap,stats.ftp.memuse,stats.ftp.memcap,stats.file_store.open_files
0,2022-01-01T00:00:13.076985+0000,1.777836e+15,flow,178.175.173.166,43719.0,198.71.247.91,23.0,TCP,1.0,0.0,...,,,,,,,,,,
1,2022-01-01T00:01:49.092097+0000,1.521455e+15,dns,209.141.58.15,35550.0,198.71.247.91,53.0,UDP,,,...,,,,,,,,,,
2,2022-01-01T00:00:13.076985+0000,6.780573e+14,flow,54.83.160.152,,198.71.247.91,,ICMP,2.0,2.0,...,,,,,,,,,,
3,2022-01-01T00:00:13.076985+0000,2.053504e+15,flow,178.175.173.166,43719.0,198.71.247.91,23.0,TCP,3.0,0.0,...,,,,,,,,,,
4,2022-01-01T00:00:13.076985+0000,8.198441e+13,flow,3.81.214.180,,198.71.247.91,,ICMP,1.0,1.0,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
25887,2022-01-01T00:00:13.076985+0000,1.679968e+15,flow,185.220.205.106,53104.0,198.71.247.91,16010.0,TCP,1.0,0.0,...,,,,,,,,,,
25888,2022-01-01T00:00:13.076985+0000,1.250100e+15,flow,167.94.146.23,2935.0,198.71.247.91,21093.0,TCP,1.0,0.0,...,,,,,,,,,,
25889,2022-01-01T00:00:13.076985+0000,1.568845e+15,flow,104.140.188.6,63263.0,198.71.247.91,593.0,TCP,1.0,0.0,...,,,,,,,,,,
25890,2022-01-01T00:00:13.076985+0000,1.354858e+15,flow,89.248.165.56,49405.0,198.71.247.91,3396.0,TCP,1.0,0.0,...,,,,,,,,,,


In [96]:
len(DF)

25892

In [97]:
len(DF.columns.values)

557

In [98]:
len([c for c in list(DF.columns.values) if not c.startswith("stats")])

120

In [99]:
DF.head(5)

Unnamed: 0,timestamp,flow_id,event_type,src_ip,src_port,dest_ip,dest_port,proto,flow.pkts_toserver,flow.pkts_toclient,...,stats.app_layer.error.nfs_udp.internal,stats.app_layer.error.krb5_udp.alloc,stats.app_layer.error.krb5_udp.parser,stats.app_layer.error.krb5_udp.internal,stats.app_layer.expectations,stats.http.memuse,stats.http.memcap,stats.ftp.memuse,stats.ftp.memcap,stats.file_store.open_files
0,2022-01-01T00:00:13.076985+0000,1777836000000000.0,flow,178.175.173.166,43719.0,198.71.247.91,23.0,TCP,1.0,0.0,...,,,,,,,,,,
1,2022-01-01T00:01:49.092097+0000,1521455000000000.0,dns,209.141.58.15,35550.0,198.71.247.91,53.0,UDP,,,...,,,,,,,,,,
2,2022-01-01T00:00:13.076985+0000,678057300000000.0,flow,54.83.160.152,,198.71.247.91,,ICMP,2.0,2.0,...,,,,,,,,,,
3,2022-01-01T00:00:13.076985+0000,2053504000000000.0,flow,178.175.173.166,43719.0,198.71.247.91,23.0,TCP,3.0,0.0,...,,,,,,,,,,
4,2022-01-01T00:00:13.076985+0000,81984410000000.0,flow,3.81.214.180,,198.71.247.91,,ICMP,1.0,1.0,...,,,,,,,,,,


In [100]:
DF.describe()

Unnamed: 0,flow_id,src_port,dest_port,flow.pkts_toserver,flow.pkts_toclient,flow.bytes_toserver,flow.bytes_toclient,flow.age,pcap_cnt,dns.id,...,stats.app_layer.error.nfs_udp.internal,stats.app_layer.error.krb5_udp.alloc,stats.app_layer.error.krb5_udp.parser,stats.app_layer.error.krb5_udp.internal,stats.app_layer.expectations,stats.http.memuse,stats.http.memcap,stats.ftp.memuse,stats.ftp.memcap,stats.file_store.open_files
count,25891.0,23527.0,23527.0,23239.0,23239.0,23239.0,23239.0,23092.0,2757.0,59.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
mean,1137196000000000.0,42148.699834,14506.184766,1.3146,0.391368,111.0423,51.658978,4.376234,21663.409503,25804.694915,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
std,646719900000000.0,17976.408575,17824.568247,1.841267,1.574928,710.331571,409.521395,37.486077,12314.156788,22754.924353,...,,,,,,,,,,
min,1631468000.0,0.0,0.0,1.0,0.0,42.0,0.0,0.0,18.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,588955900000000.0,36116.5,1701.0,1.0,0.0,54.0,0.0,0.0,10651.0,5463.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,1131389000000000.0,48856.0,7000.0,1.0,0.0,54.0,0.0,0.0,23266.0,16765.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,1695444000000000.0,54820.0,22029.5,1.0,0.0,58.0,0.0,0.0,33542.0,43119.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,2251779000000000.0,65531.0,65528.0,134.0,91.0,66168.0,25717.0,899.0,39193.0,64206.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Threat hunting

### Advanced Analytics

### Ruleset analysis