As preface, let's import some libraries we need to use.

`python3 -m pip install --user pandas numpy idstools`

In [1]:
import pandas as pd
import numpy as np

from idstools import rule

import glob
import json
import os
import re

# ET Open

Download ET Open ruleset. 
```
wget https://rules.emergingthreats.net/open/suricata-6.0.1/emerging.rules.tar.gz
```

And unpack.

```
mkdir /tmp/etopen
tar -xzf emerging.rules.tar.gz -C /tmp/etopen
```

In [2]:
!wget -q -O /tmp/etopen.tgz https://rules.emergingthreats.net/open/suricata-6.0.1/emerging.rules.tar.gz

In [3]:
!mkdir -p /tmp/etopen
!tar -xzf /tmp/etopen.tgz -C /tmp/etopen

Note that this folder in `tmp` must be synced with following `glob` code which constructs a python list of all rule files.

In [4]:
RULES_LIST_ET_OPEN = glob.glob("/tmp/etopen/rules/*.rules")

Then use python code to get a organized list of rule files.

In [5]:
sorted(RULES_LIST_ET_OPEN)

['/tmp/etopen/rules/3coresec.rules',
 '/tmp/etopen/rules/botcc.portgrouped.rules',
 '/tmp/etopen/rules/botcc.rules',
 '/tmp/etopen/rules/ciarmy.rules',
 '/tmp/etopen/rules/compromised.rules',
 '/tmp/etopen/rules/drop.rules',
 '/tmp/etopen/rules/dshield.rules',
 '/tmp/etopen/rules/emerging-activex.rules',
 '/tmp/etopen/rules/emerging-adware_pup.rules',
 '/tmp/etopen/rules/emerging-attack_response.rules',
 '/tmp/etopen/rules/emerging-chat.rules',
 '/tmp/etopen/rules/emerging-coinminer.rules',
 '/tmp/etopen/rules/emerging-current_events.rules',
 '/tmp/etopen/rules/emerging-deleted.rules',
 '/tmp/etopen/rules/emerging-dns.rules',
 '/tmp/etopen/rules/emerging-dos.rules',
 '/tmp/etopen/rules/emerging-exploit.rules',
 '/tmp/etopen/rules/emerging-exploit_kit.rules',
 '/tmp/etopen/rules/emerging-ftp.rules',
 '/tmp/etopen/rules/emerging-games.rules',
 '/tmp/etopen/rules/emerging-hunting.rules',
 '/tmp/etopen/rules/emerging-icmp.rules',
 '/tmp/etopen/rules/emerging-icmp_info.rules',
 '/tmp/etopen

And parse each rule file with `idstools`, and construct a python dictionary where keys are rule files and values are list of parsed rules.

In [6]:
%time PARSED_ET_OPEN = {k: rule.parse_file(k) for k in RULES_LIST_ET_OPEN}

CPU times: user 1.14 s, sys: 70.9 ms, total: 1.21 s
Wall time: 1.21 s


Consider the following parsed rule. Notice how much information can be extracted from it. And reader should already be familiar with sequential option list.

In [7]:
print(
    json.dumps(
        PARSED_ET_OPEN["/tmp/etopen/rules/emerging-malware.rules"][0], 
        indent=2
    )
)

{
  "enabled": false,
  "action": "alert",
  "direction": "->",
  "group": null,
  "gid": 1,
  "sid": 2009172,
  "rev": 2,
  "msg": "ET MALWARE Psyb0t joining an IRC Channel",
  "flowbits": [
    "isset,is_proto_irc"
  ],
  "metadata": [
    "created_at 2010_07_30",
    "updated_at 2010_07_30"
  ],
  "references": [
    "url,www.adam.com.au/bogaurd/PSYB0T.pdf",
    "url,doc.emergingthreats.net/2009172"
  ],
  "classtype": "trojan-activity",
  "priority": 0,
  "options": [
    {
      "name": "msg",
      "value": "\"ET MALWARE Psyb0t joining an IRC Channel\""
    },
    {
      "name": "flow",
      "value": "established,to_server"
    },
    {
      "name": "flowbits",
      "value": "isset,is_proto_irc"
    },
    {
      "name": "content",
      "value": "\"JOIN #mipsel\""
    },
    {
      "name": "reference",
      "value": "url,www.adam.com.au/bogaurd/PSYB0T.pdf"
    },
    {
      "name": "reference",
      "value": "url,doc.emergingthreats.net/2009172"
    },
    {
      "name

## High level view

Traditional data structures can be difficult for human eyes to grasp. On small scale they are fine, but things become complex if you consider that ET Open contains over 31 **thousand** rules. However, aggregations presented in row-column format can help us out here.

For that, we can use `pandas` scientific package which implements **data frames** in python. Great for data wrangling and exploration. Following block creates a new pandas data frame, and initializes columns of counters per rule file. For now, we're just interested in `total number of rules`, `number of enabled rules` and `number of disabled rules`.

In [8]:
DF_HIGH_LEVEL = pd.DataFrame()
DF_HIGH_LEVEL["file"] = list(PARSED_ET_OPEN.keys())
DF_HIGH_LEVEL["rules_total_count"] = list([len(v) for v in PARSED_ET_OPEN.values()])
DF_HIGH_LEVEL["rules_disabled_count"] = list([len([item for item in v if not item.enabled]) for v in PARSED_ET_OPEN.values()])
DF_HIGH_LEVEL["rules_enabled_count"] = list([len([item for item in v if item.enabled]) for v in PARSED_ET_OPEN.values()])

Then present the dataframe sorted by the number of active rules per file.

In [9]:
DF_HIGH_LEVEL.sort_values(by=["rules_enabled_count"], ascending=False)

Unnamed: 0,file,rules_total_count,rules_disabled_count,rules_enabled_count
2,/tmp/etopen/rules/emerging-malware.rules,8853,2454,6399
15,/tmp/etopen/rules/emerging-web_specific_apps.r...,5577,748,4829
46,/tmp/etopen/rules/tor.rules,963,0,963
0,/tmp/etopen/rules/emerging-phishing.rules,1020,100,920
9,/tmp/etopen/rules/emerging-exploit.rules,1101,293,808
41,/tmp/etopen/rules/emerging-policy.rules,1092,324,768
42,/tmp/etopen/rules/emerging-web_server.rules,711,106,605
50,/tmp/etopen/rules/emerging-mobile_malware.rules,680,76,604
12,/tmp/etopen/rules/emerging-info.rules,622,45,577
4,/tmp/etopen/rules/emerging-adware_pup.rules,1118,584,534


Each column of counters is a vector that can be summed up for total counts.

In [10]:
print("Enabled: {} Disabled: {} Total: {}".format(
    DF_HIGH_LEVEL.rules_enabled_count.sum(),
    DF_HIGH_LEVEL.rules_disabled_count.sum(),
    DF_HIGH_LEVEL.rules_total_count.sum(),
))

Enabled: 21029 Disabled: 10611 Total: 31640


## Dig into specific rule files and threats

Okay, now let's try to get information about some rules themselves.

Before getting started, `idstools` parses some information that is not terribly useful (like `action`, `direction`) while leaving other more useful data pieces unparsed. Looking specifically the `header` field for `protocol`, `src_net` and `dest_net`. Following helper function can parse that information.

In [11]:
def extract_header(header: str) -> dict:
    split = header.split()
    return {
        "proto": split[1],
        "src_net": split[2],
        "src_port": split[3],
        "dest_net": split[5],
        "dest_port": split[6]
    }

Then build a list of all rules while adding cleaned up filename and that `header` information to dictionary.

In [12]:
ALL_ET_OPEN_RULES = []

for filename, rules in PARSED_ET_OPEN.items():
    for r in rules:
        r["file"] = os.path.basename(filename)
        r = {**r, **extract_header(r.get("header"))}
        ALL_ET_OPEN_RULES.append(r)

And rather than attempting to inspect 31k element list, we'll turn the whole thing into a dataframe.

In [13]:
DF_ET_OPEN_ALL = pd.DataFrame(ALL_ET_OPEN_RULES)

Filter for only enabled rules. Rules are always commented for a reason!
* false positives;
* bad performance;
* just out of date and irrelevant;

In [14]:
DF_ET_OPEN_ALL = DF_ET_OPEN_ALL.loc[DF_ET_OPEN_ALL.enabled == True]

And get a quick peek of ruleset. Just to see what we can work on. Clearly we need to do more filtering and a proper selection of columns. All those *sticky buffer* and *content modifier* columns are totally useless. That's because they always apply to `content` keyword and have no values themselves. Thus, all those vectors are empty.

In [15]:
DF_ET_OPEN_ALL.head(5)

Unnamed: 0,enabled,action,direction,group,gid,sid,rev,msg,flowbits,metadata,...,ssh.softwareversion,ipopts,http_host,sameip,detection_filter,asn1,dce_iface,ssl_state,tls_sni,tls.version
7,True,alert,->,,1,2018334,2,ET PHISHING Possible Phish - Saved Website Com...,[],"[created_at 2014_03_31, former_category INFO, ...",...,,,,,,,,,,
12,True,alert,->,,1,2020623,3,ET PHISHING Possible Tsukuba Banker Edwards Pa...,[],"[created_at 2015_03_05, updated_at 2015_03_05]",...,,,,,,,,,,
13,True,alert,->,,1,2025004,2,ET PHISHING Google Drive Phishing Landing Sept 3,[],"[attack_target Client_Endpoint, created_at 201...",...,,,,,,,,,,
14,True,alert,->,,1,2025692,2,ET PHISHING Chase Account Phish Landing Oct 22,[],"[created_at 2015_10_22, former_category CURREN...",...,,,,,,,,,,
17,True,alert,->,,1,2025656,3,ET PHISHING AES Crypto Observed in Javascript ...,[],"[attack_target Client_Endpoint, created_at 201...",...,,,,,,,,,,


So, we'll build a more consise dataframe. with only those columns we are about. List is not exhaustive and just my selection. **Decide what is relevant to you!**

In [16]:
DF_ET_OPEN_CONSISE = DF_ET_OPEN_ALL.loc[:, ["proto", "src_net", "dest_net", "sid", "rev", "msg", "file", "flowbits", "metadata", "references", "flow", "raw"] ]

Notice that our dataframe peek was truncated. This is to avoid exploding your browser, as dataframes can be very big. Following optins can disable that to reveal more information. **But use them with care, make sure you don't call 31k row printout into your browser!**

In [17]:
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)

Some rule categories are small and can be shown as-is. Rather than creating separate data structures, we'll go data science way and keep everything in one dataframe. Remember, we are exploring, so we never know where that exploration will lead. Better to keep everything at arms reach and just filter if needed. Rely on intermediete data before reaching your goal.

So, to see into `emerging-worm` category, we can simply filter for that file name. Furthermore, we can sort values to make the information easier to grasp. Sorting by rule directionality is already a good trick to visually group rules.

In [18]:
DF_ET_OPEN_CONSISE \
    .loc[DF_ET_OPEN_CONSISE.file.str.contains("emerging-worm.rules")] \
    .sort_values(by=["src_net", "dest_net"])

Unnamed: 0,proto,src_net,dest_net,sid,rev,msg,file,flowbits,metadata,references,flow,raw
24916,udp,$HOME_NET,$EXTERNAL_NET,2102004,8,GPL WORM Slammer Worm propagation attempt OUTBOUND,emerging-worm.rules,[],"[created_at 2010_09_23, updated_at 2010_09_23]","[bugtraq,5310, bugtraq,5311, cve,2002-0649, nessus,11214, url,vil.nai.com/vil/content/v_99992.htm]",,"alert udp $HOME_NET any -> $EXTERNAL_NET 1434 (msg:""GPL WORM Slammer Worm propagation attempt OUTBOUND""; content:""|04|""; depth:1; content:""|81 F1 03 01 04 9B 81 F1|""; content:""sock""; content:""send""; reference:bugtraq,5310; reference:bugtraq,5311; reference:cve,2002-0649; reference:nessus,11214; reference:url,vil.nai.com/vil/content/v_99992.htm; classtype:misc-attack; sid:2102004; rev:8; metadata:created_at 2010_09_23, updated_at 2010_09_23;)"
24919,tcp,$HOME_NET,$EXTERNAL_NET,2017404,3,ET WORM W32/Njw0rm CnC Beacon,emerging-worm.rules,[],"[created_at 2013_08_31, former_category WORM, updated_at 2013_08_31]","[url,www.fireeye.com/blog/technical/malware-research/2013/08/njw0rm-brother-from-the-same-mother.html, md5,4c60493b14c666c56db163203e819272, md5,b0e1d20accd9a2ed29cdacb803e4a89d]","established,to_server","alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:""ET WORM W32/Njw0rm CnC Beacon""; flow:established,to_server; content:""lv0njxq80""; depth:9; content:""njxq80""; distance:0; reference:url,www.fireeye.com/blog/technical/malware-research/2013/08/njw0rm-brother-from-the-same-mother.html; reference:md5,4c60493b14c666c56db163203e819272; reference:md5,b0e1d20accd9a2ed29cdacb803e4a89d; classtype:command-and-control; sid:2017404; rev:3; metadata:created_at 2013_08_31, former_category WORM, updated_at 2013_08_31;)"
24921,http,$HOME_NET,$EXTERNAL_NET,2014402,3,ET WORM W32/Rimecud wg.txt Checkin,emerging-worm.rules,[],"[created_at 2012_03_19, updated_at 2020_04_21]","[md5,a89f7289d5cce821a194542e90026082, md5,fd56ce176889d4fbe588760a1da6462b, url,www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Worm%3AWin32%2FRimecud]","established,to_server","alert http $HOME_NET any -> $EXTERNAL_NET any (msg:""ET WORM W32/Rimecud wg.txt Checkin""; flow:established,to_server; http.uri; content:""/wg.txt""; reference:md5,a89f7289d5cce821a194542e90026082; reference:md5,fd56ce176889d4fbe588760a1da6462b; reference:url,www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Worm%3AWin32%2FRimecud; classtype:trojan-activity; sid:2014402; rev:3; metadata:created_at 2012_03_19, updated_at 2020_04_21;)"
24924,http,$HOME_NET,$EXTERNAL_NET,2012201,5,ET WORM Possible Worm Sohanad.Z or Other Infection Request for setting.nql,emerging-worm.rules,[],"[created_at 2011_01_17, updated_at 2020_08_04]","[url,www.threatexpert.com/report.aspx?md5=a70aad8f27957702febfa162556dc5b5]","established,to_server","alert http $HOME_NET any -> $EXTERNAL_NET any (msg:""ET WORM Possible Worm Sohanad.Z or Other Infection Request for setting.nql""; flow:established,to_server; http.uri; content:""/setting.nql""; nocase; reference:url,www.threatexpert.com/report.aspx?md5=a70aad8f27957702febfa162556dc5b5; classtype:trojan-activity; sid:2012201; rev:5; metadata:created_at 2011_01_17, updated_at 2020_08_04;)"
24926,http,$HOME_NET,$EXTERNAL_NET,2008020,6,ET WORM Win32.Socks.s HTTP Post Checkin,emerging-worm.rules,[],"[created_at 2010_07_30, updated_at 2020_08_18]","[url,doc.emergingthreats.net/2008020]","established,to_server","alert http $HOME_NET any -> $EXTERNAL_NET any (msg:""ET WORM Win32.Socks.s HTTP Post Checkin""; flow:established,to_server; http.method; content:""POST""; http.uri; content:"".php""; http.request_body; content:""proc=[System Process]|0a|""; depth:22; reference:url,doc.emergingthreats.net/2008020; classtype:trojan-activity; sid:2008020; rev:6; metadata:created_at 2010_07_30, updated_at 2020_08_18;)"
24927,http,$HOME_NET,$EXTERNAL_NET,2012739,4,ET WORM Rimecud Worm checkin,emerging-worm.rules,[],"[created_at 2011_04_29, updated_at 2020_10_13]","[url,www.threatexpert.com/report.aspx?md5=9623efa133415d19c941ef92a4f921fc]","established,to_server","alert http $HOME_NET any -> $EXTERNAL_NET any (msg:""ET WORM Rimecud Worm checkin""; flow:established,to_server; http.method; content:""GET""; http.uri; content:""/taskx.txt""; fast_pattern; http.user_agent; content:""Mozilla/3.0 (compatible|3b 20|Indy Library)""; depth:38; reference:url,www.threatexpert.com/report.aspx?md5=9623efa133415d19c941ef92a4f921fc; classtype:trojan-activity; sid:2012739; rev:4; metadata:created_at 2011_04_29, updated_at 2020_10_13;)"
24922,http,any,$HOME_NET,2018132,5,ET WORM TheMoon.linksys.router 2,emerging-worm.rules,[],"[created_at 2014_02_13, updated_at 2020_07_07]","[url,isc.sans.edu/forums/diary/Linksys+Worm+Captured/17630, url,devttys0.com/2014/02/wrt120n-fprintf-stack-overflow/]","to_server,established","alert http any any -> $HOME_NET 8080 (msg:""ET WORM TheMoon.linksys.router 2""; flow:to_server,established; http.method; content:""POST""; http.uri; content:""/tmUnblock.cgi""; reference:url,isc.sans.edu/forums/diary/Linksys+Worm+Captured/17630; reference:url,devttys0.com/2014/02/wrt120n-fprintf-stack-overflow/; classtype:trojan-activity; sid:2018132; rev:5; metadata:created_at 2014_02_13, updated_at 2020_07_07;)"
24923,http,any,$HOME_NET,2018155,5,ET WORM TheMoon.linksys.router 3,emerging-worm.rules,[],"[created_at 2014_02_18, updated_at 2020_07_07]","[url,isc.sans.edu/forums/diary/Linksys+Worm+Captured/17630, url,exploit-db.com/exploits/31683/, url,devttys0.com/2014/02/wrt120n-fprintf-stack-overflow/]","to_server,established","alert http any any -> $HOME_NET 8080 (msg:""ET WORM TheMoon.linksys.router 3""; flow:to_server,established; http.method; content:""POST""; http.uri; content:""/hndUnblock.cgi""; reference:url,isc.sans.edu/forums/diary/Linksys+Worm+Captured/17630; reference:url,exploit-db.com/exploits/31683/; reference:url,devttys0.com/2014/02/wrt120n-fprintf-stack-overflow/; classtype:trojan-activity; sid:2018155; rev:5; metadata:created_at 2014_02_18, updated_at 2020_07_07;)"
24925,http,any,$HOME_NET,2018131,6,ET WORM TheMoon.linksys.router 1,emerging-worm.rules,[],"[created_at 2014_02_13, updated_at 2020_08_18]","[url,isc.sans.edu/forums/diary/Linksys+Worm+Captured/17630]",established,"alert http any any -> $HOME_NET 8080 (msg:""ET WORM TheMoon.linksys.router 1""; flow:established; urilen:7; http.method; content:""GET""; http.uri; content:""/HNAP1/""; http.host; pcre:""/^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$/""; reference:url,isc.sans.edu/forums/diary/Linksys+Worm+Captured/17630; classtype:trojan-activity; sid:2018131; rev:6; metadata:created_at 2014_02_13, updated_at 2020_08_18;)"


**PS! Jupyter is a data science tool, and thus caters to that audience. This can lead to silly things like formating rule header like mathematical formula**.

However, really good stuff is in `malware` and `mobile_malware` categories. And those are big. Too big to explore with full dumps. So, let's limit the scope only to a *recent hotness*.

In [19]:
RULES_SUNBURST = DF_ET_OPEN_CONSISE \
    .loc[DF_ET_OPEN_CONSISE.msg.str.contains("SUNBURST", re.IGNORECASE)] \
    .sort_values(by=["proto", "src_net", "dest_net", "msg"]) \
    .drop(columns=["flowbits", "raw", "metadata", "flow"]) \
    .explode("references")

This is a bit more involved, but in many ways is similar to a database query.
* First, we locate all rules containing `SUNBURST` keyword. Sometimes this information is in `tag` or `metadata`, but dont count on it. And it's not very consistent.
* Then we sort values to make the frame visually easier to explore. Pandas even let's us sort by multiple values. That's why I wanted to parse `proto`, `src_net` and `dest_net` from the rule header! With those fields, we get a much better organized view.
* Then drop some columns (from view) that are just noise:
  * `flowbits` are not really that relevant for current explorations, rule content should be listed separately anyway
  * likewise `raw` rule just makes dataframe as a whole more difficult to assess, but it can always be added back if we need to check the content!
  * `metadata` does not hold much useful information and is a list, which again makes frame messy
  * `flow` is a bit redundant with sorted `src_net` and `dest_net` view. Good info, but we only have limited screen real-estate
 * Finally, `references` holds lists, but we can use `explode()` method to unpack each reference to separate row. **This duplicates other rule row elements!** But not a big deal for this case.

In [20]:
RULES_SUNBURST

Unnamed: 0,proto,src_net,dest_net,sid,rev,msg,file,references
10283,dns,$HOME_NET,any,2031392,1,ET MALWARE Dark Halo/SUNBURST Related DNS Lookup to globalnetworkissues .com,emerging-malware.rules,"url,www.volexity.com/blog/2020/12/14/dark-halo-leverages-solarwinds-compromise-to-breach-organizations/"
10282,dns,$HOME_NET,any,2031391,1,ET MALWARE Dark Halo/SUNBURST Related DNS Lookup to kubecloud .com,emerging-malware.rules,"url,www.volexity.com/blog/2020/12/14/dark-halo-leverages-solarwinds-compromise-to-breach-organizations/"
10280,dns,$HOME_NET,any,2031389,1,ET MALWARE Dark Halo/SUNBURST Related DNS Lookup to lcomputers .com,emerging-malware.rules,"url,www.volexity.com/blog/2020/12/14/dark-halo-leverages-solarwinds-compromise-to-breach-organizations/"
10281,dns,$HOME_NET,any,2031390,1,ET MALWARE Dark Halo/SUNBURST Related DNS Lookup to seobundlekit .com,emerging-malware.rules,"url,www.volexity.com/blog/2020/12/14/dark-halo-leverages-solarwinds-compromise-to-breach-organizations/"
10278,dns,$HOME_NET,any,2031387,1,ET MALWARE Dark Halo/SUNBURST Related DNS Lookup to solartrackingsystem .net,emerging-malware.rules,"url,www.volexity.com/blog/2020/12/14/dark-halo-leverages-solarwinds-compromise-to-breach-organizations/"
10279,dns,$HOME_NET,any,2031388,1,ET MALWARE Dark Halo/SUNBURST Related DNS Lookup to webcodez .com,emerging-malware.rules,"url,www.volexity.com/blog/2020/12/14/dark-halo-leverages-solarwinds-compromise-to-breach-organizations/"
10391,dns,$HOME_NET,any,2031540,1,ET MALWARE [401TRG] SUNBURST Related DNS Lookup to bigtopweb .com,emerging-malware.rules,"url,symantec-enterprise-blogs.security.com/blogs/threat-intelligence/solarwinds-raindrop-malware"
10388,dns,$HOME_NET,any,2031537,1,ET MALWARE [401TRG] SUNBURST Related DNS Lookup to infinitysoftwares .com,emerging-malware.rules,"url,symantec-enterprise-blogs.security.com/blogs/threat-intelligence/solarwinds-raindrop-malware"
10307,dns,$HOME_NET,any,2031359,3,ET MALWARE [Fireeye] Observed SUNBURST DGA Request,emerging-malware.rules,"url,www.fireeye.com/blog/threat-research/2020/12/evasive-attacker-leverages-solarwinds-supply-chain-compromises-with-sunburst-backdoor.html"
10275,dns,$HOME_NET,any,2031324,3,ET MALWARE [Fireeye] SUNBURST Related DNS Lookup to avsvmcloud .com,emerging-malware.rules,"url,www.fireeye.com/blog/threat-research/2020/12/evasive-attacker-leverages-solarwinds-supply-chain-compromises-with-sunburst-backdoor.html"


Same exploration can be repeated for other relevant threats. For example, I bet many students are interested in `Cobalt Strike` rules.

In [21]:
RULES_COBALT_STRIKE = DF_ET_OPEN_CONSISE \
    .loc[DF_ET_OPEN_CONSISE \
    .msg.str.contains("Cobalt Strike|CobaltStrike", re.IGNORECASE)] \
    .drop(columns=["metadata", "flowbits"]) \
    .explode("references") \
    .sort_values(by=["msg"]) \
    .drop(columns=["raw"])

In [22]:
RULES_COBALT_STRIKE

Unnamed: 0,proto,src_net,dest_net,sid,rev,msg,file,references,flow
25241,tls,$EXTERNAL_NET,$HOME_NET,2023629,4,ET HUNTING Suspicious Empty SSL Certificate - Observed in Cobalt Strike,emerging-hunting.rules,,"from_server,established"
12100,tls,$EXTERNAL_NET,$HOME_NET,2028832,1,ET JA3 Hash - Suspected Cobalt Strike Malleable C2 (ja3s) M1,emerging-ja3.rules,,"established,from_server"
12099,tls,$HOME_NET,$EXTERNAL_NET,2028831,1,ET JA3 Hash - Suspected Cobalt Strike Malleable C2 M1 (set),emerging-ja3.rules,,"established,to_server"
7680,http,$HOME_NET,$EXTERNAL_NET,2025636,3,ET MALWARE Cobalt Strike Exfiltration,emerging-malware.rules,,"established,to_server"
9959,http,$HOME_NET,$EXTERNAL_NET,2029744,2,ET MALWARE Cobalt Strike Malleable C2 (Adobe RTMP),emerging-malware.rules,"url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/normal/rtmp.profile","established,to_server"
5406,http,$HOME_NET,$EXTERNAL_NET,2029978,1,ET MALWARE Cobalt Strike Malleable C2 (Custom),emerging-malware.rules,"md5,79bbe1365fb7532613823ce3e0cac499","established,to_server"
5406,http,$HOME_NET,$EXTERNAL_NET,2029978,1,ET MALWARE Cobalt Strike Malleable C2 (Custom),emerging-malware.rules,"url,twitter.com/CyberRaiju/status/1249272772963864576","established,to_server"
5404,http,$HOME_NET,$EXTERNAL_NET,2029977,2,ET MALWARE Cobalt Strike Malleable C2 (Custom),emerging-malware.rules,"url,twitter.com/CyberRaiju/status/1249272772963864576","established,to_server"
5404,http,$HOME_NET,$EXTERNAL_NET,2029977,2,ET MALWARE Cobalt Strike Malleable C2 (Custom),emerging-malware.rules,"md5,79bbe1365fb7532613823ce3e0cac499","established,to_server"
5327,http,$HOME_NET,$EXTERNAL_NET,2029740,1,ET MALWARE Cobalt Strike Malleable C2 (Havex APT),emerging-malware.rules,"url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/APT/havex.profile","established,to_server"


Here we can see that many rules have multiple references. And, on that note, rules can hold a lot of interesting reading materials! How about we build a reading list.

In [23]:
sorted(
    list(
        RULES_COBALT_STRIKE \
            .loc[RULES_COBALT_STRIKE.fillna("NA") \
                                    .references.str.contains("^url")] \
            .references.unique()
    )
)

['url,attack.mitre.org/groups/G0080/',
 'url,blog.cobaltstrike.com/2015/10/07/named-pipe-pivoting/',
 'url,blog.malwarebytes.com/threat-analysis/2020/06/multi-stage-apt-attack-drops-cobalt-strike-using-malleable-c2-feature',
 'url,blog.talosintelligence.com/2020/06/indigodrop-maldocs-cobalt-strike.html',
 'url,fireeye.com/blog/threat-research/2020/03/the-cycle-of-adversary-pursuit.html',
 'url,gist.github.com/aaronst/6aa7f61246f53a8dd4befea86e832456',
 'url,github.com//rsmudge/Malleable-C2-Profiles/blob/master/crimeware/magnitude.profile',
 'url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/APT/havex.profile',
 'url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/APT/meterpreter.profile',
 'url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/normal/onedrive_getonly.profile',
 'url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/normal/rtmp.profile',
 'url,github.com/rsmudge/Malleable-C2-Profiles/blob/master/normal/safebrowsing.profile',
 'url,github.com/xx0hcd/Mal

But note that many links might be dead.

In [26]:
RULES_PURPLE_FOX = DF_ET_OPEN_CONSISE \
    .loc[DF_ET_OPEN_CONSISE \
    .msg.str.contains("PurpleFox", re.IGNORECASE)] \
    .drop(columns=["metadata", "flowbits"]) \
    .explode("references") \
    .sort_values(by=["msg"]) \
    .drop(columns=["raw"])

In [28]:
RULES_PURPLE_FOX

Unnamed: 0,proto,src_net,dest_net,sid,rev,msg,file,references,flow
27740,http,$HOME_NET,$EXTERNAL_NET,2028978,3,ET EXPLOIT_KIT Possible PurpleFox EK Framework Flash GET Request,emerging-exploit_kit.rules,,"established,to_server"
27739,http,$HOME_NET,$EXTERNAL_NET,2028977,3,ET EXPLOIT_KIT Possible PurpleFox EK Framework Flash HEAD Request,emerging-exploit_kit.rules,,"established,to_server"
27552,http,$EXTERNAL_NET,$HOME_NET,2028974,2,ET EXPLOIT_KIT Possible PurpleFox EK Framework Landing,emerging-exploit_kit.rules,,"established,to_client"
27553,http,$EXTERNAL_NET,$HOME_NET,2028975,2,ET EXPLOIT_KIT Possible PurpleFox EK Framework Landing - Various Exploits,emerging-exploit_kit.rules,,"established,to_client"
27736,http,$EXTERNAL_NET,$HOME_NET,2028982,2,ET EXPLOIT_KIT Possible PurpleFox EK Framework Payload,emerging-exploit_kit.rules,,"established,to_client"
27738,http,$EXTERNAL_NET,$HOME_NET,2028976,3,ET EXPLOIT_KIT Possible PurpleFox EK Framework Payload,emerging-exploit_kit.rules,,"established,to_client"
27741,http,$EXTERNAL_NET,$HOME_NET,2028981,3,ET EXPLOIT_KIT Possible PurpleFox EK Framework Payload,emerging-exploit_kit.rules,,"established,to_client"
27735,http,$HOME_NET,$EXTERNAL_NET,2028980,2,ET EXPLOIT_KIT Possible PurpleFox EK Framework URI Struct Flash Request,emerging-exploit_kit.rules,,"established,to_server"
27778,http,$HOME_NET,$EXTERNAL_NET,2031466,2,ET EXPLOIT_KIT Possible PurpleFox EK Framework URI Struct Jpg Request,emerging-exploit_kit.rules,,"established,to_server"
27734,http,$HOME_NET,$EXTERNAL_NET,2028979,2,ET EXPLOIT_KIT Possible PurpleFox EK Framework URI Struct Landing Request,emerging-exploit_kit.rules,,"established,to_server"


In [29]:
RULES_EMOTET = DF_ET_OPEN_CONSISE \
    .loc[DF_ET_OPEN_CONSISE \
    .msg.str.contains("Emotet", re.IGNORECASE)] \
    .drop(columns=["metadata", "flowbits"]) \
    .explode("references") \
    .sort_values(by=["msg"]) \
    .drop(columns=["raw"])

In [30]:
RULES_EMOTET

Unnamed: 0,proto,src_net,dest_net,sid,rev,msg,file,references,flow
6095,http,$HOME_NET,$EXTERNAL_NET,2019693,6,ET MALWARE Emotet Checkin,emerging-malware.rules,"md5,3083b68cb5c2a345972a5f79e735c7b9","established,to_server"
9027,http,$HOME_NET,$EXTERNAL_NET,2019704,4,ET MALWARE Emotet CnC Beacon,emerging-malware.rules,"md5,e24831e3f808116b30d85731c545e3ee","established,to_server"
5212,http,$HOME_NET,$EXTERNAL_NET,2029398,2,ET MALWARE Emotet Wifi Bruter Module Checkin,emerging-malware.rules,"url,www.binarydefense.com/emotet-evolves-with-new-wi-fi-spreader","established,to_server"
6176,http,$HOME_NET,$EXTERNAL_NET,2020900,4,ET MALWARE Emotet v2 Exfiltrating Outlook information,emerging-malware.rules,"url,securelist.com/analysis/69560/the-banking-trojan-emotet-detailed-analysis/","established,to_server"
5863,http,$HOME_NET,$EXTERNAL_NET,2018224,5,ET MALWARE Likely Geodo/Emotet Downloading PE,emerging-malware.rules,,"established,to_server"
3050,udp,any,$HOME_NET,2019692,1,ET MALWARE Possible Emotet DGA NXDOMAIN Responses,emerging-malware.rules,"md5,3083b68cb5c2a345972a5f79e735c7b9",
9850,http,$HOME_NET,$EXTERNAL_NET,2024272,6,ET MALWARE W32.Geodo/Emotet Checkin,emerging-malware.rules,"md5,dacdcd451204265ad6f44ef99db1f371","established,to_server"
9849,http,$HOME_NET,$EXTERNAL_NET,2024274,4,ET MALWARE W32/Emotet CnC Beacon 1,emerging-malware.rules,"md5,21542133a586782e7c2fa4286d98fd73","established,to_server"
9849,http,$HOME_NET,$EXTERNAL_NET,2024274,4,ET MALWARE W32/Emotet CnC Beacon 1,emerging-malware.rules,"url,blogs.forcepoint.com/security-labs/new-variant-geodoemotet-banking-malware-targets-uk","established,to_server"
9849,http,$HOME_NET,$EXTERNAL_NET,2024274,4,ET MALWARE W32/Emotet CnC Beacon 1,emerging-malware.rules,"url,blog.fortinet.com/2017/05/03/deep-analysis-of-new-emotet-variant-part-1","established,to_server"
