# ATT&CK DS Event Mappings Notebook

-----------------------------------

* **Author**: Jose Luis Rodriguez - [@Cyb3rPandaH](https://twitter.com/Cyb3rPandaH)
* **Organization**: [Open Threat Research (OTR)](https://github.com/OTRF)
* **References**: 
 - [OSSEM - ATT&CK Event Mapping](https://github.com/OTRF/OSSEM-DM/blob/main/attack_event_mapping/_attack_data_sources_all.yaml)
 - [Defining ATT&CK Data Sources, Part I: Enhancing the Current State](https://medium.com/mitre-attack/defining-attack-data-sources-part-i-4c39e581454f)
 - [Defining ATT&CK Data Sources, Part II: Operationalizing the Methodology](https://medium.com/mitre-attack/defining-attack-data-sources-part-ii-1fc98738ba5b)
 - [ATT&CK - Data Sources Definition](https://github.com/mitre-attack/attack-datasources/blob/main/DataSourcesDefinition.ipynb)

## Importing Python Libraries

In [1]:
# Importing library to manipulate data
import pandas as pd
from pandas import json_normalize

# Importing library to manipulate yaml data
import yaml
import requests

## Importing Data Sources Mapping Yaml File

In [2]:
yamlUrl = 'https://raw.githubusercontent.com/OTRF/OSSEM-DM/main/attack_event_mapping/_attack_data_sources_all.yaml'
yamlContent = requests.get(yamlUrl)
yamlMapping = yaml.safe_load(yamlContent.text)
mapping = json_normalize(yamlMapping)
mapping.head()

Unnamed: 0,name,definition,collection_layers,platforms,contributors,data_components,references
0,Service,Information about software programs that run i...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],"[{'name': 'service creation', 'type': 'activit...",[https://docs.microsoft.com/en-us/dotnet/frame...
1,Module,"Information about portable executable files, s...",[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],"[{'name': 'module load', 'type': 'activity', '...",[https://docs.microsoft.com/en-us/windows/win3...
2,WMI object,Information about objects from the system clas...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],"[{'name': 'wmi object context', 'type': 'infor...",[https://docs.microsoft.com/en-us/windows/win3...
3,File,Information about file objects that represent ...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],"[{'name': 'file creation', 'type': 'activity',...",[https://docs.microsoft.com/en-us/windows/win3...
4,Named pipe,Information about mechanisms that allow inter-...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],"[{'name': 'named pipe creation', 'relationship...",[https://docs.microsoft.com/en-us/windows/win3...


## Preparing Data Sources Mapping Dataframe

In [3]:
# Splitting rows for data_components list
dcListExp = mapping.explode('data_components').reset_index(drop=True).rename(columns={'name':'data_sources'})

# Splitting columns for data_components dict
dcDictExp = dcListExp['data_components'].apply(pd.Series).merge(dcListExp,left_index=True,right_index=True)\
.reset_index(drop=True).rename(columns={'name':'components'}).drop(['data_components'], axis = 1)

# Splitting rows for relationships list
drListExp = dcDictExp.explode('relationships').reset_index(drop=True)

# Splitting columns for relationships dict
drDictExp = drListExp['relationships'].apply(pd.Series).merge(drListExp,left_index=True,right_index=True)\
.reset_index(drop=True).rename(columns={'name':'data_relationships'}).drop(['relationships'], axis = 1)

# Splitting rows for telemetry list
tListExp = drDictExp.explode('telemetry').reset_index(drop=True)

# Splitting columnes for telemetry dict
tDictExp = tListExp['telemetry'].apply(pd.Series).merge(tListExp,left_index=True,right_index=True)\
.reset_index(drop=True).drop(['telemetry'], axis = 1)

# Splitting rows for event_id list
dataSourcesMapping = tDictExp.explode('event_id').reset_index(drop=True)

dataSourcesMapping.head()

Unnamed: 0,event_provider,event_id,data_relationships,id,source_data_element,relationship,target_data_element,components,type,data_sources,definition,collection_layers,platforms,contributors,references
0,Microsoft-Windows-Security-Auditing,4697,User created Service,BB243122-F345-4ED6-97A7-FBA2A1AF7C38,user,created,service,service creation,activity,Service,Information about software programs that run i...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],[https://docs.microsoft.com/en-us/dotnet/frame...
1,System,7045,User created Service,BB243122-F345-4ED6-97A7-FBA2A1AF7C38,user,created,service,service creation,activity,Service,Information about software programs that run i...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],[https://docs.microsoft.com/en-us/dotnet/frame...
2,Microsoft-Windows-Sysmon/Operational,7,Process loaded Dll,109A870F-84A2-4CE4-948A-4773CD283F76,process,loaded,dll,module load,activity,Module,"Information about portable executable files, s...",[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],[https://docs.microsoft.com/en-us/windows/win3...
3,Microsoft-Windows-Sysmon/Operational,7,Process loaded Executable,71F08C96-4ABE-47DE-A972-DEA470496FD4,process,loaded,executable,module load,activity,Module,"Information about portable executable files, s...",[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],[https://docs.microsoft.com/en-us/windows/win3...
4,Microsoft-Windows-WMI-Activity/Operational,5861,Wmi subscription created,F3B6DD23-4DBE-4214-A2D7-10E7103D815F,wmi subscription,created,,wmi object context,information,WMI object,Information about objects from the system clas...,[host],[Windows],[Jose Rodriguez @Cyb3rPandaH],[https://docs.microsoft.com/en-us/windows/win3...


## Visualizing Relationships Among Data Elements: Network Graph

* Getting dataframe of relationships among data elements

In [4]:
relationships = dataSourcesMapping[['source_data_element','target_data_element','relationship']]\
      .drop_duplicates()\
      .dropna().reset_index(drop = True)\
      .rename(columns={'source_data_element':'source','target_data_element':'target'})\
      .replace(['windows registry key','windows registry key value'],['registry key','registry key value'])
relationships.head()

Unnamed: 0,source,target,relationship
0,user,service,created
1,process,dll,loaded
2,process,executable,loaded
3,user,wmi filter,created
4,user,wmi consumer,created


* Importing libraries

In [5]:
import networkx as nx

from bokeh.io import output_notebook, show
from bokeh.models import (BoxSelectTool,LassoSelectTool,PanTool,BoxZoomTool,ResetTool,TapTool,
                          WheelZoomTool,HoverTool,Circle, MultiLine,Plot, Range1d,ColumnDataSource,
                          PointDrawTool,LabelSet,EdgesAndLinkedNodes, NodesAndLinkedEdges,HoverTool)
from bokeh.palettes import Spectral5
from bokeh.plotting import from_networkx

output_notebook()

* Visualizing network graph

In [6]:
# Creating networkx graph object
G = nx.from_pandas_edgelist(relationships, source = 'source', target = 'target',edge_attr = 'relationship')

# Creating plot object
plot = Plot(plot_width=800, plot_height=500,
            x_range=Range1d(-1.1, 1.1), y_range=Range1d(-1.1, 1.1))
plot.title.text = "Relationships among Data Elements"
plot.title.align = "center"
plot.title.text_font_size = "30px"

# Creating renderer objects
graph_renderer = from_networkx(G, nx.spring_layout, scale=1, center=(0, 0))

graph_renderer.node_renderer.glyph = Circle(size=60, fill_color=Spectral5[0])
graph_renderer.node_renderer.selection_glyph = Circle(size=15, fill_color=Spectral5[3])
graph_renderer.node_renderer.hover_glyph = Circle(size=15, fill_color=Spectral5[1])

graph_renderer.edge_renderer.glyph = MultiLine(line_color="gray", line_alpha=0.9, line_width=1)
graph_renderer.edge_renderer.selection_glyph = MultiLine(line_color=Spectral5[3], line_width=4)
graph_renderer.edge_renderer.hover_glyph = MultiLine(line_color=Spectral5[1], line_width=4)

graph_renderer.selection_policy = NodesAndLinkedEdges()
graph_renderer.inspection_policy = NodesAndLinkedEdges()
plot.renderers.append(graph_renderer)

# Adding nodes labels
x,y = zip(*graph_renderer.layout_provider.graph_layout.values())

names = []
for i in nx.nodes(G): names.append(i) 
    
nodes_table = ColumnDataSource({'x': x, 'y': y,'name': names})
labels = LabelSet(x='x',y='y',text='name',render_mode='canvas',y_offset=-10,text_align='center',
                  text_font_size='12px',source = nodes_table, background_fill_color=None,text_color = 'black')

plot.renderers.append(labels)

# Adding tools
plot.add_tools(HoverTool(tooltips=None),WheelZoomTool(),TapTool(),BoxSelectTool(),BoxZoomTool(),
               PanTool(),LassoSelectTool(),ResetTool())

# Showing graph
show(plot)

## Mapping (sub)techniques to data components & relationships & event logs

### Gathering ATT&CK content

* Importing libraries

In [7]:
# Importing library to interact with up to date ATT&CK content available in STIX format via public TAXII server
from attackcti import attack_client

* Gathering (sub)techniques

In [8]:
# Instantiating attack_client class
lift = attack_client()

# Collecting all techniques (Revoked and not revoked) for windows platform within the enterprise matrix
attck = lift.get_techniques_by_platform(name = 'Windows', stix_format = False)

# Removing revoked techniques
attck = lift.remove_revoked(attck)

# Generating a dataframe with information collected
attck = json_normalize(attck)

# Selecting columns
attck = attck[['tactic','technique_id','technique','data_sources']]

# Showing information collected
attck.head()

Unnamed: 0,tactic,technique_id,technique,data_sources
0,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,"[Packet capture, Netflow/Enclave netflow]"
1,"[persistence, privilege-escalation]",T1547.012,Print Processors,"[Process monitoring, Windows Registry, File mo..."
2,[defense-evasion],T1564.007,VBA Stomping,"[Process monitoring, File monitoring]"
3,[credential-access],T1558.004,AS-REP Roasting,"[Windows event logs, Authentication logs]"
4,[defense-evasion],T1218.012,Verclsid,"[Process use of network, Process command-line ..."


* Splitting data_sources field

In [9]:
attck = attck.explode('data_sources').reset_index(drop=True)
attck.head()

Unnamed: 0,tactic,technique_id,technique,data_sources
0,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,Packet capture
1,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,Netflow/Enclave netflow
2,"[persistence, privilege-escalation]",T1547.012,Print Processors,Process monitoring
3,"[persistence, privilege-escalation]",T1547.012,Print Processors,Windows Registry
4,"[persistence, privilege-escalation]",T1547.012,Print Processors,File monitoring


* Updating data_sources names

In [10]:
yamlNamesUrl = 'https://raw.githubusercontent.com/OTRF/OSSEM-DM/main/attack_event_mapping/_new_data_sources_names.yaml'
yamlNamesContent = requests.get(yamlNamesUrl)
names = yaml.safe_load(yamlNamesContent.text)
names = pd.DataFrame(list(names[0].items()), columns=['data_sources', 'new_names'])

attck = pd.merge(attck, names, on = 'data_sources', how = 'left')
namesUpdate = attck['new_names'].fillna(attck['data_sources'])
attck = attck.drop(columns = ['new_names']).assign(data_sources = namesUpdate)
attck = attck.drop_duplicates(subset = ['technique_id','data_sources'])\
        .sort_values(by = ['technique'])\
        .reset_index(drop = True)
attck.head()

Unnamed: 0,tactic,technique_id,technique,data_sources
0,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,Packet capture
1,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,Netflow/Enclave netflow
2,[credential-access],T1558.004,AS-REP Roasting,Authentication log
3,[credential-access],T1558.004,AS-REP Roasting,Windows event logs
4,"[privilege-escalation, defense-evasion]",T1548,Abuse Elevation Control Mechanism,File


* Mapping techniques to data components & relationships * event logs

In [11]:
techniques = pd.merge(attck, dataSourcesMapping, on = 'data_sources', how = 'left')\
             [['tactic','technique_id','technique','data_sources','components','data_relationships',\
               'event_provider','event_id']]

techniques.head()

Unnamed: 0,tactic,technique_id,technique,data_sources,components,data_relationships,event_provider,event_id
0,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,Packet capture,,,,
1,"[credential-access, collection]",T1557.002,ARP Cache Poisoning,Netflow/Enclave netflow,,,,
2,[credential-access],T1558.004,AS-REP Roasting,Authentication log,authentication success,User authenticated Host,Microsoft-Windows-Security-Auditing,4624.0
3,[credential-access],T1558.004,AS-REP Roasting,Authentication log,authentication success,User authenticated Host,Microsoft-Windows-Security-Auditing,4778.0
4,[credential-access],T1558.004,AS-REP Roasting,Windows event logs,,,,


### Use Case: SMB/Windows Admin Shares (T1021.002)

* What are the recommended data sources?

In [12]:
attck[attck['technique_id']=='T1021.002']

Unnamed: 0,tactic,technique_id,technique,data_sources
746,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process
747,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Authentication log


* What are the potential event logs that we could consider in our analysis?

In [13]:
techniques[techniques['technique_id']=='T1021.002']

Unnamed: 0,tactic,technique_id,technique,data_sources,components,data_relationships,event_provider,event_id
6344,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process creation,User created Process,Microsoft-Windows-Security-Auditing,4688
6345,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process creation,User created Process,Microsoft-Windows-Sysmon/Operational,1
6346,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process creation,Process created Process,Microsoft-Windows-Security-Auditing,4688
6347,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process creation,Process created Process,Microsoft-Windows-Sysmon/Operational,1
6348,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process modification,Process wrote to Process,Microsoft-Windows-Sysmon/Operational,8
6349,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process access,Process accessed Process,Microsoft-Windows-Security-Auditing,4663
6350,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process access,Process accessed Process,Microsoft-Windows-Sysmon/Operational,10
6351,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process access,Process requested access Process,Microsoft-Windows-Security-Auditing,4656
6352,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process network connection,Process connected to Port,Microsoft-Windows-Security-Auditing,5156
6353,[lateral-movement],T1021.002,SMB/Windows Admin Shares,Process,process network connection,Process connected to Port,Microsoft-Windows-Sysmon/Operational,3


* **Contribution opportunity identified after testing the framework!!**

The [Mordor](https://github.com/OTRF/mordor) project contains a dataset that represents a variation of the [SMB/Windows Admin Shares (T1021.002)](https://mordordatasets.com/notebooks/small/windows/08_lateral_movement/SDWIN-200806015757.html) sub-technique.

**Offensive tradecraft:** Adversaries leverage SMB to copy files over the network to either execute code remotely or exfiltrate data.

**Initial detection strategy:** Using the data sources recommended by the ATT&CK team, we were able to get some context related to the adversary behavior from a Procees and User Authentication perspective. However, more context around the manipulation or creation of files might be helpful to define a better detection strategy.

**Contribution opportunity:** Security event logs such as **5140 (A network share object was accessed)** and **5145 (A network share object was checked)** from Windows Security Auditing and **11 (File created)**  from Sysmon can help us with the conext required to define a detection strategy for this specific sub-technqiue variation. Therefore, adding the **File data source object** (file share access & file access data components) to the recommended data sources for this sub-technique might be a good idea.

In [14]:
dataSourcesMapping[['data_sources','components','data_relationships','event_provider','event_id']][(dataSourcesMapping['data_sources'] == 'File') & \
                   ((dataSourcesMapping['event_id'] == 5140) |\
                    (dataSourcesMapping['event_id'] == 5145) |\
                    (dataSourcesMapping['event_id'] == 11))]

Unnamed: 0,data_sources,components,data_relationships,event_provider,event_id
11,File,file creation,Process created File,Microsoft-Windows-Sysmon/Operational,11
13,File,file share access,User accessed File Share,Microsoft-Windows-Security-Auditing,5140
14,File,file access,User accessed File,Microsoft-Windows-Security-Auditing,5145


## I hope you find this notebook helpful!! :D