## InfoSec Jupyterthon Day 2

---



# MSTICPy - Microsoft Threat Intelligence Center Jupyter & Python Security Tools

msticpy is a library for InfoSec investigation and hunting in Jupyter Notebooks. It includes functionality to:
- query log data from multiple sources
- enrich the data with Threat Intelligence, geolocations and Azure resource data
- extract Indicators of Activity (IoA) from logs and unpack encoded data
- perform sophisticated analysis such as anomalous session detection and time series decomposition
- visualize data using interactive timelines, process trees and multi-dimensional Morph Charts

It also includes some time-saving notebook tools such as widgets to set query time boundaries, select and display items from lists, and configure the notebook environment.

Source Code: https://github.com/microsoft/msticpy
Python Package: https://pypi.org/project/msticpy/#:~:text=Microsoft%20Threat%20Intelligence%20Python%20Security%20Tools.%20msticpy%20is,functionality%20to%3A%20query%20log%20data%20from%20multiple%20sources
Docs: https://msticpy.readthedocs.io/en/latest/

Contents
- The basics 
- MSTICPy Widgets [Pete]
- Query providers revisited [Ian, Pete]
- Threat intelligence [Ian, Ashwin] 
- Other enrichment
- Pivot functions
- Query customization
- Notebooklets - notebook macros|
- VirusTotal Detonation browsing - Preview Feature
- MSTICPy Hack month


---
## Basics - installing and configuring
To use any library in Python you first need to install the pacakge and import it.
There are several ways to do this depending on how you want to access the library, however the simplest and easiest is using pip. [Pip](https://pypi.org/project/pip/) is the pacakge installer for Python and makes finding and installing Python pacakges simple.
You can use pip to install packages via the command line, or if you are using a notebook, directly in a notebook cell. Azure ML compute come with Pip installed already but if you are running your notebook elsewhere you may need to install pip first.

To do this we need to use `%pip` followed by install and the pacakge name. e.g.:
`%pip install msticpy`


MSTICPy is a library with a broad range of functionality. As such installing the whole library can be more than required for a lot of uses. As such MSTICPy has implemented a series of  Extras that allow for the installation of certain part of the library. These Extras are grouped around core technologies that you might want to use with MSTICPy.

| Extra      | Functionality 
|------------|-------------------------------------------------------------|
| --none--   | Most functionality (approx 75%) Kqlmagic Jupyter basic
| keyvault   | Key Vault and keyring storage of settings secrets
| azure      | Azure API data retrieval, Azure storage APIs, Sentinel APIs 
| kql        | Kqlmagic Jupyter extended functionality
| azsentinel | Combination of core install + "azure", "keyvault", "kql"
| ml         | Timeseries analysis, Event clustering, Outlier analysis
| splunk     | Splunk data queries
| vt3        | VirusTotal V3 graph API
| riskiq     | RiskIQ Illuminate threat intel provider & pivot functions
| all        | Includes all of above packages
| dev        | Development tools plus "base"
| test       | "dev" plus "all"

To install a specific Extra, use the following syntax:
`%pip install msticpy[extra]`

You can also install multiple extras at once:
`%pip install msticpy[extra1,extra2,...]`

In [None]:
%pip install msticpy[all]

## Configuration

Once installed MSTICPy can be imported in the same way as any other Python library, however to make things a bit easier we have created the `init_notebook` function that will automatically import the library and configure it for use in a Jupyter Notebook.


<p style="border: solid; padding: 5pt; color: white; background-color: DarkOliveGreen"><b>Note:</b>
Passing `globals()` lets the init function import stuff into the notebook top-level namespace
</p>

In [None]:
import msticpy
msticpy.init_notebook(globals())

### MSTICPy's config file

MSTICPy can handle connections to a variety of data sources and services, including Azure Sentinel.

To make it easier to manage and re-use the configuration and credentials fo these things MSTICPy has its own config file that holds these items - `msticpyconfig.yaml`

If you didn't have a `msticpyconfig.yaml` file in your workspace folder (which is likely if this is your first use of MSTICPY), the `init_notebook` function should have created one for you and populated it file structure you are in.

You can populate msticpyconfig manually or you can used MSTICPy's settings editor to view and edit the settings stored there.


In [None]:
msticpy.MpConfigEdit()

There are different settings depending on what feature of MSTICPy you are setting configuration for. This includes config settings for connecting to Security data sources such as Microsoft Sentinel, Threat Intelligence providers such as VirusTotal, or for connecting to KeyVault if you want to use that service for securly storing your MSTICPy settings.

Once you have compelted a these sections you can cehck wyou have the correct settings by using the `msticpy.settings.show_settings()` function.

In [None]:
import yaml
print(yaml.safe_dump(msticpy.settings.settings)[:500])

More details on populating the config file can be found in the [MSTICPy Settings Editor Documentation](https://msticpy.readthedocs.io/en/latest/settings.html)

---

# MSTICPy Widgets 

MSTICPy include a series of Notebook widgets to make interacting with data easier, especially for users without a programming background.

The widgets are designed to fulfil a number of common tasks that a user might need to interact with a notebook such as select items from returned data, or set a timeframe for a query.

The widgets themselves are build in ipywidgets and are available in the `msticpy.nbtools.nbwidgets` module.

<p style="border: solid; padding: 5pt; color: white; background-color: Navy"><b>Note:</b>
Widgets are automatically imported by init_notebook</p>

The below code creates our Time Range widget that can be used to allow a user to set a time range. We are telling it to use days as its unit of measurements and set a max range to select from.

In [None]:
from msticpy.nbtools.nbwidgets import *

time_select = QueryTime(units="day", max_before=20, before=5, max_after=1)
time_select.display()

We can then call  the start end end properties and get datetime objects based on the user selection.

In [None]:
time_select.start

Other widgets allow for the selection of items from list, along with a text filter option to help users find items:

In [None]:
items = ["item 1", "item 2", "item 3"]
selection = SelectItem(item_list=items, description="Select item", auto_display=True)

There are also security specific widgets such as SelectAlert which allows a user to select a specific alert from a list of alerts.
With this widget and others you can also specify an action, this is a follow on funciton that is executed with the value of the value selected in the widget.

In the cell below we are using the action method to display the selected alert.

In [None]:
import pandas as pd
from msticpy.nbtools.nbdisplay import display_alert

# As discussed earlier pands read_* functions can call remote files as well as local ones.
alerts = pd.read_pickle("https://github.com/microsoft/msticpy/raw/main/tests/testdata/localdata/alerts_list.pkl")
alert_select = SelectAlert(alerts=alerts, action=display_alert)
alert_select.display()

Some widgets are registered, meaning that they can be re-used later and will rememeber previous values entered.<br>
This can be done by simply creating a widget with the same parameters as previously.

In [None]:
mem_text = GetText(prompt="Enter your name")
mem_text

In [None]:
mem_text = GetText(prompt="Enter your name")
mem_text

Other MSTICPy widgets include:
- A simple datetime based lookback slider `Lookback`
- A text box to capture user input `GetText`
- A widget to capture and return an Environemnt Variable `GetEnvrionmentKey`
- A widget to select a subset of items from a list `SelectSubset`
- A widget to show progress of a task `Progress`
- Multi option buttons with a wait function that pauses cell execution until a user selects an option `OptionButtons`


More details on MSTICPy's widgets can be found here: https://msticpy.readthedocs.io/en/latest/visualization/NotebookWidgets.html

---
# Query providers revisited [Ian]

## Supported providers
- Microsoft Sentinel
- Microsoft Defender/Defender for Endpoint
- Splunk
- Sumologic
- Microsoft Graph
- Local data
- Mordor/Security Datasets
- Kusto/Azure Data Explorer
- Azure Resource Graph

In [None]:
from msticpy.data import QueryProvider
import pandas as pd

# Load query providers (typically you'll be using just one)
qry_prov_az = QueryProvider("AzureSentinel")
qry_prov_sp = QueryProvider("Splunk")
qry_prov_mde = QueryProvider("MDE")
# Special provider that uses local data files
qry_prov_loc = QueryProvider("LocalData", data_paths=["./data"], query_paths=["./data"])


## list_queries and running a query


In [None]:
qry_prov_az.list_queries()[:10]

In [None]:
qry_prov_az.Azure.list_aad_signins_for_account?


## Query time ranges


In [None]:
qry_prov_az.query_time

## Parameters


---
# Enrichment - Threat intelligence

## Generic providers [Ashwin] 


In [None]:
pivot = Pivot(namespace=globals())
f

ti_results = destip.tilookup_ipv4()
TILookup.browse_results(ti_results)

---
# Enrichment - Other enrichment [Pete] 

Another key feature of MSTICPy is the ability to enrich your core security log data with additional data sources that help provide additional information and context to a security analyst.<br>
There are a number of data enrichments avaliable including:<br>
- GeoIP data to locate an IP Address
- WhoIs data to provide information on a domain owner
- Azure API data to provide additional data on Azure resources.

### IP Tools
MSTICPy contains a number of IP related enrichments the are grouped under the IPAddress entity type.
GeoIP is a useful feature to enrich your data with information about the location of an IP address and provide context about whether the IP address shoudl be considered suspicious or not.<br>
MSTICPy supports getting GeoIP data from tow key sources, MaxMind's GeoIPLite service, and the IPStack Geo service.

<p style="border: solid; padding: 5pt; color: white; background-color: Navy"><b>Note:</b>
Both of these GeoIP services require an API key - more details can be found in the <a href="https://msticpy.readthedocs.io/en/latest/data_acquisition/GeoIPLookups.html">MSTICPy documentation</a></p>



In [None]:
from msticpy.datamodel.pivot import Pivot
Pivot(namespace=globals())

from msticpy.datamodel.entities import IpAddress
IpAddress.util.geoloc(value="103.125.190.248")

MSTICPy also has IP tools to get WhoIs information on an IP address:

In [None]:
IpAddress.util.whois(value="103.125.190.248")

And to do a resverse DNS lookup:

In [None]:
IpAddress.util.ip_rev_resolve(value="103.125.190.248")

### Domain Tools
Similar enrichments exist for other common entity types such as domains (under the Dns entity type):

In [None]:
from msticpy.datamodel.entities import Dns
Dns.util.dns_resolve("www.contoso.com")

In [None]:
Dns.util.dns_in_abuse_list("www.contoso.com")

We can also fetch a screenshot of a target URL in order to give the analyst a visual representation of the site being investigated.
<p style="border: solid; padding: 5pt; color: white; background-color: Navy"><b>Note:</b>
Screenshots are enabled by the <a href="https://browshot.com/">Browshot</a> service</p>

In [40]:
from msticpy.sectools.domain_utils import screenshot
from IPython.display import display, Image

sshot = screenshot("www.contoso.com")

with open('screenshot.png', 'wb') as f:
    f.write(sshot.content)

display(Image(filename='screenshot.png'))

MsticpyUserConfigError: ('Browshot configuration not found', 'No configuration found for Browshot', 'Please add a section to msticpyconfig.yaml:', 'DataProviders:', '  Browshot:', '    Args:', '      AuthKey: {your_auth_key}', 'Ensure that the path to your msticpyconfig.yaml is specified with the MSTICPYCONFIG environment variable.', 'Or ensure that a copy of this file is in the current directory.', ('Configuring msticpy', 'https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html'), ('Get an API key for Browshot', 'https://api.browshot.com/'))

In [None]:
from msticpy.sectools.geoip import GeoLiteLookup

iplocation = GeoLiteLookup()
loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97')
print('Raw result')
display(loc_result)

print('IP Address Entity')
display(ip_entity[0])

We can also lookup multiple IP addresses at once by passing in a list of IP addresses.

In [None]:
ips = ["103.125.190.248", "173.232.207.214", "52.200.40.111"]

_, ip_entities = iplocation.lookup_ip(ip_addr_list=ips)
ents = [ip_ent.properties for ip_ent in ip_entities]
pd.DataFrame(ents)

MSTICPy also has tools to support getting information such as the WhoIs record for an IP address:

In [39]:
from msticpy.sectools.ip_utils import get_whois_info
get_whois_info("103.125.190.248")

('VNPT-AS-VN VIETNAM POSTS AND TELECOMMUNICATIONS GROUP, VN',
 {'nir': None,
  'asn_registry': 'apnic',
  'asn': '135905',
  'asn_cidr': '103.125.188.0/22',
  'asn_country_code': 'VN',
  'asn_date': '2018-11-21',
  'asn_description': 'VNPT-AS-VN VIETNAM POSTS AND TELECOMMUNICATIONS GROUP, VN',
  'query': '103.125.190.248',
  'nets': [{'cidr': '103.125.188.0/22',
    'name': 'HYPERNET-VN',
    'handle': 'NDM8-AP',
    'range': '103.125.188.0 - 103.125.191.255',
    'description': 'Hypernet Vietnam Technology Company Limited\nXa Khuc, Chu Phan, Me Linh, Hanoi',
    'country': 'VN',
    'state': None,
    'city': None,
    'address': 'Ha Noi, VietNam',
    'postal_code': None,
    'emails': ['hm-changed@vnnic.vn',
     'ducmanhepu1@gmail.com',
     'admin@vietserver.vn'],
    'created': None,
    'updated': None},
   {'cidr': '103.125.188.0/22',
    'name': None,
    'handle': None,
    'range': '103.125.188.0 - 103.125.191.255',
    'description': 'HYPERNET-VN',
    'country': None,
    

## Whois 



## Azure/Azure Resource Graph  


---
# Pivot functions [Ian] 

Pivot functions are methods of entities that provide:
- data queries related to an entity
- enrichment functions relevant to that entity

Pivot functions are dynamically attached to entities. We created this
framework to make it easier to find which functions you can use for which entity type.

### Motivation
- We had built a lot of functionality in MSTICPy for querying and enrichment
- A lot of the functions had inconsistent type/parameter signatures
- There was no easy discovery mechanism for these functions - you had to know
- Using entities as pivot points is a "natural" investigation pattern

## Access functionality from entities


In [None]:
pivot = Pivot(namespace=globals())

In [None]:
from msticpy.datamodel import entities

display(entities.IpAddress.whois("38.75.137.9"))
display(entities.IpAddress.geoloc("38.75.137.9"))

In [None]:
pivot.browse()

In [None]:
%%ioc --out ip_list
	SourceIP	DestinationIP	TotalBytesSent	nir	asn_registry	asn	asn_cidr	asn_country_code	asn_date	asn_description	query	nets	raw	referral	raw_referral
0	10.0.3.5	40.124.45.19	621	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	10.16.12.1	40.124.45.19	1004	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	10.4.5.12	13.71.172.130	247	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	10.4.5.12	40.77.232.95	189	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	10.4.5.16	13.71.172.130	46	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
5	10.4.5.16	65.55.44.109	120	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6	10.90.78.142	104.43.212.12	12	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
7	10.90.78.71	104.43.212.12	4	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
8	20.185.182.48	38.75.137.9	8328	NaN	arin	8075

In [None]:
entities.IpAddress.whois(ip_list["ipv4"]) #, join="left")


# Creating pipelines



In [None]:
list(ip_list["ipv4"])[:4]

In [None]:
(
    entities.IpAddress.whois(list(ip_list["ipv4"])[:4], join="left")
    .mp_pivot.run(entities.IpAddress.geoloc, input_col="ip_column", join="left")
    .mp_pivot.run(entities.IpAddress.tilookup_ipv4, input_col="ip_column", join="left")
)

# Advanced queries [Ian] 

## Creating Custom queries

```yaml
metadata:
  version: 1
  description: Linux Syslog Host Activity Queries
  data_environments: [LogAnalytics]
  data_families: [LinuxSyslog]
  tags: ['linux', 'syslog']
defaults:
  metadata:
    data_source: 'linux_syslog'
    pivot:
      direct_func_entities:
        - Host
  parameters:
      table:
        description: Table name
        type: str
        default: 'Syslog'
      start:
        description: Query start time
        type: datetime
      end:
        description: Query end time
        type: datetime
      add_query_items:
        description: Additional query clauses
        type: str
        default: ''
      host_name:
        description: Hostname to query for
        type: str
        default: ''
sources:
  user_group_activity:
    description: All user/group additions, deletions, and modifications
    args:
      query: '
        {table}
        | where TimeGenerated >= datetime({start})
        | where TimeGenerated <= datetime({end})
        | where Computer has "{host_name}"
        | where Facility == "authpriv"
        | extend UserGroupAction = iif((ProcessName == "groupadd" or ProcessName == "useradd") and (SyslogMessage contains "new group" or SyslogMessage contains "new user"), "Add",
                                    iif((ProcessName == "groupdel" or ProcessName == "userdel") and (SyslogMessage contains "delete user" or SyslogMessage matches regex "(removed group|removed$)"), "Delete",
                                    iif(ProcessName == "usermod" or ProcessName == "gpasswd", "Modify", "")
                                        )
                                    )
        | extend User=extract("(user: name=|user '')([[:alnum:]]+)",2,SyslogMessage), Group=extract("(group: name=|group '')([[:alnum:]]+)",2,SyslogMessage), UID=extract("UID=([0-9]+)",1,SyslogMessage), GID=extract("GID=([0-9]+)",1,SyslogMessage)
        | where UserGroupAction != ""
        {add_query_items}'
    parameters:
  all_syslog:
    description: Returns all syslog activity for a host
    args:
      query: '
         {table}
        | where TimeGenerated >= datetime({start})
        | where TimeGenerated <= datetime({end})
        | where Computer has "{host_name}"
        {add_query_items}'
    parameters:
```
Splunk
```yaml
list_all_alerts:
    description: Retrieves all configured alerts
    metadata:
      data_families: [Alerts]
    args:
      query: '
      | rest/servicesNS/-/search/saved/searches
      | search alert.track=1
      | fields title description search disabled triggered_alert_count actions action.script.filename alert.severity cron_schedule'
    parameters:
```

OData
```yaml
list_alerts_for_user:
    description: Retrieves list of alerts for a user account
    metadata:
      data_source: 'graph_alert'
    args:
      query: '{path}?$filter=createdDateTime ge {start}
        and createdDateTime le {end}
        and (userStates/any(d:tolower(d/userPrincipalName) eq tolower("{user_principal_name}")
        or userStates/any(d:tolower(d/accountName) eq tolower("{account_name}"))
        {add_query_items}'
      uri: None
```

---
# Notebooklets - "Macros" for Notebooks [Ian] 

We built notebooklets because life is too short keep writing (copy/pasting) the same code over and over again.

The Notebooklets (MSTICNB) package multiple notebook cells for common investigation routines into simple functions

<a style="font-family: consolas; font-size:15pt"
 href="https://github.com/microsoft/msticnb">Repo: https://github.com/microsoft/msticnb</a>
<br>
<a style="font-family: consolas; font-size:15pt"
 href="https://msticnb.readthedocs.io/en/latest/">Docs: https://msticnb.readthedocs.io/</a>

<p style="font-family: consolas; color:green; font-size:15pt">$ pip install msticnb</p>


In [None]:
# Import and initialize MSTIC Notebooklets - companion package
# more later
import msticnb as nb
qry_prov_az.connect(WorkspaceConfig(workspace="CyberSecuritySoc"))
nb.init(query_provider=qry_prov_az)
# qry_prov_az.connect(WorkspaceConfig(workspace="CyberSecuritySoc"))

nb.browse()

In [None]:
host_time = nbwidgets.QueryTime()
host_time

In [None]:
host_summary = nb.nblts.azsent.host.HostSummary()

host_summary_rslt = host_summary.run(value="WORKSTATION6", timespan=host_time)#, options=["-bookmarks", "-azure_api"])

In [None]:
host_summary_rslt.browse_alerts()

In [None]:
host_summary_rslt.host_entity.qry_wevt_processes(start="2021-11-17 16:00", end="2021-11-17 16:20").mp_plot.timeline(group_by="Account")

In [None]:
host_summary_rslt.host_entity.qry_wevt_processes(start="2021-11-17 16:09", end="2021-11-17 16:10").mp_plot.process_tree(legend_col="Account")

---
# Preview feature - Virus Total and MP Preview [ian]




---
# MSTICPy Community Sprint - Jan 2022

MSTICPy is always open to contributions from the community, whether this be fixes to the current code base, feature additions, or just new ideas and suggestions.
However, we know that contributing to an Open Source project can be a bit daunting, especially if it’s not something you have done before.

To help people with this we are running a Community Sprint during January 2022. 

During this sprint we are encouraging people to engage with and contribute to MSTICPy. Contributions can take any form but in order to make this as easy as possible for people we will be offering support and guidance during the month to help people work out where and how to contribute.
We will provide:
- A set of contribution ideas tailored to differing skill levels and time commitments
- Office Hours where you can come and ask questions and get help from the MSTICPy team
- Additional contribution resources and guidance
- Some awesome swag for people who contribute

Want to get involved? Keep an eye on the MSTICPy GitHub page for updates on the Community Sprint - https://github.com/microsoft/msticpy

