***MDTI API Playground***

This notebook is designed to help new users of the MDTI API become familiar with the different calls and rationale for those calls.  If you follow this notebook from start to finish it will lead you through the data gathering/scaling/correlation process you can use the API for.  This notebook is not an official Microsoft release and does not use the Python SDK as of this publishing.  This notebook is in no way intended to show best practices for python coding, any suggestions from the community are welcome.

This notebook is pre-seeded with indicators and if followed from top to bottom will guide you through an info gathering process.  The process is broken down into this sequence:

* Token access
* Reputation lookup
* Article review from reputation result
* Listing indicators from the article
* Review of the actor group from the article
* Listing the indicators from the group profile
* Review of the web components of multiple hosts
* Analyzing an array of hosts for anything running a C2 (Command and Control server)
* Pulling PDNS information and running the various domains through the reputation endpoint
* Listing Trackers found on the array of hosts from the previous block

This notebook currently does not use the reverseDNS endpoint and does not include unreleased endpoints as the API is still in public preview with more to come.  My hope is that this will familiarize you with the output from the API and allow you to hit the ground running for scaled investigations/data stacking.

If you haven't yet setup your application, go here: https://techcommunity.microsoft.com/t5/microsoft-defender-threat/what-s-new-apis-in-microsoft-graph/ba-p/3780350

I would suggest adding the base site to favorites: https://techcommunity.microsoft.com/t5/microsoft-defender-threat/bg-p/DefenderThreatIntelligence

In [None]:
# For this notebook your env will need pandas and msal installed via pip or from source if you prefer.  The client_secret.txt file should be in the same folder the notebook is running from or should have a specified filepath.

%pip install pandas, msal

***Token Access***

For those new to the MSFT graph API, you're going to work with tokens which will provide an hour of access at a time.  Below we will generate the token by filling out the client secret (in this case it's just a text file) and then the client and tenant ID's from your environment.

In [None]:

import pandas as pd
import os
import requests
import json

# Read the client secret from a text file
with open("client_secret.txt", "r") as f:
    client_secret = f.read().strip()

# Set the client secret as an environment variable
os.environ["CLIENT_SECRET"] = client_secret

from msal import ConfidentialClientApplication

# Azure AD application credentials
client_id = ""
# If you are not using a secure string or key vault, you will need to un-comment the line below and add the secret there.
# client_secret = ""
tenant_id = ""


# Create a ConfidentialClientApplication object
app = ConfidentialClientApplication(
    client_id=client_id,
    client_credential=client_secret,
    authority=f"https://login.microsoftonline.com/{tenant_id}",
)

# Get a token from Azure AD
result = None
scopes = ["https://graph.microsoft.com/.default"]
result = app.acquire_token_silent(scopes=scopes, account=None)

if not result:
    result = app.acquire_token_for_client(scopes=scopes)

# Get the access token
access_token = result["access_token"]

# Print the access token
print("Access Token:", access_token)

***Reputation***

This block will accept domains/IPs in an array and pass them through to the reputation endpoint for analysis.  The resulting dataframe is setup to provide data from 2 separate JSON normalizations.  If you display a single dataframe from the list below, it will have nested JSON which will not display nicely so we combine the two for an easier viewing experience.  In the below example you have two IOCs which will provide some additional paths for us to take.

You can use either domains or IPs within the array, this is easier for reputation lookups.

In [None]:
pd.set_option('display.max_colwidth', 0)

suspect_iocs = ["162.33.178.162", "vanguard.om"]
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}


for ip in suspect_iocs:
    services = f"https://graph.microsoft.com/beta/security/threatIntelligence/hosts/"+(ip)+'/reputation'
    responseS = requests.get(services, headers=headers)
    dataS = responseS.json()
    dataframeS = pd.json_normalize(dataS)
    dataframeSi = pd.json_normalize(dataS['rules'])
    display(dataframeS[['@odata.context','score', 'classification']], dataframeSi)



**Articles**

In the above example we see that an article has been written to provide more context around the IOC we're looking up.  Below we can grab the article content for faster review and also below this block we can extract the IOCs which are attributed to this campaign.

In [None]:
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}

article = '223703a9'

article_lookup = f"https://graph.microsoft.com/beta/security/threatIntelligence/articles/"+(article)
responseS = requests.get(article_lookup, headers=headers)
dataA = responseS.json()
dataframeA = pd.json_normalize(dataA)
display(dataframeA)

**Article Indicators**

Following the thread for Volt Typhoon, can now extract the indicators observed to be in use by the group.

This block will pull the indicators and types from an article, this could be passed to an array or uploaded into Sentinel's TI blade for further investigation for example.  The return is clipped down to two columns, you can explore all of the columns by removing the [[['artifact.id', 'artifact.kind']]] on the last line.

In [None]:
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}

article = '223703a9'

article_IOC = f"https://graph.microsoft.com/beta/security/threatIntelligence/articles/"+(article)+"/indicators"
responseS = requests.get(article_IOC, headers=headers)
dataA = responseS.json()
dataframeA = pd.json_normalize(dataA['value'])
display(dataframeA[['artifact.id', 'artifact.kind']])

***Group Info***

Along side the indicators in an article are indicators and information for the groups themselves.  The next two blocks will help pull that info for review.

In [None]:
pd.set_option('display.max_colwidth', 0)
group_name = "Volt Typhoon"

group_lookup = f"https://graph.microsoft.com/beta/security/threatIntelligence/intelProfiles?$search="+(group_name)+""
responseGr = requests.get(group_lookup, headers=headers)
dataGr = responseGr.json()
dataframeGr = pd.json_normalize(dataGr['value'])
    #dataframeT = dataframeT.drop(['id'], axis=1)     
display(dataframeGr)

***Article IOCs***

This will allow you to list the IOCs from the article which are not incldued in the article lookup

In [None]:
group_id = "8fe93ebfb3a03fb94a92ac80847790f1d6cfa08f57b2bcebfad328a5c3e762cb"

groupIOC_lookup = f"https://graph.microsoft.com/beta/security/threatIntelligence/intelProfiles/"+(group_id)+"/indicators"
responseGroupIOC = requests.get(groupIOC_lookup, headers=headers)
dataGroupIOC = responseGroupIOC.json()
dataframeGroupIOC = pd.json_normalize(dataGroupIOC['value'])
#dataframeGroupIOC = dataframeGroupIOC.drop(['id', 'artifact.@odata.type'], axis=1)     
display(dataframeGroupIOC)

**Web Components**

Web Components are vital to investigations, they allow us to map the services in use by our adversarial infrastructure.  This block will take an array of domains/IPs and return a cleaned up list of the components observed on each.  The .pop function is to remove the request ID from the returned information which isn't relevant to your investigation.

In [None]:
pd.set_option('display.max_colwidth', 0)

suspect_iocs = ["64.52.80.63","5.252.177.180","193.149.189.224","64.190.113.172","64.52.80.209","65.109.31.190","45.128.156.46","84.252.94.184","192.240.116.106","5.255.100.206"]
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}
for ip in suspect_iocs:

    web_components = f"https://graph.microsoft.com/beta/security/threatIntelligence/hosts/"+(ip)+"/components"
    responseS = requests.get(web_components, headers=headers)
    dataP = responseS.json()
    dataframeP = pd.json_normalize(dataP['value'])
    dataframeP.pop('id')
    display(dataframeP)

***C2 Check***

This code block will query for the presence of the Web Component Category "Command and Control Server".  This could easily be customized to look for anything of interest to the investigation, this is simply an example for category based search.

In [None]:
pd.set_option('display.max_colwidth', 0)

suspect_iocs = ["64.52.80.63","5.252.177.180","193.149.189.224","64.190.113.172","64.52.80.209","65.109.31.190","45.128.156.46","84.252.94.184","192.240.116.106","5.255.100.206"]
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}
for ip in suspect_iocs:

    web_components = f"https://graph.microsoft.com/beta/security/threatIntelligence/hosts/"+(ip)+"/components"
    responseS = requests.get(web_components, headers=headers)
    dataP = responseS.json()
    dataframeP = pd.json_normalize(dataP['value'])
    dataframeP.pop('id')
    cnc = (dataframeP[(dataframeP['category'] == 'Command and Control Server')])
    if not cnc.empty:
        display(cnc)
    



**PDNS Info** 

This snippet will allow you to pull PDNS hostnames from domains/IPs and run them through the reputation endpoint to learn what MDTI knows about them.  There is a commented line which will also look at the justification for the scoring.  You can uncomment that line to add in the additional information but be warned it could cause quite a lot of information to be returned if your array is large or the IP has many associated domains.

In [None]:
pd.set_option('display.max_colwidth', 0)

pdns_lookup = ["64.52.80.63","5.252.177.180","193.149.189.224","64.190.113.172"]

headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}
for ip in pdns_lookup:

    pdns = f"https://graph.microsoft.com/beta/security/threatIntelligence/hosts/"+(ip)+"/passivedns"
    responsePDNS = requests.get(pdns, headers=headers)
    dataP = responsePDNS.json()
    dataframeP = pd.json_normalize(dataP['value'])
    url = dataframeP['artifact.id'].values
    #url = dataframeP.drop(['id', 'firstSeenDateTime', 'lastSeenDateTime', 'collectedDateTime', 'recordType', 'parentHost.id', 'artifact.@odata.type'], axis=1).values
    

    for url1 in url:
        servicesURL = f"https://graph.microsoft.com/beta/security/threatIntelligence/hosts/"+(url1)+'/reputation'
        responseURL = requests.get(servicesURL, headers=headers)
        dataS = responseURL.json()
        dataframeS = pd.json_normalize(dataS)
        #dataframeSi = pd.json_normalize(dataS['rules'])
        display(dataframeS[['@odata.context','score', 'classification']])
        #display(dataframeS[['@odata.context','score', 'classification']], dataframeSi)

***Trackers***

Trackers are embedded in web pages and can be google analytic IDs, Jarm Hashes, Facebook IDs, etc.  An example of where this is helpful is determining if a site may have rogue javascript injected or if a group of sites are using a custom javascript which can link the actors infrastructure together.

In [None]:
pd.set_option('display.max_colwidth', 0)

for ip in pdns_lookup:
    trackers = f"https://graph.microsoft.com/beta/security/threatIntelligence/hosts/"+(ip)+"/trackers?count=true"
    responseT = requests.get(trackers, headers=headers)
    dataT = responseT.json()
    dataframeT = pd.DataFrame(dataT['value'])
    #dataframeT = dataframeT.drop(['id'], axis=1)     
    display(dataframeT)