# Unusual Account Activity
 <details>
     <summary>&nbsp;<u>Notebook Details...</u></summary>

 **Notebook Version:** 2.0<br>
 **Python Version:** Python 3.8+<br>
 **Required Packages**: msticpy, msticnb<br>

 **Data Sources Required**:
 - Sentinel - SecurityAlert, SecurityEvent, HuntingBookmark, Syslog, AAD SigninLogs, AzureActivity, OfficeActivity, ThreatIndicator
 - (Optional) - VirusTotal, AlienVault OTX, IBM XForce, Open Page Rank, (all require accounts and API keys)
 </details>


# Notebook Overview

The notebook uses Azure Active Directory risk assignment to identify
potentially risky user sign-ins. 

For each high risk sign-in that was not later marked as safe/mitigated,
additional data about that user account is collected and uploaded to an MS Sentinel 
Dynamic Summary

## Time ranges for the notebook

The investigation time range is the previous 2 days using the notebook
run time as the origin time.
This can be overridden by notebook parameters.
The default baseline period is the 28 days prior to the investigation
time range.

## Notebook Contents

1. Notebook initialization and Connection
2. Get risk-flagged sign-ins (for primary period)
3. Get login risk level for baseline period
4. Retrieve and Run UEBA hunting queries on risk-flagged users 
5. Get related alerts for users and user IPs
6. Get Threat Intelligence reports for sign-in IPs
7. Look for unusual Azure Audit entries
8. Look for unusual Office 365 activity
9. Look for unusual Azure activity
10. Summarize and upload data


## Output (dynamic summary):
- Dynamic Summary item for each user with additional data as item 3 above.


## Notebook parameters

- `ws_name`: str - The MS Sentinel workspace name to query, default is "Default"
- `start`: datetime/datetime string - the start time of the investigation period
- `end`: datetime/datetime string - the start time of the investigation period
- `baseline_period`: int (days) - the number of days before `start` to use to
  use for a baseline (comparison of current with previous behavior)


---
# 1. Notebook initialization
This should complete without errors. If you encounter errors or warnings look at the following notebooks:

- <a href="https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/A%20Getting%20Started%20Guide%20For%20Azure%20Sentinel%20ML%20Notebooks.ipynb">Getting Started Notebook</a>
- [TroubleShootingNotebooks](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/TroubleShootingNotebooks.ipynb)
- [ConfiguringNotebookEnvironment](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb)

<details>
    <summary>&nbsp;<u>Details...</u></summary>
The next cell:
- Checks for the correct Python version
- Checks versions and optionally installs required packages
- Imports the required packages into the notebook
- Sets a number of configuration options.

If you are running in the Azure Sentinel Notebooks environment (Azure Notebooks or Azure ML) you can run live versions of these notebooks:
- [Getting Started](./A Getting Started Guide For Azure Sentinel ML Notebooks.ipynb)
- [Run TroubleShootingNotebooks](./TroubleShootingNotebooks.ipynb)
- [Run ConfiguringNotebookEnvironment](./ConfiguringNotebookEnvironment.ipynb)

You may also need to do some additional configuration to successfully use functions such as Threat Intelligence service lookup and Geo IP lookup. 
There are more details about this in the `ConfiguringNotebookEnvironment` notebook and in these documents:
- [msticpy configuration](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html)
- [Threat intelligence provider configuration](https://msticpy.readthedocs.io/en/latest/data_acquisition/TIProviders.html#configuration-file)
</details>

In [1]:
from datetime import datetime, timedelta, timezone

REQ_PYTHON_VER = "3.8"
REQ_MSTICPY_VER = "2.3.0"

# %pip install --upgrade msticpy

import msticpy as mp
mp.init_notebook()


In [2]:
# papermill default parameters
ws_name = "Default"
end = datetime.now(timezone.utc)
start = end - timedelta(days=2)
baseline_period = 28
run_date = end

### Get Workspace and Authenticate

<details>
    <summary><u>Authentication help...</u></summary>
    If you want to use a workspace other than one you have defined in your<br>
msticpyconfig.yaml create a connection string with your AAD TENANT_ID and<br>
your WORKSPACE_ID (these should both be quoted UUID strings).

```python
  workspace_cs = "loganalytics://code().tenant('TENANT_ID').workspace('WORKSPACE_ID')"
```
e.g.
```python
  workspace_cs = "loganalytics://code().tenant('c3de0f06-dcb8-40fb-9d1a-b62faea29d9d').workspace('c62d3dc5-11e6-4e29-aa67-eac88d5e6cf6')"
```
Then in the Authentication cell replace
the call to `qry_prov.connect` with the following:
```python
  qry_prov.connect(connect_str=workspace_cs)
```
The cell should now look like this:

```python
...
  # Authentication
  qry_prov = QueryProvider(data_environment="MSSentinel")
  qry_prov.connect(connect_str=workspace_cs)
...
```

On successful authentication you should see a ```popup schema``` button.
To find your Workspace Id go to [Log Analytics](https://ms.portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.OperationalInsights%2Fworkspaces). Look at the workspace properties to find the ID.
</details>

In [3]:
print("Configured workspaces: ", ", ".join(msticpy.settings.get_config("AzureSentinel.Workspaces").keys()))
import ipywidgets as widgets
ws_param = widgets.Combobox(
    description="Workspace Name",
    value=ws_name,
    options=list(msticpy.settings.get_config("AzureSentinel.Workspaces").keys())
)
ws_param

Configured workspaces:  ASIHuntOMSWorkspaceV4, CCIS, Centrica, CyberSecuritySoc, Default, GovCyberSecuritySOC, NationalGrid, RedmondSentinelDemoEnvironment


Combobox(value='Default', description='Workspace Name', options=('ASIHuntOMSWorkspaceV4', 'CCIS', 'Centrica', …

In [4]:
from msticpy.common.timespan import TimeSpan
from msticpy.context.azure import MicrosoftSentinel

# Authentication
qry_prov = mp.QueryProvider(data_environment="MSSentinel")
qry_prov.connect(workspace=ws_param.value)

sentinel = MicrosoftSentinel(workspace=ws_param.value, connect=True)

nb_timespan = TimeSpan(start, end)
qry_prov.query_time.timespan = nb_timespan
md("<hr>")
md("Confirm time range to search", "large, bold")
qry_prov.query_time

Please wait. Loading Kqlmagic extension...done
Connecting... 

connected


VBox(children=(HTML(value='<h4>Set query time boundaries</h4>'), HBox(children=(DatePicker(value=datetime.date…

### Function and class defintions used by the notebook

In [5]:
import re
import urllib
from collections import namedtuple, defaultdict
from datetime import datetime, timedelta, timezone
from typing import Any, Dict, NamedTuple, Optional

import httpx
import pandas as pd
import yaml
from tqdm.auto import tqdm

from msticpy.context.azure.sentinel_dynamic_summary import DynamicSummary, DynamicSummaryItem


# Summary report classes
class SummaryItem(NamedTuple):
    """Data report collection for summary."""
    key: str
    data: pd.DataFrame
    properties: Dict[str, Any]


class SummaryReport:
    """Class to hold summary reports during exec of notebook."""
    def __init__(self):
        self.summary_reports: Dict[str, Dict[str, SummaryItem]] = defaultdict(dict)

    def add_summary_data(self, data: pd.DataFrame, user_column: str, section: str, **kwargs):
        """Add data for users to the summary report"""
        for user, user_data in data.groupby(user_column):
            summary = SummaryItem(
                key=user,
                data=user_data,
                properties=kwargs
            )
            self.summary_reports[user.casefold()][section] = summary

    @property
    def users(self):
        return sorted(self.summary_reports)

    @property
    def report_types(self):
        return sorted({
            report for user_reports in self.summary_reports.values()
            for report in user_reports
        })


summary_report = SummaryReport()


# DF display function
def df_caption(data: pd.DataFrame, caption: str):
    """Display dataframe with a caption."""
    caption_css = "; ".join([
        "caption-side: top",
        "text-align: left",
        "font-size: 15pt",
        "font-weight: bold",
        "padding: 5pt",
    ])
    display(
        data.style.set_caption(f"{caption}").set_table_styles(
            [
                {
                    "selector": "caption",
                    "props": caption_css,
                }
            ]
        )
    )


def get_user_param(data: pd.DataFrame) -> str:
    """Return user names from DataFrame as comma-sep string."""
    return  ",".join([
        f"'{user}'" for user
        in data.UserPrincipalName.values
    ])


# update any changes to start/end datetimes
start = qry_prov.query_time.start
end = qry_prov.query_time.end

# 2. Get risk-flagged sign-ins

This query retrieves user signins that have been flagged by Azure Identity Protection
as at risk. See [Azure Identity Protection](https://learn.microsoft.com/azure/active-directory/identity-protection/overview-identity-protection)
for more background.

In [6]:
signing_risk_query = """
SigninLogs
| where TimeGenerated between (datetime({start}) .. datetime({end}))
| where RiskState != "none"
| project UserPrincipalName, ResultDescription, RiskState, RiskDetail, RiskEventTypes,
  RiskEventTypes_V2, RiskLevelAggregated, RiskLevelDuringSignIn, IPAddress
| extend SigninRisk = case(
        RiskLevelDuringSignIn == "high", 5,
        RiskLevelDuringSignIn == "medium", 3,
        RiskLevelDuringSignIn == "low", 1,
        0
    ),
    AggRisk = case(
        RiskLevelAggregated == "high", 5,
        RiskLevelAggregated == "medium", 3,
        RiskLevelAggregated == "low", 1,
        0
    )
| extend RiskEventDyn = parse_json(RiskEventTypes), RiskEventV2Dyn = parse_json(RiskEventTypes_V2)
| mv-expand RiskEventDyn, RiskEventV2Dyn
| summarize SignIns=count(AggRisk), MeanAggRisk=avg(AggRisk), MeanSigninRisk=avg(SigninRisk), 
  RiskStates=make_set(RiskState), RiskEvents=make_set(RiskEventDyn), RiskEventsV2=make_set(RiskEventV2Dyn),
  SourceIPs=make_set(IPAddress)
  by UserPrincipalName
| order by MeanAggRisk, MeanSigninRisk asc nulls last
"""

# run the query
signin_risk_users_df = qry_prov.exec_query(
    signing_risk_query.format(start=start, end=end)
)
# expand RiskStates (list)
risk_states_df = signin_risk_users_df.explode("RiskStates")
# Extract list of users where risk was mitigated 
safe_users_df = risk_states_df[risk_states_df["RiskStates"].isin(["remediated", "confirmedSafe"])].UserPrincipalName.drop_duplicates()

# Separate unmitigated from mitigated risk users
risk_users_df = signin_risk_users_df[~signin_risk_users_df["UserPrincipalName"].isin(safe_users_df)]
mitigated_users_df = signin_risk_users_df[signin_risk_users_df["UserPrincipalName"].isin(safe_users_df)]

df_caption(risk_users_df[["UserPrincipalName"]], "Unmitigated risk users")
df_caption(mitigated_users_df[["UserPrincipalName"]], "Mitigated risk users")

Unnamed: 0,UserPrincipalName
0,jank@seccxpninja.onmicrosoft.com
2,aguruswamy@contosohotels.com
3,suzanac@contosohotels.com
4,asekstee@microsoft.com
5,takuyaot@microsoft.com
6,ragomeri@microsoft.com
7,elsherif@microsoft.com
8,aweinkopf@microsoft.com
9,rickkotlarz@microsoft.com
10,aauvinen@microsoft.com


Unnamed: 0,UserPrincipalName
1,pdemo@seccxpninja.onmicrosoft.com
13,adm_pwatkins@seccxpninja.onmicrosoft.com


# 3. Retrieve login risk level for baseline period

This is used to distinguish accounts that have a new "At Risk"
designation from those accounts that have a history of risk signins.

> Note: The period used is the `baseline_period` parameter for the notebook - default is 28 days

## Signing Summary for users with unmitigated risk

In [7]:
_AADSIL_DISPLAY_COLUMNS = [
    'TimeGenerated', 'ResultType', 'ResultDescription', 'UserPrincipalName', 'UserId',
    'Location', 'IPAddress', 'AppDisplayName', 'ClientAppUsed', 'AppId',
    'AuthenticationDetails', 'AuthenticationMethodsUsed',
    'RiskEventTypes', 'RiskEventTypes_V2', 'RiskLevelAggregated',
    'RiskLevelDuringSignIn', 'RiskState', 'ResourceDisplayName',
    'LocationDetails', 'MfaDetail', 'NetworkLocationDetails',
    'UserAgent', 'UserDisplayName', 'UserType', 'IPAddressFromResourceProvider',
    'ResourceTenantId', 'HomeTenantId', 'AutonomousSystemNumber', 'Type'
]


# Function to summarize the history data
def weekly_signin_summary(data) -> pd.DataFrame:
    """Create signin summary from historical data."""
    return (
        data
        [_AADSIL_DISPLAY_COLUMNS]
        .explode(["RiskEventTypes"])
        .groupby(["UserPrincipalName", pd.Grouper(key="TimeGenerated", freq="W")])
        .agg(
            LoginCount=pd.NamedAgg("ResultType", "count"),
            ResultTypes=pd.NamedAgg("ResultType", "unique"),
            RiskEventTypes=pd.NamedAgg("RiskEventTypes", "unique"),
            RiskLevels=pd.NamedAgg("RiskLevelAggregated", "unique"),
            RiskLevelSignins=pd.NamedAgg("RiskLevelDuringSignIn", "unique"),
            IPs=pd.NamedAgg("IPAddress", "nunique"),
            Locations=pd.NamedAgg("Location", "nunique"),
            Apps=pd.NamedAgg("AppDisplayName", "nunique"),
            UserAgents=pd.NamedAgg("UserAgent", "nunique"),
            StartDate=pd.NamedAgg("TimeGenerated", "min"),
            EndDate=pd.NamedAgg("TimeGenerated", "max"),
        )
        .sort_index()
    )


# Get historical risk level for previous {period} days
risk_hist_query = """
let q_end = datetime({start});
let q_start = datetime_add("day", -{baseline_period}, q_end);
SigninLogs
| where TimeGenerated between (q_start .. q_end)
| where RiskState != "none"
| where UserPrincipalName in ({users})
| extend RiskEventTypes = parse_json(RiskEventTypes),
  RiskEventTypes_V2 = parse_json(RiskEventTypes_V2)
"""

if risk_users_df.empty:
    raise LookupError(
        "No user logins with unmitigated risk flag found for period.",
        "Exiting notebook."
    )

# Unmitigated risk users
risk_user_hist_df = qry_prov.exec_query(
    risk_hist_query.format(
        users=get_user_param(risk_users_df),
        start=start,
        baseline_period=baseline_period,
    )
)

risk_users_history = weekly_signin_summary(risk_user_hist_df).reset_index()

# Isolate users that have no history of risk in previous period
users_with_past_risk_criteria = risk_users_df.UserPrincipalName.isin(risk_user_hist_df.UserPrincipalName.unique())
risk_users_df = risk_users_df.copy()
risk_users_df.loc[~users_with_past_risk_criteria, "RiskHistory"] = "New"
risk_users_df.loc[users_with_past_risk_criteria, "RiskHistory"] = "Existing"

summary_report.add_summary_data(
    data=risk_users_df,
    user_column="UserPrincipalName",
    section="Risk Users Summary",
)
summary_report.add_summary_data(
    data=risk_users_history,
    user_column="UserPrincipalName",
    section="Risk Users History",
)

df_caption(risk_users_df, "Sign-in risk summary - unmitigated")

Unnamed: 0,UserPrincipalName,SignIns,MeanAggRisk,MeanSigninRisk,RiskStates,RiskEvents,RiskEventsV2,SourceIPs,RiskHistory
0,jank@seccxpninja.onmicrosoft.com,1,3.0,0.0,['atRisk'],[],['newCountry'],['73.109.22.203'],New
2,aguruswamy@contosohotels.com,1,1.0,0.0,['atRisk'],[],['newCountry'],['67.168.169.80'],New
3,suzanac@contosohotels.com,1,1.0,0.0,['atRisk'],[],['newCountry'],['50.47.87.74'],New
4,asekstee@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['144.134.106.54'],New
5,takuyaot@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['153.240.206.142'],New
6,ragomeri@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['167.220.197.42'],New
7,elsherif@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['94.128.105.93'],New
8,aweinkopf@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['167.220.196.39'],New
9,rickkotlarz@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['45.22.1.222'],New
10,aauvinen@microsoft.com,1,0.0,0.0,['dismissed'],[],['newCountry'],['167.220.197.91'],New


## Signing Summary for users with mitigated risk
### [info only]

In [8]:
# History of mitigated risk users
if not mitigated_users_df.empty:
    mit_risk_user_hist_df = qry_prov.exec_query(
        risk_hist_query.format(
            users=get_user_param(mitigated_users_df),
            start=start,
            baseline_period=baseline_period
        )
    )


    # Isolate users that have no history of risk in previous period
    users_with_past_risk_criteria = mitigated_users_df.UserPrincipalName.isin(mit_risk_user_hist_df.UserPrincipalName.unique())
    mitigated_users_df = mitigated_users_df.copy()
    mitigated_users_df.loc[~users_with_past_risk_criteria, "RiskHistory"] = "New"
    mitigated_users_df.loc[users_with_past_risk_criteria, "RiskHistory"] = "Existing"
    mitigated_users_df
    df_caption(mitigated_users_df, "Sign-in risk summary - mitigated")
else:
    md("No users with mitigated risk in this period.")

Unnamed: 0,UserPrincipalName,SignIns,MeanAggRisk,MeanSigninRisk,RiskStates,RiskEvents,RiskEventsV2,SourceIPs,RiskHistory
1,pdemo@seccxpninja.onmicrosoft.com,148,1.385135,3.878378,"['confirmedSafe', 'atRisk', 'dismissed', 'confirmedCompromised']","['unfamiliarFeatures', 'unlikelyTravel']","['unfamiliarFeatures', 'unlikelyTravel']","['195.200.70.38', '182.1.122.187', '174.49.144.40', '187.145.130.238', '213.180.18.170', '86.188.84.35', '178.15.174.19', '114.4.214.32', '180.178.100.70', '89.64.60.74', '201.191.51.20', '140.186.246.113', '86.136.20.241', '212.81.187.84', '96.234.155.228', '81.228.197.31', '194.69.103.247', '147.161.137.90', '91.37.88.113', '194.69.103.19', '112.201.164.139', '165.225.112.139', '101.180.77.136', '122.161.73.80', '46.223.162.163', '95.130.222.34', '31.223.2.253', '167.220.197.43', '108.218.142.243', '136.226.252.97', '134.238.224.48', '169.159.144.109', '194.69.103.115', '13.79.0.98', '31.160.80.18', '194.69.103.94', '167.220.197.108', '92.97.154.99', '94.15.56.164', '195.146.138.78', '31.168.52.224', '213.123.211.228', '189.128.102.186', '136.35.205.71', '167.220.197.233', '194.69.103.30', '194.69.103.141', '194.69.103.187']",Existing
13,adm_pwatkins@seccxpninja.onmicrosoft.com,9,0.0,5.0,['remediated'],['anonymizedIPAddress'],['anonymizedIPAddress'],"['185.220.102.251', '192.42.116.17', '171.25.193.78']",Existing


# 4. Retrieve and Run UEBA hunting queries on risk-flagged users

> UEBA = User Entity Behavior Analytics

The next cell retrieves the current UEBA hunting
queries and runs them against the risk-flagged users.

For more information see [Microsoft Sentinel UEBA](https://learn.microsoft.com/azure/sentinel/identify-threats-with-entity-behavior-analytics)

In [9]:
# Hunting Queries
_SENTINEL_REPO = "https://raw.githubusercontent.com/Azure/Azure-Sentinel/master"
_SI_LOG_ROOT = f"{_SENTINEL_REPO}/Hunting%20Queries/SigninLogs"
_GEN_HUNTING_QRY = [
    # "AnomalousUserAppSigninLocationIncreaseDetail.yaml",
    # "LegacyAuthAttempt.yaml",
    # "Signins-From-VPS-Providers.yaml",
    # "UserAccountsMeasurableincreaseofsuccessfulsignins.yaml",
    # "riskSignInWithNewMFAMethod.yaml",
    # "signinBurstFromMultipleLocations.yaml",
]

# UEBA Hunting Queries
_UEBA_HQ_ROOT = f"{_SENTINEL_REPO}/Solutions/UEBA%20Essentials/Hunting%20Queries"
_UEBA_HUNTING_QRY = [
    "anomaliesOnVIPUsers.yaml",
    "Anomalous AAD Account Manipulation.yaml",
    "Anomalous Account Creation.yaml",
    "Anomalous Activity Role Assignment.yaml",
    "Anomalous Code Execution.yaml",
    "Anomalous Data Access.yaml",
    "Anomalous Defensive Mechanism Modification.yaml",
    "Anomalous Failed Logon.yaml",
    "Anomalous Geo Location Logon.yaml",
    "Anomalous Login to Devices.yaml",
    "Anomalous Password Reset.yaml",
    "Anomalous RDP Activity.yaml",
    "Anomalous Resource Access.yaml",
    "Anomalous Role Assignment.yaml",
    "Anomalous Sign-in Activity.yaml",
    "anomalousActionInTenant.yaml",
    "dormantAccountActivityFromUncommonCountry.yaml",
    "firstConnectionFromGroup.yaml",
    "loginActivityFromBotnet.yaml",
    "newAccountAddedToAdminGroup.yaml",
    # "terminatedEmployeeAccessHVA.yaml",
    # "terminatedEmployeeActivity.yaml",
    "updateKeyVaultActivity.yaml",
]

ALL_QUERIES = {qry: _SI_LOG_ROOT for qry in _GEN_HUNTING_QRY}
ALL_QUERIES.update({qry: _UEBA_HQ_ROOT for qry in _UEBA_HUNTING_QRY})

TIME_TOKEN = re.compile(r"(\{\{StartTimeISO\}\}|\{\{EndTimeISO\}\})")
_LEFT_BRACE = r"[^{](\{)[^{]"
_RIGHT_BRACE = r"[^}](\})[^}]"
_LB_TOKEN = "%%~[~%%"
_RB_TOKEN = "%%~]~%%"


def replace_time_params(query):
    repl_query = re.sub(_LEFT_BRACE, _LB_TOKEN, query)
    repl_query = re.sub(_RIGHT_BRACE, _RB_TOKEN, repl_query)
    repl_query = repl_query.replace("{{StartTimeISO}}", "{start}").replace("{{EndTimeISO}}", "{end}")
    return repl_query.replace(_LB_TOKEN, "{{").replace(_RB_TOKEN, "}}")


QueryProps = namedtuple("QueryProps", "name, query, req_time, description, url, raw_query")


def fetch_queries(query_dict: Dict[str, str], verbose: bool = False) -> Dict[str, QueryProps]:
    """Fetch queries from Sentinel GitHub repo."""
    discover_queries: Dict[str, QueryProps] = {}
    error_queries: Dict[str, str] = {}
    for query, path in tqdm(query_dict.items()):
        q_path = f"{path}/{urllib.parse.quote(query)}"
        resp = httpx.get(q_path)
        if resp.status_code != 200:
            print(f"invalid URL {path}")
            continue
        try:
            q_dict = yaml.safe_load(resp.content)
        except yaml.scanner.ScannerError as err:
            print(f"could not parse query {query} at {q_path}")
            error_queries[query] = resp.content
            continue

        query_text = q_dict.get("query")
        req_time = False
        if re.search(TIME_TOKEN, query_text):
            query_text = replace_time_params(query_text)
            req_time = True

        if "UEBA" in path:
            query_text = add_ueba_time_params(query_text)
        if verbose:
            print(f"Query {query}, {q_dict['name']}, req time: {req_time}")
        discover_queries[query] = QueryProps(
            name=q_dict.get("name"),
            query=query_text,
            req_time=req_time,
            description=q_dict.get("description"),
            url=q_path,
            raw_query=q_dict.get("query"),
        )
    return discover_queries


PRIM_TABLE_EXP = r"(?P<prefix>^|\n)(?P<table>(BehaviorAnalytics|AuditLogs|IdentityInfo|SigninLogs))(?=[\s\n\)\|])"
PRIM_TABLE_REPL = r"\g<prefix>\g<table>\n| where TimeGenerated > datetime({start})"
JOIN_TABLE_EXP = r"(?P<join>\|\s+join[^(]*\(\s*[^\s]+)(?=[\s\)\|])"
JOIN_TABLE_REPL = r"\g<join>\n| where TimeGenerated > datetime({start})"


def add_ueba_time_params(query):
    if isinstance(query, tuple):
        if query.req_time:
            return query.query
        query = query.query
    return re.sub(
        JOIN_TABLE_EXP,
        JOIN_TABLE_REPL,
        re.sub(PRIM_TABLE_EXP, PRIM_TABLE_REPL, query)
    )


def display_query_table(queries):
    ht_table = "<table>{rows}</table>"
    rows = [f"<tr><td>{q.name}</td><td>{q.url}</td></tr>"
        for q in queries.values()]
    from IPython.display import HTML
    display(HTML(ht_table.format(rows="".join(rows))))


hunting_queries = fetch_queries(ALL_QUERIES)

display_query_table(hunting_queries)

100%|██████████| 21/21 [00:04<00:00,  4.61it/s]


0,1
Anomalies on users tagged as VIP,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/anomaliesOnVIPUsers.yaml
Anomalous AAD Account Manipulation,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20AAD%20Account%20Manipulation.yaml
Anomalous AAD Account Creation,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Account%20Creation.yaml
Anomalous Activity Role Assignment,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Activity%20Role%20Assignment.yaml
Anomalous Code Execution,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Code%20Execution.yaml
Anomalous Data Access,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Data%20Access.yaml
Anomalous Defensive Mechanism Modification,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Defensive%20Mechanism%20Modification.yaml
Anomalous Failed Logon,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Failed%20Logon.yaml
Anomalous Geo Location Logon,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Geo%20Location%20Logon.yaml
Anomalous Login to Devices,https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Solutions/UEBA%20Essentials/Hunting%20Queries/Anomalous%20Login%20to%20Devices.yaml


## Browser for UEBA queries - not used in notebook

In [10]:
import ipywidgets as widgets
import difflib

def browse_queries(queries: Dict[str, QueryProps]):
    """
    Browse Hunting queries.
    
    Notes
    -----
    T
    """
    select_query = widgets.Select(
        description="Query",
        options=[(qry.name, idx) for idx, qry in queries.items()],
        layout=widgets.Layout(height="200px", width="50%", padding="5pt")
    )
    layout_query = lambda x, y: widgets.Layout(height=x, width=y, padding="5pt")
    layout_w = lambda x: widgets.Layout(width=x, padding="5pt")
    qry_view = widgets.Textarea(layout=layout_query("200px", "95%"))
    qry_view_repl = widgets.Textarea(layout=layout_query("200px", "95%"))
    qry_view_diff = widgets.Textarea(layout=layout_query("150px", "50%"))
    qry_file = widgets.Label(layout=layout_w("60%"))
    orig_lbl = widgets.Label(value="Original query", layout=layout_w("60%"))
    mod_lbl = widgets.Label(value="Modified query", layout=layout_w("60%"))
    vbox = widgets.VBox([
        select_query,
        qry_file,
        widgets.HBox([
            widgets.VBox([orig_lbl, qry_view], layout=layout_query("250px", "45%")),
            widgets.VBox([mod_lbl, qry_view_repl], layout=layout_query("250px", "45%"))
        ]),
        qry_view_diff
    ])

    def update_query(change):
        query = queries[select_query.value]
        qry_file.value = query.url
        qry_view.value = query.raw_query
        qry_view_repl.value = query.query
        qry_view_diff.value = "\n".join(difflib.unified_diff(qry_view.value.splitlines(), qry_view_repl.value.splitlines()))

    select_query.observe(update_query, names="value")
    update_query(None)
    return vbox

# Uncomment the follow line to browse the hunting queries
# browse_queries(hunting_queries)

## Run Hunting queries for time range on risky accounts

In [11]:

def run_ueba_queries(queries, start, end) -> pd.DataFrame:
    dfs = []
    query_params = {"end": end, "start": start}
    print(f"Running {len(queries)} queries...")
    for query in tqdm(queries.values()):
        if "UEBA" not in query.url:
            continue
        try:
            repl_query = query.query
            if "{start}" in repl_query or "{end}" in repl_query:
                try:
                    repl_query = repl_query.format(**query_params)
                except KeyError:
                    print(f"Format error: {query.name}")
            result_df = qry_prov.exec_query(repl_query)
            result_df["UEBAQuery"] = query.name
            dfs.append(result_df)
        except Exception as err:
            print("Exception:", type(err), query.name)
    return pd.concat(dfs)

ueba_df = run_ueba_queries(hunting_queries, start=start, end=end)

Running 21 queries...


100%|██████████| 21/21 [00:32<00:00,  1.55s/it]


In [12]:
ueba_summary = (
    ueba_df[ueba_df["UserPrincipalName"].str.lower().isin(risk_users_df.UserPrincipalName)]
    .groupby(["UserPrincipalName", "UEBAQuery"])
    .agg(
        UEBAEventCount=pd.NamedAgg("TimeGenerated", "count"),
        StartTime=pd.NamedAgg("TimeGenerated", "min"),
        EndTime=pd.NamedAgg("TimeGenerated", "max"),
    )
)
summary_report.add_summary_data(
    data=ueba_summary.reset_index(),
    user_column="UserPrincipalName",
    section="UEBA Summary",
)
df_caption(
    ueba_summary,
    caption="UEBA entries for unmitigated risk users"
)

Unnamed: 0_level_0,Unnamed: 1_level_0,UEBAEventCount,StartTime,EndTime
UserPrincipalName,UEBAQuery,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
aguruswamy@contosohotels.com,Anomalous Sign-in Activity,10,2023-03-21 19:25:49+00:00,2023-03-21 19:29:54+00:00
elsherif@microsoft.com,Anomalous Sign-in Activity,2,2023-03-22 08:27:19+00:00,2023-03-22 08:27:19+00:00
jank@seccxpninja.onmicrosoft.com,Anomalous Sign-in Activity,4,2023-03-21 23:09:09+00:00,2023-03-21 23:26:52+00:00
suzanac@contosohotels.com,Anomalous Sign-in Activity,13,2023-03-22 17:35:16+00:00,2023-03-22 17:41:53+00:00
takuyaot@microsoft.com,Anomalous Sign-in Activity,1,2023-03-22 07:19:31+00:00,2023-03-22 07:19:31+00:00


In [13]:
df_caption(
    ueba_df[ueba_df["UserPrincipalName"].str.lower().isin(mitigated_users_df.UserPrincipalName)]
    .groupby(["UserPrincipalName", "UEBAQuery"])
    .agg(
        UEBAEventCount=pd.NamedAgg("TimeGenerated", "count"),
        StartTime=pd.NamedAgg("TimeGenerated", "min"),
        EndTime=pd.NamedAgg("TimeGenerated", "max"),
    ),
    caption="UEBA entries for mitigated risk users"
)

Unnamed: 0_level_0,Unnamed: 1_level_0,UEBAEventCount,StartTime,EndTime
UserPrincipalName,UEBAQuery,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
PDemo@seccxpninja.onmicrosoft.com,Anomalous Sign-in Activity,1476,2023-03-20 19:48:55+00:00,2023-03-22 18:48:58+00:00
adm_pwatkins@seccxpninja.onmicrosoft.com,Anomalies on users tagged as VIP,9,2023-03-22 13:48:17.600813400+00:00,2023-03-22 13:50:40.154887600+00:00
adm_pwatkins@seccxpninja.onmicrosoft.com,Anomalous Sign-in Activity,9,2023-03-22 13:48:17.600813400+00:00,2023-03-22 13:50:40.154887600+00:00
adm_pwatkins@seccxpninja.onmicrosoft.com,"Anomalous login activity originated from Botnet, Tor proxy or C2",3,2023-03-22 13:48:35.564164900+00:00,2023-03-22 13:50:10.714615400+00:00


# Signin Summaries for prior week

Collect distinct locations, IP addresses, client apps 
and User agent strings used in user sign-ins

In [14]:
query = """
SigninLogs
| where ResultType in (0, 50055, 50126)
| where TimeGenerated > ago(5d)
| project Id, UserPrincipalName, IsInteractive
"""
qry_prov.exec_query(query)

Unnamed: 0,Id,UserPrincipalName,IsInteractive
0,21e3cf35-1191-44a2-922d-927718699800,v-rniteesh@microsoft.com,True
1,1d52279e-d326-471a-a761-a5b59cab3a00,joanne.sensitive@contosohotels.com,True
2,50644068-10ed-4dd3-94db-6778ed4a3200,pdemo@seccxpninja.onmicrosoft.com,True
3,bb29c897-581e-4d60-9db2-08eeb032b200,pdemo@seccxpninja.onmicrosoft.com,True
4,75438e12-2502-41fd-93b5-c26d532e2100,pdemo@seccxpninja.onmicrosoft.com,True
...,...,...,...
9861,6cbdc387-bf42-474f-9a0e-eb6914d13b00,sync_ninja-dc_9d913db9dfd8@seccxpninja.onmicrosoft.com,True
9862,ef4c9ef1-d46c-43a2-b65e-53051a246900,sync_ninja-dc_9d913db9dfd8@seccxpninja.onmicrosoft.com,True
9863,a4454e8e-c501-470d-a2a8-9c2102971500,sync_ninja-dc_9d913db9dfd8@seccxpninja.onmicrosoft.com,True
9864,49affc23-1178-40db-b537-423e6d5b8900,joanne.sensitive@contosohotels.com,True


In [16]:
user_summary_query = """
let si_history = SigninLogs
| where TimeGenerated between (datetime({start}) .. datetime({end}))
| where UserPrincipalName in~ ({users})
| summarize count() by UserPrincipalName, ResultType, RiskLevelAggregated, RiskLevelDuringSignIn, ClientAppUsed, UserAgent, IPAddress, Location;
si_history
| summarize OpCount=sum(count_) by UserPrincipalName, ClientAppUsed
| project UserPrincipalName, Attribute="ClientAppUser", Value=ClientAppUsed, OpCount
| union ( 
si_history
| summarize OpCount=sum(count_) by UserPrincipalName, IPAddress
| project UserPrincipalName, Attribute="IPAddress", Value=IPAddress, OpCount
)
| union ( 
si_history
| summarize OpCount=sum(count_) by UserPrincipalName, UserAgent
| project UserPrincipalName, Attribute="UserAgent", Value=UserAgent, OpCount
)
| union ( 
si_history
| summarize OpCount=sum(count_) by UserPrincipalName, Location
| project UserPrincipalName, Attribute="Location", Value=Location, OpCount
)
"""
week_ago = (end - timedelta(7))
user_summary_df = qry_prov.exec_query(user_summary_query.format(
    users=get_user_param(risk_users_df),
    start=week_ago,
    end=end
))

summary_report.add_summary_data(
    data=user_summary_df,
    user_column="UserPrincipalName",
    section="Signin summary for previous week"
)
df_caption(
    user_summary_df.groupby(["UserPrincipalName", "Attribute"]).agg(
        Values=pd.NamedAgg("Value", "unique"),
        NumUniqueValues=pd.NamedAgg("Value", "nunique"),
        OpCount=pd.NamedAgg("Value", "count"),
    )
    .reset_index()
    .pivot(index=['UserPrincipalName'], columns='Attribute', values=["Values", "NumUniqueValues"]),
    caption="Sign-in summary for previous week"
)

Unnamed: 0_level_0,Values,Values,Values,Values,NumUniqueValues,NumUniqueValues,NumUniqueValues,NumUniqueValues
Attribute,ClientAppUser,IPAddress,Location,UserAgent,ClientAppUser,IPAddress,Location,UserAgent
UserPrincipalName,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
aauvinen@microsoft.com,['Browser'],['167.220.197.91'],['GB'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.44']",1,1,1,1
adstadel@microsoft.com,['Browser'],['85.7.181.19'],['CH'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36']",1,1,1,1
aguruswamy@contosohotels.com,['Browser'],['67.168.169.80'],['US'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.43']",1,1,1,1
asekstee@microsoft.com,['Browser'],['144.134.106.54'],['AU'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.44']",1,1,1,1
aweinkopf@microsoft.com,['Browser'],['84.75.253.35' '167.220.196.39'],['CH' 'GB'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.44']",1,2,2,1
elsherif@microsoft.com,['Browser'],['94.128.105.93'],['KW'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.51']",1,1,1,1
jank@seccxpninja.onmicrosoft.com,['Browser' 'Mobile Apps and Desktop clients'],['73.109.22.203'],['US'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.44']",2,1,1,1
malarkin@microsoft.com,['Browser'],['75.136.202.123'],['US'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.43']",1,1,1,1
ragomeri@microsoft.com,['Browser'],['167.220.197.42'],['GB'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.51']",1,1,1,1
rickkotlarz@microsoft.com,['Browser'],['45.22.1.222'],['US'],"['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.44']",1,1,1,1


# 5. Get related alerts for users and user IPs

## Alerts that name the account explicitly

In [17]:
related_alerts_df = pd.concat([
    (
        qry_prov.SecurityAlert.list_related_alerts(account_name=acct)
        .assign(UserPrincipalName=acct)
    )
    for acct in tqdm(risk_users_df.UserPrincipalName)
])
summary_report.add_summary_data(
    data=related_alerts_df,
    user_column="UserPrincipalName",
    section="Related alerts for user"
)
df_caption(related_alerts_df.drop(
    columns=["Description", "RemediationSteps", "ExtendedProperties"]),
    caption="Related alerts for account")

100%|██████████| 12/12 [00:20<00:00,  1.75s/it]


Unnamed: 0,TenantId,TimeGenerated,AlertDisplayName,AlertName,Severity,ProviderName,VendorName,VendorOriginalId,SystemAlertId,ResourceId,SourceComputerId,AlertType,ConfidenceLevel,ConfidenceScore,IsIncident,StartTimeUtc,EndTimeUtc,ProcessingEndTime,Entities,SourceSystem,WorkspaceSubscriptionId,WorkspaceResourceGroup,ExtendedLinks,ProductName,ProductComponentName,AlertLink,Status,CompromisedEntity,Tactics,Techniques,Type,Computer,src_hostname,src_accountname,src_procname,host_match,acct_match,proc_match,UserPrincipalName
0,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-21 23:15:18.679918+00:00,Activity from infrequent country,Activity from infrequent country,Medium,MCAS,Microsoft,641a3a797612f527f42ed3d4,abf4cd24-8737-861d-fb55-2b9d52757ec4,,,MCAS_ALERT_ANUBIS_DETECTION_NEW_COUNTRY,,,False,2023-03-21 23:09:09.916000+00:00,2023-03-21 23:09:09.916000+00:00,2023-03-21 23:15:18.678815600+00:00,"[{""$id"":""2"",""Address"":""73.109.22.203"",""Location"":{""CountryCode"":""US""},""Asset"":false,""Roles"":[""Attacker""],""Type"":""ip""},{""$id"":""3"",""AppId"":11161,""SaasId"":11161,""Name"":""Office 365"",""InstanceId"":0,""Type"":""cloud-application""},{""$id"":""4"",""Name"":""jank"",""UPNSuffix"":""seccxpninja.onmicrosoft.com"",""AadUserId"":""0cecc121-df78-4350-8e73-d81d9925bcb6"",""CloudAppAccountId"":""11161|0|0cecc121-df78-4350-8e73-d81d9925bcb6"",""Type"":""account""}]",Detection,,,"[{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/policy/?id=eq(5b2405b8185459b631739047,)"",""Category"":null,""Label"":""Defender for Cloud Apps policy ID"",""Type"":""webLink""},{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641a3a797612f527f42ed3d4"",""Category"":null,""Label"":""Defender for Cloud Apps alert ID"",""Type"":""webLink""}]",Microsoft Cloud App Security,,https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641a3a797612f527f42ed3d4,New,jank@seccxpninja.onmicrosoft.com,DefenseEvasion,"[""T1078""]",SecurityAlert,,,jank,,False,True,False,jank@seccxpninja.onmicrosoft.com
1,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-22 00:58:03.079763900+00:00,Activity from infrequent country,Activity from infrequent country,Medium,MCAS,Microsoft,64109eb2d28315667fdec3c9,323bdd2e-45a9-9ed5-5c82-d9d785d954f4,,,MCAS_ALERT_ANUBIS_DETECTION_NEW_COUNTRY,,,False,2023-03-14 16:16:26.380000+00:00,2023-03-14 16:16:26.380000+00:00,2023-03-22 00:58:03.078047300+00:00,"[{""$id"":""2"",""Address"":""181.214.93.55"",""Location"":{""CountryCode"":""BR""},""Asset"":false,""Roles"":[""Attacker""],""Type"":""ip""},{""$id"":""3"",""AppId"":11161,""SaasId"":11161,""Name"":""Office 365"",""InstanceName"":""Office 365"",""InstanceId"":0,""Type"":""cloud-application""},{""$id"":""4"",""Name"":""jank"",""UPNSuffix"":""seccxpninja.onmicrosoft.com"",""AadUserId"":""0cecc121-df78-4350-8e73-d81d9925bcb6"",""CloudAppAccountId"":""11161|0|0cecc121-df78-4350-8e73-d81d9925bcb6"",""Type"":""account""}]",Detection,,,"[{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/policy/?id=eq(5b2405b8185459b631739047,)"",""Category"":null,""Label"":""Defender for Cloud Apps policy ID"",""Type"":""webLink""},{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/alerts/64109eb2d28315667fdec3c9"",""Category"":null,""Label"":""Defender for Cloud Apps alert ID"",""Type"":""webLink""}]",Microsoft Cloud App Security,,https://seccxpninja.portal.cloudappsecurity.com/#/alerts/64109eb2d28315667fdec3c9,Resolved,,DefenseEvasion,"[""T1078""]",SecurityAlert,,,jank,,False,True,False,jank@seccxpninja.onmicrosoft.com
0,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-21 19:27:18.303824600+00:00,Activity from infrequent country,Activity from infrequent country,Medium,MCAS,Microsoft,641a0510c6011d2c37ce2bbc,204e5a06-5560-20e0-6e7f-95795f988c46,,,MCAS_ALERT_ANUBIS_DETECTION_NEW_COUNTRY,,,False,2023-03-21 19:25:49.784000+00:00,2023-03-21 19:25:49.784000+00:00,2023-03-21 19:27:18.301896400+00:00,"[{""$id"":""2"",""Name"":""aguruswamy"",""UPNSuffix"":""contosohotels.com"",""AadUserId"":""4d10cc91-8c99-4f23-9470-7070cb2eaf4b"",""CloudAppAccountId"":""11161|0|4d10cc91-8c99-4f23-9470-7070cb2eaf4b"",""Type"":""account""},{""$id"":""3"",""Address"":""67.168.169.80"",""Location"":{""CountryCode"":""US""},""Asset"":false,""Roles"":[""Attacker""],""Type"":""ip""},{""$id"":""4"",""AppId"":51040,""SaasId"":51040,""Name"":""Microsoft MyAccount"",""InstanceId"":2047,""Type"":""cloud-application""}]",Detection,,,"[{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/policy/?id=eq(5b2405b8185459b631739047,)"",""Category"":null,""Label"":""Defender for Cloud Apps policy ID"",""Type"":""webLink""},{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641a0510c6011d2c37ce2bbc"",""Category"":null,""Label"":""Defender for Cloud Apps alert ID"",""Type"":""webLink""}]",Microsoft Cloud App Security,,https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641a0510c6011d2c37ce2bbc,New,aguruswamy@contosohotels.com,DefenseEvasion,"[""T1078""]",SecurityAlert,,,aguruswamy,,False,True,False,aguruswamy@contosohotels.com
0,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-22 17:37:12.607978+00:00,Activity from infrequent country,Activity from infrequent country,Medium,MCAS,Microsoft,641b3cc03261b058ed4ce348,32c9582c-419d-c77d-1140-55142b386500,,,MCAS_ALERT_ANUBIS_DETECTION_NEW_COUNTRY,,,False,2023-03-22 17:35:17.280000+00:00,2023-03-22 17:35:17.280000+00:00,2023-03-22 17:37:12.606159400+00:00,"[{""$id"":""2"",""Address"":""50.47.87.74"",""Location"":{""CountryCode"":""US""},""Asset"":false,""Roles"":[""Attacker""],""Type"":""ip""},{""$id"":""3"",""Name"":""suzanac"",""UPNSuffix"":""contosohotels.com"",""AadUserId"":""53ea4395-cd1b-4c86-a6ad-370aae4ce1b7"",""CloudAppAccountId"":""11161|0|53ea4395-cd1b-4c86-a6ad-370aae4ce1b7"",""Type"":""account""},{""$id"":""4"",""AppId"":51040,""SaasId"":51040,""Name"":""Microsoft MyAccount"",""InstanceId"":2047,""Type"":""cloud-application""}]",Detection,,,"[{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/policy/?id=eq(5b2405b8185459b631739047,)"",""Category"":null,""Label"":""Defender for Cloud Apps policy ID"",""Type"":""webLink""},{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641b3cc03261b058ed4ce348"",""Category"":null,""Label"":""Defender for Cloud Apps alert ID"",""Type"":""webLink""}]",Microsoft Cloud App Security,,https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641b3cc03261b058ed4ce348,New,suzanac@contosohotels.com,DefenseEvasion,"[""T1078""]",SecurityAlert,,,suzanac,,False,True,False,suzanac@contosohotels.com
0,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-22 07:21:11.969415700+00:00,Activity from infrequent country,Activity from infrequent country,Medium,MCAS,Microsoft,641aac60525cb17b1013eae5,bf1d5ca9-7e77-2a82-fa59-2715886aa153,,,MCAS_ALERT_ANUBIS_DETECTION_NEW_COUNTRY,,,False,2023-03-22 07:19:32.246000+00:00,2023-03-22 07:19:32.246000+00:00,2023-03-22 07:21:11.968285200+00:00,"[{""$id"":""2"",""AppId"":35931,""SaasId"":35931,""Name"":""Microsoft 365 security center"",""InstanceId"":2047,""Type"":""cloud-application""},{""$id"":""3"",""Address"":""153.240.206.142"",""Location"":{""CountryCode"":""JP""},""Asset"":false,""Roles"":[""Attacker""],""Type"":""ip""},{""$id"":""4"",""Name"":""takuyaot"",""UPNSuffix"":""microsoft.com"",""AadUserId"":""01b09ab3-09f3-485b-b55c-cc96d5d2ab9d"",""CloudAppAccountId"":""11161|0|01b09ab3-09f3-485b-b55c-cc96d5d2ab9d"",""Type"":""account""}]",Detection,,,"[{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/policy/?id=eq(5b2405b8185459b631739047,)"",""Category"":null,""Label"":""Defender for Cloud Apps policy ID"",""Type"":""webLink""},{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641aac60525cb17b1013eae5"",""Category"":null,""Label"":""Defender for Cloud Apps alert ID"",""Type"":""webLink""}]",Microsoft Cloud App Security,,https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641aac60525cb17b1013eae5,New,takuyaot@microsoft.com,DefenseEvasion,"[""T1078""]",SecurityAlert,,,takuyaot,,False,True,False,takuyaot@microsoft.com
0,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-22 08:29:07.284193+00:00,Activity from infrequent country,Activity from infrequent country,Medium,MCAS,Microsoft,641abc4e08e2065d7dca9b11,e61a55fc-acc5-cdde-d0d0-43f73c74b7d8,,,MCAS_ALERT_ANUBIS_DETECTION_NEW_COUNTRY,,,False,2023-03-22 08:27:20.870000+00:00,2023-03-22 08:27:20.870000+00:00,2023-03-22 08:29:07.282423500+00:00,"[{""$id"":""2"",""Address"":""94.128.105.93"",""Location"":{""CountryCode"":""KW""},""Asset"":false,""Roles"":[""Attacker""],""Type"":""ip""},{""$id"":""3"",""Name"":""elsherif"",""UPNSuffix"":""microsoft.com"",""AadUserId"":""b045505f-b621-4273-a434-7c83268a268e"",""CloudAppAccountId"":""11161|0|b045505f-b621-4273-a434-7c83268a268e"",""Type"":""account""},{""$id"":""4"",""AppId"":35931,""SaasId"":35931,""Name"":""Microsoft 365 security center"",""InstanceId"":2047,""Type"":""cloud-application""}]",Detection,,,"[{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/policy/?id=eq(5b2405b8185459b631739047,)"",""Category"":null,""Label"":""Defender for Cloud Apps policy ID"",""Type"":""webLink""},{""Href"":""https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641abc4e08e2065d7dca9b11"",""Category"":null,""Label"":""Defender for Cloud Apps alert ID"",""Type"":""webLink""}]",Microsoft Cloud App Security,,https://seccxpninja.portal.cloudappsecurity.com/#/alerts/641abc4e08e2065d7dca9b11,New,elsherif@microsoft.com,DefenseEvasion,"[""T1078""]",SecurityAlert,,,elsherif,,False,True,False,elsherif@microsoft.com
1,8ecf8077-cf51-4820-aadd-14040956f35d,2023-03-22 11:51:43.988203400+00:00,Authentication Attempt from New Country,Authentication Attempt from New Country,Medium,ASI Scheduled Alerts,Microsoft,96ad6b0e-ecda-4366-beae-3e292adbfc43,5f16ecd7-1788-64d3-9815-8882091bbb5e,,,8ecf8077-cf51-4820-aadd-14040956f35d_70a3191f-d92a-42e1-b305-de1d4d2becd0,,,False,2023-03-08 11:46:38.822000+00:00,2023-03-22 11:46:38.822000+00:00,2023-03-22 11:51:43.918945400+00:00,"[{""$id"":""2"",""Name"":""pdemo"",""UPNSuffix"":""seccxpninja.onmicrosoft.com"",""IsDomainJoined"":true,""DisplayName"":""pdemo@seccxpninja.onmicrosoft.com"",""Type"":""account""},{""$id"":""3"",""Name"":""hesaad"",""UPNSuffix"":""microsoft.com"",""IsDomainJoined"":true,""DisplayName"":""hesaad@microsoft.com"",""Type"":""account""},{""$id"":""4"",""Name"":""elsherif"",""UPNSuffix"":""microsoft.com"",""IsDomainJoined"":true,""DisplayName"":""elsherif@microsoft.com"",""Type"":""account""}]",Detection,d1d8779d-38d7-4f06-91db-9cbc8de0176f,soc,,Azure Sentinel,Scheduled Alerts,,New,,"InitialAccess, Reconnaissance","[""T1594"",""T1078""]",SecurityAlert,,,elsherif,,False,True,False,elsherif@microsoft.com


## 2. Alerts related to signin-in IP addresses

In [18]:
related_alerts_ip_df = pd.concat([
    (
        qry_prov.SecurityAlert.list_alerts_for_ip(source_ip_list=ip_addr)
        .assign(UserPrincipalName=acct, IPAddress=ip_addr)
    )
    for acct, ip_addr in tqdm(
        risk_users_df.explode("SourceIPs")[["UserPrincipalName", "SourceIPs"]].apply(tuple, axis=1)
    )
])
summary_report.add_summary_data(
    data=related_alerts_ip_df,
    user_column="UserPrincipalName",
    section="Related alerts for user signin IP address"
)

df_caption(related_alerts_ip_df, caption="Related alerts for sign-in IP Address")


100%|██████████| 12/12 [00:24<00:00,  2.08s/it]


Unnamed: 0,TenantId,TimeGenerated,AlertDisplayName,AlertName,Severity,Description,ProviderName,VendorName,VendorOriginalId,SystemAlertId,ResourceId,SourceComputerId,AlertType,ConfidenceLevel,ConfidenceScore,IsIncident,StartTimeUtc,EndTimeUtc,ProcessingEndTime,RemediationSteps,ExtendedProperties,Entities,SourceSystem,WorkspaceSubscriptionId,WorkspaceResourceGroup,ExtendedLinks,ProductName,ProductComponentName,AlertLink,Status,CompromisedEntity,Tactics,Techniques,Type,SystemAlertId1,ExtendedProperties1,Entities1,MatchingIps,UserPrincipalName,IPAddress


# 6. Get Threat Intelligence reports for sign-in IPs

In [19]:
# look up IP addresses - join UserPrincipalName from source DF to output
ti_user_ip = IpAddress.tilookup_ip(
    risk_users_df.explode("SourceIPs")[["UserPrincipalName", "SourceIPs"]],
    column="SourceIPs",
    join="left"
).query("Severity != 'information'")

summary_report.add_summary_data(
    data=ti_user_ip,
    user_column="UserPrincipalName",
    section="Threat intel reports for user sign-in IP address(es)"
)

df_caption(ti_user_ip, caption="Threat intel reports for risky sign-in IPs")

Observables processed: 100%|██████████| 72/72 [00:45<00:00,  1.57obs/s]


Unnamed: 0,UserPrincipalName,SourceIPs,QuerySubtype,Result,Details,RawResult,Reference,Status,Ioc,IocType,SafeIoc,Severity,Provider


# 7. Look for unusual Azure Audit entries

Look for operations in Azure audit for selected accounts
where account used operations type in the current time slot that
it had not used in the baseline period (default prior 30 days)

In [20]:
# Azure Audit
# Find any operation types for current period that weren't seen for
# that user in previous baseline period
azure_audit_query = """
let start = datetime("{start}");
let end = datetime("{end}");
let baseline_start = start - ({baseline_period} * 1d);
let bl_threshold = {threshold};
let operation_history = AuditLogs
| where TimeGenerated between(baseline_start .. start)
| where Identity !in ("Azure AD Cloud Sync", "Managed Service Identity", "Microsoft.Azure.SyncFabric")
| where bag_has_key(InitiatedBy, "user")
| extend UserPrincipalName = tostring(InitiatedBy["user"]["userPrincipalName"])
| where UserPrincipalName in~ ({users})
| summarize EventCount=count() by UserPrincipalName, OperationName
| where EventCount > bl_threshold;
AuditLogs
| where TimeGenerated between(end .. start)
| where Identity !in ("Azure AD Cloud Sync", "Managed Service Identity", "Microsoft.Azure.SyncFabric")
| where bag_has_key(InitiatedBy, "user")
| extend UserPrincipalName = tostring(InitiatedBy["user"]["userPrincipalName"]), IPAddress = InitiatedBy["user"]["ipAddress"]
| where UserPrincipalName in~ ({users})
| join kind=leftanti (operation_history) on UserPrincipalName, OperationName
| project Identity, UserPrincipalName, OperationName, LoggedByService, InitiatedBy, AdditionalDetails, TargetResources
"""

end = datetime.now(tz=timezone.utc)
start = end-timedelta(1)
from datetime import datetime, timezone, timedelta
fmt_query = azure_audit_query.format(
    start=start,
    end=end,
    baseline_period=baseline_period,
    threshold=0,
    users=get_user_param(risk_users_df),
)
az_audit_df = qry_prov.exec_query(fmt_query)
summary_report.add_summary_data(
    data=az_audit_df,
    user_column="UserPrincipalName",
    section="Unusual Azure Audit log entries for user"
)
df_caption(az_audit_df, caption="Azure audit activity types not seen in baseline period.")

Unnamed: 0,Identity,UserPrincipalName,OperationName,LoggedByService,InitiatedBy,AdditionalDetails,TargetResources


# 8. Look for unusual Office 365 activity

Office operations occurring in the measured period that had
not occurred or rarely occurred in the baseline period.

In [21]:
o365_baseline_activity_query = """
let num_stddev = {std_dev_scale};
let bl_period = datetime_add("day", -{baseline_period}, datetime({start}));
OfficeActivity
| where TimeGenerated between (bl_period .. datetime({start}))
| where UserId in~ ({users})
// count operations by user and op type per day
| summarize OpCount = count() by UserId, OfficeWorkload, Operation, bin(TimeGenerated, 1d)
// calculate mean and average values for the user/op combos
| summarize OpStdev = stdev(OpCount), OpMean = avg(OpCount) by UserId, OfficeWorkload, Operation
// Calculate a baseline score Mean + N StdDevs * StdDev (default to 1 if 0 variance)
| extend OpBase = OpMean + (num_stddev * iif(OpStdev > 0, OpStdev, 1.0))
| extend RecType="baseline"
"""

o365_current_activity_query = """
OfficeActivity
| where TimeGenerated between (datetime({start}) .. datetime({end}))
| where UserId in~ ({users})
| summarize OpCount = count() by UserId, OfficeWorkload, Operation
| extend RecType="current"
"""

# set number of std deviations from mean to use as indicating
# anomalous activity
_STD_THRESHOLD = 2

end = datetime.now(tz=timezone.utc)
start = end - timedelta(1)
office_baseline_df = qry_prov.exec_query(
    o365_baseline_activity_query.format(
        users=get_user_param(risk_users_df),
        std_dev_scale=_STD_THRESHOLD,
        start=start,
        baseline_period=baseline_period,
    )
)
office_current_df = qry_prov.exec_query(
    o365_current_activity_query.format(
        users=get_user_param(risk_users_df),
        start=start,
        end=end
    )
)

# Pull out any current activity that exceeds the baseline threshold (mean + N*stddev)
office_activity_df = (
    office_current_df
    .merge(office_baseline_df, on=["UserId", "OfficeWorkload", "Operation"], how="left")
    .fillna({"OpBase": 0})
    .query("OpCount > OpBase")
)

In [22]:
df_caption(office_baseline_df, "Office baseline operations.")
df_caption(office_current_df, "Office current operations.")
df_caption(office_activity_df, "Office anomalous operations.")
summary_report.add_summary_data(
    data=office_activity_df,
    user_column="UserId",
    section="Unusual Office activity for user"
)
summary_report.add_summary_data(
    data=office_current_df,
    user_column="UserId",
    section="Summarized current Office activity for user"
)

Unnamed: 0,UserId,OfficeWorkload,Operation,OpStdev,OpMean,OpBase,RecType


Unnamed: 0,UserId,OfficeWorkload,Operation,OpCount,RecType


Unnamed: 0,OpCount,RecType_x,UserId,OfficeWorkload,Operation,OpStdev,OpMean,OpBase,RecType_y


# 9. Look for unusual Azure activity

Azure activity operations occurring in the measured period that had
not occurred in the baseline period.


In [23]:
# Azure Activity
azure_activity_query = """
let start = datetime("{start}");
let end = datetime("{end}");
let baseline_start = start - ({period} * 1d);
let bl_threshold = {threshold};
let operation_history = AzureActivity
| where TimeGenerated between(baseline_start .. start)
| where Caller in~ ({users})
| project UserPrincipalName=Caller, OperationNameValue
| summarize EventCount=count() by UserPrincipalName, OperationNameValue
| where EventCount > bl_threshold;
AzureActivity
| where TimeGenerated between(start .. end)
| where Caller in~ ({users})
| project-rename UserPrincipalName=Caller
| join kind=leftanti (operation_history) on UserPrincipalName, OperationNameValue
| project TimeGenerated, UserPrincipalName, OperationNameValue, IPAddress=CallerIpAddress,
  EventDataId, ActivityStatusValue, ResourceGroup, SubscriptionId, TenantId
"""

fmt_query = azure_activity_query.format(
    end=datetime.now(tz=timezone.utc),
    start=end-timedelta(1),
    period=28,
    threshold=0,
    users=get_user_param(risk_users_df),
)
azure_activity_df = qry_prov.exec_query(fmt_query)

aa_summary_cols = [
    "UserPrincipalName",
    "OperationNameValue",
    "IPAddress",
    "ResourceGroup",
    "SubscriptionId",
    "TenantId",
]

azure_activity_summary_df = azure_activity_df.groupby(aa_summary_cols).agg(
    EventCount=pd.NamedAgg("TimeGenerated", "count"),
    ActivityStatusValue=pd.NamedAgg("ActivityStatusValue", "unique"),
    StartTime=pd.NamedAgg("TimeGenerated", "min"),
    EndTime=pd.NamedAgg("TimeGenerated", "max"),
).reset_index().sort_values("StartTime", ascending=True)

summary_report.add_summary_data(
    data=azure_activity_summary_df,
    user_column="UserPrincipalName",
    section="Unusual Azure activity for user"
)
df_caption(azure_activity_summary_df, "Azure activity operations not seen in baseline period.")

Unnamed: 0,UserPrincipalName,OperationNameValue,IPAddress,ResourceGroup,SubscriptionId,TenantId,EventCount,ActivityStatusValue,StartTime,EndTime


# 10. Summarize and upload data

Create dynamic summaries for each user and upload to sentinel

> Note: we could offer the option to group by report type instead
> of user. That would result in a Dynamic Summary entry for each
> report type (with consistent schema) but with data from (potentially)
> multiple users.

In [25]:
# Iterate through summary reports and create a summary for each user
try:
    from notebookutils import mssparkutils
    synapse_workspace = mssparkutils.env.getWorkspaceName
except ImportError:
    synapse_workspace = "none"

def create_dynamic_summaries(summary_report):
    """Create a dynamic summary for each account."""
    dynamic_summaries = []
    for user, reports in summary_report.summary_reports.items():
        # Create a summary for each user
        user_ds = DynamicSummary(
            summary_name=f"AccountEvaluation - {user}",
            summary_description="Summary generated from AccountSignInEvaluation notebook.",
            source_info={
                "source": "notebooks",
                "notebook": "AccountSignInEvaluation.ipynb",
                "synapse_workspace": synapse_workspace
            }
        )

        for report_type, summary_item in reports.items():
            ds_item_params = {
                "event_time_utc": end,
                "search_key": user,
                "observable_type": "report_type",
                "observable_value": report_type
            }
            user_ds.add_summary_items(
                data=summary_item.data,
                **ds_item_params
            )
        dynamic_summaries.append(user_ds)
    return dynamic_summaries


def upload_dynamic_summaries(sentinel, dynamic_summaries):
    """Upload summaries to Sentinel."""
    for dyn_summary in dynamic_summaries:
        # Create or update the report
        print("Uploading", dyn_summary.summary_name)
        sentinel.create_dynamic_summary(dyn_summary)


dynamic_summaries = create_dynamic_summaries(summary_report)
upload_dynamic_summaries(sentinel, dynamic_summaries)

Uploading AccountEvaluation - aauvinen@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - adstadel@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - aguruswamy@contosohotels.com
Dynamic summary created/updated.
Uploading AccountEvaluation - asekstee@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - aweinkopf@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - elsherif@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - jank@seccxpninja.onmicrosoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - malarkin@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - ragomeri@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - rickkotlarz@microsoft.com
Dynamic summary created/updated.
Uploading AccountEvaluation - suzanac@contosohotels.com
Dynamic summary created/updated.
Uploading AccountEvaluation - t

## Appendix - Pickling and restoring data

The following cells allow the summary data to be picked
and stored as a file. The final cell will
store the dynamic summary (base64 encoded) in a notebook
cell, so it can be restored later.

These are all commented-out. Uncomment to use any of these.

In [None]:
# # Save dynamic summaries to a pickle file
# import pickle
# obj = pickle.dumps(dynamic_summaries)

# with open("acct_nb_summaries.pkl", "wb") as pickle_file:
#     pickle_file.write(obj)

In [None]:
# # Restore dynamic summaries from a pickle file
# # note - you need to have the DynamicSummary class imported
# import pickle
# from msticpy.context.azure.sentinel_dynamic_summary import DynamicSummary
# # to successfully restore the summary report
# with open("acct_nb_summaries.pkl", "rb") as pickle_file:
#     summary_obj = pickle_file.read()
#     dynamic_summaries_copy = pickle.loads(summary_obj)

# print([ds.summary_name for ds in dynamic_summaries_copy])

['AccountEvaluation - tamuto@seccxpninja.onmicrosoft.com']


In [None]:
# # Save dynamic summaries to a base64-encoded cell

# from base64 import b64encode
# from IPython.core.getipython import get_ipython

# cell_code = """#########################################
# # Run this cell to restore cached data to
# # the object "{var_name}"
# #########################################

# from base64 import b64decode
# import pickle

# ## Store dynamic summaries as base64 byte string
# summary_data = {encoded_bytes}

# # decode and unpickle the summaries
# {var_name} = pickle.loads(b64decode(summary_data))
# {var_name}
# """

# def persist_to_cell(dynamic_summaries, var_name="acct_summaries"):
#     encoded_bytes = b64encode(pickle.dumps(dynamic_summaries))
#     cell_text = cell_code.format(
#         encoded_bytes=encoded_bytes,
#         var_name=var_name,
#     )
#     shell = get_ipython()
#     # create a new cell using `cell_code` as the code contents.
#     shell.set_next_input(cell_text)

# persist_to_cell(dynamic_summaries)