# 01_setup_environment – Sentinel APT Hunt

This notebook installs required packages, authenticates to Azure, connects to Microsoft Sentinel through MSTICPy, and demonstrates a simple AutoEncoder-based anomaly‑detection model for APT hunting.


## Install Python packages
Run the cell below **once** (or whenever you update `requirements.txt`).

In [None]:
!pip install --quiet -r ../requirements.txt

## Authenticate to Azure

In [None]:

from msticpy.init import nbinit
nbinit.init_notebook(
    auth_methods=["cli", "msi", "devicecode"],
    verbose=False
)
print("MSTICPy initialized – authentication attempted using CLI/MSI/devicecode.")


## Connect to Sentinel and run a test query

In [None]:

from msticpy.data import QueryProvider
sentinel = QueryProvider("AzureSentinel")
sentinel.connect()
df = sentinel.exec_query("SecurityEvent | take 5")
display(df)


## Train an AutoEncoder model for anomaly detection

In [None]:

import pandas as pd
from pyod.models.auto_encoder_torch import AutoEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score

# Example: fetch 30 days of process events
proc_df = sentinel.exec_query("SecurityEvent | where TimeGenerated > ago(30d) | project TimeGenerated, EventID, Account, Computer")
if proc_df.empty:
    print("No data returned – adjust the query to match your environment.")
else:
    # Basic feature encoding – demo purposes only
    proc_df = proc_df.fillna("missing")
    X = pd.get_dummies(proc_df[['EventID', 'Account', 'Computer']])
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)

    model = AutoEncoder(epochs=20, contamination=0.01, hidden_neurons=[32, 16, 8, 16, 32])
    model.fit(X_scaled)

    scores = model.decision_scores_
    proc_df['anomaly_score'] = scores
    # Flag top 1 % anomalies
    threshold = pd.Series(scores).quantile(0.99)
    anomalies = proc_df[proc_df['anomaly_score'] >= threshold]
    print(f"Detected {len(anomalies)} high‑scoring anomalies:")
    display(anomalies.head())
