# Scenario 02 — Incremental Help Notebook

This notebook provides **optional, step‑by‑step hints** for Scenario 02.

Each section includes:
- A **goal**
- A **first hint** (conceptual)
- A **second hint** (more concrete)
- A **reveal cell** with example code (optional)

Use this notebook only when you need it — the main scenario notebook is designed to be completed independently.

## 1. Understanding the Data Sources

**Goal:** Familiarize yourself with the structure of the three log files.

### First Hint
Use `.head()` and `.info()` to inspect each dataframe.

### Second Hint
Look for:
- Timestamp columns
- User identifiers
- Hostnames
- Event types
- Ports (for network logs)

### Reveal (optional)

In [ ]:
# Example exploration code
auth_df.head(), auth_df.info()

## 2. Normalizing Timestamps

**Goal:** Convert timestamps into a consistent, timezone‑aware format.

### First Hint
Use `pd.to_datetime()` with `utc=True`.

### Second Hint
If you see errors about `+00:00Z`, remove the trailing `Z` before parsing.

### Reveal (optional)

In [ ]:
# Example timestamp normalization
auth_df['timestamp'] = pd.to_datetime(
    auth_df['timestamp'].astype(str).str.replace('Z', '', regex=False),
    utc=True,
    errors='coerce'
)

## 3. Initial SOC Sweep

**Goal:** Identify suspicious patterns in auth, process, and network logs.

### First Hint
Start by filtering for unusual or rare events.

### Second Hint
Examples:
- Failed logins
- Rare processes
- Unusual ports

### Reveal (optional)

In [ ]:
# Example: failed logins
auth_df[auth_df['event_type'] == 'failed_login'].head()

## 4. Extracting IOCs

**Goal:** Identify suspicious IPs, users, hosts, or processes.

### First Hint
Start with the events you flagged as suspicious in Step 3.

### Second Hint
Look for values that:
- Appear rarely
- Are associated with failed logins
- Are linked to suspicious processes
- Use unusual ports

### Reveal (optional)

In [ ]:
# Example IOC extraction pattern
suspicious_ips = auth_df[auth_df['event_type'] == 'failed_login']['src_ip'].unique().tolist()
suspicious_ips

## 5. MITRE ATT&CK Mapping

**Goal:** Map observed behavior to ATT&CK techniques.

### First Hint
Think about the *type* of activity: brute force, credential use, remote execution, etc.

### Second Hint
Common mappings:
- Brute force → `T1110`
- Valid accounts → `T1078`
- Suspicious scripting → `T1059`
- Remote services → `T1021`

### Reveal (optional)

In [ ]:
# Example mapping
mitre = ["T1110", "T1078"]
mitre

## 6. Feature Engineering for Network ML

**Goal:** Choose features that capture meaningful network behavior.

### First Hint
Start with simple numeric features.

### Second Hint
Try:
- `hour`
- `src_port`
- `dst_port`

### Reveal (optional)

In [ ]:
# Example feature selection
net_features = net_df[['hour', 'src_port', 'dst_port']]
net_features.head()

## 7. Choosing an ML Model

**Goal:** Select an unsupervised anomaly detection model.

### First Hint
IsolationForest is a good default.

### Second Hint
Other options:
- LocalOutlierFactor
- OneClassSVM

### Reveal (optional)

In [ ]:
# Example model initialization
from sklearn.ensemble import IsolationForest
model = IsolationForest(n_estimators=100, contamination=0.05, random_state=42)
model

## 8. Training the Model

**Goal:** Fit your model on the full network dataset.

### First Hint
Use the features you selected earlier.

### Second Hint
Call `.fit()` on your model.

### Reveal (optional)

In [ ]:
# Example training
model.fit(net_features)

## 9. Scoring Suspicious Events

**Goal:** Compute anomaly scores for suspicious network events.

### First Hint
Use the same features you trained on.

### Second Hint
Use `decision_function()` or `score_samples()`.

### Reveal (optional)

In [ ]:
# Example scoring
sus_scores = model.decision_function(sus_net_features)
sus_scores[:5]

## 10. Writing a Triage Summary

**Goal:** Summarize the incident clearly and concisely.

### First Hint
Describe what happened, how you know, and why it matters.

### Second Hint
Include:
- Key IOCs
- MITRE techniques
- Severity

### Reveal (optional)

In [ ]:
# Example summary structure
example_summary = """
Suspicious authentication failures were followed by unusual process execution
and outbound connections to rare ports. Indicators suggest credential misuse
and possible lateral movement.
"""
example_summary