# Notebook 05 â€“ Interactive Inference and CLI Demo

This notebook demonstrates how to use the trained baseline model for real-time incident classification through an interactive command-line interface. The goal is to provide a simple, operational prototype that security analysts could use to quickly triage incoming incidents.

## Purpose and Use Cases

The interactive triage session simulates how the model would be deployed in a production SOC environment:

**Workflow:**
1. Analyst receives an incident description (from user report, SIEM alert, or ticket)
2. Analyst enters the description into the CLI tool
3. Model returns predicted event type with confidence scores
4. Analyst reviews probabilities to assess classification certainty
5. High-confidence predictions can be auto-triaged; low-confidence cases require manual review

**Key Features:**
- **Real-time prediction**: Instant classification of free-text descriptions
- **Probability scores**: Shows confidence across all event types (top 5 displayed)
- **Interactive loop**: Continuous session for testing multiple scenarios
- **Text preprocessing**: Automatically applies the same cleaning pipeline used during training

**Example Use Cases:**
- Testing how the model handles edge cases or ambiguous descriptions
- Validating model behavior on real incident narratives from your environment
- Prototyping integration with ticketing systems or SIEM platforms
- Training new analysts on incident classification patterns

This notebook serves as a foundation for building production-ready inference APIs, web dashboards, or automated triage workflows.

In [1]:
import joblib
import sys, os
sys.path.append(os.path.abspath("../src")) # Add src to path
from triage.preprocess import clean_description # For text cleaning

vectorizer = joblib.load("../models/vectorizer.joblib") # Load the vectorizer
clf = joblib.load("../models/baseline_logreg.joblib") # Load the classifier

print("---Baseline model and vectorizer loaded.---")

def predict_event_type(raw_text: str):
    clean = clean_description(raw_text)
    X = vectorizer.transform([clean])
    label = clf.predict(X)[0]
    if hasattr(clf, "predict_proba"):
        proba = clf.predict_proba(X)[0]
        proba_dict = dict(zip(clf.classes_, proba))
        return label, proba_dict
    return label, None

---Baseline model and vectorizer loaded.---


In [2]:
def interactive_triage_session():
    print("=" * 60)
    print("Interactive Incident Triage Session")
    print("=" * 60)
    print("Enter an incident description to classify it.")
    print("To exit: press Enter on empty line, type 'quit', or type 'exit'")
    print("=" * 60 + "\n")

    while True:
        try:
            raw_text = input("Incident> ").strip()
            if raw_text == "" or raw_text.lower() in {"quit", "exit", "q"}:
                print("\n" + "=" * 60)
                print("Exiting interactive triage session.")
                print("=" * 60)
                break

            print(f"\nInput description: {raw_text}")

            label, proba = predict_event_type(raw_text)
            print(f"Predicted event_type: {label}")

            if proba is not None:
                print("\nTop 5 class probabilities:")
                for cls, p in sorted(proba.items(), key=lambda x: x[1], reverse=True)[:5]:
                    print(f"  {cls}: {p:.3f}")
            print("")
        except (KeyboardInterrupt, EOFError):
            print("\n\n" + "=" * 60)
            print("Session interrupted. Exiting...")
            print("=" * 60)
            break

## Example Test Cases

Before running the interactive session, here are some example incident descriptions you can test:

**Phishing Examples:**
- "User received an email claiming to be from IT asking them to verify their VPN password using a link."
- "Multiple employees reported emails about mandatory security training with suspicious links."

**Malware Examples:**
- "EDR detected a suspicious PowerShell process spawning from Outlook and connecting to an external IP."
- "Endpoint started encrypting user documents and displaying a ransom note demanding bitcoin."

**Access Abuse Examples:**
- "Multiple failed login attempts for a privileged admin account from a foreign country outside business hours."
- "SSO logs show the same user logging in from the US and Europe within 10 minutes."

**Data Exfiltration Examples:**
- "Employee downloaded a large number of files from confidential SharePoint and uploaded to personal Google Drive."
- "Proxy logs show multi-GB uploads to unfamiliar cloud storage from a finance workstation after hours."

**Web Attack Examples:**
- "WAF observed repeated HTTP requests with SQL injection payloads against the /login endpoint."
- "Large number of failed login attempts against customer portal from a small set of IPs."

**Policy Violation Examples:**
- "User installed an unauthorized remote access tool that started connecting to external IPs."
- "DLP detected sensitive files being copied to an unencrypted USB drive."

**Benign Activity Examples:**
- "Server performance degraded during planned Windows patch cycle and backup job in maintenance window."
- "User opened ticket about slow email, but logs show normal traffic and no suspicious activity."

Try these examples or create your own to explore model behavior!

## Run Interactive Session

Execute the cell below to start the interactive triage session. You can test the examples above or enter your own incident descriptions.

In [3]:
# Execute this cell to start the interactive triage session
interactive_triage_session()

Interactive Incident Triage Session
Enter an incident description to classify it.
To exit: press Enter on empty line, type 'quit', or type 'exit'


Input description: Employee downloaded a large number of files from confidential SharePoint and uploaded to personal Google Drive.
Predicted event_type: data_exfiltration

Top 5 class probabilities:
  data_exfiltration: 0.557
  insider_threat: 0.359
  policy_violation: 0.047
  benign_activity: 0.015
  access_abuse: 0.005


Exiting interactive triage session.
