# Setup & Utils

In [None]:
import requests
import json

In [None]:
import re

def extract_model_response(text: str) -> str:
    """
    Extracts the content between <think> and the final <|eot_id|> marker.

    Parameters:
        text (str): The full text containing <think> and <|eot_id|> markers.

    Returns:
        str: The extracted content between <think> and <|eot_id|>, or an empty string if not found.
    """
    match = re.search(r"<think>(.*?)<\|end_of_text\|>", text, re.DOTALL)
    return match.group(1).strip() if match else ""


**Please enter the API key that was provided to you**

In [None]:
API_KEY = ""

In [None]:
def call_llm(prompt):
    resp = requests.post(
      "https://model-dq4r501w.api.baseten.co/development/predict",
      headers={"Authorization": f"Api-Key {API_KEY}"},
      json={'prompt': prompt},
    )
    model_response = extract_model_response(resp.text)
    return model_response


# Chat/Q&A with Cybersecurity Reasoning Model

In [None]:
user_prompt = """
What are best practices for security when deploying a multi-container Docker architecture?
"""

In [None]:
model_response = call_llm(user_prompt)


In [None]:
print(model_response)

When it comes to securing a multi-container Docker architecture, there are several key considerations and best practices that need to be followed. One of the primary concerns is ensuring that all containers are configured with the principle of least privilege in mind, which means running each container as a non-root user whenever possible. This can be achieved by using the USER directive in the Dockerfile to set the default user, or by using the --user flag during runtime.

Another important aspect is managing secrets securely. While some may think that storing secrets directly in environment variables is acceptable, this approach is actually not recommended due to the risk of them being exposed via /proc. Instead, it's better to use Docker Secrets or something like HashiCorp's Vault to manage secrets.

In terms of networking, it's crucial to understand how containers communicate with each other. By default, containers on the same host do not have an IP route to each other, so they can

# Use Case - Indicent Investigation

**Use Case Description**- Automate key phases of incident investigation using a coordinated set of LLM-driven agents. These agents handle tasks such as triaging, summarizing, planning next steps, and generating recommendations, ultimately producing detailed investigation reports for SOC teams with minimal manual effort.

## Summarize Incident

In [None]:
def summarize_incident(metadata: dict, alert_logs: str) -> str:
    """Generate a structured, SOC-ready summary of the incident."""
    prompt = (
        "You are a senior SOC analyst assisting with incident triage. "
        "Your task is to read the incident metadata and alert logs, and provide a clear summary of what occurred.\n\n"
        "Instructions:\n"
        "- Highlight the sequence of events (inferred from timestamps).\n"
        "- Think deeply about cause and effect and how artifacts relate to one another.\n"
        "- Mention key attack techniques used (if inferable from logs).\n"
        "- Describe how the attack began and progressed.\n"
        "- Use clear and concise language appropriate for L1/L2 analysts.\n\n"
        f"Incident Metadata:\n{metadata}\n\n"
        f"Alert Logs:\n{alert_logs}\n\n"
        "Summarize what happened in this incident in 2-3 sentences"
    )

    summary_text = call_llm(prompt)
    return summary_text


In [None]:
incident_meta = {
    "incident_id": "INC-1024",
    "type": "Unauthorized Access",
    "severity": "High",
    "timestamp": "2025-04-09T10:30:00Z"
}
raw_logs = """2025-04-09 10:00:23 - Alert: 5 failed login attempts for user 'alice' on host 'WS123'
2025-04-09 10:05:10 - Alert: Suspicious PowerShell execution on 'WS123' by 'alice' (malicious script blocked)
2025-04-09 10:10:45 - Alert: Process dumping LSASS memory on 'WS123' (possible credential theft)
2025-04-09 10:15:00 - Alert: Successful login of 'alice' to server 'DC1' from host 'WS123'"""

In [None]:
summary = summarize_incident(incident_meta, raw_logs)

In [None]:
print(summary)

The incident started at 10:00 AM when user 'alice' had five failed login attempts on workstation 'WS123'. Following that, suspicious PowerShell activity was detected and a malicious script was blocked at 10:05 AM. At 10:10 AM, process dumping of LSASS memory was attempted, indicating potential credential theft. The attack culminated at 10:15 AM with a successful login of 'alice' onto server 'DC1' from 'WS123'.

Attack Summary: This unauthorized access incident involved a brute-force attempt followed by an attempt to extract credentials via LSASS process dumping, leading to a successful lateral movement to a domain controller.


##  Identify Impacted Assets, Users, and MITRE Tactics


In [None]:
def extract_entities_and_tactics(metadata: dict, alert_logs: str) -> dict:
    """
    Identify impacted assets, users, and MITRE ATT&CK tactics/techniques from incident data.
    Returns a dictionary with keys: 'impacted_hosts', 'impacted_users', 'tactics'.
    """
    prompt = (
        "You are a cybersecurity expert specializing in threat analysis and incident response. Analyze the following incident logs and metadata, then:\n"
        "- List impacted host systems and IPs\n"
        "- List impacted user accounts\n"
        "- Identify MITRE ATT&CK tactics and techniques observed (with names or IDs)\n\n"
        f"Metadata: {metadata}\n"
        f"Logs:\n{alert_logs}\n\n"
        "Provide the result as a JSON object with keys 'impacted_hosts', 'impacted_users', 'tactics'."
    )
    response = call_llm(prompt)
    # print(response)
    # # Parse the JSON string output into a Python dict:
    # entities = json.loads(response) if response else {}
    # return entities

    # response = call_llm(prompt)
    # print("LLM raw response:\n", response)

    if not response:
        print("Empty response from LLM.")
        return {}

    # Try to extract JSON using regex in case there is surrounding text
    match = re.search(r'\{[\s\S]*\}', response)
    if match:
        try:
            return json.loads(match.group(0))
        except json.JSONDecodeError as e:
            print("Error decoding JSON:", e)
    else:
        print("No JSON object found in response.")

    return {}


In [None]:
entities = extract_entities_and_tactics(incident_meta, raw_logs)

In [None]:
print(json.dumps(entities, indent=3))

{
   "impacted_hosts": [
      "WS123",
      "DC1"
   ],
   "impacted_users": [
      "alice"
   ],
   "tactics": [
      {
         "name": "Credential Access",
         "techniques": [
            {
               "ID": "T1003",
               "name": "OS Credential Dumping"
            },
            {
               "ID": "T1550.002",
               "name": "Credentials from Password Stores: LSASS Memory"
            }
         ]
      },
      {
         "name": "Execution",
         "techniques": [
            {
               "ID": "T1059.003",
               "name": "Command and Scripting Interpreter: PowerShell"
            }
         ]
      },
      {
         "name": "Lateral Movement",
         "techniques": [
            {
               "ID": "T1021.001",
               "name": "Remote Services: Remote Desktop Protocol"
            }
         ]
      }
   ]
}


 ## Recommend Remediation Steps and Next Investigative Actions

In [None]:
def clean_llm_response(response: str) -> str:
    """
    Extract JSON from markdown code block if present.
    """
    match = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", response, re.DOTALL)
    if match:
        return match.group(1)
    return response.strip()

def recommend_actions(summary: str, entities: dict) -> dict:
    """
    Recommend remediation steps and next investigative actions based on the incident summary and entities.
    Returns a dict with 'remediation_steps' and 'next_steps'.
    """
    impacted_hosts = ", ".join(entities.get("impacted_hosts", []))
    impacted_users = ", ".join(entities.get("impacted_users", []))
    tactics_list = [
        f"{tactic['name']}: {technique['name']} ({technique['ID']})"
        for tactic in entities.get("tactics", [])
        for technique in tactic.get("techniques", [])
    ]

    context = (
        f"Incident Summary: {summary}\n"
        f"Impacted Hosts: {impacted_hosts}\n"
        f"Impacted Users: {impacted_users}\n"
        f"Observed Tactics: {tactics_list}\n\n"
    )

    prompt = (
        "You are a SOC incident response assistant. Based on the incident details, respond strictly in the following JSON format:\n"
        '{\n  "remediation_steps": ["..."],\n  "next_steps": ["..."]\n}\n\n'
        + context
    )

    response = call_llm(prompt)

    if not response:
        print("LLM returned no response.")
        return {}

    try:
        cleaned = clean_llm_response(response)
        return json.loads(cleaned)
    except json.JSONDecodeError:
        print("LLM returned invalid JSON:")
        print(response)
        return {}

In [None]:
actions = recommend_actions(summary, entities)

In [None]:
print(json.dumps(actions, indent=3))

{
   "remediation_steps": [
      "Isolate 'WS123' from the network to prevent further lateral movement.",
      "Conduct a thorough analysis of the PowerShell logs to identify the malicious script and its origin.",
      "Perform memory dumps of 'LSASS.exe' on 'WS123' for forensic analysis to determine if credentials were extracted.",
      "Reset 'alice''s password and enable multi-factor authentication to prevent future brute-force attacks.",
      "Review and update firewall rules to block unnecessary ports and services, especially RDP.",
      "Scan 'WS123' and 'DC1' for malware using updated antivirus software.",
      "Monitor 'alice''s account activity for any unusual behavior post-incident.",
      "Implement additional monitoring on 'WS123' and 'DC1' for similar activities using EDR tools.",
      "Update the incident response plan to include immediate isolation procedures for compromised hosts."
   ],
   "next_steps": [
      "Collect and analyze all relevant logs from 'WS12

## Produce a Structured Incident Investigation Report

In [None]:
def compile_incident_report(metadata: dict, summary: str, entities: dict, actions: dict) -> str:
    """
    Compile a structured incident report using the summary, entities, and recommended actions.
    Returns a formatted report as a text (could be markdown or plain text).
    """
    hosts = ", ".join(entities.get("impacted_hosts", []))
    users = ", ".join(entities.get("impacted_users", []))
    tactics = [
        f"{tactic['name']}: {technique['name']} ({technique['ID']})"
        for tactic in entities.get("tactics", [])
        for technique in tactic.get("techniques", [])
    ]
    remediation_list = "- " + "\n- ".join(
        step["text"] if isinstance(step, dict) and "text" in step else str(step)
        for step in actions.get("remediation_steps", [])
    )
    next_steps_list = "- " + "\n- ".join(
        step["text"] if isinstance(step, dict) and "text" in step else str(step)
        for step in actions.get("next_steps", [])
    )

    prompt = (
        "You are a SOC analyst assistant tasked with writing the final incident report. "
        "Use the information below to fill out each section of the report.\n\n"
        f"Incident Metadata: {metadata}\n"
        f"Incident Summary: {summary}\n"
        f"Impacted Hosts: {hosts}\n"
        f"Impacted Users: {users}\n"
        f"Observed Tactics/Techniques: {tactics}\n"
        f"Remediation Steps Taken:\n{remediation_list}\n"
        f"Next Investigation Steps:\n{next_steps_list}\n\n"
        "Now format the incident report with the following sections:\n"
        "## Incident Summary\n"
        "*(A brief overview of the incident.)*\n"
        "## Impacted Assets and Users\n"
        "*(Which systems and accounts were affected.)*\n"
        "## Adversary Tactics and Techniques\n"
        "*(ATT&CK tactics/techniques observed in this incident.)*\n"
        "### Incident Details (Timeline)\n"
        "*(Chronological log of key events.)*\n\n"
        "## Remediation Actions Implemented\n"
        "*(How we contained/mitigated the incident.)*\n"
        "## Next Steps and Recommendations\n"
        "*(Further investigation steps or preventive measures.)*\n"
        "Draft the report now, incorporating all provided details in the appropriate sections."
    )

    response = call_llm(prompt)
    return response


In [None]:
report = compile_incident_report(incident_meta, summary, entities, actions)

In [None]:
print(report)

**Incident Report**

---

**Incident ID:** INC-1024  
**Type:** Unauthorized Access  
**Severity:** High  
**Timestamp:** 2025-04-09T10:30:00Z  

---

## Incident Summary
On April 9th, 2025, at approximately 10:00 AM, an unauthorized access incident was detected involving user 'alice' attempting multiple failed logins on workstation 'WS123'. Subsequent detection of suspicious PowerShell activity at 10:05 AM indicated a malicious script was blocked. At 10:10 AM, there was an attempt to dump LSASS memory, suggesting credential theft. The attack peaked at 10:15 AM with a successful login of 'alice' onto the domain controller 'DC1' from 'WS123'. This incident combined brute-force techniques with credential extraction via LSASS process dumping, resulting in lateral movement to a critical asset.

## Impacted Assets and Users
* **Hosts:** WS123, DC1
* **Users:** alice

## Adversary Tactics and Techniques
The observed tactics and techniques align with the MITRE ATT&CK framework as follows:
- *