# Lab 00c: Prompt Engineering Mastery

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/depalmar/ai_for_the_win/blob/main/notebooks/lab00c_prompt_engineering.ipynb)

Master the art of writing effective prompts for security applications.

## Learning Objectives
- Write clear, structured prompts
- Use system prompts effectively
- Handle structured output
- Detect and prevent hallucinations

In [None]:
# Colab: Install dependencies (skip this cell locally - packages already in venv)
# %pip install -q anthropic python-dotenv

In [None]:
import os
from dotenv import load_dotenv

load_dotenv()

# For Colab, set your API key
# os.environ['ANTHROPIC_API_KEY'] = 'your-key-here'

## 1. Basic Prompt Structure

A good prompt has:
1. **Context**: What role should the AI take?
2. **Task**: What do you want it to do?
3. **Format**: How should the output look?
4. **Constraints**: Any limitations?

In [None]:
# Example: Basic vs. Improved prompt

basic_prompt = "Analyze this log."

improved_prompt = """
You are a security analyst specializing in log analysis.

Analyze the following log entry and provide:
1. Event type
2. Severity (LOW/MEDIUM/HIGH/CRITICAL)
3. Recommended action

Log entry:
{log_entry}

Respond in JSON format with keys: event_type, severity, action
"""

print("Basic prompt lacks context and structure.")
print("Improved prompt provides role, task, and format.")

## 2. System Prompts

System prompts set the AI's behavior for the entire conversation.

In [None]:
SECURITY_ANALYST_SYSTEM = """
You are an expert security analyst with deep knowledge of:
- MITRE ATT&CK framework
- Log analysis and SIEM/SOAR operations (Splunk, Elastic, Sentinel, etc.)
- Incident response procedures
- Threat intelligence

Guidelines:
- Always cite MITRE ATT&CK techniques when applicable (e.g., T1059)
- Provide actionable recommendations
- Flag any IOCs (IPs, domains, hashes) you identify
- Rate severity using: LOW, MEDIUM, HIGH, CRITICAL
- When uncertain, say so rather than guessing
"""

print(SECURITY_ANALYST_SYSTEM)

## 3. Structured Output

In [None]:
import json

# Prompt that enforces JSON output
structured_prompt = """
Analyze this security event and respond ONLY with valid JSON.

Event: Failed SSH login from 45.33.32.156 to admin@server01

Required JSON schema:
{
    "event_type": "string",
    "severity": "LOW|MEDIUM|HIGH|CRITICAL",
    "source_ip": "string",
    "target": "string",
    "mitre_technique": "string or null",
    "recommended_actions": ["list of strings"]
}

Respond with ONLY the JSON, no other text.
"""

# Example response parsing
example_response = """{
    "event_type": "authentication_failure",
    "severity": "MEDIUM",
    "source_ip": "45.33.32.156",
    "target": "admin@server01",
    "mitre_technique": "T1110.001",
    "recommended_actions": [
        "Check IP reputation",
        "Review auth logs for patterns",
        "Consider IP blocking if repeated"
    ]
}"""

parsed = json.loads(example_response)
print(f"Severity: {parsed['severity']}")
print(f"MITRE: {parsed['mitre_technique']}")

## 4. Few-Shot Prompting

Provide examples to guide the model's responses.

In [None]:
few_shot_prompt = """
Classify the following security events.

Example 1:
Event: Multiple failed logins from same IP
Classification: Brute Force Attack (T1110)
Severity: HIGH

Example 2:
Event: User logged in from new device
Classification: New Device Login
Severity: LOW

Example 3:
Event: PowerShell encoded command execution
Classification: Command Execution (T1059.001)
Severity: CRITICAL

Now classify:
Event: {new_event}
"""

print("Few-shot prompting shows the model expected format.")

## 5. Preventing Hallucinations

In [None]:
# Techniques to reduce hallucinations

anti_hallucination_prompt = """
Analyze the following log entry.

IMPORTANT INSTRUCTIONS:
1. Only report IOCs that are EXPLICITLY present in the log
2. If you cannot determine something, say "Unknown" or "Not enough information"
3. Do NOT invent or assume information not in the log
4. If asked about something not in the data, respond "Not present in log"

Log: {log_entry}

Extract:
- Source IP (or "Not present")
- Destination IP (or "Not present")
- User (or "Not present")
- Action performed
"""

print("Key anti-hallucination techniques:")
print("1. Explicit instructions to not invent data")
print("2. Provide 'unknown' as valid option")
print("3. Ask for citations/evidence")
print("4. Verify outputs programmatically")

## 6. Chain of Thought Prompting

Ask the model to show its reasoning.

In [None]:
cot_prompt = """
Analyze this security alert and determine if it's a true positive.

Alert: Suspicious PowerShell activity detected
Command: powershell -enc JABjAGwAaQBlAG4AdAAgAD0AIABOAGUAdwAtAE8AYgBqAGU
User: SYSTEM
Host: DC01

Think through this step by step:
1. What is the command doing? (decode if possible)
2. Is SYSTEM running PowerShell normal for this host type?
3. What MITRE techniques might this match?
4. What additional context would help?
5. Final verdict: True Positive, False Positive, or Needs Investigation?

Show your reasoning for each step.
"""

print("Chain of thought helps with:")
print("- Complex analysis tasks")
print("- Explainable decisions")
print("- Catching logical errors")

## 7. Real Example: IOC Extraction

In [None]:
from anthropic import Anthropic


def extract_iocs_with_llm(text: str) -> dict:
    """Extract IOCs from text using Claude."""

    client = Anthropic()

    prompt = f"""
    Extract all Indicators of Compromise (IOCs) from the following text.

    Return ONLY a JSON object with these keys:
    - ips: list of IP addresses
    - domains: list of domain names
    - hashes: list of file hashes (MD5, SHA1, SHA256)
    - urls: list of URLs
    - emails: list of email addresses

    If a category has no IOCs, use an empty list.
    Only include IOCs explicitly present in the text.

    Text:
    {text}
    """

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1000,
        messages=[{"role": "user", "content": prompt}],
    )

    try:
        return json.loads(response.content[0].text)
    except json.JSONDecodeError:
        return {"error": "Failed to parse response", "raw": response.content[0].text}


# Test (uncomment if API key is set)
# sample_text = """
# The malware connects to 185.220.101.1 and downloads payload from
# https://evil-domain.com/malware.exe. The file hash is
# d41d8cd98f00b204e9800998ecf8427e.
# """
# result = extract_iocs_with_llm(sample_text)
# print(json.dumps(result, indent=2))

## Best Practices Summary

| Technique | When to Use |
|-----------|-------------|
| System Prompt | Set consistent behavior |
| Structured Output | When parsing is needed |
| Few-Shot | Complex format requirements |
| Chain of Thought | Complex reasoning tasks |
| Anti-Hallucination | Fact-critical applications |

## Next Steps
- **Lab 01**: Apply prompting to phishing classification
- **Lab 04**: Full LLM log analysis