# THE BASELINE: Where Most Systems Start

## The Bare Minimum

Let's establish a baseline - **simple pattern matching** - to understand the gap we need to close:
- Rule-based detection looking for suspicious phrases
- No understanding of context or intent
- High false positive rates
- Misses sophisticated violations

---

## Let's See What Simple Detection Can (and Can't) Do

In [None]:
from snowflake.snowpark import Session

session = Session.builder.getOrCreate()
session.use_warehouse('COMPLIANCE_DEMO_WH')
session.use_database('COMPLIANCE_DEMO')
session.use_schema('EMAIL_SURVEILLANCE')

print(f"Connected. Let's examine 10,000 hedge fund emails.")

## Simple Pattern Matching (The Baseline)

In [None]:
SUSPICIOUS_PATTERNS = [
    'insider', 'confidential', 'secret', 'buy now', 'sell before',
    'off the record', 'between us', 'delete this', 'personal trade',
    'between you and me', 'heads up', 'tip', 'private', 'sensitive',
    'keep this quiet', 'early numbers', 'non-public', 'mnpi'
]

pattern_regex = '|'.join(SUSPICIOUS_PATTERNS)

print("BASELINE: Simple Pattern Matching")
print(f"Scanning 10,000 emails for {len(SUSPICIOUS_PATTERNS)} suspicious patterns...")

In [None]:
baseline_results = session.sql(f"""
SELECT 
    EMAIL_ID,
    COMPLIANCE_LABEL as ACTUAL_LABEL,
    CASE 
        WHEN REGEXP_LIKE(LOWER(BODY), '.*({pattern_regex}).*', 's') 
        THEN 'FLAGGED' 
        ELSE 'CLEAN' 
    END as BASELINE_PREDICTION,
    SUBJECT,
    BODY
FROM COMPLIANCE_DEMO.EMAIL_SURVEILLANCE.EMAILS
""").to_pandas()

print(f"Total emails scanned: {len(baseline_results):,}")
print(f"Flagged for review: {(baseline_results['BASELINE_PREDICTION'] == 'FLAGGED').sum():,}")

In [None]:
actual_violations = baseline_results['ACTUAL_LABEL'] != 'CLEAN'
flagged = baseline_results['BASELINE_PREDICTION'] == 'FLAGGED'

true_positives = (flagged & actual_violations).sum()
false_positives = (flagged & ~actual_violations).sum()
false_negatives = (~flagged & actual_violations).sum()
total_violations = actual_violations.sum()

precision = true_positives / (true_positives + false_positives) if (true_positives + false_positives) > 0 else 0
recall = true_positives / (true_positives + false_negatives) if (true_positives + false_negatives) > 0 else 0
f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0

print("="*60)
print("BASELINE PERFORMANCE")
print("="*60)
print(f"\nPRECISION: {precision*100:.1f}%")
print(f"   Of flagged emails, only {precision*100:.0f}% are real violations")
print(f"   {false_positives:,} false alarms for analysts to review")
print(f"\nRECALL: {recall*100:.1f}%")
print(f"   Of {total_violations:,} actual violations, caught {true_positives:,}")
print(f"   MISSED {false_negatives:,} real violations")
print(f"\nF1 SCORE: {f1*100:.1f}%")

In [None]:
print("\nFALSE ALARMS (Clean emails flagged as suspicious):")
print("="*70)
false_alarm_examples = baseline_results[(flagged) & (~actual_violations)].head(2)
for _, row in false_alarm_examples.iterrows():
    print(f"\n[CLEAN - FALSE ALARM]")
    print(f"Subject: {row['SUBJECT']}")
    print(f"Body: {row['BODY'][:400]}...")
print("\n** These are LEGITIMATE emails that wasted analyst time **")

In [None]:
missed = baseline_results[(~flagged) & (actual_violations)].head(3)

print("\nMISSED VIOLATIONS (Real threats that slipped through):")
print("="*70)
for _, row in missed.iterrows():
    print(f"\n[{row['ACTUAL_LABEL']} - MISSED!]")
    print(f"Subject: {row['SUBJECT']}")
    print(f"Body: {row['BODY'][:400]}...")
print("\n** These violations had NO suspicious patterns but ARE real threats **")

---

## The Gap

| Issue | Impact |
|-------|--------|
| Low Precision (~71%) | Alert fatigue, wasted investigation time |
| Low Recall (~29%) | **Real violations slip through** |
| No Context | Can't distinguish legitimate use from actual violations |
| Static Rules | Can't adapt to new evasion patterns |

---

## The Question:

**Can we build a system that:**
1. Catches violations that simple patterns miss?
2. Dramatically reduces false positives?
3. Understands context and intent?
4. Learns and adapts to new patterns?

**Let's find out. â†’**