# TLM Anomaly Radar - Demo

**Energy-Based Anomaly Detection for LLM Outputs**

This notebook demonstrates how TLM (Thermodynamic Language Model) can detect anomalies in restaurant booking dialogues using energy-based scoring.


## Setup


In [1]:
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(''))))

from anomaly_radar import AnomalyRadar, load_trained_model
from anomaly_radar.dialogue_data import load_real_dialogues_simple
import numpy as np
import matplotlib.pyplot as plt

print("Setup complete!")


ModuleNotFoundError: No module named 'anomaly_radar'

## STEP 1: Show Real Restaurant Dataset

Here are 5 normal restaurant booking dialogues from our training data:


In [None]:
dialogues = load_real_dialogues_simple("../anomaly_radar/real_dialogues.txt")

print("Normal Restaurant Booking Dialogues:")
print("=" * 70)
for i, dialogue in enumerate(dialogues[:5], 1):
    print(f"\n{i}. {dialogue}")


## STEP 2: Show Anomalies

Here are 5 examples of weird/anomalous dialogues:


In [None]:
anomalies = [
    "quantum computing research table booking for negative infinity people at yesterday time",
    "the sky is blue and elephants can fly through quantum space while making restaurant reservations",
    "hello hello hello hello hello hello hello hello hello hello",
    "i need a table for negative five people at 25pm tomorrow",
    "bonjour je veux une table pour deux à 7h pm reservation confirmée quantum"
]

print("Anomalous Dialogues:")
print("=" * 70)
for i, anomaly in enumerate(anomalies, 1):
    print(f"\n{i}. {anomaly}")


## STEP 3: Load TLM Model and Compute Energy


In [None]:
model = load_trained_model("../anomaly_radar/dialogue_tlm_weights.npy")
detector = AnomalyRadar(model)

# Compute baseline
baseline_mean, baseline_std = detector.compute_baseline_energy(dialogues[:50])
print(f"Baseline energy: {baseline_mean:.2f} ± {baseline_std:.2f}")
print(f"Normal range: [{baseline_mean - 2*baseline_std:.2f}, {baseline_mean + 2*baseline_std:.2f}]")


## STEP 4: Show TLM Energy Difference

TLM assigns energy scores. Lower energy = more stable/normal. Higher energy = more anomalous.


In [None]:
print("Energy Scores:")
print("=" * 70)

print("\nNormal Dialogues:")
for i, dialogue in enumerate(dialogues[:5], 1):
    energy = detector.compute_energy(dialogue)
    print(f"{i}. {dialogue[:50]}...")
    print(f"   Energy: {energy:.2f}")

print("\nAnomalous Dialogues:")
for i, anomaly in enumerate(anomalies, 1):
    energy = detector.compute_energy(anomaly)
    print(f"{i}. {anomaly[:50]}...")
    print(f"   Energy: {energy:.2f}")


## STEP 5: Show Anomaly Scores

Anomaly scores (0-1, higher = more anomalous):


In [None]:
print("Anomaly Scores:")
print("=" * 70)

print("\nNormal Dialogues:")
for i, dialogue in enumerate(dialogues[:5], 1):
    result = detector.detect_anomaly(dialogue, baseline_energy=baseline_mean, baseline_std=baseline_std)
    print(f"{i}. {dialogue[:50]}...")
    print(f"   Anomaly Score: {result['anomaly_score']:.3f} (0=normal, 1=anomalous)")

print("\nAnomalous Dialogues:")
for i, anomaly in enumerate(anomalies, 1):
    result = detector.detect_anomaly(anomaly, baseline_energy=baseline_mean, baseline_std=baseline_std)
    print(f"{i}. {anomaly[:50]}...")
    print(f"   Anomaly Score: {result['anomaly_score']:.3f}")


## STEP 6: Show Classification

Final classification results:


In [None]:
print("Classification Results:")
print("=" * 70)

print("\nNormal Dialogues:")
for i, dialogue in enumerate(dialogues[:5], 1):
    result = detector.detect_anomaly(dialogue, baseline_energy=baseline_mean, baseline_std=baseline_std)
    status = "✗ ANOMALOUS" if result['is_anomalous'] else "✓ NORMAL"
    print(f"{i}. {status} - {dialogue[:60]}...")

print("\nAnomalous Dialogues:")
for i, anomaly in enumerate(anomalies, 1):
    result = detector.detect_anomaly(anomaly, baseline_energy=baseline_mean, baseline_std=baseline_std)
    status = "✗ ANOMALOUS" if result['is_anomalous'] else "✓ NORMAL"
    print(f"{i}. {status} - {anomaly[:60]}...")


## Summary

**TLM Anomaly Radar works as a guardrail:**

- ✓ Identifies normal restaurant booking dialogues
- ✓ Flags anomalous/hallucinated content
- ✓ Provides interpretable energy-based scores
- ✓ Model-agnostic (works on any LLM output)

**This is the moment Scale AI "gets it."**
