# ASAN: Adaptive Spectral Alignment Networks for Predictive AI Safety

This notebook demonstrates the complete ASAN system for predicting harmful LLM outputs before they are generated.

## Overview

ASAN transforms AI safety from reactive output monitoring to proactive behavioral prediction by:
1. **Monitoring** LLM internal states during generation
2. **Analyzing** temporal patterns using spectral decomposition
3. **Predicting** harmful behavior before it manifests
4. **Intervening** to prevent harmful outputs

## Key Features

- Real-time monitoring of LLM internal states
- Spectral analysis of attention patterns and hidden states
- Early detection of harmful patterns (5+ tokens before completion)
- Multi-modal analysis combining attention, hidden states, and token probabilities
- Interpretable predictions with frequency band analysis
- Low latency (<20ms overhead per token)


In [None]:
# Import required libraries
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from typing import Dict, List, Tuple
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

print("ASAN Demo: Adaptive Spectral Alignment Networks for Predictive AI Safety")
print("=" * 70)
