# Day 18: Anomaly Detection for FL Security

**Real-Time Detection of Malicious Client Updates**

## Overview
- **Goal**: Detect malicious client updates in real-time
- **Methods**: L2 norm, cosine similarity, clustering, autoencoders
- **Deployment**: Server-side monitoring system

## What You'll Learn
1. **Update Anomalies**: What makes an update suspicious?
2. **Detection Methods**: L2 norm, cosine similarity, KL divergence
3. **Threshold Selection**: Statistical approaches
4. **Ensemble Detection**: Combining multiple methods

---

## 1. What Makes an Update Anomalous?

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("""
ANOMALOUS UPDATE CHARACTERISTICS:

1. LARGE L2 NORM
   ‚Ä¢ Normal: ~0.1-1.0
   ‚Ä¢ Anomalous: >10.0
   ‚Ä¢ Detection: Compare to historical distribution

2. WRONG DIRECTION (Cosine Similarity)
   ‚Ä¢ Normal: Similar to other updates (cosine > 0.5)
   ‚Ä¢ Anomalous: Opposite direction (cosine < -0.5)
   ‚Ä¢ Detection: Angle to mean update

3. STATISTICAL OUTLIER
   ‚Ä¢ Normal: Within 2-3 standard deviations
   ‚Ä¢ Anomalous: Beyond 3 standard deviations
   ‚Ä¢ Detection: Z-score, IQR, isolation forest

4. RARE PATTERN
   ‚Ä¢ Normal: Follows expected distribution
   ‚Ä¢ Anomalous: Unusual pattern
   ‚Ä¢ Detection: Autoencoder, one-class SVM

""")

## 2. L2 Norm Detection

In [None]:
def l2_norm_detector(updates, threshold=None, z_score_threshold=3.0):
    """
    Detect anomalies using L2 norm.
    
    Args:
        updates: List of weight updates
        threshold: Fixed threshold (if None, use statistical)
        z_score_threshold: Z-score threshold for statistical detection
        
    Returns:
        anomalies: List of booleans (True if anomalous)
        norms: L2 norms for all updates
    """
    # Compute L2 norms
    norms = [np.linalg.norm(u) for u in updates]
    
    if threshold is not None:
        # Fixed threshold
        anomalies = [norm > threshold for norm in norms]
    else:
        # Statistical threshold (z-score)
        mean_norm = np.mean(norms)
        std_norm = np.std(norms)
        z_scores = [(norm - mean_norm) / (std_norm + 1e-10) for norm in norms]
        anomalies = [abs(z) > z_score_threshold for z in z_scores]
    
    return anomalies, norms

# Test with mixed honest/malicious updates
np.random.seed(42)
honest_updates = [np.random.randn(100) * 0.1 for _ in range(9)]
malicious_updates = [
    np.random.randn(100) * 5,  # Large norm
    np.random.randn(100) * 0.1 * 50  # Scaled attack
]
all_updates = honest_updates + malicious_updates

# Detect
anomalies, norms = l2_norm_detector(all_updates, z_score_threshold=2.5)

# Visualize
plt.figure(figsize=(12, 6))
colors = ['green' if not a else 'red' for a in anomalies]
plt.bar(range(len(norms)), norms, color=colors, alpha=0.7)
plt.axhline(y=np.mean(norms[:9]), color='blue', linestyle='--', label='Honest mean')
plt.xlabel('Client ID', fontsize=12)
plt.ylabel('L2 Norm', fontsize=12)
plt.title('L2 Norm Anomaly Detection\n(Green=Honest, Red=Anomalous)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3, axis='y')
plt.show()

print(f"Detected {sum(anomalies)} anomalies out of {len(all_updates)} clients")
print(f"Malicious clients detected: {sum(anomalies[-2:])}/2")

## 3. Cosine Similarity Detection

In [None]:
def cosine_similarity_detector(updates, threshold=-0.5):
    """
    Detect anomalies using cosine similarity to mean update.
    
    Args:
        updates: List of weight updates
        threshold: Cosine similarity threshold (below = anomalous)
        
    Returns:
        anomalies: List of booleans
        similarities: Cosine similarities
    """
    # Compute mean direction
    mean_update = np.mean(updates, axis=0)
    mean_update /= (np.linalg.norm(mean_update) + 1e-10)
    
    # Compute cosine similarities
    similarities = []
    for update in updates:
        update_norm = update / (np.linalg.norm(update) + 1e-10)
        sim = np.dot(mean_update, update_norm)
        similarities.append(sim)
    
    # Detect anomalies (low similarity = wrong direction)
    anomalies = [sim < threshold for sim in similarities]
    
    return anomalies, similarities

# Test with sign-flipping attack
np.random.seed(42)
honest_updates = [np.random.randn(50) * 0.1 + 1 for _ in range(8)]
sign_flip_attack = [-honest_updates[0]]  # Opposite direction
all_updates = honest_updates + sign_flip_attack

# Detect
anomalies, similarities = cosine_similarity_detector(all_updates, threshold=-0.3)

# Visualize
plt.figure(figsize=(12, 6))
colors = ['green' if not a else 'red' for a in anomalies]
plt.bar(range(len(similarities)), similarities, color=colors, alpha=0.7)
plt.axhline(y=-0.3, color='orange', linestyle='--', label='Threshold')
plt.xlabel('Client ID', fontsize=12)
plt.ylabel('Cosine Similarity to Mean', fontsize=12)
plt.title('Cosine Similarity Anomaly Detection\n(Red=Opposite Direction)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3, axis='y')
plt.ylim(-1.1, 1.1)
plt.show()

print(f"Detected {sum(anomalies)} sign-flipping attack(s)")

## 4. Ensemble Detection

In [None]:
def ensemble_anomaly_detector(updates):
    """
    Ensemble multiple detection methods.
    
    Methods:
    1. L2 norm (statistical)
    2. Cosine similarity
    3. Euclidean distance from mean
    
    Returns:
        anomalies: List of booleans (True if any method flags)
        scores: Dict of anomaly scores per method
    """
    # Method 1: L2 norm z-score
    _, l2_anomalies, _ = l2_norm_detector(updates, z_score_threshold=2.5)
    
    # Method 2: Cosine similarity
    cos_anomalies, _ = cosine_similarity_detector(updates, threshold=-0.5)
    
    # Method 3: Euclidean distance
    mean_update = np.mean(updates, axis=0)
    distances = [np.linalg.norm(u - mean_update) for u in updates]
    mean_dist = np.mean(distances)
    std_dist = np.std(distances)
    z_scores = [(d - mean_dist) / (std_dist + 1e-10) for d in distances]
    dist_anomalies = [abs(z) > 2.5 for z in z_scores]
    
    # Combine (OR logic: flag if any method detects)
    anomalies = [
        l2 or cos or dist
        for l2, cos, dist in zip(l2_anomalies, cos_anomalies, dist_anomalies)
    ]
    
    scores = {
        'l2_anomaly': l2_anomalies,
        'cosine_anomaly': cos_anomalies,
        'distance_anomaly': dist_anomalies,
        'ensemble_anomaly': anomalies
    }
    
    return anomalies, scores

# Test ensemble
np.random.seed(42)
updates = [
    np.random.randn(50) * 0.1 + 1 for _ in range(7)  # Honest
] + [
    np.random.randn(50) * 5,                   # Large norm
    -updates[0] * 2,                           # Sign flip
    np.random.randn(50) * 0.1 + 5             # Far from mean
]

anomalies, scores = ensemble_anomaly_detector(updates)

# Display results
results_df = pd.DataFrame({
    'Client': range(len(updates)),
    'L2 Anomaly': scores['l2_anomaly'],
    'Cosine Anomaly': scores['cosine_anomaly'],
    'Distance Anomaly': scores['distance_anomaly'],
    'ENSEMBLE (Any)': scores['ensemble_anomaly']
})

print("\n" + "="*60)
print("ENSEMBLE ANOMALY DETECTION RESULTS")
print("="*60)
print(results_df.to_string(index=False))
print(f"\nTotal anomalies detected: {sum(anomalies)}/{len(updates)}")

## 5. Summary

In [None]:
print("""

ANOMALY DETECTION SUMMARY:

Detection Methods:

1. L2 NORM (Magnitude-based)
   ‚Ä¢ Detects: Gradient scaling, large updates
   ‚Ä¢ Threshold: Statistical (z-score) or fixed
   ‚Ä¢ Pros: Simple, fast, effective for scaling attacks
   ‚Ä¢ Cons: Misses same-magnitude attacks (sign flip)

2. COSINE SIMILARITY (Direction-based)
   ‚Ä¢ Detects: Sign flipping, wrong direction
   ‚Ä¢ Threshold: < -0.5 (opposite direction)
   ‚Ä¢ Pros: Detects subtle direction attacks
   ‚Ä¢ Cons: Misses scaling attacks (same direction)

3. EUCLIDEAN DISTANCE
   ‚Ä¢ Detects: Any deviation from group norm
   ‚Ä¢ Threshold: Z-score > 2.5
   ‚Ä¢ Pros: General-purpose
   ‚Ä¢ Cons: Less specific

4. ENSEMBLE (Combine All)
   ‚Ä¢ Detects: All attack types
   ‚Ä¢ Logic: OR (flag if any method detects)
   ‚Ä¢ Pros: Highest detection rate
   ‚Ä¢ Cons: Higher false positive rate

Deployment Considerations:
  ‚Ä¢ Run on server after receiving all client updates
  ‚Ä¢ Block detected anomalies before aggregation
  ‚Ä¢ Log detections for audit trail
  ‚Ä¢ Tune thresholds based on historical data

""")

## 6. Summary

### Anomaly Detection for FL Security:

**Key Insight:**
- Malicious updates look DIFFERENT from honest ones
- Multiple dimensions: magnitude, direction, distribution
- Ensemble detection catches all attack types

**Detection Pipeline:**
1. Client sends update ‚Üí Server
2. Server collects all updates (wait for all or timeout)
3. Run anomaly detection (L2, cosine, distance)
4. Flag anomalous clients
5. Aggregate only honest updates (Krum, trimmed mean)

**Best Practices:**
- Use ensemble detection (multiple methods)
- Tune thresholds on validation data
- Log detections for monitoring
- Combine with robust aggregation (Day 17)

**Limitations:**
- Sophisticated attacks can evade detection
- Sybil attacks can overwhelm (use FoolsGold, Day 19)
- False positives reject honest clients

### Next Steps:
‚Üí **Day 19**: FoolsGold (Sybil-resistant aggregation)

---

**üìÅ Project Location**: `04_defensive_techniques/anomaly_detection_system/`