<!--
Copyright (c) 2025 Milin Patel
Hochschule Kempten - University of Applied Sciences

Autonomous Driving: AI Safety and Security Workshop
This project is licensed under the MIT License.
See LICENSE file in the root directory for full license text.
-->

*Copyright ¬© 2025 Milin Patel. All Rights Reserved.*

# Notebook 10: Adversarial Attacks on AV Perception Systems

**Session 2: Failure Modes and Edge Cases**

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/milinpatel07/Autonomous-Driving_AI-Safety-and-Security/blob/main/AV_Perception_Safety_Workshop/Session_2_Failure_Modes_and_Edge_Cases/notebooks/10_Adversarial_Attacks_on_Perception.ipynb)

**Author:** Milin Patel  
**Duration:** ~25 minutes

---

## üéØ Learning Objectives

By the end of this notebook, you will:
- ‚úÖ Understand adversarial examples and their threat to AVs
- ‚úÖ Implement FGSM and PGD digital attacks
- ‚úÖ Analyze physical adversarial patches on traffic signs
- ‚úÖ Explore sensor spoofing attacks (LiDAR, camera)
- ‚úÖ Implement defense mechanisms (adversarial training, input validation)
- ‚úÖ Connect adversarial robustness to ISO/SAE 21434 cybersecurity
- ‚úÖ Design security requirements for AV systems

---

## üì¶ Setup and Imports

In [None]:
# Import required libraries
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
from torchvision.models import resnet18
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
import cv2
import warnings
warnings.filterwarnings('ignore')

# Check device
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

# Set random seeds
torch.manual_seed(42)
np.random.seed(42)

print("‚úÖ All libraries imported successfully!")
print(f"PyTorch version: {torch.__version__}")

---

## 1Ô∏è‚É£ What are Adversarial Examples?

### Definition

**Adversarial examples** are inputs intentionally designed to cause a machine learning model to make a mistake.

**Key property:** Small, **imperceptible** perturbations cause **large** changes in model output

$$x_{adv} = x + \epsilon \cdot \text{sign}(\nabla_x L(x, y))$$

Where:
- $x$ = original input
- $x_{adv}$ = adversarial example
- $\epsilon$ = perturbation magnitude (small!)
- $\nabla_x L$ = gradient of loss

### Example

**Original:** Stop sign ‚Üí Classified as "stop sign" (99% confidence)  
**Adversarial:** Stop sign + tiny noise ‚Üí Classified as "speed limit 45" (95% confidence)

**Danger for AVs:**
- Attacker can cause misclassification
- Imperceptible to humans
- Can be physical (stickers on real signs)

### Why do adversarial examples exist?

1. **High dimensionality:** Images have millions of pixels
2. **Linear nature:** Neural networks locally linear
3. **Transferability:** Attacks transfer across models
4. **Optimization:** Networks optimize for average case, not worst case

In [None]:
# Visualize adversarial example concept
fig, axes = plt.subplots(1, 4, figsize=(16, 4))

# Simulate an example
np.random.seed(42)
original = np.random.rand(64, 64, 3) * 0.5 + 0.25  # Original image
perturbation = (np.random.rand(64, 64, 3) - 0.5) * 0.1  # Small noise
adversarial = np.clip(original + perturbation, 0, 1)
difference = np.abs(adversarial - original) * 10  # Amplified for visibility

axes[0].imshow(original)
axes[0].set_title('Original Image\n"Stop Sign"\nConfidence: 99%', fontsize=11, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(perturbation + 0.5, cmap='seismic', vmin=0, vmax=1)
axes[1].set_title('Perturbation\n(Amplified for visibility)', fontsize=11, fontweight='bold')
axes[1].axis('off')

axes[2].imshow(adversarial)
axes[2].set_title('Adversarial Image\n"Speed Limit 45"\nConfidence: 95%', 
                  fontsize=11, fontweight='bold', color='red')
axes[2].axis('off')

axes[3].imshow(difference)
axes[3].set_title('Difference\n(Amplified 10x)', fontsize=11, fontweight='bold')
axes[3].axis('off')

plt.tight_layout()
plt.show()

print("‚ö†Ô∏è Danger: Tiny, imperceptible changes cause misclassification!")
print("   Human sees: Stop sign")
print("   AV sees: Speed limit 45")
print("   Result: Vehicle doesn't stop ‚Üí Collision!")

---

## 2Ô∏è‚É£ Digital Attack 1: FGSM (Fast Gradient Sign Method)

**Idea:** Perturb image in direction of gradient

**Algorithm:**
1. Compute loss $L(x, y_{true})$
2. Compute gradient $\nabla_x L$
3. Add perturbation: $x_{adv} = x + \epsilon \cdot \text{sign}(\nabla_x L)$

**Properties:**
- **Fast:** Single gradient computation
- **Simple:** One-step attack
- **Effective:** Often succeeds

**Reference:** Goodfellow et al. (2014) - "Explaining and Harnessing Adversarial Examples"

In [None]:
# Load a simple model for demonstration
model = resnet18(pretrained=True).to(device)
model.eval()

# ImageNet normalization
mean = torch.tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1).to(device)
std = torch.tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1).to(device)

def normalize(x):
    return (x - mean) / std

def denormalize(x):
    return x * std + mean

print("‚úÖ Model loaded (ResNet-18 pre-trained on ImageNet)")

In [None]:
# FGSM Attack Implementation
def fgsm_attack(image, epsilon, model, target=None):
    """
    Fast Gradient Sign Method attack
    
    Args:
        image: Input image tensor [1, 3, H, W]
        epsilon: Perturbation magnitude
        model: Target model
        target: Target class (if None, untargeted attack)
    
    Returns:
        adversarial image
    """
    # Ensure image requires grad
    image = image.clone().detach().requires_grad_(True)
    
    # Forward pass
    output = model(normalize(image))
    
    # Compute loss
    if target is None:
        # Untargeted: maximize loss for true class
        pred = output.argmax(dim=1)
        loss = F.cross_entropy(output, pred)
    else:
        # Targeted: minimize loss for target class
        loss = -F.cross_entropy(output, torch.tensor([target]).to(device))
    
    # Backward pass
    model.zero_grad()
    loss.backward()
    
    # Get gradient sign
    grad_sign = image.grad.sign()
    
    # Create adversarial example
    adversarial = image + epsilon * grad_sign
    adversarial = torch.clamp(adversarial, 0, 1)  # Keep in valid range
    
    return adversarial.detach()

print("‚úÖ FGSM attack function defined")

In [None]:
# Test FGSM attack on a sample image
# Create a synthetic "traffic sign" image for demo
torch.manual_seed(42)
test_image = torch.rand(1, 3, 224, 224).to(device) * 0.5 + 0.25

# Original prediction
with torch.no_grad():
    orig_output = model(normalize(test_image))
    orig_pred = orig_output.argmax(dim=1).item()
    orig_conf = F.softmax(orig_output, dim=1).max().item()

print(f"Original prediction: Class {orig_pred}, Confidence: {orig_conf:.3f}")

# Generate adversarial examples with different epsilon values
epsilons = [0.0, 0.01, 0.03, 0.05, 0.1]
results = []

fig, axes = plt.subplots(1, len(epsilons), figsize=(16, 3))

for i, eps in enumerate(epsilons):
    if eps == 0:
        adv_image = test_image
    else:
        adv_image = fgsm_attack(test_image, eps, model)
    
    # Prediction on adversarial example
    with torch.no_grad():
        adv_output = model(normalize(adv_image))
        adv_pred = adv_output.argmax(dim=1).item()
        adv_conf = F.softmax(adv_output, dim=1).max().item()
    
    success = adv_pred != orig_pred
    results.append({'Epsilon': eps, 'Success': success, 'Pred': adv_pred, 'Conf': adv_conf})
    
    # Visualize
    img_np = adv_image.squeeze().cpu().permute(1, 2, 0).numpy()
    axes[i].imshow(img_np)
    color = 'red' if success else 'green'
    axes[i].set_title(f'Œµ={eps}\nClass {adv_pred}\nConf: {adv_conf:.2f}', 
                      fontsize=10, fontweight='bold', color=color)
    axes[i].axis('off')
    
    if success:
        axes[i].add_patch(plt.Rectangle((0, 0), 224, 224, fill=False, 
                                        edgecolor='red', linewidth=4))

plt.tight_layout()
plt.show()

# Display results
results_df = pd.DataFrame(results)
display(results_df)

print("\nüìä Attack Success Rate:")
success_rate = results_df[results_df['Epsilon'] > 0]['Success'].mean() * 100
print(f"   {success_rate:.1f}% of adversarial examples caused misclassification")

---

## 3Ô∏è‚É£ Digital Attack 2: PGD (Projected Gradient Descent)

**Idea:** Iterative version of FGSM (stronger attack)

**Algorithm:**
1. Start from original image
2. Repeat N iterations:
   - Apply FGSM with small step size
   - Project back to epsilon-ball around original
3. Return final adversarial example

**Properties:**
- **Stronger:** More iterations find better perturbations
- **Slower:** Multiple gradient computations
- **More robust:** Better success rate

**Reference:** Madry et al. (2018) - "Towards Deep Learning Models Resistant to Adversarial Attacks"

In [None]:
# PGD Attack Implementation
def pgd_attack(image, epsilon, model, alpha=0.01, num_iter=10, target=None):
    """
    Projected Gradient Descent attack
    
    Args:
        image: Input image tensor
        epsilon: Maximum perturbation
        model: Target model
        alpha: Step size per iteration
        num_iter: Number of iterations
        target: Target class (if None, untargeted)
    """
    original_image = image.clone().detach()
    adversarial = image.clone().detach()
    
    for i in range(num_iter):
        adversarial.requires_grad = True
        
        # Forward pass
        output = model(normalize(adversarial))
        
        # Compute loss
        if target is None:
            pred = output.argmax(dim=1)
            loss = F.cross_entropy(output, pred)
        else:
            loss = -F.cross_entropy(output, torch.tensor([target]).to(device))
        
        # Backward
        model.zero_grad()
        loss.backward()
        
        # Update adversarial example
        grad_sign = adversarial.grad.sign()
        adversarial = adversarial.detach() + alpha * grad_sign
        
        # Project back to epsilon-ball
        perturbation = adversarial - original_image
        perturbation = torch.clamp(perturbation, -epsilon, epsilon)
        adversarial = original_image + perturbation
        adversarial = torch.clamp(adversarial, 0, 1)
    
    return adversarial.detach()

print("‚úÖ PGD attack function defined")

In [None]:
# Compare FGSM vs PGD
epsilon = 0.03

# Generate attacks
fgsm_adv = fgsm_attack(test_image, epsilon, model)
pgd_adv = pgd_attack(test_image, epsilon, model, alpha=0.007, num_iter=10)

# Predictions
with torch.no_grad():
    fgsm_output = model(normalize(fgsm_adv))
    fgsm_pred = fgsm_output.argmax(dim=1).item()
    fgsm_conf = F.softmax(fgsm_output, dim=1).max().item()
    
    pgd_output = model(normalize(pgd_adv))
    pgd_pred = pgd_output.argmax(dim=1).item()
    pgd_conf = F.softmax(pgd_output, dim=1).max().item()

# Visualize comparison
fig, axes = plt.subplots(1, 3, figsize=(14, 4))

images = [test_image, fgsm_adv, pgd_adv]
titles = [
    f'Original\nClass {orig_pred}\nConf: {orig_conf:.2f}',
    f'FGSM (Œµ={epsilon})\nClass {fgsm_pred}\nConf: {fgsm_conf:.2f}',
    f'PGD (Œµ={epsilon})\nClass {pgd_pred}\nConf: {pgd_conf:.2f}'
]
colors = ['green', 'orange' if fgsm_pred != orig_pred else 'green', 
          'red' if pgd_pred != orig_pred else 'green']

for i, (img, title, color) in enumerate(zip(images, titles, colors)):
    img_np = img.squeeze().cpu().permute(1, 2, 0).numpy()
    axes[i].imshow(img_np)
    axes[i].set_title(title, fontsize=11, fontweight='bold', color=color)
    axes[i].axis('off')

plt.tight_layout()
plt.show()

print("\nüìä Comparison:")
print(f"   Original: Class {orig_pred}")
print(f"   FGSM: Class {fgsm_pred} {'‚úó Failed' if fgsm_pred == orig_pred else '‚úì Success'}")
print(f"   PGD: Class {pgd_pred} {'‚úó Failed' if pgd_pred == orig_pred else '‚úì Success'}")
print("\nüí° PGD typically has higher attack success rate than FGSM")

---

## 4Ô∏è‚É£ Physical Adversarial Attacks

**Challenge:** Digital attacks work on raw pixels, but real world is different!

### Physical Attack Considerations
1. **Printing:** Perturbations must survive printer limitations
2. **Viewing angle:** Must work from multiple angles
3. **Distance:** Must work at various distances
4. **Lighting:** Must work under different lighting conditions
5. **Weather:** Rain, fog, etc. affect appearance

### Adversarial Patch Attack

**Idea:** Instead of perturbing entire image, place a **patch** (sticker) on object

**Famous Example:** Adversarial stickers on stop sign
- Paper: Eykholt et al. (2018) - "Robust Physical-World Attacks on Deep Learning Visual Classification"
- Result: Stop sign misclassified as "Speed Limit 45"
- Method: Small stickers placed on sign

**Danger for AVs:**
- Attacker can physically modify traffic signs
- AV misinterprets sign
- Could cause vehicle to not stop ‚Üí collision

In [None]:
# Simulate adversarial patch attack
def create_adversarial_patch(size=(50, 50, 3)):
    """Create a random adversarial patch"""
    # In practice, this would be optimized via gradient descent
    # Here we simulate with random pattern
    np.random.seed(42)
    patch = np.random.rand(*size)
    return patch

def apply_patch(image, patch, position):
    """Apply patch to image at given position"""
    image_with_patch = image.copy()
    x, y = position
    h, w = patch.shape[:2]
    
    # Check bounds
    if x + w <= image.shape[1] and y + h <= image.shape[0]:
        image_with_patch[y:y+h, x:x+w] = patch
    
    return image_with_patch

# Create synthetic traffic sign (convert to uint8 for cv2)
sign_image = (np.ones((224, 224, 3)) * 0.8 * 255).astype(np.uint8)
# Draw red octagon (stop sign shape)
cv2.circle(sign_image, (112, 112), 80, (200, 25, 25), -1)
cv2.putText(sign_image, 'STOP', (70, 125), cv2.FONT_HERSHEY_SIMPLEX, 1.5, (255, 255, 255), 4)

# Convert back to float for display
sign_image = sign_image.astype(np.float64) / 255.0

# Create adversarial patch
adv_patch = create_adversarial_patch((40, 40, 3))

# Apply patch at different positions
fig, axes = plt.subplots(1, 4, figsize=(16, 4))

positions = [(20, 20), (160, 20), (90, 140), (90, 90)]
for i, pos in enumerate(positions):
    patched_sign = apply_patch(sign_image, adv_patch, pos)
    
    axes[i].imshow(patched_sign)
    axes[i].set_title(f'Patch at {pos}\n"Speed Limit 45" (simulated)', 
                      fontsize=10, fontweight='bold', color='red')
    axes[i].axis('off')
    
    # Highlight patch
    rect = plt.Rectangle(pos, 40, 40, fill=False, edgecolor='yellow', linewidth=3)
    axes[i].add_patch(rect)

plt.tight_layout()
plt.show()

print("‚ö†Ô∏è Physical Attack Scenario:")
print("   Attacker places small stickers on stop sign")
print("   AV perception system misclassifies sign")
print("   Vehicle fails to stop ‚Üí Potential collision!")
print("\nüí° Defense: Anomaly detection, multiple viewpoints, sensor fusion")

---

## 5Ô∏è‚É£ Sensor Spoofing Attacks

Beyond camera attacks, adversaries can target other sensors:

### LiDAR Spoofing
- **Attack:** Use laser to inject fake returns
- **Result:** Phantom objects appear
- **Example:** Make AV "see" pedestrian that doesn't exist ‚Üí Emergency brake

### Camera Blinding
- **Attack:** Use laser to saturate camera sensor
- **Result:** Temporary blindness
- **Danger:** Critical objects not detected

### GPS Spoofing
- **Attack:** Broadcast fake GPS signals
- **Result:** Vehicle thinks it's in wrong location
- **Danger:** Wrong map data, navigation errors

### Radar Jamming
- **Attack:** Broadcast interference on radar frequency
- **Result:** Radar sees noise instead of objects
- **Danger:** Miss vehicles, pedestrians

In [None]:
# Visualize sensor spoofing attacks
sensor_attacks = pd.DataFrame({
    'Sensor': ['Camera', 'Camera', 'LiDAR', 'LiDAR', 'Radar', 'GPS'],
    'Attack_Type': ['Adversarial patch', 'Laser blinding', 'Spoofing', 'Interference', 'Jamming', 'Spoofing'],
    'Difficulty': ['Medium', 'Easy', 'Hard', 'Medium', 'Medium', 'Hard'],
    'Impact': ['Misclassification', 'Temporary blindness', 'Phantom objects', 'Lost data', 'Lost data', 'Wrong location'],
    'Risk': ['High', 'Critical', 'Critical', 'High', 'High', 'High'],
    'Countermeasure': [
        'Adversarial training, ensemble',
        'Filter detection, redundancy',
        'Signal authentication, anomaly detection',
        'Frequency hopping, shielding',
        'Frequency diversity, sensor fusion',
        'INS backup, plausibility checks'
    ]
})

display(sensor_attacks)

# Visualize attack surface
fig, ax = plt.subplots(figsize=(12, 6))

sensors = sensor_attacks['Sensor'].unique()
attack_counts = sensor_attacks.groupby('Sensor').size()
risk_levels = sensor_attacks.groupby('Sensor')['Risk'].apply(lambda x: (x == 'Critical').sum())

x = np.arange(len(sensors))
width = 0.35

bars1 = ax.bar(x - width/2, attack_counts, width, label='Total Attack Types', alpha=0.8, color='orange')
bars2 = ax.bar(x + width/2, risk_levels, width, label='Critical Risk Attacks', alpha=0.8, color='red')

ax.set_xlabel('Sensor Type', fontsize=12, fontweight='bold')
ax.set_ylabel('Number of Attack Types', fontsize=12, fontweight='bold')
ax.set_title('AV Sensor Attack Surface', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(sensors)
ax.legend()
ax.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\n‚ö†Ô∏è Key Insight: All sensors vulnerable to attacks!")
print("   Defense: Multi-sensor fusion + anomaly detection + authentication")

---

## 6Ô∏è‚É£ Defense Mechanisms

### Defense 1: Adversarial Training

**Idea:** Train model on adversarial examples

**Procedure:**
1. Generate adversarial examples during training
2. Include in training dataset
3. Model learns to be robust

**Pros:**
- Effective against known attacks
- Improves model robustness

**Cons:**
- Slower training
- May reduce accuracy on clean data
- Doesn't defend against novel attacks

In [None]:
# Simulate adversarial training (conceptual)
def adversarial_training_epoch(model, train_loader, optimizer, epsilon=0.03):
    """
    One epoch of adversarial training
    
    In each batch:
    1. Generate adversarial examples
    2. Train on both clean and adversarial
    """
    model.train()
    total_loss = 0
    
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        
        # Generate adversarial examples
        adv_images = fgsm_attack(images, epsilon, model)
        
        # Combine clean and adversarial
        combined_images = torch.cat([images, adv_images], dim=0)
        combined_labels = torch.cat([labels, labels], dim=0)
        
        # Forward pass
        outputs = model(normalize(combined_images))
        loss = F.cross_entropy(outputs, combined_labels)
        
        # Backward pass
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
    
    return total_loss / len(train_loader)

print("‚úÖ Adversarial training procedure defined")
print("\nüí° In production:")
print("   - Train for multiple epochs with adversarial examples")
print("   - Use stronger attacks (PGD) for training")
print("   - Balance clean and adversarial accuracy")

### Defense 2: Input Validation

**Idea:** Detect adversarial inputs before feeding to model

**Methods:**
1. **Statistical tests:** Check for unusual pixel patterns
2. **Reconstruction:** Use autoencoder to "clean" input
3. **Ensemble disagreement:** Multiple models should agree
4. **Feature squeezing:** Reduce color depth, spatial resolution

**Example: Simple pixel statistics check**

In [None]:
# Simple input validation detector
def detect_adversarial_simple(image, clean_images):
    """
    Simple adversarial detector based on pixel statistics
    
    Check if image statistics deviate from clean distribution
    """
    # Compute statistics
    img_mean = image.mean().item()
    img_std = image.std().item()
    img_max = image.max().item()
    img_min = image.min().item()
    
    # Compare to clean distribution
    clean_mean = clean_images.mean().item()
    clean_std = clean_images.std().item()
    
    # Simple threshold-based detection
    mean_diff = abs(img_mean - clean_mean)
    std_diff = abs(img_std - clean_std)
    
    # Flag if statistics deviate significantly
    threshold = 0.1
    is_adversarial = (mean_diff > threshold) or (std_diff > threshold)
    
    return is_adversarial, {'mean_diff': mean_diff, 'std_diff': std_diff}

# Test detector
clean_batch = torch.rand(10, 3, 224, 224).to(device) * 0.5 + 0.25

# Test on clean vs adversarial
is_adv_clean, stats_clean = detect_adversarial_simple(test_image, clean_batch)
is_adv_fgsm, stats_fgsm = detect_adversarial_simple(fgsm_adv, clean_batch)
is_adv_pgd, stats_pgd = detect_adversarial_simple(pgd_adv, clean_batch)

print("Input Validation Results:\n" + "="*50)
print(f"Clean image: {'‚ö†Ô∏è Flagged' if is_adv_clean else '‚úÖ Passed'}")
print(f"  Stats: {stats_clean}")
print(f"\nFGSM adversarial: {'‚ö†Ô∏è Flagged' if is_adv_fgsm else '‚úÖ Passed'}")
print(f"  Stats: {stats_fgsm}")
print(f"\nPGD adversarial: {'‚ö†Ô∏è Flagged' if is_adv_pgd else '‚úÖ Passed'}")
print(f"  Stats: {stats_pgd}")

print("\nüí° Note: Real detectors use more sophisticated methods:")
print("   - Deep learning-based detectors")
print("   - Autoencoder reconstruction error")
print("   - Ensemble model agreement")

### Defense 3: Sensor Fusion

**Idea:** Multiple sensors provide redundancy

**Strategy:**
- Camera may be fooled by adversarial patch
- BUT LiDAR still detects stop sign correctly
- Fusion logic: Trust LiDAR when disagreement

**Benefit:** Attacking multiple sensors simultaneously is much harder

### Defense 4: Certified Defenses

**Idea:** Provable robustness guarantees

**Methods:**
- **Randomized smoothing:** Add noise, average predictions
- **Interval bound propagation:** Compute guaranteed bounds
- **Lipschitz constraints:** Limit model sensitivity

**Trade-off:** Often reduce model accuracy for guaranteed robustness

---

## 7Ô∏è‚É£ ISO/SAE 21434: Cybersecurity Standard

**ISO/SAE 21434:2021** - Road vehicles ‚Äî Cybersecurity engineering

### Key Requirements

**1. Threat Analysis and Risk Assessment (TARA):**
- Identify attack vectors
- Assess likelihood and impact
- Prioritize mitigation

**2. Security-by-Design:**
- Security considered from concept phase
- Secure architecture
- Regular security reviews

**3. Security Testing:**
- Penetration testing
- Vulnerability scanning
- Adversarial attack simulation

**4. Incident Response:**
- Monitor for attacks in deployment
- Rapid response to discovered vulnerabilities
- OTA updates for patches

### Application to Adversarial Attacks

**Threat:** Adversarial examples on traffic signs

**Risk Assessment:**
- **Likelihood:** Medium (requires physical access)
- **Impact:** Critical (collision)
- **Risk:** HIGH

**Mitigation:**
1. Adversarial training (design)
2. Input validation (runtime)
3. Sensor fusion (redundancy)
4. Anomaly detection (monitoring)

In [None]:
# ISO/SAE 21434 Threat Analysis Example
threats = pd.DataFrame({
    'Threat': [
        'Adversarial patch on stop sign',
        'LiDAR spoofing',
        'Camera laser blinding',
        'GPS spoofing',
        'CAN bus injection',
        'OTA update compromise'
    ],
    'Attack_Vector': [
        'Physical modification',
        'Laser injection',
        'Laser attack',
        'RF spoofing',
        'Physical access to bus',
        'Network attack'
    ],
    'Likelihood': ['Medium', 'Low', 'Medium', 'Low', 'Low', 'Low'],
    'Impact': ['Critical', 'Critical', 'Critical', 'High', 'Critical', 'Critical'],
    'Risk_Level': ['HIGH', 'MEDIUM', 'HIGH', 'MEDIUM', 'MEDIUM', 'MEDIUM'],
    'Mitigation_Status': ['‚ö†Ô∏è Partial', '‚ùå None', '‚ö†Ô∏è Partial', '‚úÖ Yes', '‚úÖ Yes', '‚úÖ Yes'],
    'Required_Actions': [
        'Deploy adversarial training',
        'Add signal authentication',
        'Add bright light detection',
        'Strengthen INS backup',
        'Already secured',
        'Already secured'
    ]
})

display(threats)

# Priority matrix
high_risk = threats[threats['Risk_Level'] == 'HIGH']
not_mitigated = high_risk[high_risk['Mitigation_Status'] != '‚úÖ Yes']

print("\nüö® HIGH PRIORITY SECURITY ACTIONS:\n" + "="*60)
for idx, row in not_mitigated.iterrows():
    print(f"\nThreat: {row['Threat']}")
    print(f"  Risk Level: {row['Risk_Level']}")
    print(f"  Current Status: {row['Mitigation_Status']}")
    print(f"  Required Action: {row['Required_Actions']}")

print("\n" + "="*60)
print("üí° ISO/SAE 21434 requires documented risk assessment and mitigation plan")

---

## ‚úèÔ∏è Exercise: Design Security Requirements

**Task:** For the campus shuttle, design security requirements against adversarial attacks.

**Consider:**
1. What are the attack vectors?
2. Which attacks are most feasible?
3. What defenses would you implement?
4. How would you test adversarial robustness?
5. What monitoring is needed in deployment?

In [None]:
# TODO: Complete your security design

your_security_design = pd.DataFrame({
    'Threat': [
        # Identify threats for campus shuttle
        # Example: 'Adversarial sticker on stop sign at campus entrance'
    ],
    'Likelihood': [],  # Low, Medium, High
    'Impact': [],      # Low, Medium, High, Critical
    'Defense': [],     # Your proposed defense
    'Testing': []      # How to test this defense
})

if len(your_security_design) > 0:
    display(your_security_design)
else:
    print("Add your security analysis above!")

print("\nü§î Reflection Questions:")
print("   1. How do you balance security and performance?")
print("   2. What's the cost of false positives (flagging clean inputs)?")
print("   3. How do you update defenses as new attacks emerge?")
print("   4. What role does sensor fusion play in security?")

---

## üéØ Key Takeaways

### Adversarial Attacks are Real
- **Digital attacks:** FGSM, PGD cause misclassification with tiny perturbations
- **Physical attacks:** Adversarial patches work in real world
- **Sensor attacks:** All sensors (camera, LiDAR, radar, GPS) vulnerable

### Attack Characteristics
- **Imperceptible:** Humans can't see perturbations
- **Transferable:** Attacks work across different models
- **Feasible:** Physical attacks demonstrated in research
- **Safety-critical:** Can cause accidents in AVs

### Defense Strategies
1. **Adversarial training:** Train on adversarial examples
2. **Input validation:** Detect adversarial inputs
3. **Sensor fusion:** Multiple sensors provide redundancy
4. **Certified defenses:** Provable robustness guarantees

### ISO/SAE 21434 Compliance
- **Threat analysis:** Identify and assess attack vectors
- **Security-by-design:** Build in defenses from start
- **Testing:** Simulate adversarial attacks
- **Monitoring:** Detect attacks in deployment

### Best Practices
1. ‚úÖ **Defense-in-depth:** Multiple layers of security
2. ‚úÖ **Sensor diversity:** Don't rely on single sensor
3. ‚úÖ **Continuous testing:** Regular penetration testing
4. ‚úÖ **Rapid response:** OTA updates for vulnerabilities
5. ‚úÖ **Monitoring:** Detect anomalous inputs in deployment

### Open Research Questions
- How to guarantee robustness without sacrificing accuracy?
- Can we detect all adversarial examples?
- What about attacks we haven't thought of yet?

---

## üéì Session 2 Complete!

**Congratulations!** You've completed Session 2 on Failure Modes and Edge Cases.

**What we covered:**
1. ‚úÖ Real-world AV failure case studies
2. ‚úÖ Out-of-distribution detection methods
3. ‚úÖ Corner cases and long-tail scenarios
4. ‚úÖ Adversarial attacks and defenses

**Next:** Session 3 - Safety Standards and Verification
- ISO 26262 (Functional Safety)
- ISO 21448 (SOTIF)
- ISO/SAE 21434 (Cybersecurity)
- Verification and validation methods

---

*Notebook created by Milin Patel | Hochschule Kempten*  
*Last updated: 2025-01-18*