# DetectVoice Adversarial Suite - Robustness Demo

This notebook demonstrates:
1. Loading a detector model
2. Generating adversarial examples
3. Evaluating robustness
4. Visualizing results

⚠️ **ETHICS NOTICE**: This demo is for DEFENSIVE research only.

In [None]:
import sys
from pathlib import Path

# Add project root to path
sys.path.append(str(Path.cwd().parent))

import torch
import numpy as np
import matplotlib.pyplot as plt

from src.models.cnn.detector import CNNDetector
from src.attacks import FGSM, PGD
from src.utils.audio import AudioFeatureExtractor

print("✓ Imports successful")

## 1. Initialize Model

In [None]:
# Create detector model
detector = CNNDetector(input_channels=1, num_classes=2, dropout=0.5)
detector.eval()

# NOTE: In practice, load trained weights here:
# checkpoint = torch.load('path/to/checkpoint.pt')
# detector.load_state_dict(checkpoint['model_state_dict'])

print(f"Model parameters: {sum(p.numel() for p in detector.parameters()):,}")

## 2. Create Dummy Data

For demonstration, we'll use dummy spectrograms. 
In practice, use real audio data.

In [None]:
# Create dummy mel-spectrograms
batch_size = 8
freq_bins = 128
time_bins = 94

# Dummy inputs (random spectrograms)
inputs = torch.randn(batch_size, freq_bins, time_bins)

# Dummy labels (4 real, 4 fake)
labels = torch.cat([torch.ones(4), torch.zeros(4)]).long()

print(f"Input shape: {inputs.shape}")
print(f"Labels: {labels}")

## 3. Clean Prediction

In [None]:
# Get clean predictions
with torch.no_grad():
    outputs = detector(inputs)
    probs = torch.softmax(outputs, dim=1)
    preds = outputs.argmax(dim=1)

clean_accuracy = (preds == labels).float().mean().item()

print(f"Clean Accuracy: {clean_accuracy:.2%}")
print(f"Predictions: {preds}")
print(f"Ground Truth: {labels}")

## 4. FGSM Attack

In [None]:
# Create FGSM attack
fgsm = FGSM(model=detector, epsilon=0.03)

# Generate adversarial examples
adv_inputs_fgsm, fgsm_metrics = fgsm.generate(inputs, labels)

print(f"FGSM Metrics:")
print(f"  L2 Norm: {fgsm_metrics['l2_norm']:.4f}")
print(f"  L-inf Norm: {fgsm_metrics['linf_norm']:.4f}")

In [None]:
# Evaluate on adversarial examples
with torch.no_grad():
    adv_outputs = detector(adv_inputs_fgsm)
    adv_preds = adv_outputs.argmax(dim=1)

fgsm_accuracy = (adv_preds == labels).float().mean().item()

print(f"FGSM Adversarial Accuracy: {fgsm_accuracy:.2%}")
print(f"Robustness Drop: {clean_accuracy - fgsm_accuracy:.2%}")

## 5. PGD Attack

In [None]:
# Create PGD attack
pgd = PGD(model=detector, epsilon=0.03, alpha=0.01, num_iter=10)

# Generate adversarial examples
adv_inputs_pgd, pgd_metrics = pgd.generate(inputs, labels)

print(f"PGD Metrics:")
print(f"  L2 Norm: {pgd_metrics['l2_norm']:.4f}")
print(f"  L-inf Norm: {pgd_metrics['linf_norm']:.4f}")

In [None]:
# Evaluate on adversarial examples
with torch.no_grad():
    adv_outputs = detector(adv_inputs_pgd)
    adv_preds = adv_outputs.argmax(dim=1)

pgd_accuracy = (adv_preds == labels).float().mean().item()

print(f"PGD Adversarial Accuracy: {pgd_accuracy:.2%}")
print(f"Robustness Drop: {clean_accuracy - pgd_accuracy:.2%}")

## 6. Visualization

In [None]:
# Visualize clean vs adversarial spectrograms
sample_idx = 0

fig, axes = plt.subplots(1, 4, figsize=(16, 4))

# Clean
axes[0].imshow(inputs[sample_idx].numpy(), aspect='auto', origin='lower')
axes[0].set_title('Clean Spectrogram')
axes[0].set_xlabel('Time')
axes[0].set_ylabel('Frequency')

# FGSM
axes[1].imshow(adv_inputs_fgsm[sample_idx].numpy(), aspect='auto', origin='lower')
axes[1].set_title('FGSM Adversarial')
axes[1].set_xlabel('Time')

# PGD
axes[2].imshow(adv_inputs_pgd[sample_idx].numpy(), aspect='auto', origin='lower')
axes[2].set_title('PGD Adversarial')
axes[2].set_xlabel('Time')

# Perturbation (FGSM)
perturbation = (adv_inputs_fgsm[sample_idx] - inputs[sample_idx]).numpy()
axes[3].imshow(perturbation, aspect='auto', origin='lower', cmap='RdBu')
axes[3].set_title('FGSM Perturbation')
axes[3].set_xlabel('Time')

plt.tight_layout()
plt.show()

## 7. Robustness Summary

In [None]:
# Summary plot
attacks = ['Clean', 'FGSM', 'PGD']
accuracies = [clean_accuracy, fgsm_accuracy, pgd_accuracy]

plt.figure(figsize=(8, 5))
bars = plt.bar(attacks, accuracies, color=['green', 'orange', 'red'])
plt.ylabel('Accuracy')
plt.title('Model Robustness Against Adversarial Attacks')
plt.ylim([0, 1])
plt.grid(axis='y', alpha=0.3)

# Add value labels on bars
for bar, acc in zip(bars, accuracies):
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
             f'{acc:.1%}',
             ha='center', va='bottom')

plt.tight_layout()
plt.show()

## Conclusion

This notebook demonstrated:

1. ✓ Loading a detector model
2. ✓ Generating FGSM and PGD adversarial examples
3. ✓ Evaluating model robustness
4. ✓ Visualizing attacks and perturbations

**Next Steps:**
- Use real audio data
- Train models with adversarial training
- Evaluate with comprehensive robustness suite
- Export models for deployment

⚠️ **Remember**: Use this tool ethically and responsibly for defensive research only.