# AI Model Extraction Simulator
## Makine Öğrenimi Modeli Çalma Simülatörü

Bu notebook, makine öğrenimi modellerinin siyah kutu (black-box) saldırılarla nasıl çalınabileceğini simüle eder.

⚠️ **Etik Uyarı**: Bu araç yalnızca eğitim ve güvenlik araştırması amaçlıdır. Gerçek sistemlere karşı izinsiz kullanımı yasaktır.

## Çalışma Prensibi
```
Saldırgan → API Sorguları → Kurban Model → Yanıtlar → Transfer Öğrenme → Klon Model
```

In [None]:
# Gerekli kütüphaneleri import et
import sys
import os
from pathlib import Path

# Proje kök dizinini ekle
project_root = Path.cwd().parent if 'examples' in str(Path.cwd()) else Path.cwd()
sys.path.append(str(project_root))

import torch
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm

# Proje modülleri
from src.victim_model.victim_model import VictimModel, VictimModelAPI
from src.attacker.extraction_strategies import (
    RandomQueryStrategy, ActiveLearningStrategy, AdversarialQueryStrategy,
    ModelExtractor
)
from src.clone_model.clone_model import CloneModel
from src.utils.utils import (
    get_device, set_random_seeds, calculate_model_similarity,
    evaluate_model_performance, count_parameters
)

# Ayarlar
plt.style.use('seaborn-v0_8')
device = get_device()
set_random_seeds(42)

print(f"🚀 AI Model Extraction Simulator")
print(f"Device: {device}")
print(f"PyTorch version: {torch.__version__}")

## 1. Kurban Model Oluşturma

İlk olarak, çalmaya çalışacağımız kurban modeli oluşturalım. Bu model gerçek dünyada korumalı bir API arkasında olacaktır.

In [None]:
# Kurban modeli oluştur
print("📦 Creating victim model...")
victim_model = VictimModel(
    model_type="resnet18",
    num_classes=10,  # CIFAR-10 için
    pretrained=True,
    device=device
)

# API wrapper oluştur
victim_api = VictimModelAPI(victim_model, rate_limit=None)

# Model bilgilerini göster
param_count = count_parameters(victim_model.model)
print(f"✅ Victim model created")
print(f"📊 Parameters: {param_count:,}")
print(f"📊 Model type: {victim_model.model_type}")
print(f"📊 Classes: {victim_model.num_classes}")

## 2. Kurban Modeli Test Etme

Kurban modelinin çalıştığından emin olalım.

In [None]:
# Test verisi oluştur
test_input = torch.randn(5, 3, 32, 32)  # 5 rastgele görüntü

# Kurban modeli sorgula
victim_response = victim_model.query(test_input, return_logits=True)

print("🧪 Victim model test:")
print(f"Input shape: {test_input.shape}")
print(f"Predictions: {victim_response['predictions']}")
print(f"Confidence scores: {victim_response['confidence']}")
print(f"Query count: {victim_model.query_count}")

## 3. Farklı Saldırı Stratejileri

Şimdi farklı sorgulama stratejilerini test edelim.

### 3.1 Rastgele Sorgulama Stratejisi

In [None]:
# Rastgele sorgulama stratejisi
print("🎲 Testing Random Query Strategy")

random_strategy = RandomQueryStrategy(
    data_distribution="uniform",
    input_shape=(3, 32, 32)
)

# Örnek sorgular oluştur
random_queries = random_strategy.select_queries(budget=100)
print(f"Generated {len(random_queries)} random queries")
print(f"Query shape: {random_queries.shape}")
print(f"Value range: [{random_queries.min():.3f}, {random_queries.max():.3f}]")

# İlk birkaç sorguyu görselleştir
fig, axes = plt.subplots(1, 4, figsize=(12, 3))
for i in range(4):
    # Channels first'ten channels last'e çevir
    img = random_queries[i].transpose(1, 2, 0)
    axes[i].imshow(img)
    axes[i].set_title(f"Random Query {i+1}")
    axes[i].axis('off')
plt.tight_layout()
plt.show()

### 3.2 Aktif Öğrenme Stratejisi

In [None]:
# Aktif öğrenme stratejisi
print("🧠 Testing Active Learning Strategy")

active_strategy = ActiveLearningStrategy(
    initial_pool_size=500,
    uncertainty_method="entropy"
)

# İlk sorgular (clone model olmadan)
active_queries = active_strategy.select_queries(budget=50)
print(f"Generated {len(active_queries)} active learning queries")
print(f"Strategy uses uncertainty-based selection")

## 4. Model Çıkarma Süreci

Şimdi gerçek model çıkarma sürecini başlatalım.

In [None]:
# Model extractor oluştur
print("🔓 Setting up model extractor")

# Rastgele strateji ile başla (daha hızlı demo için)
extractor = ModelExtractor(
    query_strategy=random_strategy,
    victim_api=victim_api,
    query_budget=2000  # Küçük budget (notebook için)
)

print(f"✅ Extractor ready with budget: {extractor.query_budget:,}")

In [None]:
# Bilgi çıkarma
print("🕳️ Extracting knowledge from victim model...")

stolen_queries, stolen_responses = extractor.extract_knowledge(batch_size=64)

# İstatistikleri göster
extraction_stats = extractor.get_extraction_statistics()
print(f"\n📊 Extraction Statistics:")
for key, value in extraction_stats.items():
    print(f"  {key}: {value}")

print(f"\n✅ Collected {len(stolen_queries):,} query-response pairs")
print(f"📊 Data shape: {stolen_queries.shape} → {stolen_responses.shape}")

## 5. Klon Model Oluşturma ve Eğitimi

Çalınan verilerle klon modeli eğitelim.

In [None]:
# Klon model oluştur
print("🤖 Creating clone model")

clone_model = CloneModel(
    architecture="simple_cnn",  # Daha basit mimari
    num_classes=10,
    device=device
)

clone_params = count_parameters(clone_model.model)
print(f"✅ Clone model created")
print(f"📊 Parameters: {clone_params:,}")
print(f"📊 Architecture: {clone_model.architecture}")
print(f"📊 Victim vs Clone ratio: {clone_params/param_count:.2%}")

In [None]:
# Klon model eğitimi
print("🎓 Training clone model with stolen data...")

training_results = clone_model.train_with_stolen_data(
    stolen_queries=stolen_queries,
    stolen_labels=stolen_responses,
    epochs=30,
    learning_rate=0.001,
    batch_size=64,
    temperature=3.0
)

print(f"\n✅ Training completed!")
print(f"📈 Final loss: {training_results['final_loss']:.4f}")
print(f"📈 Final accuracy: {training_results['final_accuracy']:.4f}")

## 6. Eğitim Görselleştirmesi

In [None]:
# Eğitim geçmişini görselleştir
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

# Loss grafiği
axes[0].plot(training_results['training_losses'])
axes[0].set_title('Training Loss')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].grid(True)

# Accuracy grafiği
axes[1].plot(training_results['training_accuracies'])
axes[1].set_title('Training Accuracy')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')
axes[1].grid(True)

plt.tight_layout()
plt.show()

## 7. Model Değerlendirmesi ve Karşılaştırma

In [None]:
# Test verisi oluştur
print("🧪 Evaluating models...")

test_size = 500
test_queries = np.random.uniform(0, 1, (test_size, 3, 32, 32)).astype(np.float32)
test_tensor = torch.from_numpy(test_queries)

# Her iki modelden tahmin al
print("Getting predictions from victim model...")
victim_predictions = victim_model.query(test_tensor)

print("Getting predictions from clone model...")
clone_predictions = clone_model.predict(test_tensor)

# Benzerlik metriklerini hesapla
similarity_metrics = calculate_model_similarity(
    victim_predictions['probabilities'], 
    clone_predictions['probabilities']
)

print(f"\n📊 Model Similarity Metrics:")
for key, value in similarity_metrics.items():
    print(f"  {key}: {value:.4f}")

## 8. Detaylı Analiz ve Görselleştirme

In [None]:
# Tahmin dağılımlarını karşılaştır
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Fidelity analizi
victim_pred_labels = victim_predictions['predictions']
clone_pred_labels = clone_predictions['predictions']
agreement = victim_pred_labels == clone_pred_labels

axes[0, 0].hist([victim_pred_labels, clone_pred_labels], bins=10, alpha=0.7, 
                label=['Victim', 'Clone'])
axes[0, 0].set_title('Prediction Distribution')
axes[0, 0].set_xlabel('Class')
axes[0, 0].set_ylabel('Count')
axes[0, 0].legend()

# Agreement analizi
axes[0, 1].pie([agreement.sum(), (~agreement).sum()], 
               labels=['Agreement', 'Disagreement'],
               autopct='%1.1f%%',
               colors=['lightgreen', 'lightcoral'])
axes[0, 1].set_title(f'Model Agreement (Fidelity: {similarity_metrics["fidelity"]:.2%})')

# Confidence karşılaştırması
victim_conf = victim_predictions['confidence']
clone_conf = clone_predictions['probabilities'].max(axis=1)

axes[1, 0].scatter(victim_conf, clone_conf, alpha=0.5)
axes[1, 0].plot([0, 1], [0, 1], 'r--', alpha=0.8)
axes[1, 0].set_xlabel('Victim Confidence')
axes[1, 0].set_ylabel('Clone Confidence')
axes[1, 0].set_title('Confidence Correlation')
axes[1, 0].grid(True)

# Query efficiency
query_efficiency = similarity_metrics['fidelity'] / extractor.query_count * 1000
metrics_data = {
    'Fidelity': similarity_metrics['fidelity'],
    'Cosine Sim': similarity_metrics['cosine_similarity'],
    'Query Eff.\n(×1000)': query_efficiency
}

bars = axes[1, 1].bar(metrics_data.keys(), metrics_data.values(), 
                      color=['skyblue', 'lightgreen', 'orange'])
axes[1, 1].set_title('Performance Metrics')
axes[1, 1].set_ylabel('Score')

# Bar değerlerini ekle
for bar, value in zip(bars, metrics_data.values()):
    axes[1, 1].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
                    f'{value:.3f}', ha='center', va='bottom')

plt.tight_layout()
plt.show()

## 9. Farklı Stratejilerin Karşılaştırması

Şimdi farklı sorgulama stratejilerinin performansını karşılaştıralım.

In [None]:
# Farklı stratejileri test et
def test_strategy(strategy_name, strategy, budget=1000):
    """Bir stratejiyi test et"""
    print(f"\n🧪 Testing {strategy_name} strategy...")
    
    # Kurban model sorgularını sıfırla
    victim_model.reset_query_log()
    
    # Extractor oluştur
    extractor = ModelExtractor(
        query_strategy=strategy,
        victim_api=victim_api,
        query_budget=budget
    )
    
    # Bilgi çıkar
    queries, responses = extractor.extract_knowledge(batch_size=32)
    
    # Basit clone model eğit
    clone = CloneModel("lightweight", 10, device)
    training_results = clone.train_with_stolen_data(
        queries, responses, epochs=15, learning_rate=0.01
    )
    
    # Test et
    test_inputs = torch.randn(200, 3, 32, 32)
    victim_out = victim_model.query(test_inputs)['probabilities']
    clone_out = clone.predict(test_inputs)['probabilities']
    
    similarity = calculate_model_similarity(victim_out, clone_out)
    
    return {
        'fidelity': similarity['fidelity'],
        'queries_used': extractor.query_count,
        'final_loss': training_results['final_loss'],
        'query_efficiency': similarity['fidelity'] / extractor.query_count * 1000
    }

# Stratejileri test et
strategies = {
    'Random (Uniform)': RandomQueryStrategy("uniform", (3, 32, 32)),
    'Random (Normal)': RandomQueryStrategy("normal", (3, 32, 32)),
    'Active Learning': ActiveLearningStrategy(500, "entropy")
}

results = {}
for name, strategy in strategies.items():
    results[name] = test_strategy(name, strategy, budget=800)

print("\n📊 Strategy Comparison Results:")
for name, result in results.items():
    print(f"\n{name}:")
    for metric, value in result.items():
        print(f"  {metric}: {value:.4f}")

In [None]:
# Strateji karşılaştırması görselleştirmesi
metrics = ['fidelity', 'query_efficiency', 'final_loss']
strategy_names = list(results.keys())

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for i, metric in enumerate(metrics):
    values = [results[name][metric] for name in strategy_names]
    
    bars = axes[i].bar(strategy_names, values, color=['skyblue', 'lightgreen', 'orange'])
    axes[i].set_title(f'{metric.replace("_", " ").title()}')
    axes[i].set_ylabel('Score' if metric != 'final_loss' else 'Loss')
    
    # Değerleri bar üzerine yaz
    for bar, value in zip(bars, values):
        axes[i].text(bar.get_x() + bar.get_width()/2, bar.get_height() + max(values)*0.01,
                    f'{value:.3f}', ha='center', va='bottom')
    
    # X eksenindeki isimleri döndür
    axes[i].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 10. Güvenlik İmplications ve Savunma Yöntemleri

Bu bölümde model çalma saldırılarına karşı savunma yöntemlerini tartışalım.

In [None]:
# Savunma yöntemlerini simüle et
def test_defense_mechanism(defense_name, defense_func):
    """Savunma mekanizması test et"""
    print(f"\n🛡️ Testing {defense_name}...")
    
    # Savunmalı kurban model oluştur
    defended_model = VictimModel("resnet18", 10, True, device)
    
    # Savunma fonksiyonunu uygula
    defended_api = defense_func(defended_model)
    
    # Saldırıyı dene
    extractor = ModelExtractor(
        RandomQueryStrategy("uniform", (3, 32, 32)),
        defended_api,
        query_budget=500
    )
    
    queries, responses = extractor.extract_knowledge()
    
    # Klon model eğit
    clone = CloneModel("lightweight", 10, device)
    clone.train_with_stolen_data(queries, responses, epochs=10)
    
    # Değerlendir
    test_inputs = torch.randn(100, 3, 32, 32)
    original_out = defended_model.query(test_inputs)['probabilities']
    clone_out = clone.predict(test_inputs)['probabilities']
    
    similarity = calculate_model_similarity(original_out, clone_out)
    return similarity['fidelity']

# Savunma mekanizmaları
def rate_limiting_defense(victim_model):
    """Rate limiting savunması"""
    return VictimModelAPI(victim_model, rate_limit=100)

def noise_defense(victim_model):
    """Noise ekleme savunması"""
    class NoisyAPI(VictimModelAPI):
        def predict(self, image_data, return_probabilities=True):
            # Tahminlere noise ekle
            result = super().predict(image_data, return_probabilities)
            if 'probabilities' in result:
                probs = np.array(result['probabilities'])
                noise = np.random.normal(0, 0.05, probs.shape)
                noisy_probs = probs + noise
                # Normalize et
                noisy_probs = np.maximum(noisy_probs, 0)
                noisy_probs = noisy_probs / noisy_probs.sum(axis=-1, keepdims=True)
                result['probabilities'] = noisy_probs.tolist()
            return result
    return NoisyAPI(victim_model)

# Savunmaları test et
defenses = {
    'No Defense': lambda vm: VictimModelAPI(vm),
    'Rate Limiting': rate_limiting_defense,
    'Output Noise': noise_defense
}

defense_results = {}
for name, defense in defenses.items():
    fidelity = test_defense_mechanism(name, defense)
    defense_results[name] = fidelity
    print(f"  Fidelity: {fidelity:.4f}")

print("\n🛡️ Defense Effectiveness:")
for name, fidelity in defense_results.items():
    print(f"  {name}: {fidelity:.4f}")

In [None]:
# Savunma etkinliği görselleştirmesi
plt.figure(figsize=(10, 6))

defense_names = list(defense_results.keys())
fidelities = list(defense_results.values())

bars = plt.bar(defense_names, fidelities, color=['red', 'orange', 'green'])
plt.title('Defense Mechanism Effectiveness\n(Lower Fidelity = Better Defense)')
plt.ylabel('Attack Fidelity')
plt.xlabel('Defense Method')

# Değerleri bar üzerine yaz
for bar, value in zip(bars, fidelities):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
            f'{value:.3f}', ha='center', va='bottom')

plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 11. Sonuçlar ve Öneriler

### Ana Bulgular:

1. **Model Extraction Feasibility**: Siyah kutu modelleri bile çalınabilir
2. **Query Efficiency**: Farklı stratejiler farklı verimlilik seviyeleri gösterir
3. **Defense Mechanisms**: Basit savunma yöntemleri bile etkili olabilir

### Güvenlik Önerileri:

1. **Rate Limiting**: API çağrı limitleri koyun
2. **Output Perturbation**: Çıktılara kontrollü noise ekleyin
3. **Query Monitoring**: Anormal sorgu patternlerini izleyin
4. **Differential Privacy**: Gizlilik koruyucu teknikler kullanın
5. **Watermarking**: Model watermark'ları ekleyin

In [None]:
# Final özet
print("📋 Experiment Summary")
print("=" * 50)
print(f"🎯 Original victim model accuracy: High (pretrained)")
print(f"🤖 Clone model achieved: {similarity_metrics['fidelity']:.2%} fidelity")
print(f"📊 Query budget used: {extractor.query_count:,}")
print(f"⚡ Query efficiency: {query_efficiency:.4f} (fidelity per 1000 queries)")
print(f"🛡️ Defense effectiveness: Noise injection reduced fidelity by {defense_results['No Defense'] - defense_results['Output Noise']:.2%}")

print("\n⚠️ Ethical Reminder:")
print("This simulation is for educational purposes only.")
print("Real-world model extraction attacks are illegal without permission.")
print("Always implement proper security measures for production ML systems.")