# Deepfake Detection Model Training with YOLOv11

This notebook trains a YOLOv11 model on the Roboflow deepfake dataset to create a `best.pt` model for detecting deepfake images.

## Overview
- **Dataset**: Roboflow deepfake dataset
- **Model**: YOLOv11
- **Output**: best.pt model file
- **Platform**: Google Colab (GPU recommended)

## 1. Setup Environment

First, check GPU availability and install required packages.

In [None]:
# Check GPU availability
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU Device: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
else:
    print("‚ö†Ô∏è Warning: GPU not available. Training will be slow on CPU.")
    print("In Colab, go to Runtime > Change runtime type > Hardware accelerator > GPU")

In [None]:
# Install required packages
!pip install -q ultralytics roboflow
print("‚úÖ Installation complete!")

## 2. Download Dataset from Roboflow

Download the deepfake dataset from Roboflow in YOLOv11 format.

In [None]:
from roboflow import Roboflow

# Initialize Roboflow and download dataset
rf = Roboflow(api_key="SyZVxdrfu1ZXJfjQNklc")
project = rf.workspace("laboratorio-ia-yvlu2").project("deepfake-wuqrr")
version = project.version(5)
dataset = version.download("yolov11")

print(f"\n‚úÖ Dataset downloaded successfully!")
print(f"Dataset location: {dataset.location}")

In [None]:
# Verify dataset structure
import os
import yaml

data_yaml_path = os.path.join(dataset.location, "data.yaml")

# Read and display dataset configuration
with open(data_yaml_path, 'r') as f:
    data_config = yaml.safe_load(f)
    
print("Dataset Configuration:")
print(f"  - Train images: {data_config.get('train', 'N/A')}")
print(f"  - Validation images: {data_config.get('val', 'N/A')}")
print(f"  - Test images: {data_config.get('test', 'N/A')}")
print(f"  - Number of classes: {data_config.get('nc', 'N/A')}")
print(f"  - Class names: {data_config.get('names', 'N/A')}")

# Count images in each split
train_dir = os.path.join(dataset.location, data_config.get('train', 'train/images'))
val_dir = os.path.join(dataset.location, data_config.get('val', 'valid/images'))

if os.path.exists(train_dir):
    train_count = len([f for f in os.listdir(train_dir) if f.endswith(('.jpg', '.jpeg', '.png'))])
    print(f"\n‚úÖ Training images found: {train_count}")
    
if os.path.exists(val_dir):
    val_count = len([f for f in os.listdir(val_dir) if f.endswith(('.jpg', '.jpeg', '.png'))])
    print(f"‚úÖ Validation images found: {val_count}")

## 3. Configure Training Parameters

Set up the training configuration for YOLOv11 model.

In [None]:
from ultralytics import YOLO

# Training parameters
MODEL_SIZE = 'yolo11n.pt'  # Options: yolo11n.pt (nano), yolo11s.pt (small), yolo11m.pt (medium), yolo11l.pt (large)
EPOCHS = 100                # Number of training epochs
IMAGE_SIZE = 640           # Image size for training
BATCH_SIZE = 16            # Batch size (adjust based on GPU memory)
PATIENCE = 20              # Early stopping patience

print(f"Training Configuration:")
print(f"  - Model: {MODEL_SIZE}")
print(f"  - Epochs: {EPOCHS}")
print(f"  - Image Size: {IMAGE_SIZE}")
print(f"  - Batch Size: {BATCH_SIZE}")
print(f"  - Early Stopping Patience: {PATIENCE}")
print(f"  - Data YAML: {data_yaml_path}")

## 4. Train the Model

Train YOLOv11 on the deepfake dataset. This will automatically save the best model as `best.pt`.

**Note:** Training may take 30 minutes to several hours depending on dataset size and GPU.

In [None]:
# Initialize model
model = YOLO(MODEL_SIZE)

# Train the model
results = model.train(
    data=data_yaml_path,
    epochs=EPOCHS,
    imgsz=IMAGE_SIZE,
    batch=BATCH_SIZE,
    patience=PATIENCE,
    save=True,
    project='deepfake_detection',
    name='yolo11_training',
    exist_ok=True,
    pretrained=True,
    optimizer='AdamW',
    verbose=True,
    seed=42,
    deterministic=True,
    device=0 if torch.cuda.is_available() else 'cpu'
)

print("\n‚úÖ Training completed!")

## 5. Evaluate the Model

Load the best model and evaluate its performance on the validation set.

In [None]:
# Load the best model
best_model_path = 'deepfake_detection/yolo11_training/weights/best.pt'
best_model = YOLO(best_model_path)

# Evaluate on validation set
metrics = best_model.val()

# Print evaluation metrics
print("\n" + "="*50)
print("MODEL EVALUATION RESULTS")
print("="*50)
print(f"mAP50: {metrics.box.map50:.4f}")
print(f"mAP50-95: {metrics.box.map:.4f}")
print(f"Precision: {metrics.box.mp:.4f}")
print(f"Recall: {metrics.box.mr:.4f}")
print("="*50)

## 6. Visualize Training Results

Display training curves and confusion matrix.

In [None]:
from IPython.display import Image, display
import matplotlib.pyplot as plt

# Display training results
results_dir = 'deepfake_detection/yolo11_training'

print("Training Curves:")
results_plot = os.path.join(results_dir, 'results.png')
if os.path.exists(results_plot):
    display(Image(filename=results_plot))
    
print("\nConfusion Matrix:")
confusion_matrix = os.path.join(results_dir, 'confusion_matrix.png')
if os.path.exists(confusion_matrix):
    display(Image(filename=confusion_matrix))

print("\nPrediction Examples:")
val_batch = os.path.join(results_dir, 'val_batch0_pred.jpg')
if os.path.exists(val_batch):
    display(Image(filename=val_batch))

## 7. Test Predictions

Run inference on sample images to test the model.

In [None]:
# Get some test images
test_images_dir = os.path.join(dataset.location, data_config.get('val', 'valid/images'))
test_images = [os.path.join(test_images_dir, f) for f in os.listdir(test_images_dir) 
               if f.endswith(('.jpg', '.jpeg', '.png'))][:5]

if test_images:
    print(f"Running inference on {len(test_images)} test images...\n")
    
    # Run prediction
    results = best_model.predict(
        source=test_images,
        save=True,
        conf=0.25,
        project='deepfake_detection',
        name='predictions',
        exist_ok=True
    )
    
    # Display predictions
    for i, result in enumerate(results):
        print(f"\nImage {i+1}: {os.path.basename(test_images[i])}")
        print(f"  Detections: {len(result.boxes)}")
        for box in result.boxes:
            cls = int(box.cls[0])
            conf = float(box.conf[0])
            class_name = data_config['names'][cls]
            print(f"    - {class_name}: {conf:.2f}")
    
    print(f"\n‚úÖ Predictions saved to: deepfake_detection/predictions/")
else:
    print("‚ö†Ô∏è No test images found.")

## 8. Export and Download Model

The trained model is saved and ready to use. You can download it from Google Drive or copy it to your local system.

In [None]:
# Display model location
print("="*50)
print("TRAINED MODEL LOCATION")
print("="*50)
print(f"Best model: {best_model_path}")
print(f"Last model: deepfake_detection/yolo11_training/weights/last.pt")
print(f"\nModel file size: {os.path.getsize(best_model_path) / (1024*1024):.2f} MB")

# Copy to easy-to-find location
import shutil
output_path = 'best.pt'
shutil.copy(best_model_path, output_path)
print(f"\n‚úÖ Model copied to: {output_path}")

# In Colab, download the model
try:
    from google.colab import files
    print("\nüì• Starting download...")
    files.download(output_path)
    print("‚úÖ Download complete!")
except:
    print("\nüí° Not running in Colab. Model saved locally.")

## 9. How to Use the Model

Once you have the `best.pt` model file, you can use it for inference:

### Python API
```python
from ultralytics import YOLO

# Load the model
model = YOLO('best.pt')

# Run inference
results = model.predict('path/to/image.jpg', conf=0.25)

# Process results
for result in results:
    boxes = result.boxes
    for box in boxes:
        class_id = int(box.cls[0])
        confidence = float(box.conf[0])
        print(f"Detected: {model.names[class_id]} ({confidence:.2f})")
```

### Command Line
```bash
yolo detect predict model=best.pt source='path/to/image.jpg' conf=0.25
```

### Export to Other Formats
```python
model = YOLO('best.pt')
model.export(format='onnx')  # Export to ONNX
model.export(format='torchscript')  # Export to TorchScript
model.export(format='tflite')  # Export to TensorFlow Lite
```

## Summary

‚úÖ **Congratulations!** You have successfully:
1. Set up the training environment
2. Downloaded the Roboflow deepfake dataset
3. Trained a YOLOv11 model
4. Evaluated the model performance
5. Visualized training results
6. Generated predictions on test images
7. Exported the `best.pt` model

The `best.pt` file is your trained model ready for deployment in deepfake detection applications.

### Next Steps:
- Fine-tune hyperparameters for better performance
- Increase epochs if the model is still improving
- Try different model sizes (yolo11s, yolo11m, yolo11l)
- Experiment with image augmentation
- Deploy the model in a production environment