# 🎬 Extract Frames & Test Models with FaceForensics++

## 📊 Dataset ที่มี:
- **Original (Real):** 300 videos
- **Deepfakes:** 100 videos
- **FaceSwap:** 100 videos
- **Face2Face:** 100 videos

**รวม:** 600 videos

## 🎯 สิ่งที่จะทำ:
1. Extract frames จาก videos (ทุก 30 frames)
2. Crop faces ด้วย MTCNN
3. ทดสอบ 3 โมเดล
4. หา optimal weights
5. สร้าง config ใหม่

## ⚡ Compute Units:
- Extract + Crop: ~10-15 units
- Model testing: ~15-20 units
- **รวม: ~30-35 units** (จาก 60.24 ที่เหลือ)

## 🔧 Setup

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

import os
os.chdir('/content/drive/MyDrive/DeepfakeProject')
print(f"📁 Current directory: {os.getcwd()}")

In [None]:
# ติดตั้ง dependencies
!pip install torch torchvision timm pillow scikit-learn matplotlib seaborn tqdm opencv-python
!pip install facenet-pytorch
!pip install git+https://github.com/openai/CLIP.git

In [None]:
import cv2
import numpy as np
from PIL import Image
import torch
from pathlib import Path
from tqdm import tqdm
import json
import matplotlib.pyplot as plt
import seaborn as sns
from facenet_pytorch import MTCNN

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"🔧 Using device: {device}")

# ตรวจสอบ GPU
if device.type == 'cuda':
    print(f"   GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## 📁 ตรวจสอบ Dataset

In [None]:
# ตรวจสอบโครงสร้าง dataset
import glob

BASE_PATH = '/content/drive/MyDrive/DeepfakeProject/datasets'

# นับจำนวน videos
original_videos = glob.glob(f'{BASE_PATH}/original_sequences/youtube/c40/videos/*.mp4')
deepfakes_videos = glob.glob(f'{BASE_PATH}/manipulated_sequences/Deepfakes/c40/videos/*.mp4')
faceswap_videos = glob.glob(f'{BASE_PATH}/manipulated_sequences/FaceSwap/c40/videos/*.mp4')
face2face_videos = glob.glob(f'{BASE_PATH}/manipulated_sequences/Face2Face/c40/videos/*.mp4')

print("📊 Dataset Summary:")
print(f"  Original (Real):  {len(original_videos)} videos")
print(f"  Deepfakes (Fake): {len(deepfakes_videos)} videos")
print(f"  FaceSwap (Fake):  {len(faceswap_videos)} videos")
print(f"  Face2Face (Fake): {len(face2face_videos)} videos")
print(f"  " + "="*40)
print(f"  Total Real:       {len(original_videos)} videos")
print(f"  Total Fake:       {len(deepfakes_videos) + len(faceswap_videos) + len(face2face_videos)} videos")
print(f"  Grand Total:      {len(original_videos) + len(deepfakes_videos) + len(faceswap_videos) + len(face2face_videos)} videos")

## 🎬 Extract Frames จาก Videos

In [None]:
def extract_frames_from_video(video_path, output_dir, frame_interval=30, max_frames=10):
    """
    Extract frames จาก video
    
    Args:
        video_path: path ไปยังวิดีโอ
        output_dir: directory สำหรับเก็บ frames
        frame_interval: ดึงทุกๆ N frames (30 = 1 frame ต่อวินาที @ 30fps)
        max_frames: จำนวน frames สูงสุดต่อ video
    """
    os.makedirs(output_dir, exist_ok=True)
    
    video_name = Path(video_path).stem
    cap = cv2.VideoCapture(video_path)
    
    frame_count = 0
    saved_count = 0
    
    while True:
        ret, frame = cap.read()
        if not ret or saved_count >= max_frames:
            break
        
        if frame_count % frame_interval == 0:
            # บันทึก frame
            frame_path = os.path.join(output_dir, f"{video_name}_frame{saved_count:03d}.jpg")
            cv2.imwrite(frame_path, frame)
            saved_count += 1
        
        frame_count += 1
    
    cap.release()
    return saved_count

print("✅ Extract function ready")

In [None]:
# กำหนดจำนวนวิดีโอที่จะ extract (เพื่อประหยัด compute units)
NUM_VIDEOS_PER_CLASS = 30  # ปรับได้ตามความต้องการ
FRAMES_PER_VIDEO = 5       # จำนวน frames ต่อวิดีโอ
FRAME_INTERVAL = 60        # ดึงทุก 60 frames (2 วินาที @ 30fps)

OUTPUT_BASE = f'{BASE_PATH}/extracted_frames'

print(f"⚙️  Configuration:")
print(f"  Videos per class: {NUM_VIDEOS_PER_CLASS}")
print(f"  Frames per video: {FRAMES_PER_VIDEO}")
print(f"  Total frames: {NUM_VIDEOS_PER_CLASS * FRAMES_PER_VIDEO * 4} frames")
print(f"\n🎬 Starting extraction...\n")

# Extract frames
extraction_stats = {}

# 1. Original (Real)
print("[1/4] Extracting REAL frames...")
real_output = f'{OUTPUT_BASE}/real'
total_frames = 0
for video_path in tqdm(original_videos[:NUM_VIDEOS_PER_CLASS], desc="Real"):
    frames = extract_frames_from_video(video_path, real_output, FRAME_INTERVAL, FRAMES_PER_VIDEO)
    total_frames += frames
extraction_stats['real'] = total_frames
print(f"   ✅ Extracted {total_frames} frames\n")

# 2. Deepfakes (Fake)
print("[2/4] Extracting DEEPFAKES frames...")
deepfakes_output = f'{OUTPUT_BASE}/fake'
total_frames = 0
for video_path in tqdm(deepfakes_videos[:NUM_VIDEOS_PER_CLASS], desc="Deepfakes"):
    frames = extract_frames_from_video(video_path, deepfakes_output, FRAME_INTERVAL, FRAMES_PER_VIDEO)
    total_frames += frames
extraction_stats['deepfakes'] = total_frames
print(f"   ✅ Extracted {total_frames} frames\n")

# 3. FaceSwap (Fake)
print("[3/4] Extracting FACESWAP frames...")
for video_path in tqdm(faceswap_videos[:NUM_VIDEOS_PER_CLASS], desc="FaceSwap"):
    frames = extract_frames_from_video(video_path, deepfakes_output, FRAME_INTERVAL, FRAMES_PER_VIDEO)
    total_frames += frames
extraction_stats['faceswap'] = total_frames - extraction_stats['deepfakes']
print(f"   ✅ Extracted {extraction_stats['faceswap']} frames\n")

# 4. Face2Face (Fake)
print("[4/4] Extracting FACE2FACE frames...")
for video_path in tqdm(face2face_videos[:NUM_VIDEOS_PER_CLASS], desc="Face2Face"):
    frames = extract_frames_from_video(video_path, deepfakes_output, FRAME_INTERVAL, FRAMES_PER_VIDEO)
    total_frames += frames
extraction_stats['face2face'] = total_frames - extraction_stats['deepfakes'] - extraction_stats['faceswap']
print(f"   ✅ Extracted {extraction_stats['face2face']} frames\n")

extraction_stats['fake_total'] = total_frames

print("\n" + "="*50)
print("📊 Extraction Summary:")
print(f"  Real frames:       {extraction_stats['real']}")
print(f"  Fake frames:       {extraction_stats['fake_total']}")
print(f"    - Deepfakes:     {extraction_stats['deepfakes']}")
print(f"    - FaceSwap:      {extraction_stats['faceswap']}")
print(f"    - Face2Face:     {extraction_stats['face2face']}")
print(f"  Total frames:      {extraction_stats['real'] + extraction_stats['fake_total']}")
print("="*50)

## 👤 Crop Faces ด้วย MTCNN

In [None]:
# โหลด face detector
print("📥 Loading MTCNN face detector...")
face_detector = MTCNN(
    keep_all=False,
    device='cpu',  # ใช้ CPU เพื่อหลีก CUDA errors
    post_process=False,
    min_face_size=80,
    thresholds=[0.6, 0.7, 0.7]  # ปรับให้หาใบหน้าได้ง่ายขึ้น
)
print("✅ MTCNN ready")

In [None]:
def crop_face_from_frame(frame_path, output_path, face_detector, min_confidence=0.90):
    """
    Crop face จาก frame
    
    Returns:
        True ถ้า crop สำเร็จ, False ถ้าไม่พบหน้าหรือ confidence ต่ำ
    """
    try:
        img = Image.open(frame_path).convert('RGB')
        
        # ตรวจจับใบหน้า
        boxes, probs = face_detector.detect(img)
        
        if boxes is None or len(boxes) == 0:
            return False
        
        # เลือกใบหน้าที่มี confidence สูงสุด
        best_idx = np.argmax(probs)
        confidence = probs[best_idx]
        
        if confidence < min_confidence:
            return False
        
        # Crop face with padding
        box = boxes[best_idx]
        x1, y1, x2, y2 = map(int, box)
        
        # Add padding
        padding = 30
        w, h = img.size
        x1 = max(0, x1 - padding)
        y1 = max(0, y1 - padding)
        x2 = min(w, x2 + padding)
        y2 = min(h, y2 + padding)
        
        # Crop และบันทึก
        face = img.crop((x1, y1, x2, y2))
        face.save(output_path, quality=95)
        
        return True
        
    except Exception as e:
        return False

print("✅ Crop function ready")

In [None]:
# Crop faces
CROPPED_BASE = f'{BASE_PATH}/cropped_faces'
MIN_CONFIDENCE = 0.90  # ปรับได้

print(f"⚙️  Face Detection Confidence: {MIN_CONFIDENCE}")
print(f"\n✂️  Starting face cropping...\n")

crop_stats = {'real': 0, 'fake': 0, 'failed': 0}

# Crop Real faces
print("[1/2] Cropping REAL faces...")
real_frames = glob.glob(f'{OUTPUT_BASE}/real/*.jpg')
real_cropped_dir = f'{CROPPED_BASE}/real'
os.makedirs(real_cropped_dir, exist_ok=True)

for frame_path in tqdm(real_frames, desc="Real"):
    output_path = os.path.join(real_cropped_dir, Path(frame_path).name)
    if crop_face_from_frame(frame_path, output_path, face_detector, MIN_CONFIDENCE):
        crop_stats['real'] += 1
    else:
        crop_stats['failed'] += 1

print(f"   ✅ Cropped {crop_stats['real']} real faces\n")

# Crop Fake faces
print("[2/2] Cropping FAKE faces...")
fake_frames = glob.glob(f'{OUTPUT_BASE}/fake/*.jpg')
fake_cropped_dir = f'{CROPPED_BASE}/fake'
os.makedirs(fake_cropped_dir, exist_ok=True)

for frame_path in tqdm(fake_frames, desc="Fake"):
    output_path = os.path.join(fake_cropped_dir, Path(frame_path).name)
    if crop_face_from_frame(frame_path, output_path, face_detector, MIN_CONFIDENCE):
        crop_stats['fake'] += 1
    else:
        crop_stats['failed'] += 1

print(f"   ✅ Cropped {crop_stats['fake']} fake faces\n")

print("\n" + "="*50)
print("📊 Face Cropping Summary:")
print(f"  Real faces:    {crop_stats['real']}")
print(f"  Fake faces:    {crop_stats['fake']}")
print(f"  Total faces:   {crop_stats['real'] + crop_stats['fake']}")
print(f"  Failed:        {crop_stats['failed']} (no face or low confidence)")
print(f"  Success rate:  {(crop_stats['real'] + crop_stats['fake']) / (crop_stats['real'] + crop_stats['fake'] + crop_stats['failed']) * 100:.1f}%")
print("="*50)

## 🔍 ตัวอย่างภาพที่ Crop แล้ว

In [None]:
# แสดงตัวอย่างภาพ
import random

real_samples = random.sample(glob.glob(f'{CROPPED_BASE}/real/*.jpg'), min(5, crop_stats['real']))
fake_samples = random.sample(glob.glob(f'{CROPPED_BASE}/fake/*.jpg'), min(5, crop_stats['fake']))

fig, axes = plt.subplots(2, 5, figsize=(15, 6))

# Real samples
for idx, img_path in enumerate(real_samples):
    img = Image.open(img_path)
    axes[0, idx].imshow(img)
    axes[0, idx].set_title('REAL', color='green', fontweight='bold')
    axes[0, idx].axis('off')

# Fake samples
for idx, img_path in enumerate(fake_samples):
    img = Image.open(img_path)
    axes[1, idx].imshow(img)
    axes[1, idx].set_title('FAKE', color='red', fontweight='bold')
    axes[1, idx].axis('off')

plt.suptitle('Sample Cropped Faces', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.savefig('sample_cropped_faces.png', dpi=150, bbox_inches='tight')
plt.show()

## 🤖 โหลดโมเดล

**อัปโหลด model weights ที่นี่:**
1. `xception_best.pth`
2. `f3net_best.pth`
3. `effort_clip_L14_trainOn_FaceForensic.pth`

In [None]:
# อัปโหลด weights (ถ้ายังไม่มีใน Drive)
from google.colab import files

print("📁 Please upload 3 model weight files:")
print("1. xception_best.pth")
print("2. f3net_best.pth")
print("3. effort_clip_L14_trainOn_FaceForensic.pth")
print("\nหรือถ้ามีใน Drive แล้ว ให้ skip cell นี้")

# uploaded = files.upload()

In [None]:
# Clone โปรเจกต์ (ถ้ายังไม่มี)
import os

if not os.path.exists('deepfake-detection'):
    print("📥 Cloning project...")
    # หรืออัปโหลด folder จาก Drive
    !cp -r /content/drive/MyDrive/deepfake-detection /content/
    print("✅ Project ready")
else:
    print("✅ Project already exists")

In [None]:
# Setup paths และโหลดโมเดล
import sys
sys.path.insert(0, '/content/deepfake-detection/backend/app')

# ตั้งค่า weights path
WEIGHTS_PATH = '/content'  # หรือ path ที่คุณเก็บ weights

# Import model classes (ตามโครงสร้างโปรเจกต์ของคุณ)
# สมมติว่ามีโครงสร้างตามที่เคยสร้างไว้

print("📥 Loading models...")
# โหลดโมเดลที่นี่ (ใช้โค้ดจาก notebook ก่อนหน้า)
# ...

print("✅ Models loaded")

## 🧪 ทดสอบโมเดล

**(ใช้โค้ดจาก `Model_Weight_Optimization.ipynb` ที่สร้างไว้แล้ว)**

เนื่องจากตอนนี้เรามี cropped faces แล้ว เราจะใช้โค้ดเดียวกันกับ notebook ก่อนหน้า
แต่เปลี่ยน test_data path เป็น:
```python
TEST_DATA_PATH = '/content/drive/MyDrive/DeepfakeProject/datasets/cropped_faces'
```

## 🎯 หาค่า Optimal Weights

**(ใช้โค้ดเดียวกันกับ notebook ก่อนหน้า)**

## 💾 บันทึกผลลัพธ์

In [None]:
# บันทึกผลลัพธ์ไปยัง Drive
from google.colab import files

print("📥 Downloading results...")
files.download('config_optimized.json')
files.download('weight_optimization_report.json')
files.download('individual_vs_ensemble.png')
files.download('sample_cropped_faces.png')

print("✅ Done! Files downloaded to your computer")

## 📊 สรุปผลการทดสอบ

### Dataset:
- **Source:** FaceForensics++ (c40 compression)
- **Real videos:** 300 (30 used)
- **Fake videos:** 300 (90 used: 30 Deepfakes + 30 FaceSwap + 30 Face2Face)
- **Total frames extracted:** ~600 frames
- **Faces detected:** ~XXX faces

### Model Performance:
- **Xception:** Accuracy: X.XXX, F1: X.XXX
- **F3Net:** Accuracy: X.XXX, F1: X.XXX
- **Effort-CLIP:** Accuracy: X.XXX, F1: X.XXX

### Ensemble (Optimized):
- **Weights:** Xception: X.XX, F3Net: X.XX, Effort: X.XX
- **Accuracy:** X.XXX
- **F1 Score:** X.XXX
- **AUC:** X.XXX

### ✅ Next Steps:
1. Download `config_optimized.json`
2. Replace `backend/app/config.json`
3. Restart backend server
4. Test with real-world images!