# Dog Keypoints Training + Rule-Based Emotion Classification

**Podejście zgodne z QUICK_IMPLEMENTATION_PLAN.md:**

| Komponent | Źródło | Opis |
|-----------|--------|------|
| Keypoints Model | Trenowany na DogFLW | CNN → Heatmaps → 20 keypoints |
| Action Units | Geometria | 12 AU obliczanych z keypoints |
| Emotion | Rule-based | Wagi naukowe (DogFACS) |

**Wymagany dataset Kaggle:**
- `lovodkin/dogflw` - annotacje keypoints (3855 obrazów)

**Po treningu:**
- `keypoints_dogflw.pt` - model keypoints do pobrania

**Klasyfikacja emocji NIE wymaga treningu** - używa reguł z DogFACS.

In [None]:
!pip install -q timm albumentations

In [None]:
import os
import json
import math
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split
from pathlib import Path
from tqdm.auto import tqdm
import matplotlib.pyplot as plt
from PIL import Image
import cv2
import timm
from torchvision import transforms
import albumentations as A
from albumentations.pytorch import ToTensorV2

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Device: {device}')
print(f'PyTorch: {torch.__version__}')

## Конфигурация

In [None]:
# === ŚCIEŻKI ===
DOGFLW_PATH = '/kaggle/input/dogflw/DogFLW'

# === KEYPOINTS ===
NUM_KEYPOINTS_DOGFLW = 46
NUM_KEYPOINTS_PROJECT = 20
HEATMAP_SIZE = 64
IMAGE_SIZE = 256

# === ACTION UNITS ===
NUM_ACTION_UNITS = 12
KEYPOINTS_FEATURES = NUM_KEYPOINTS_PROJECT * 3  # 60

# === EMOTIONS (Rule-based - bez treningu) ===
EMOTION_CLASSES = ['happy', 'sad', 'angry', 'fearful', 'relaxed', 'neutral']
NUM_EMOTIONS = 6
NEUTRAL_THRESHOLD = 0.35

# === TRAINING ===
BATCH_SIZE_KP = 16
EPOCHS_KP = 50
LR_KP = 0.001

print('Config loaded!')
print(f'  Keypoints: {NUM_KEYPOINTS_DOGFLW} → {NUM_KEYPOINTS_PROJECT}')
print(f'  Action Units: {NUM_ACTION_UNITS}')
print(f'  Emotions: {NUM_EMOTIONS} (rule-based)')

## Keypoint Mapping (DogFLW 46 → Project 20)

In [None]:
KEYPOINT_NAMES = [
    "left_eye", "right_eye", "nose",
    "left_ear_base", "right_ear_base", "left_ear_tip", "right_ear_tip",
    "left_mouth_corner", "right_mouth_corner", "upper_lip", "lower_lip", "chin",
    "left_cheek", "right_cheek", "forehead",
    "left_eyebrow", "right_eyebrow",
    "muzzle_top", "muzzle_left", "muzzle_right",
]

# DogFLW (46) → Project (20) mapping
# Based on DogFLW landmark definitions
DOGFLW_TO_PROJECT = {
    0: 0,   # left_eye
    1: 1,   # right_eye  
    14: 2,  # nose
    32: 3,  # left_ear_base (approximate)
    36: 4,  # right_ear_base (approximate)
    34: 5,  # left_ear_tip
    38: 6,  # right_ear_tip
    20: 7,  # left_mouth_corner
    24: 8,  # right_mouth_corner
    22: 9,  # upper_lip (mouth top)
    26: 10, # lower_lip (mouth bottom)
    28: 11, # chin
    4: 12,  # left_cheek
    8: 13,  # right_cheek
    40: 14, # forehead (top of head)
    42: 15, # left_eyebrow
    44: 16, # right_eyebrow
    16: 17, # muzzle_top
    6: 18,  # muzzle_left
    10: 19, # muzzle_right
}

PROJECT_TO_DOGFLW = {v: k for k, v in DOGFLW_TO_PROJECT.items()}

def map_keypoints_46_to_20(kp46):
    """Конвертирует 46 DogFLW keypoints в 20 project keypoints."""
    kp20 = []
    for i in range(NUM_KEYPOINTS_PROJECT):
        dogflw_idx = PROJECT_TO_DOGFLW[i]
        kp20.append(kp46[dogflw_idx])
    return np.array(kp20)

print(f'Keypoint mapping: {NUM_KEYPOINTS_DOGFLW} → {NUM_KEYPOINTS_PROJECT}')

---
# ЭТАП 1: Обучение Keypoints модели
---

## 1.1 Dataset для Keypoints

In [None]:
class DogFLWDataset(Dataset):
    """Dataset для DogFLW keypoints."""
    
    def __init__(self, root_path, split='train', image_size=256, heatmap_size=64):
        self.image_size = image_size
        self.heatmap_size = heatmap_size
        self.sigma = 2.0  # Gaussian sigma for heatmaps
        
        images_dir = os.path.join(root_path, split, 'images')
        labels_dir = os.path.join(root_path, split, 'labels')
        
        self.samples = []
        for label_file in os.listdir(labels_dir):
            if not label_file.endswith('.json'):
                continue
            
            name = label_file.replace('.json', '')
            # Try different image extensions
            for ext in ['.jpg', '.jpeg', '.png', '.JPEG']:
                img_path = os.path.join(images_dir, name + ext)
                if os.path.exists(img_path):
                    label_path = os.path.join(labels_dir, label_file)
                    self.samples.append((img_path, label_path))
                    break
        
        self.transform = A.Compose([
            A.Resize(image_size, image_size),
            A.HorizontalFlip(p=0.5),
            A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=0.5),
            A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ToTensorV2(),
        ], keypoint_params=A.KeypointParams(format='xy', remove_invisible=False))
        
        self.transform_val = A.Compose([
            A.Resize(image_size, image_size),
            A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ToTensorV2(),
        ], keypoint_params=A.KeypointParams(format='xy', remove_invisible=False))
        
        self.is_train = (split == 'train')
        print(f'{split}: {len(self.samples)} samples')
    
    def __len__(self):
        return len(self.samples)
    
    def _generate_heatmap(self, keypoints, h, w):
        """Генерирует heatmaps для keypoints."""
        num_kp = len(keypoints)
        heatmaps = np.zeros((num_kp, self.heatmap_size, self.heatmap_size), dtype=np.float32)
        
        for i, (x, y) in enumerate(keypoints):
            # Scale to heatmap size
            x_hm = x * self.heatmap_size / w
            y_hm = y * self.heatmap_size / h
            
            if 0 <= x_hm < self.heatmap_size and 0 <= y_hm < self.heatmap_size:
                # Generate Gaussian
                xx, yy = np.meshgrid(np.arange(self.heatmap_size), np.arange(self.heatmap_size))
                heatmaps[i] = np.exp(-((xx - x_hm)**2 + (yy - y_hm)**2) / (2 * self.sigma**2))
        
        return heatmaps
    
    def __getitem__(self, idx):
        img_path, label_path = self.samples[idx]
        
        # Load image
        image = cv2.imread(img_path)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        h, w = image.shape[:2]
        
        # Load keypoints
        with open(label_path, 'r') as f:
            data = json.load(f)
        keypoints = [(float(x), float(y)) for x, y in data['landmarks']]
        
        # Apply transforms
        transform = self.transform if self.is_train else self.transform_val
        transformed = transform(image=image, keypoints=keypoints)
        
        image_tensor = transformed['image']
        kp_transformed = transformed['keypoints']
        
        # Pad keypoints if some were removed
        while len(kp_transformed) < NUM_KEYPOINTS_DOGFLW:
            kp_transformed.append((0.0, 0.0))
        
        # Generate heatmaps
        heatmaps = self._generate_heatmap(kp_transformed, self.image_size, self.image_size)
        heatmaps_tensor = torch.from_numpy(heatmaps)
        
        return image_tensor, heatmaps_tensor

# Test dataset
train_kp_dataset = DogFLWDataset(DOGFLW_PATH, 'train')
test_kp_dataset = DogFLWDataset(DOGFLW_PATH, 'test')

In [None]:
# Визуализация примера
img, hm = train_kp_dataset[0]

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# Image
img_show = img.permute(1, 2, 0).numpy()
img_show = img_show * [0.229, 0.224, 0.225] + [0.485, 0.456, 0.406]
img_show = np.clip(img_show, 0, 1)
axes[0].imshow(img_show)
axes[0].set_title('Image')

# Heatmap sum
axes[1].imshow(hm.sum(0).numpy(), cmap='hot')
axes[1].set_title('Heatmaps (sum)')

# Single heatmap
axes[2].imshow(hm[0].numpy(), cmap='hot')
axes[2].set_title('Keypoint 0 heatmap')

plt.tight_layout()
plt.show()

## 1.2 Keypoints Model (SimpleBaseline)

In [None]:
class SimpleBaselineNet(nn.Module):
    """
    Simple Baseline for keypoint detection.
    ResNet backbone + Deconv head → Heatmaps
    """
    
    def __init__(self, num_keypoints=46, backbone='resnet34'):
        super().__init__()
        
        # Backbone
        self.backbone = timm.create_model(
            backbone, 
            pretrained=True, 
            features_only=True,
            out_indices=[-1]
        )
        
        # Get backbone output channels
        with torch.no_grad():
            dummy = torch.zeros(1, 3, 256, 256)
            feat = self.backbone(dummy)[-1]
            in_channels = feat.shape[1]
        
        # Deconv head
        self.deconv = nn.Sequential(
            nn.ConvTranspose2d(in_channels, 256, 4, 2, 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            
            nn.ConvTranspose2d(256, 256, 4, 2, 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            
            nn.ConvTranspose2d(256, 256, 4, 2, 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
        )
        
        # Output layer
        self.head = nn.Conv2d(256, num_keypoints, 1)
        
    def forward(self, x):
        features = self.backbone(x)[-1]
        x = self.deconv(features)
        heatmaps = self.head(x)
        return heatmaps

# Create model
kp_model = SimpleBaselineNet(NUM_KEYPOINTS_DOGFLW, 'resnet34').to(device)

# Test forward pass
with torch.no_grad():
    dummy = torch.randn(1, 3, 256, 256).to(device)
    out = kp_model(dummy)
    print(f'Input: {dummy.shape}')
    print(f'Output: {out.shape}')
    print(f'Parameters: {sum(p.numel() for p in kp_model.parameters()):,}')

## 1.3 Обучение Keypoints

In [None]:
# DataLoaders
train_kp_loader = DataLoader(train_kp_dataset, batch_size=BATCH_SIZE_KP, shuffle=True, num_workers=2, pin_memory=True)
test_kp_loader = DataLoader(test_kp_dataset, batch_size=BATCH_SIZE_KP, shuffle=False, num_workers=2, pin_memory=True)

# Loss and optimizer
criterion_kp = nn.MSELoss()
optimizer_kp = torch.optim.Adam(kp_model.parameters(), lr=LR_KP)
scheduler_kp = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer_kp, mode='min', factor=0.5, patience=5)

print(f'Train batches: {len(train_kp_loader)}')
print(f'Test batches: {len(test_kp_loader)}')

In [None]:
def train_keypoints_epoch(model, loader, criterion, optimizer):
    model.train()
    total_loss = 0
    
    for images, heatmaps in tqdm(loader, desc='Train', leave=False):
        images = images.to(device)
        heatmaps = heatmaps.to(device)
        
        optimizer.zero_grad()
        pred = model(images)
        loss = criterion(pred, heatmaps)
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
    
    return total_loss / len(loader)


def eval_keypoints(model, loader, criterion):
    model.eval()
    total_loss = 0
    
    with torch.no_grad():
        for images, heatmaps in tqdm(loader, desc='Eval', leave=False):
            images = images.to(device)
            heatmaps = heatmaps.to(device)
            
            pred = model(images)
            loss = criterion(pred, heatmaps)
            total_loss += loss.item()
    
    return total_loss / len(loader)

In [None]:
# Training loop
print(f'\n{"="*60}')
print(f'ЭТАП 1: Обучение Keypoints модели ({EPOCHS_KP} epochs)')
print(f'{"="*60}\n')

best_kp_loss = float('inf')
kp_history = {'train_loss': [], 'val_loss': []}

for epoch in range(EPOCHS_KP):
    train_loss = train_keypoints_epoch(kp_model, train_kp_loader, criterion_kp, optimizer_kp)
    val_loss = eval_keypoints(kp_model, test_kp_loader, criterion_kp)
    
    scheduler_kp.step(val_loss)
    
    kp_history['train_loss'].append(train_loss)
    kp_history['val_loss'].append(val_loss)
    
    if val_loss < best_kp_loss:
        best_kp_loss = val_loss
        torch.save(kp_model.state_dict(), 'keypoints_dogflw.pt')
    
    if (epoch + 1) % 5 == 0 or epoch == 0:
        lr = optimizer_kp.param_groups[0]['lr']
        print(f'Epoch {epoch+1:3d}/{EPOCHS_KP} | Train: {train_loss:.6f} | Val: {val_loss:.6f} | Best: {best_kp_loss:.6f} | LR: {lr:.6f}')

print(f'\nKeypoints training done! Best loss: {best_kp_loss:.6f}')
print('Saved: keypoints_dogflw.pt')

In [None]:
# Plot keypoints training
plt.figure(figsize=(10, 4))
plt.plot(kp_history['train_loss'], label='Train')
plt.plot(kp_history['val_loss'], label='Val')
plt.xlabel('Epoch')
plt.ylabel('MSE Loss')
plt.title('Keypoints Model Training')
plt.legend()
plt.grid(True)
plt.savefig('keypoints_training.png', dpi=150)
plt.show()

## 1.4 Визуализация Keypoints

In [None]:
# Load best model
kp_model.load_state_dict(torch.load('keypoints_dogflw.pt'))
kp_model.eval()

def decode_heatmaps(heatmaps, original_size):
    """Декодирует heatmaps в координаты keypoints."""
    h, w = original_size
    num_kp, hm_h, hm_w = heatmaps.shape
    
    keypoints = []
    confidences = []
    
    for k in range(num_kp):
        hm = heatmaps[k]
        idx = hm.argmax()
        y_hm, x_hm = idx // hm_w, idx % hm_w
        conf = float(hm.max())
        
        x = float(x_hm) * w / hm_w
        y = float(y_hm) * h / hm_h
        
        keypoints.append((x, y))
        confidences.append(conf)
    
    return keypoints, confidences

# Test on sample
img_tensor, gt_hm = test_kp_dataset[0]

with torch.no_grad():
    pred_hm = kp_model(img_tensor.unsqueeze(0).to(device))[0].cpu().numpy()

kp_pred, conf = decode_heatmaps(pred_hm, (IMAGE_SIZE, IMAGE_SIZE))

# Visualize
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

img_show = img_tensor.permute(1, 2, 0).numpy()
img_show = img_show * [0.229, 0.224, 0.225] + [0.485, 0.456, 0.406]
img_show = np.clip(img_show, 0, 1)

# Image with keypoints
axes[0].imshow(img_show)
for i, ((x, y), c) in enumerate(zip(kp_pred, conf)):
    if c > 0.1:
        axes[0].plot(x, y, 'ro', markersize=3)
axes[0].set_title('Predicted Keypoints')

# GT heatmaps
axes[1].imshow(gt_hm.sum(0).numpy(), cmap='hot')
axes[1].set_title('GT Heatmaps')

# Pred heatmaps
axes[2].imshow(pred_hm.sum(0), cmap='hot')
axes[2].set_title('Pred Heatmaps')

plt.tight_layout()
plt.savefig('keypoints_prediction.png', dpi=150)
plt.show()

---
# ETAP 2: Rule-Based Emotion Classification (bez treningu!)
---

Klasyfikacja emocji oparta na naukowych wagach z DogFACS (Mota-Rojas et al. 2021).

**Nie wymaga treningu** - używa wzorów:
- happy = AU_mouth_open * 0.35 + AU_ear_forward * 0.25 + ...
- sad = AU_ear_back * 0.40 + AU_eye_blink * 0.15 + ...
- angry = (AU_mouth_open + AU_jaw_drop)/2 * 0.30 + AU_lip_pull * 0.25 + ...
- fearful = AU_ear_back * 0.30 + AU_nose_lick * 0.25 + ...
- relaxed = (1 - mean_activation) * 0.50 + ...
- neutral = (1 - mean_activation) * 0.70 + ...

## 2.1 Rule-Based Emotion Classifier

In [None]:
# Keypoint indices for AU computation
KP = {name: i for i, name in enumerate(KEYPOINT_NAMES)}

ACTION_UNIT_NAMES = [
    "AU_brow_raise", "AU_ear_forward", "AU_ear_back", "AU_ear_asymmetry",
    "AU_eye_opening", "AU_mouth_open", "AU_lip_corner_pull", "AU_jaw_drop",
    "AU_nose_wrinkle", "AU_muzzle_width", "AU_face_elongation", "AU_eye_distance",
]

def extract_action_units(keypoints_flat):
    """
    Wyodrębnia 12 Action Units z 60 keypoints features.
    keypoints_flat: [x0, y0, v0, x1, y1, v1, ...] (60 values)
    """
    kp = keypoints_flat.reshape(20, 3)
    coords = kp[:, :2]
    vis = kp[:, 2]
    
    def dist(i, j):
        return float(np.sqrt(np.sum((coords[i] - coords[j]) ** 2)))
    
    def ang(i, j):
        d = coords[j] - coords[i]
        return math.atan2(d[1], d[0])
    
    # Reference distance (between eyes)
    eye_d = max(dist(KP['left_eye'], KP['right_eye']), 1e-6)
    
    # Compute Action Units
    brow_d = (dist(KP['left_eyebrow'], KP['left_eye']) + dist(KP['right_eyebrow'], KP['right_eye'])) / 2
    au_brow = np.clip((brow_d / eye_d - 0.2) / 0.3, 0, 1)
    
    l_ear = ang(KP['left_ear_base'], KP['left_ear_tip'])
    r_ear = ang(KP['right_ear_base'], KP['right_ear_tip'])
    au_ear_fwd = np.clip(1.0 - (abs(l_ear) + abs(r_ear)) / 2 / math.pi, 0, 1)
    au_ear_back = 1.0 - au_ear_fwd
    au_ear_asym = np.clip(abs(l_ear - r_ear) / math.pi, 0, 1)
    
    au_eye = np.clip((vis[KP['left_eye']] + vis[KP['right_eye']]) / 2, 0, 1)
    au_mouth = np.clip(dist(KP['upper_lip'], KP['lower_lip']) / eye_d / 0.3, 0, 1)
    
    l_lip = ang(KP['upper_lip'], KP['left_mouth_corner'])
    r_lip = ang(KP['upper_lip'], KP['right_mouth_corner'])
    au_smile = np.clip((l_lip - r_lip + math.pi) / (2 * math.pi), 0, 1)
    
    au_jaw = np.clip((dist(KP['nose'], KP['chin']) / eye_d - 0.5), 0, 1)
    au_nose = np.clip(1.0 - dist(KP['nose'], KP['upper_lip']) / eye_d / 0.5, 0, 1)
    au_muzzle = np.clip((dist(KP['muzzle_left'], KP['muzzle_right']) / eye_d - 0.3) / 0.5, 0, 1)
    au_elong = np.clip((dist(KP['forehead'], KP['chin']) / eye_d - 1.0) / 1.5, 0, 1)
    au_eye_d = np.clip(eye_d / 100.0, 0, 1)
    
    return {
        'AU_brow_raise': float(au_brow),
        'AU_ear_forward': float(au_ear_fwd),
        'AU_ear_back': float(au_ear_back),
        'AU_ear_asymmetry': float(au_ear_asym),
        'AU_eye_opening': float(au_eye),
        'AU_mouth_open': float(au_mouth),
        'AU_lip_corner_pull': float(au_smile),
        'AU_jaw_drop': float(au_jaw),
        'AU_nose_wrinkle': float(au_nose),
        'AU_muzzle_width': float(au_muzzle),
        'AU_face_elongation': float(au_elong),
        'AU_eye_distance': float(au_eye_d),
    }


def classify_emotion_rule_based(au_values, neutral_threshold=0.35):
    """
    Rule-based klasyfikacja emocji na podstawie Action Units.
    Oparta na DogFACS (Mota-Rojas et al. 2021).
    
    NIE WYMAGA TRENINGU - używa naukowych wag.
    """
    # Get AU values
    brow_raise = au_values.get('AU_brow_raise', 0.0)
    eye_opening = au_values.get('AU_eye_opening', 0.5)
    mouth_open = au_values.get('AU_mouth_open', 0.0)
    jaw_drop = au_values.get('AU_jaw_drop', 0.0)
    nose_wrinkle = au_values.get('AU_nose_wrinkle', 0.0)
    lip_corner_pull = au_values.get('AU_lip_corner_pull', 0.0)
    ear_forward = au_values.get('AU_ear_forward', 0.0)
    ear_back = au_values.get('AU_ear_back', 0.0)
    ear_asymmetry = au_values.get('AU_ear_asymmetry', 0.0)
    
    # Derived values
    blink = 1.0 - eye_opening
    nose_lick = max(0.0, mouth_open * 0.5 - jaw_drop * 0.3)  # Proxy
    
    all_au = [brow_raise, blink, mouth_open, jaw_drop, nose_wrinkle,
              lip_corner_pull, ear_forward, ear_back, ear_asymmetry, nose_lick]
    mean_activation = sum(all_au) / len(all_au)
    
    # === EMOTION SCORING (wg QUICK_IMPLEMENTATION_PLAN.md) ===
    
    happy_score = (
        mouth_open * 0.35 + ear_forward * 0.25 + brow_raise * 0.15 +
        (1 - ear_back) * 0.15 + (1 - nose_lick) * 0.10
    )
    
    sad_score = (
        ear_back * 0.40 + blink * 0.15 + (1 - brow_raise) * 0.15 +
        (1 - mouth_open) * 0.15 + nose_lick * 0.15
    )
    
    angry_score = (
        ((mouth_open + jaw_drop) / 2) * 0.30 + lip_corner_pull * 0.25 +
        nose_wrinkle * 0.15 + ((ear_back + ear_asymmetry) / 2) * 0.15 + blink * 0.15
    )
    
    fearful_score = (
        ear_back * 0.30 + nose_lick * 0.25 + blink * 0.20 +
        brow_raise * 0.12 + (1 - mouth_open) * 0.13
    )
    
    relaxed_score = (
        (1 - mean_activation) * 0.50 + (1 - ear_back) * 0.15 +
        (1 - ear_forward) * 0.15 + (1 - nose_lick) * 0.10 + (1 - nose_wrinkle) * 0.10
    )
    
    neutral_score = (
        (1 - mean_activation) * 0.70 +
        (1 - (mouth_open + jaw_drop) / 2) * 0.15 +
        (1 - (ear_back + ear_forward) / 2) * 0.15
    )
    
    scores = {
        'happy': happy_score,
        'sad': sad_score,
        'angry': angry_score,
        'fearful': fearful_score,
        'relaxed': relaxed_score,
        'neutral': neutral_score,
    }
    
    # Softmax-like normalization
    temperature = 2.0
    exp_scores = {k: np.exp(v * temperature) for k, v in scores.items()}
    total_exp = sum(exp_scores.values())
    probabilities = {k: v / total_exp for k, v in exp_scores.items()}
    
    # Find best emotion
    best_emotion = max(probabilities, key=probabilities.get)
    best_prob = probabilities[best_emotion]
    
    # Threshold for neutral
    if best_prob < neutral_threshold and best_emotion != 'neutral':
        best_emotion = 'neutral'
        best_prob = probabilities['neutral']
    
    return {
        'emotion': best_emotion,
        'confidence': best_prob,
        'probabilities': probabilities,
        'action_units': au_values,
    }

print(f'Action Units: {len(ACTION_UNIT_NAMES)}')
print(f'Rule-based classifier ready!')

## 2.2 Full Pipeline Demo

In [None]:
class FullEmotionPipeline:
    """
    Pełny pipeline: Image → Keypoints → AU → Emotion
    
    Używa:
    - Wytrenowanego modelu keypoints (SimpleBaseline)
    - Ekstrakcji Action Units z geometrii
    - Rule-based klasyfikacji emocji (DogFACS)
    """
    
    def __init__(self, kp_model, device):
        self.kp_model = kp_model
        self.kp_model.eval()
        self.device = device
        
        self.transform = transforms.Compose([
            transforms.ToPILImage(),
            transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ])
    
    def extract_keypoints(self, image):
        """Wyodrębnia 20 keypoints z obrazu."""
        if isinstance(image, str):
            image = cv2.imread(image)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        h, w = image.shape[:2]
        tensor = self.transform(image).unsqueeze(0).to(self.device)
        
        with torch.no_grad():
            heatmaps = self.kp_model(tensor)[0].cpu().numpy()
        
        # Decode 46 keypoints
        hm_h, hm_w = heatmaps.shape[1:]
        kp46 = []
        for k in range(NUM_KEYPOINTS_DOGFLW):
            hm = heatmaps[k]
            idx = hm.argmax()
            y_hm, x_hm = idx // hm_w, idx % hm_w
            conf = float(hm.max())
            x = x_hm * w / hm_w
            y = y_hm * h / hm_h
            kp46.append((x, y, conf))
        
        # Map 46 → 20
        result = []
        for i in range(NUM_KEYPOINTS_PROJECT):
            dogflw_idx = PROJECT_TO_DOGFLW[i]
            x, y, v = kp46[dogflw_idx]
            result.extend([x, y, v])
        
        return np.array(result, dtype=np.float32)
    
    def predict(self, image):
        """
        Pełna predykcja: Image → Emotion
        
        Returns:
            dict z emotion, confidence, probabilities, action_units
        """
        # 1. Extract keypoints
        keypoints = self.extract_keypoints(image)
        
        # 2. Compute Action Units
        au_values = extract_action_units(keypoints)
        
        # 3. Rule-based classification
        result = classify_emotion_rule_based(au_values)
        
        # Add keypoints to result
        result['keypoints'] = keypoints
        
        return result

# Create pipeline with trained model
pipeline = FullEmotionPipeline(kp_model, device)
print('Full Emotion Pipeline ready!')
print('  Image → Keypoints → Action Units → Emotion')

## 2.3 Demo na przykładowych obrazach

In [None]:
# Demo na kilku obrazach z test set
test_images_dir = os.path.join(DOGFLW_PATH, 'test', 'images')
test_images = os.listdir(test_images_dir)[:5]  # Pierwsze 5

fig, axes = plt.subplots(1, 5, figsize=(20, 4))

for i, img_name in enumerate(test_images):
    img_path = os.path.join(test_images_dir, img_name)
    
    # Load image
    img = cv2.imread(img_path)
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    # Predict
    result = pipeline.predict(img_rgb)
    
    # Draw keypoints
    kp = result['keypoints'].reshape(20, 3)
    for x, y, v in kp:
        if v > 0.1:
            cv2.circle(img_rgb, (int(x), int(y)), 3, (255, 0, 0), -1)
    
    # Display
    axes[i].imshow(img_rgb)
    axes[i].set_title(f"{result['emotion']}\n{result['confidence']:.1%}")
    axes[i].axis('off')

plt.tight_layout()
plt.savefig('emotion_demo.png', dpi=150)
plt.show()

# Show AU values for last image
print('\nAction Units dla ostatniego obrazu:')
for name, value in result['action_units'].items():
    print(f'  {name}: {value:.3f}')

In [None]:
# Probability distribution dla ostatniego obrazu
probs = result['probabilities']

plt.figure(figsize=(10, 4))
colors = ['#2ecc71', '#3498db', '#e74c3c', '#9b59b6', '#f39c12', '#95a5a6']
plt.bar(probs.keys(), probs.values(), color=colors)
plt.xlabel('Emotion')
plt.ylabel('Probability')
plt.title('Emotion Probabilities (Rule-Based)')
plt.ylim(0, 1)
plt.savefig('emotion_probabilities.png', dpi=150)
plt.show()

## 2.4 Statystyki na całym test set

In [None]:
# Analiza emocji na całym test set
test_images_dir = os.path.join(DOGFLW_PATH, 'test', 'images')
all_test_images = [f for f in os.listdir(test_images_dir) if f.endswith(('.jpg', '.jpeg', '.png', '.JPEG'))]

emotion_counts = {e: 0 for e in EMOTION_CLASSES}
all_confidences = []

print(f'Analizuję {len(all_test_images)} obrazów...')

for img_name in tqdm(all_test_images):
    img_path = os.path.join(test_images_dir, img_name)
    try:
        img = cv2.imread(img_path)
        img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        result = pipeline.predict(img_rgb)
        
        emotion_counts[result['emotion']] += 1
        all_confidences.append(result['confidence'])
    except:
        pass

print('\nRozkład emocji na test set:')
for emotion, count in emotion_counts.items():
    pct = 100 * count / len(all_test_images) if all_test_images else 0
    print(f'  {emotion}: {count} ({pct:.1f}%)')

print(f'\nŚrednia pewność: {np.mean(all_confidences):.1%}')

In [ ]:
# Wizualizacja rozkładu emocji
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Emotion distribution
colors = ['#2ecc71', '#3498db', '#e74c3c', '#9b59b6', '#f39c12', '#95a5a6']
ax1.bar(emotion_counts.keys(), emotion_counts.values(), color=colors)
ax1.set_xlabel('Emotion')
ax1.set_ylabel('Count')
ax1.set_title('Emotion Distribution (Rule-Based on Test Set)')

# Confidence histogram
ax2.hist(all_confidences, bins=20, color='steelblue', edgecolor='black', alpha=0.7)
ax2.axvline(x=np.mean(all_confidences), color='red', linestyle='--', label=f'Mean: {np.mean(all_confidences):.2f}')
ax2.set_xlabel('Confidence')
ax2.set_ylabel('Count')
ax2.set_title('Confidence Distribution')
ax2.legend()

plt.tight_layout()
plt.savefig('test_set_analysis.png', dpi=150)
plt.show()

---
# Zapisanie wyników
---

In [ ]:
# Zapisz metryki (handle inf for JSON)
metrics = {
    'approach': 'rule-based',
    'keypoints': {
        'best_loss': float(best_kp_loss) if best_kp_loss != float('inf') else None,
        'epochs': EPOCHS_KP,
        'num_keypoints_dogflw': NUM_KEYPOINTS_DOGFLW,
        'num_keypoints_project': NUM_KEYPOINTS_PROJECT,
    },
    'action_units': {
        'count': NUM_ACTION_UNITS,
        'names': ACTION_UNIT_NAMES,
    },
    'emotion': {
        'method': 'DogFACS rule-based (no training required)',
        'num_classes': NUM_EMOTIONS,
        'classes': EMOTION_CLASSES,
        'neutral_threshold': NEUTRAL_THRESHOLD,
        'test_distribution': emotion_counts,
        'mean_confidence': float(np.mean(all_confidences)),
    },
    'kp_history': kp_history,
}

with open('training_metrics.json', 'w') as f:
    json.dump(metrics, f, indent=2)

print('Zapisane pliki:')
print('  ✓ keypoints_dogflw.pt (model keypoints)')
print('  ✓ training_metrics.json (metryki)')
print('  ✓ *.png (wykresy)')
print()
print('UWAGA: Model emocji NIE jest potrzebny!')
print('       Klasyfikacja emocji jest rule-based.')

In [None]:
# Weryfikacja modelu
print('=== Weryfikacja ===\n')

# Test keypoints model
test_kp = SimpleBaselineNet(46, 'resnet34')
test_kp.load_state_dict(torch.load('keypoints_dogflw.pt'))
test_kp.eval()

with torch.no_grad():
    out = test_kp(torch.randn(1, 3, 256, 256))
print(f'Keypoints Model:')
print(f'  Input: (1, 3, 256, 256)')
print(f'  Output: {tuple(out.shape)}')
print()

# Test full pipeline
print('Full Pipeline:')
dummy_kp = np.random.randn(60).astype(np.float32)  # 20 keypoints * 3
au = extract_action_units(dummy_kp)
result = classify_emotion_rule_based(au)
print(f'  Keypoints → AU → Emotion')
print(f'  Input: 60 keypoint values')
print(f'  AU: {len(au)} action units')
print(f'  Output: {result["emotion"]} ({result["confidence"]:.1%})')
print()
print('✓ Wszystko działa!')

---
## Po zakończeniu treningu

**Pobierz plik:**
- `keypoints_dogflw.pt` → skopiuj do `models/keypoints_dogflw.pt`

**Oczekiwane wyniki:**

| Model | Metryka | Wartość |
|-------|---------|---------|
| Keypoints | MSE Loss | < 0.01 |
| Emotion | Metoda | Rule-based (bez treningu) |

**Klasyfikacja emocji:**
- Używa DogFACS (Mota-Rojas et al. 2021)
- Nie wymaga treningu
- 6 klas: happy, sad, angry, fearful, relaxed, neutral
- Neutral gdy `max(prob) < 0.35`

**Zalety podejścia rule-based:**
1. Interpretowalność - wiemy dokładnie dlaczego model przewiduje daną emocję
2. Brak potrzeby danych treningowych dla emocji
3. Oparte na naukowych badaniach DogFACS
4. Łatwe do modyfikacji i dostrajania wag