# üèãÔ∏è Fitness-AQA Vision Pipeline (Google Colab)

This notebook extracts **2D pose keypoints** from exercise videos using **MMPose**.

---

## ‚öôÔ∏è IMPORTANT: Run cells in order!

**Step 1** will install dependencies and **automatically restart the runtime**.  
After restart, **manually run Steps 2-7**.

---

## üöÄ Quick Start:
1. Runtime ‚Üí Change runtime type ‚Üí **GPU (T4)**
2. Run **Step 1** ‚Üí wait for auto-restart
3. After restart, run **Steps 2-7**

---

## üì¶ Step 1: Install Dependencies (AUTO-RESTART)

**‚ö†Ô∏è This cell will restart the runtime automatically!**  
After restart, continue with Step 2.

In [None]:
import sys
print(f"Python version: {sys.version}")

# Install numpy first (before scipy to avoid ABI issues)
!pip install "numpy<2.0.0" --force-reinstall -q

# Upgrade installers
!pip install --upgrade pip setuptools wheel -q

# Install OpenMIM
!pip install -U openmim -q

# Install MMPose stack
!mim install mmengine -q
!mim install "mmcv>=2.0.0,<2.2.0" -q
!mim install "mmdet>=3.0.0" -q
!mim install "mmpose>=1.0.0" -q

# Install scipy AFTER numpy is downgraded
!pip install scipy opencv-python matplotlib -q

print("\n‚úÖ Installation complete! Restarting runtime...")

# Auto-restart to load new packages
import os
os.kill(os.getpid(), 9)

## ‚úÖ Step 2: Verify Installation

**Run this after the runtime restarts to confirm everything installed correctly.**

In [None]:
import sys
import numpy as np
import scipy
import cv2
from mmpose.apis import MMPoseInferencer

print(f"‚úÖ Python: {sys.version.split()[0]}")
print(f"‚úÖ NumPy: {np.__version__}")
print(f"‚úÖ SciPy: {scipy.__version__}")
print(f"‚úÖ OpenCV: {cv2.__version__}")
print(f"‚úÖ MMPose: Imported successfully")
print("\nüéâ All dependencies loaded correctly!")

## üì§ Step 3: Upload Your Video

In [None]:
from google.colab import files
import os

uploaded = files.upload()
video_path = list(uploaded.keys())[0]
print(f"‚úÖ Uploaded: {video_path}")

## üîß Step 4: Define Pipeline

In [None]:
import json
import logging
import numpy as np
import cv2
from scipy.signal import savgol_filter
from mmpose.apis import MMPoseInferencer

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s')
logger = logging.getLogger(__name__)

class PoseExtractor:
    def __init__(self, mode='human', device='cuda'):
        logger.info(f"Loading MMPose (device={device})...")
        self.inferencer = MMPoseInferencer(mode, device=device)

    def smooth_signal(self, keypoints, window_length=5, polyorder=2):
        if len(keypoints) < window_length:
            return keypoints
        smoothed = np.zeros_like(keypoints)
        for i in range(keypoints.shape[1]):
            smoothed[:, i, 0] = savgol_filter(keypoints[:, i, 0], window_length, polyorder)
            smoothed[:, i, 1] = savgol_filter(keypoints[:, i, 1], window_length, polyorder)
        return smoothed

    def normalize_signal(self, keypoints):
        normalized = np.zeros_like(keypoints)
        for f in range(len(keypoints)):
            frame_kps = keypoints[f]
            mid_shoulder = (frame_kps[5] + frame_kps[6]) / 2
            mid_hip = (frame_kps[11] + frame_kps[12]) / 2
            torso_len = np.linalg.norm(mid_shoulder - mid_hip)
            scale = 1.0 if torso_len < 1e-3 else 1.0 / torso_len
            normalized[f] = (frame_kps - mid_hip) * scale
        return normalized

    def process_video(self, video_path, output_path=None):
        logger.info(f"Processing: {video_path}")
        result_generator = self.inferencer(video_path, return_vis=False)
        
        raw_keypoints, scores = [], []
        for result in result_generator:
            preds = result['predictions']
            if preds and len(preds) > 0:
                raw_keypoints.append(preds[0]['keypoints'])
                scores.append(preds[0]['keypoint_scores'])
            else:
                raw_keypoints.append(np.zeros((17, 2)))
                scores.append(np.zeros(17))

        raw_keypoints = np.array(raw_keypoints)
        scores = np.array(scores)
        
        smoothed = self.smooth_signal(raw_keypoints)
        normalized = self.normalize_signal(smoothed)
        
        data = {
            "video_id": os.path.basename(video_path),
            "frame_count": len(raw_keypoints),
            "raw_keypoints": raw_keypoints.tolist(),
            "smoothed_keypoints": smoothed.tolist(),
            "normalized_keypoints": normalized.tolist(),
            "scores": scores.tolist()
        }
        
        if output_path:
            with open(output_path, 'w') as f:
                json.dump(data, f)
            logger.info(f"Saved to {output_path}")
        return data

print("‚úÖ PoseExtractor ready!")

## üöÄ Step 5: Process Video

In [None]:
extractor = PoseExtractor(mode='human', device='cuda')
result = extractor.process_video(video_path, output_path='analysis.json')

print(f"\n‚úÖ Processing complete!")
print(f"üìä Frames: {result['frame_count']}")
print(f"ÔøΩÔøΩ Output: analysis.json")

## üìä Step 6: Visualize

In [None]:
import matplotlib.pyplot as plt

cap = cv2.VideoCapture(video_path)
ret, frame = cap.read()
cap.release()

if ret:
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    keypoints = np.array(result['smoothed_keypoints'][0])
    
    plt.figure(figsize=(12, 8))
    plt.imshow(frame_rgb)
    plt.scatter(keypoints[:, 0], keypoints[:, 1], c='red', s=100, marker='o', edgecolors='white', linewidths=2)
    
    for i, (x, y) in enumerate(keypoints):
        plt.text(x, y, str(i), color='yellow', fontsize=10, ha='center', va='center', weight='bold')
    
    plt.title("Detected Keypoints (Frame 0)", fontsize=16)
    plt.axis('off')
    plt.tight_layout()
    plt.show()
    
    # Plot trajectory
    left_wrist_idx = 9
    raw_y = [kp[left_wrist_idx][1] for kp in result['raw_keypoints']]
    smoothed_y = [kp[left_wrist_idx][1] for kp in result['smoothed_keypoints']]
    
    plt.figure(figsize=(14, 6))
    plt.plot(raw_y, 'r-', alpha=0.4, linewidth=1, label='Raw (Jittery)')
    plt.plot(smoothed_y, 'b-', linewidth=2.5, label='Smoothed')
    plt.xlabel('Frame', fontsize=12)
    plt.ylabel('Y Coordinate', fontsize=12)
    plt.title('Left Wrist Movement - Smoothing Effect', fontsize=14)
    plt.legend(fontsize=12)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()

## üíæ Step 7: Download

In [None]:
files.download('analysis.json')
print("‚úÖ Download started!")

---

## ‚úÖ Complete!

**You now have `analysis.json` ready for Vishal!**

**GitHub:** https://github.com/JCHETAN26/Form-Analyser  
**Handoff Doc:** `HANDOFF_TO_VISHAL.md`
