# Judo Throw Recognition with YOLO11 Pose

**Free GPU Testing** - Run this notebook on Google Colab

This notebook:
1. Downloads a test judo video from YouTube
2. Runs YOLO11 pose estimation
3. Analyzes biomechanics to detect throws
4. Shows results with skeleton overlays

**Usage:** Click Runtime â†’ Run All

In [None]:
# Install dependencies
!pip install -q ultralytics yt-dlp opencv-python

In [None]:
# Download test video from YouTube
# Using one of the technique demonstration videos
VIDEO_URL = "https://www.youtube.com/watch?v=LMKgaMdm9UY"  # 80 tachi-waza techniques

!yt-dlp -f 'best[height<=720]' -o 'judo_test.mp4' --download-sections "*0:00-2:00" {VIDEO_URL}

print("âœ“ Downloaded 2 minutes of test video")

In [None]:
from ultralytics import YOLO
import cv2
import numpy as np
import json
from pathlib import Path
from google.colab import files

# Load YOLO11 pose model
print("Loading YOLO11x-pose model...")
model = YOLO('yolo11x-pose.pt')  # Automatically downloads if not present
print("âœ“ Model loaded")

In [None]:
# Run pose estimation on video
print("Running pose estimation...")

results = model.predict(
    source='judo_test.mp4',
    save=True,
    project='output',
    name='pose_results',
    conf=0.5,
    show_labels=True,
    show_conf=True,
    verbose=True
)

print("\nâœ“ Pose estimation complete!")
print("Output saved to: output/pose_results/judo_test.avi")

In [None]:
# Display sample frame with pose overlay
from IPython.display import Image, display
import matplotlib.pyplot as plt

# Load the output video and show a frame
cap = cv2.VideoCapture('output/pose_results/judo_test.avi')
cap.set(cv2.CAP_PROP_POS_FRAMES, 100)  # Jump to frame 100
ret, frame = cap.read()
cap.release()

if ret:
    plt.figure(figsize=(15, 10))
    plt.imshow(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    plt.title('Sample Frame with YOLO11 Pose Overlay')
    plt.axis('off')
    plt.show()
else:
    print("Could not load frame")

In [None]:
# Extract pose data for analysis
print("Extracting pose keypoints...")

cap = cv2.VideoCapture('judo_test.mp4')
results = model.predict(source='judo_test.mp4', stream=True, verbose=False)

pose_data = []
for frame_idx, result in enumerate(results):
    if result.keypoints is not None and len(result.keypoints.data) > 0:
        for person_idx, kpts in enumerate(result.keypoints.data):
            keypoints_array = kpts.cpu().numpy()
            
            pose_data.append({
                'frame': frame_idx,
                'person': person_idx,
                'keypoints': keypoints_array.tolist()
            })
    
    if frame_idx % 100 == 0:
        print(f"  Processed {frame_idx} frames")

cap.release()
print(f"\nâœ“ Extracted {len(pose_data)} person-frames")

# Save pose data
with open('pose_data.json', 'w') as f:
    json.dump(pose_data, f)

print("âœ“ Pose data saved to pose_data.json")

In [None]:
# Simple biomechanical analysis
# Calculate hip heights over time

hip_heights = []
frame_numbers = []

for item in pose_data:
    kpts = np.array(item['keypoints'])
    
    # COCO keypoints: 11=left_hip, 12=right_hip
    left_hip = kpts[11]
    right_hip = kpts[12]
    
    # Check confidence
    if left_hip[2] > 0.5 and right_hip[2] > 0.5:
        avg_hip_y = (left_hip[1] + right_hip[1]) / 2
        hip_heights.append(avg_hip_y)
        frame_numbers.append(item['frame'])

# Plot hip height over time
plt.figure(figsize=(15, 5))
plt.plot(frame_numbers, hip_heights, linewidth=1)
plt.xlabel('Frame Number')
plt.ylabel('Hip Height (pixels, inverted - lower=higher)')
plt.title('Hip Height Trajectory - Drops indicate potential throws')
plt.grid(True, alpha=0.3)
plt.gca().invert_yaxis()  # Invert so drops go down visually
plt.show()

# Find significant hip drops (potential throws)
hip_heights_array = np.array(hip_heights)
# Look for local maxima followed by drops
window_size = 30
throws_detected = []

for i in range(len(hip_heights_array) - window_size):
    window = hip_heights_array[i:i+window_size]
    drop = np.max(window) - np.min(window)
    
    if drop > 50:  # Significant drop threshold (pixels)
        throws_detected.append({
            'frame': frame_numbers[i],
            'time': frame_numbers[i] / 30,  # Assume 30fps
            'drop_amount': drop
        })

print(f"\nâœ“ Detected {len(throws_detected)} potential throws based on hip drops:")
for throw in throws_detected[:10]:  # Show first 10
    print(f"  - Frame {throw['frame']} ({throw['time']:.1f}s): {throw['drop_amount']:.0f}px drop")

In [None]:
# Download results
print("Preparing files for download...")

# Compress output video to reduce size
!ffmpeg -i output/pose_results/judo_test.avi -c:v libx264 -crf 28 -y judo_pose_annotated.mp4 -loglevel quiet

print("\nðŸ“¥ Download these files:")
files.download('judo_pose_annotated.mp4')
files.download('pose_data.json')

print("\nâœ“ Complete! You now have:")
print("  1. judo_pose_annotated.mp4 - Video with skeleton overlays")
print("  2. pose_data.json - Raw pose keypoints for further analysis")

## Next Steps

1. **Analyze Quality**: Review the skeleton overlays - are keypoints accurate?
2. **Hip Detection**: Check if the hip drop detection correctly identifies throws
3. **Fine-tune**: Adjust threshold values for your specific videos
4. **Hybrid Approach**: Combine with Vision LLM (Gemini/Claude) for technique classification

**Cost Estimate**: This entire notebook runs free on Colab! For production:
- Hetzner GPU: ~$0.50/hour
- 2-hour video processing: ~10 min = $0.08
- Very affordable compared to Vision LLM only ($0.04-0.50/session)