# Video Feature Extraction Tool - Demo Notebook

This notebook demonstrates the **Video Feature Extraction Tool** that analyzes video files and extracts key visual and temporal features.

## Features Implemented:
1. **Shot Cut Detection** - Detects hard cuts in videos
2. **Motion Analysis** - Quantifies motion using optical flow
3. **Text Detection (OCR)** - Detects text presence in frames
4. **Object vs Person Detection** - Analyzes object and person presence

---

## 1. Install and Import Required Libraries

First, let's install and import all necessary libraries.

In [None]:
from video_feature_extractor import VideoFeatureExtractor

import cv2
import numpy as np
import json
import matplotlib.pyplot as plt
from IPython.display import Video, display
import warnings
warnings.filterwarnings('ignore')

print("✓ All libraries imported successfully!")
print(f"OpenCV version: {cv2.__version__}")

## 2. Basic Usage - Quick Start

Let's start with a simple example of how to use the tool.

In [None]:
video_path = "sample_video.mp4"

extractor = VideoFeatureExtractor(video_path, sample_rate=30)

## 3. Extract All Features at Once

The easiest way to use the tool is to extract all features at once.

In [None]:
features = extractor.extract_all_features(
    shot_cut_threshold=30.0,
    text_sample_frames=20,
    object_sample_frames=30,
    object_confidence=0.5
)

extractor.print_summary()

## 4. Individual Feature Extraction

You can also extract features individually for more control.

### 4.1 Shot Cut Detection

Detects hard cuts (scene changes) in the video.

In [None]:
shot_cuts = extractor.detect_shot_cuts(threshold=30.0)

print(f"Total cuts detected: {shot_cuts['total_cuts']}")
print(f"Average frame difference: {shot_cuts['average_frame_difference']:.2f}")

if shot_cuts['cut_timestamps']:
    print("\nFirst 5 cuts:")
    for cut in shot_cuts['cut_timestamps'][:5]:
        print(f"  Frame {cut['frame']} at {cut['timestamp']:.2f}s (diff: {cut['difference']:.2f})")

### 4.2 Motion Analysis

Quantifies motion intensity using optical flow.

In [None]:
motion = extractor.analyze_motion()

print(f"Motion Category: {motion['motion_category']}")
print(f"Average Motion: {motion['average_motion']:.2f}")
print(f"Max Motion: {motion['max_motion']:.2f}")
print(f"Motion Std Dev: {motion['std_motion']:.2f}")
print(f"Frames Analyzed: {motion['frames_analyzed']}")

### 4.3 Text Detection (OCR)

Detects text presence in video frames.

In [None]:
text_detection = extractor.detect_text(sample_frames=20)

if 'error' not in text_detection:
    print(f"Text Present Ratio: {text_detection['text_present_ratio']:.1%}")
    print(f"Frames with Text: {text_detection['frames_with_text']}/{text_detection['frames_analyzed']}")
    
    if text_detection['keywords']:
        print(f"\nTop Keywords: {', '.join(text_detection['keywords'][:10])}")
    else:
        print("\nNo keywords detected")
else:
    print(f"Text detection unavailable: {text_detection['error']}")

### 4.4 Object and Person Detection

Analyzes the presence of objects and people using YOLO or Haar Cascade.

In [None]:
object_detection = extractor.detect_objects_and_people(confidence=0.5, sample_frames=30)

if 'error' not in object_detection:
    print(f"People Detected: {object_detection['total_people_detected']}")
    
    if 'total_objects_detected' in object_detection:
        print(f"Objects Detected: {object_detection['total_objects_detected']}")
        print(f"Person/Object Ratio: {object_detection['person_to_object_ratio']:.2f}")
    
    print(f"Average People per Frame: {object_detection['average_people_per_frame']:.2f}")
    print(f"Dominance: {object_detection['dominance'].capitalize()}")
    print(f"Detection Method: {object_detection.get('detection_method', 'YOLO')}")
else:
    print(f"Object detection unavailable: {object_detection['error']}")

## 5. Visualizing Results

Let's create some visualizations of the extracted features.

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

if 'shot_cuts' in features and features['shot_cuts']['cut_timestamps']:
    ax = axes[0, 0]
    cut_times = [cut['timestamp'] for cut in features['shot_cuts']['cut_timestamps']]
    ax.eventplot(cut_times, colors='red', linewidths=2)
    ax.set_xlabel('Time (seconds)')
    ax.set_title(f"Shot Cuts Timeline ({len(cut_times)} cuts)")
    ax.set_yticks([])
    ax.grid(True, alpha=0.3)

if 'motion_analysis' in features:
    ax = axes[0, 1]
    motion_data = features['motion_analysis']
    categories = ['Average\nMotion', 'Max\nMotion', 'Std\nDev']
    values = [motion_data['average_motion'], motion_data['max_motion'], motion_data['std_motion']]
    bars = ax.bar(categories, values, color=['#3498db', '#e74c3c', '#2ecc71'])
    ax.set_ylabel('Motion Magnitude')
    ax.set_title(f"Motion Analysis ({motion_data['motion_category']})")
    ax.grid(True, alpha=0.3, axis='y')
    
    for bar in bars:
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2., height,
                f'{height:.2f}', ha='center', va='bottom')

if 'text_detection' in features and 'error' not in features['text_detection']:
    ax = axes[1, 0]
    text_data = features['text_detection']
    ratio = text_data['text_present_ratio']
    colors = ['#27ae60', '#ecf0f1']
    sizes = [ratio, 1 - ratio]
    labels = [f'Text Present\n({ratio:.1%})', f'No Text\n({1-ratio:.1%})']
    ax.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=90)
    ax.set_title('Text Presence in Video')

if 'object_detection' in features and 'error' not in features['object_detection']:
    ax = axes[1, 1]
    obj_data = features['object_detection']
    
    if 'total_objects_detected' in obj_data and obj_data['total_objects_detected'] > 0:
        labels = ['People', 'Objects']
        sizes = [obj_data['total_people_detected'], obj_data['total_objects_detected']]
        colors = ['#3498db', '#e67e22']
        ax.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=90)
        ax.set_title(f"Object Detection ({obj_data['dominance'].capitalize()} Dominant)")
    else:
        ax.text(0.5, 0.5, f"People Detected: {obj_data['total_people_detected']}\n\n" +
                f"Avg per frame: {obj_data['average_people_per_frame']:.2f}",
                ha='center', va='center', fontsize=12, transform=ax.transAxes)
        ax.set_title('Object Detection Results')
        ax.axis('off')

plt.tight_layout()
plt.show()

## 6. Save Results to JSON

Save the extracted features to a JSON file for later use.

In [None]:
output_file = extractor.save_features()

print("\nExtracted Features (JSON format):")
print("=" * 60)
print(json.dumps(features, indent=2))

: 

## 7. Advanced Usage: Batch Processing

Process multiple videos at once.

In [None]:
import os
import glob

def process_videos_in_folder(folder_path, output_folder="results"):
    os.makedirs(output_folder, exist_ok=True)
    
    video_extensions = ['*.mp4', '*.avi', '*.mov', '*.mkv', '*.flv', '*.wmv']
    
    video_files = []
    for ext in video_extensions:
        video_files.extend(glob.glob(os.path.join(folder_path, ext)))
    
    print(f"Found {len(video_files)} video files")
    
    results = {}
    
    for i, video_path in enumerate(video_files, 1):
        print(f"\n{'='*60}")
        print(f"Processing video {i}/{len(video_files)}: {os.path.basename(video_path)}")
        print(f"{'='*60}")
        
        try:
            extractor = VideoFeatureExtractor(video_path, sample_rate=30)
            features = extractor.extract_all_features()
            
            base_name = os.path.splitext(os.path.basename(video_path))[0]
            output_path = os.path.join(output_folder, f"{base_name}_features.json")
            extractor.save_features(output_path)
            
            results[video_path] = {
                'success': True,
                'output': output_path,
                'features': features
            }
            
        except Exception as e:
            print(f"Error processing {video_path}: {e}")
            results[video_path] = {
                'success': False,
                'error': str(e)
            }
    
    return results

## 8. Tips and Best Practices

### Performance Optimization
- **Sample Rate**: Increase `sample_rate` parameter to process fewer frames (faster but less accurate)
- **Frame Selection**: For text detection, sample more frames if text is sparse
- **Video Resolution**: Consider downscaling very high-resolution videos

### Feature Tuning
- **Shot Cut Threshold**: Lower values detect more cuts (30-40 is typical)
- **OCR Accuracy**: Works best with clear, large text
- **Object Detection**: Requires YOLO model files or uses Haar Cascade as fallback

### Common Issues
1. **Missing pytesseract**: Install with `pip install pytesseract` and Tesseract OCR
2. **YOLO models not found**: Tool falls back to Haar Cascade for person detection
3. **Memory issues**: Increase sample_rate or process shorter video segments

---

## Summary

This tool provides comprehensive video analysis with:
- ✓ **Shot Cut Detection** - Identifies scene transitions
- ✓ **Motion Analysis** - Quantifies camera and subject movement
- ✓ **Text Detection** - Finds text in frames with OCR
- ✓ **Object/Person Detection** - Analyzes visual content composition

For questions or issues, refer to the README.md file.