# Automatic Video Summarization Demo

This notebook demonstrates the automatic video summarization pipeline using shot boundary detection and keyframe extraction.

## Pipeline Overview

1. **Load Video**: Load and sample frames from the input video
2. **Feature Extraction**: Compute HSV color histograms and edge histograms
3. **Distance Calculation**: Calculate frame-to-frame distances
4. **Smoothing**: Apply smoothing to the distance curve
5. **Boundary Detection**: Detect shot boundaries using adaptive thresholding
6. **Keyframe Extraction**: Extract one representative keyframe per shot
7. **Export**: Generate storyboard, JSON, and optional summary video

In [None]:
# Import required libraries
import sys
sys.path.insert(0, '..')

from src.video_summarizer import VideoSummarizer
from src.export_utils import export_all
import matplotlib.pyplot as plt
import cv2
import numpy as np

%matplotlib inline
plt.rcParams['figure.figsize'] = (12, 8)

## 1. Configuration

Set the input video path and pipeline parameters.

In [None]:
# Configuration
VIDEO_PATH = 'path/to/your/video.mp4'  # Change this to your video path
OUTPUT_DIR = '../output'
SAMPLE_RATE = 1  # Sample every Nth frame (1 = every frame)
THRESHOLD_PERCENTILE = 75  # Percentile for adaptive threshold
MIN_SHOT_LENGTH = 10  # Minimum frames per shot
WINDOW_SIZE = 5  # Smoothing window size
KEYFRAME_METHOD = 'middle'  # 'middle', 'first', or 'last'

## 2. Initialize Video Summarizer

Create a VideoSummarizer instance with the specified parameters.

In [None]:
# Initialize summarizer
summarizer = VideoSummarizer(VIDEO_PATH, sample_rate=SAMPLE_RATE)

## 3. Load Video

Load the video and sample frames.

In [None]:
# Load video
summarizer.load_video()

print(f"Loaded {len(summarizer.frames)} frames")
print(f"Frame shape: {summarizer.frames[0].shape}")

### Visualize Sample Frames

In [None]:
# Display first few frames
fig, axes = plt.subplots(1, 4, figsize=(16, 4))
for i, ax in enumerate(axes):
    if i < len(summarizer.frames):
        frame_rgb = cv2.cvtColor(summarizer.frames[i], cv2.COLOR_BGR2RGB)
        ax.imshow(frame_rgb)
        ax.set_title(f'Frame {i}')
        ax.axis('off')
plt.tight_layout()
plt.show()

## 4. Compute Frame Distances

Calculate frame-to-frame distances using color and edge histograms.

In [None]:
# Compute distances
summarizer.compute_distances()

print(f"Distance statistics:")
print(f"  Min: {summarizer.distances.min():.3f}")
print(f"  Max: {summarizer.distances.max():.3f}")
print(f"  Mean: {summarizer.distances.mean():.3f}")
print(f"  Std: {summarizer.distances.std():.3f}")

### Visualize Distance Curve

In [None]:
# Plot raw distance curve
plt.figure(figsize=(14, 5))
plt.plot(summarizer.distances, linewidth=1, alpha=0.7)
plt.xlabel('Frame Index')
plt.ylabel('Distance')
plt.title('Frame-to-Frame Distance Curve (Raw)')
plt.grid(True, alpha=0.3)
plt.show()

## 5. Detect Shot Boundaries

Apply smoothing and adaptive thresholding to detect shot boundaries.

In [None]:
# Detect boundaries
summarizer.detect_boundaries(
    threshold_percentile=THRESHOLD_PERCENTILE,
    min_shot_length=MIN_SHOT_LENGTH,
    window_size=WINDOW_SIZE
)

print(f"Detected {len(summarizer.boundaries) - 1} shots")
print(f"Boundaries at frames: {summarizer.boundaries[:10]}..." if len(summarizer.boundaries) > 10 else f"Boundaries: {summarizer.boundaries}")

### Visualize Distance Curve with Boundaries

In [None]:
# Plot smoothed curve with boundaries
smoothed = summarizer.smooth_distances(WINDOW_SIZE)
threshold = np.percentile(smoothed, THRESHOLD_PERCENTILE)

plt.figure(figsize=(14, 5))
plt.plot(smoothed, linewidth=1.5, label='Smoothed Distance', color='blue', alpha=0.7)
plt.axhline(y=threshold, color='red', linestyle='--', linewidth=2, label=f'Threshold: {threshold:.3f}')

# Mark boundaries
for boundary in summarizer.boundaries:
    if 0 < boundary < len(smoothed):
        plt.axvline(x=boundary, color='green', linestyle='-', linewidth=1, alpha=0.5)

plt.xlabel('Frame Index')
plt.ylabel('Distance')
plt.title('Frame-to-Frame Distance Curve with Shot Boundaries')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

## 6. Extract Keyframes

Extract one representative keyframe per shot.

In [None]:
# Extract keyframes
summarizer.extract_keyframes(method=KEYFRAME_METHOD)

print(f"Extracted {len(summarizer.keyframes)} keyframes")
print(f"Keyframe indices: {summarizer.keyframe_indices[:10]}..." if len(summarizer.keyframe_indices) > 10 else f"Keyframe indices: {summarizer.keyframe_indices}")

### Visualize Keyframes

In [None]:
# Display keyframes in a grid
n_keyframes = min(len(summarizer.keyframes), 12)  # Show up to 12 keyframes
cols = 4
rows = (n_keyframes + cols - 1) // cols

fig, axes = plt.subplots(rows, cols, figsize=(16, rows * 3))
axes = axes.flatten() if n_keyframes > 1 else [axes]

for i in range(rows * cols):
    if i < n_keyframes:
        frame_rgb = cv2.cvtColor(summarizer.keyframes[i], cv2.COLOR_BGR2RGB)
        axes[i].imshow(frame_rgb)
        axes[i].set_title(f'Shot {i+1}\nFrame {summarizer.keyframe_indices[i]}')
    axes[i].axis('off')

plt.tight_layout()
plt.show()

## 7. Get Summary Data

Retrieve the complete summary information.

In [None]:
# Get summary data
summary_data = summarizer.get_summary_data()

print(f"Video: {summary_data['video_path']}")
print(f"Total shots: {summary_data['total_shots']}")
print(f"Total frames sampled: {summary_data['total_frames_sampled']}")
print(f"\nFirst 5 shots:")
for shot in summary_data['shots'][:5]:
    print(f"  Shot {shot['shot_id']}: frames {shot['start_frame']}-{shot['end_frame']}, keyframe: {shot['keyframe']}")

## 8. Export Results

Export all results including storyboard, JSON, distance curve, and optional summary video.

In [None]:
# Export all results
output_files = export_all(
    keyframes=summarizer.keyframes,
    keyframe_indices=summarizer.keyframe_indices,
    summary_data=summary_data,
    distances=summarizer.distances,
    boundaries=summarizer.boundaries,
    output_dir=OUTPUT_DIR,
    base_name='demo_summary',
    create_video=True  # Set to False to skip video creation
)

print("\nGenerated files:")
for output_type, path in output_files.items():
    print(f"  {output_type}: {path}")

## 9. Complete Pipeline (All-in-One)

Alternatively, run the entire pipeline with a single method call.

In [None]:
# Create a new summarizer instance
summarizer_fast = VideoSummarizer(VIDEO_PATH, sample_rate=SAMPLE_RATE)

# Run complete pipeline
summary_data_fast = summarizer_fast.run_pipeline(
    threshold_percentile=THRESHOLD_PERCENTILE,
    min_shot_length=MIN_SHOT_LENGTH,
    window_size=WINDOW_SIZE,
    keyframe_method=KEYFRAME_METHOD
)

# Export results
output_files_fast = export_all(
    keyframes=summarizer_fast.keyframes,
    keyframe_indices=summarizer_fast.keyframe_indices,
    summary_data=summary_data_fast,
    distances=summarizer_fast.distances,
    boundaries=summarizer_fast.boundaries,
    output_dir=OUTPUT_DIR,
    base_name='fast_summary',
    create_video=False
)

print(f"\nSummary: {summary_data_fast['total_shots']} shots detected")

## Conclusion

This notebook demonstrated the complete automatic video summarization pipeline:

1. ✅ Video loading and frame sampling
2. ✅ Feature extraction (HSV color + edge histograms)
3. ✅ Frame-to-frame distance calculation
4. ✅ Distance curve smoothing
5. ✅ Adaptive threshold-based boundary detection
6. ✅ Keyframe extraction (one per shot)
7. ✅ Export (storyboard, JSON, video)

### Next Steps

- Experiment with different parameter values (threshold, window size, etc.)
- Try different keyframe selection methods
- Test on different types of videos (lectures, movies, sports, etc.)
- Customize the export formats to suit your needs