# Isolating the Pitcher

This notebook shows how to crop and trim videos to focus on just the pitcher during the pre-release phase.

## Why Isolate?

Raw broadcast videos include:
- Multiple camera angles (replays, crowd shots, etc.)
- Full frame with batter, catcher, umpire
- Post-pitch action (fielding, running, etc.)

For pitch analysis, we want:
- **Just the pitcher** - cropped to focus on the mound
- **Pre-release only** - the windup/set through release (for tipping analysis)
- **Consistent framing** - same crop for all videos

In [None]:
import sys
from pathlib import Path
sys.path.insert(0, str(Path.cwd().parent))

from mlb_pitcher_videos import PitcherIsolator

## Check Downloaded Videos

In [None]:
# Check for downloaded videos
videos_dir = Path('../data/videos')
output_dir = Path('../data/videos_isolated')

if not videos_dir.exists():
    print("No videos found!")
    print("Run 03_download_videos.ipynb first.")
else:
    videos = list(videos_dir.glob('*.mp4'))
    print(f"Found {len(videos)} videos in {videos_dir}")

## Process a Single Video

Let's start by processing one video to see how it works.

In [None]:
# Initialize isolator
isolator = PitcherIsolator()

# CONFIGURE: Maximum duration (seconds)
MAX_DURATION = 2.5  # Keep first 2.5 seconds (pre-release)

In [None]:
# Process one video as a test
if videos:
    test_video = videos[0]
    output_path = output_dir / f"{test_video.stem}_isolated.mp4"
    output_dir.mkdir(exist_ok=True)
    
    print(f"Processing: {test_video.name}")
    
    metadata = isolator.process_video(
        test_video,
        output_path,
        max_duration=MAX_DURATION,
    )
    
    print(f"\nResults:")
    print(f"  Input: {metadata['input_size'][0]}x{metadata['input_size'][1]}, {metadata['input_frames']} frames")
    print(f"  Output: {metadata['output_size'][0]}x{metadata['output_size'][1]}, {metadata['output_frames']} frames")
    print(f"  Duration: {metadata['duration']:.2f}s")
    print(f"  Saved to: {output_path}")

## Process All Videos

Now let's process all videos in the directory.

In [None]:
# CONFIGURE: How many videos to process (None = all)
MAX_VIDEOS = 10  # Start small, then increase

In [None]:
# Process all videos
print(f"Processing videos from: {videos_dir}")
print(f"Output directory: {output_dir}")
print(f"Max duration: {MAX_DURATION}s")
print()

results = isolator.process_directory(
    videos_dir,
    output_dir,
    max_duration=MAX_DURATION,
    max_videos=MAX_VIDEOS,
)

In [None]:
# Summary
print("\n" + "="*40)
print("Processing Summary")
print("="*40)
print(f"Videos processed: {len(results)}")

if results:
    avg_duration = sum(r['duration'] for r in results) / len(results)
    avg_frames = sum(r['output_frames'] for r in results) / len(results)
    print(f"Average duration: {avg_duration:.2f}s")
    print(f"Average frames: {avg_frames:.0f}")

In [None]:
# List isolated videos
isolated = list(output_dir.glob('*.mp4'))
print(f"\nIsolated videos in {output_dir}:")
for v in isolated[:10]:
    size_mb = v.stat().st_size / (1024 * 1024)
    print(f"  {v.name} ({size_mb:.1f} MB)")
if len(isolated) > 10:
    print(f"  ... and {len(isolated) - 10} more")

## Process All Videos (with Resume)

For large batches, use checkpointing to enable resume if interrupted.

In [None]:
# Process with checkpoint (uncomment to run)
# results = isolator.process_directory(
#     videos_dir,
#     output_dir,
#     max_duration=MAX_DURATION,
#     max_videos=None,  # Process all
#     checkpoint_file=Path('../data/isolate_checkpoint.json'),
# )

## Done!

Your isolated videos are ready for analysis. Each video now contains:
- **Cropped frame** focused on the pitcher
- **~2.5 seconds** of pre-release footage (windup through release)
- **Consistent framing** across all videos

### Next Steps

You can now use these videos for:
- Pitch tipping analysis
- Pose estimation and biomechanics
- Training machine learning models
- Manual video review