# üé¨ Video Frame Extractor - Google Colab

Extract frames from videos and save to Google Drive!

## ‚ö†Ô∏è Important: Avoiding Google Drive I/O Errors

Google Drive can timeout when handling thousands of small files. This notebook uses a **local-first approach**:

1. Extract frames to **local Colab storage** (fast)
2. Copy to **Google Drive** in batches (reliable)
3. Clean up local files

This avoids the `Input/output error` you might encounter when writing directly to Drive.

## Step 1: Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Create output folder in Drive
import os
DRIVE_OUTPUT = '/content/drive/MyDrive/extracted_frames'
os.makedirs(DRIVE_OUTPUT, exist_ok=True)
print(f'Output will be saved to: {DRIVE_OUTPUT}')

## Step 2: Install Dependencies

In [None]:
# Install Node.js 18 (required)
!curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash - > /dev/null 2>&1
!sudo apt-get install -y nodejs > /dev/null 2>&1

# Install ffmpeg and yt-dlp
!apt-get install -qq ffmpeg > /dev/null 2>&1
!pip install -q yt-dlp

# Verify
!node --version
!ffmpeg -version 2>&1 | head -1

## Step 3: Setup Video Frame Extractor

In [None]:
%cd /content
!rm -rf video-frame-extractor 2>/dev/null
!git clone https://github.com/user/video-frame-extractor.git
%cd video-frame-extractor
!npm install --silent
print('\n‚úÖ Setup complete!')

## Step 4: Add Your Video URLs

In [None]:
%%writefile links.txt
# Add your video URLs here (one per line)
# Lines starting with # are ignored

# Example:
# https://example.com/video1.mp4
# https://example.com/video2.mp4

## Step 5: Extract Frames (Recommended Method)

This extracts frames to **local storage first**, then copies to Drive in batches.
This is more reliable than writing directly to Drive.

In [None]:
# Configuration
DRIVE_OUTPUT = '/content/drive/MyDrive/extracted_frames'  # Where to save in Drive
FPS = 1  # Frames per second (lower = fewer files)

# Run the helper script (extracts locally, then copies to Drive)
!node colab_helper.js full -u links.txt --fps {FPS} {DRIVE_OUTPUT}

## Alternative: Step-by-Step Extraction

If you have many videos or want more control, do it in steps:

In [None]:
# Step A: Extract frames to LOCAL storage (fast)
!node colab_helper.js extract -u links.txt --fps 1

In [None]:
# Step B: Copy to Google Drive (batched, reliable)
!node colab_helper.js copy /content/drive/MyDrive/extracted_frames

In [None]:
# Step C: Clean up local temp files
!node colab_helper.js cleanup

## Check Results

In [None]:
import os

DRIVE_OUTPUT = '/content/drive/MyDrive/extracted_frames'

if os.path.exists(DRIVE_OUTPUT):
    folders = [f for f in os.listdir(DRIVE_OUTPUT) if os.path.isdir(os.path.join(DRIVE_OUTPUT, f))]
    print(f'üìÅ Found {len(folders)} video folder(s):\n')
    
    total_frames = 0
    for folder in sorted(folders):
        folder_path = os.path.join(DRIVE_OUTPUT, folder)
        frames = [f for f in os.listdir(folder_path) if f.endswith('.png')]
        total_frames += len(frames)
        print(f'  üìÇ {folder}: {len(frames)} frames')
    
    print(f'\nüìä Total: {total_frames} frames')
else:
    print('No frames extracted yet.')

## Preview Frames

In [None]:
from IPython.display import Image, display
import glob

DRIVE_OUTPUT = '/content/drive/MyDrive/extracted_frames'
frames = sorted(glob.glob(f'{DRIVE_OUTPUT}/**/*.png', recursive=True))

if frames:
    print(f'Found {len(frames)} total frames. Showing first 3:\n')
    for frame in frames[:3]:
        print(f'üì∑ {os.path.basename(os.path.dirname(frame))}/{os.path.basename(frame)}')
        display(Image(filename=frame, width=400))
        print()
else:
    print('No frames found.')

---

## üí° Tips for Large Batches

| Issue | Solution |
|-------|----------|
| Drive I/O errors | Use `colab_helper.js` (extracts locally first) |
| Too many files | Lower FPS (e.g., `--fps 0.5` = 1 frame every 2 seconds) |
| Session timeout | Use caching - just re-run and cached videos are skipped |
| Slow extraction | Use `-c 1` for reliability over speed |

## üìã Command Reference

```bash
# Full pipeline (recommended)
node colab_helper.js full -u links.txt --fps 1 /content/drive/MyDrive/frames

# Step by step
node colab_helper.js extract -u links.txt --fps 1
node colab_helper.js copy /content/drive/MyDrive/frames
node colab_helper.js cleanup

# Resume after restart (cached videos are skipped)
node colab_helper.js full -u links.txt --fps 1 /content/drive/MyDrive/frames

# Force re-extract
node colab_helper.js full -u links.txt --fps 1 --force /content/drive/MyDrive/frames
```