Skip to content

EyalPasha/smart-extract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎥 Smart Extract

Intelligent GPU-accelerated frame extraction for presentation and lecture videos

Smart Extract is a high-performance Python tool that automatically finds and exports the best possible still frames from long presentation recordings such as talks, lectures, demos, and keynotes.

Instead of dumping thousands of near-identical frames, Smart Extract scans the entire video, scores visual quality, enforces temporal diversity, and optionally applies AI super-resolution to produce a curated, portfolio-ready image set.



✨ What Makes Smart Extract Different?

Most frame-grab tools blindly sample frames by time.

Smart Extract understands quality.

It evaluates every candidate frame and keeps only the best moments.

Key Advantages

  • 🧠 Global Best-Frame Selection
    Scores frames by sharpness + brightness across the entire video

  • Temporal Diversity Enforcement
    Prevents near-duplicate frames from the same moment

  • GPU-Accelerated Scoring (Optional)
    Uses PyTorch and CUDA for fast analysis on RTX GPUs

  • 🖼 AI Super-Resolution (4x)
    Optional RealESRGAN upscaling for ultra-clean slides and thumbnails

  • Auto-Crop for Presentation Screens
    Removes dead space and zooms into the slide content

  • 🎨 Optional Color Normalization
    Corrects projector color casts such as purple or yellow shifts

  • 🚀 Fast Video Decoding
    Uses Decord when available with OpenCV fallback


🧠 How It Works (High Level)

  1. Scan the entire video at a configurable low FPS
  2. Score each sampled frame using:
    • Laplacian sharpness
    • Mean brightness
  3. Rank all frames globally by visual quality
  4. Select the top K frames while enforcing a minimum time gap
  5. Post-process selected frames:
    • Optional crop
    • Optional color correction
    • Optional AI upscaling
  6. Save only the final best frames

Result: A clean, diverse, high-quality image set instead of noise.


🏗 Architecture Overview

flowchart TD
    A[Input Video File] --> B[Frame Sampling]
    B --> C[Sharpness Scoring]
    B --> D[Brightness Scoring]
    C --> E[Global Frame Ranking]
    D --> E[Global Frame Ranking]
    E --> F[Temporal Diversity Filter]
    F --> G[Best Frame Selection]
    G --> H[Optional Auto Crop]
    H --> I[Optional Color Correction]
    I --> J[Optional AI Super Resolution]
    J --> K[Final High Quality Frames]
Loading

📸 Ideal Use Cases

  • 🎤 Conference talks and keynote recordings
  • 🎓 University lectures and classes
  • 🧑‍🏫 Internal presentations and demos
  • 📊 Slide reconstruction from recorded sessions
  • 🖼 Thumbnail generation
  • 📁 Portfolio or documentation screenshots

⚙ Requirements

Core

  • Python 3.9 or higher
  • Windows, Linux, or macOS

Recommended

  • NVIDIA GPU with CUDA support
  • RTX series GPU for best performance

📦 Installation

Clone the repository:

git clone https://github.com/EyalPasha/smart-extract.git
cd smart-extract

Install dependencies:

pip install -r requirements.txt

This installs:

  • torch (CUDA-enabled if available)
  • opencv-python
  • numpy
  • decord for fast video decoding
  • realesrgan and basicsr for AI upscaling

🔧 Configuration

All behavior is controlled via config.py with no code changes required.

Core Settings

VIDEO_PATH = "IMG_8313.MOV"
OUTPUT_DIR = "photos"
OUTPUT_FORMAT = "PNG"

🎯 Global Best-Frame Mode (Primary Feature)

USE_GLOBAL_BEST_FRAMES = True
GLOBAL_TARGET_FRAMES = 100
GLOBAL_SAMPLE_FPS = 2.0
GLOBAL_MIN_TIME_BETWEEN = 8.0
GLOBAL_START_TIME_SECONDS = 27.0

This mode:

  • Scans the entire video
  • Picks the top K frames globally
  • Ensures visual and temporal diversity

Output directory:

photos/best

⚡ GPU Acceleration and 🖼 AI Upscaling

USE_GPU_SCORING = True
ENABLE_SUPER_RESOLUTION = True
  • GPU scoring accelerates sharpness and brightness evaluation
  • RealESRGAN upscales selected frames by 4x for maximum clarity

✂ Cropping and 🎨 Color Correction (Optional)

Disabled by default for maximum fidelity:

ENABLE_AUTO_CROP = False
ENABLE_COLOR_CORRECTION = False

Enable these options if your recording contains:

  • Excess dead space
  • Projector color casts

▶ Running Smart Extract

python smart_extract.py

You will see:

  • Device detection (CPU or CUDA)
  • Global scan progress bar
  • Final selection summary

Example output:

Done! Saved 100 global best frames to photos/best

📁 Output Naming

Each saved frame includes metadata in the filename:

best_042_t318.4s_score912.png
  • 042 is the rank
  • 318.4s is the timestamp
  • score912 is the final quality score

Perfect for sorting, filtering, and automation.


🧪 Advanced Notes

  • If frames look too similar, increase GLOBAL_MIN_TIME_BETWEEN
  • If key moments are missing, increase GLOBAL_SAMPLE_FPS
  • If VRAM is limited, reduce the RealESRGAN tile size in smart_extract.py

🚀 Why This Project Matters

Smart Extract is a production-ready visual curation pipeline.

It demonstrates:

  • Computer vision fundamentals
  • GPU acceleration with PyTorch
  • Practical AI upscaling
  • Performance-aware video processing
  • Clean and configurable system design

Ideal for portfolio showcase, research tooling, or real-world media workflows.


📜 License

MIT License


⭐ If this project helped you, consider starring the repository.

About

GPU-accelerated tool that scans presentation videos, scores visual quality, and extracts the best frames using computer vision and AI upscaling.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages