Skip to content

Video Snapshot & Metadata Scanner captures snapshots from videos, overlays metadata, and generates a composite image. Early-stage project; may have critical bugs.

License

Notifications You must be signed in to change notification settings

KJH-x/scans_creator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scan Creator

This project is a video snapshot and metadata scanner that captures snapshots at specified intervals from a video, formats them into a grid, and overlays essential metadata such as codec, resolution, bitrate, and language information. Additionally, it includes custom fonts and logo integration. Please note: This project is in its early stages and may experience critical bugs.

Features

  • Snapshot Extraction: Captures and resizes snapshots at set intervals across the video.
  • Metadata Display: Displays video, audio, and subtitle details within the final scan image.
  • Customizable Layout: Configurable grid layout for arranging snapshots.
  • Image Resizing & Logo Overlay: Optional resizing for the final image and support for logo integration.

Requirements

  • Python 3.12+

The project has been fully developed and tested under Python 3.12. Lower versions are not guaranteed to work.
You may attempt to run it under Python 3.8 by manually removing incompatible type annotations, but this is not officially supported.

  • FFmpeg (for extracting frames from the video, type 'ffmpeg -version` in terminal to check)
  • Pillow (PIL library for image manipulation)
  • Pydantic v2 (Used for configuration validation)

Setup

  1. Download Repo:

    git clone https://github.com/KJH-x/scans_creator.git
  2. Install dependencies:

    pip install pillow
    pip install 'pydantic ~=2.0'
  3. Download FFmpeg: Ensure FFmpeg is installed and accessible from the command line. You can download FFmpeg here.

  4. Fonts and Logo: Place your chosen fonts in the fonts/ directories.

Usage

You can run the script either interactively or by providing CLI arguments.

python cli.py --file "/path/to/video.mp4" --layout en.json --stream 0

# suffix `.json` can be ignored
python cli.py -f "/path/to/video.mp4" -l zh-CN -s 0
  • --file / -f (optional): Path to the video file. If omitted, the script will ask interactively.
  • --layout / -l (optional): Layout preset to use (zh-CN by default). Determines font selection, grid size, and text arrangement.
  • --stream / -s (optional): Index of the video stream to use if multiple streams exist. If omitted and multiple streams exist, user will be prompted.

What the script does

  1. Verifies required files (video, fonts, and logo) based on the configuration.
  2. Extracts video information (duration, resolution, file size, bitrate).
  3. Calculates snapshot times based on grid size and optional leading/ending avoidance.
  4. Captures snapshots at the calculated times.
  5. Generates a composite scan image containing snapshots and metadata.
  6. Optionally rescales the final image according to the configuration.
  7. Saves the scan image as PNG in the scans/ directory.

Example Output

The output will be a composite image arranged in a grid layout, displaying snapshots and video metadata with a custom logo.

img

Configuration Files

Default Configuration

Backups of the default configuration files are saved at schemas/defaults.json.bak(Preview here). The SHA256 checksum of the file is hard-coded in the code to ensure the correctness of the file, and the program cannot run if the checksum does not match.

global.json

This file contains the following configuration items:

  • logo_file: Path to the logo image file to overlay on the scan.

  • fonts: The path to several font files, you can specify the sequence number in the font_list.

  • resize_scale: Scaling factor for resizing the final scan image (e.g., 2 means resize to half size).

  • avoid_leading: If true, avoids taking snapshots from the very beginning of the video.

  • avoid_ending: If true, avoids taking snapshots from the very end of the video.

  • output_filename_format: Template for the output file name. Requirements:

    1. Must end with .png.
    2. Can include an optional {file_name} placeholder to insert the video file name.
    3. Can include an optional {timestamp:FORMAT} placeholder, where FORMAT is a valid datetime.strftime format string (e.g., %H%M%S) to include a timestamp in the file name.
  • max_text_multiline: Determine the maximum number of line breaks for a piece of text (use ellipsis if the limit is still exceeded)

layout/*.json

This file is responsible for setting the layout style of the metadata (including font, font size, font color, layout, shadows, and information to be displayed).

  • canvas_width: The width of the whole canvas.

  • font_list: The font used for each paragraph of text.

  • time_font: The font used to time the snapshot.

  • shade_offset: The amount of shadow offset for the text.

  • text_color: The color of the text.

  • shade_color: The color of the shade.

  • text_list: The metadata to be displayed. (TODO: Details).

  • grid_size: A tuple defining the grid size for snapshot arrangement (e.g., [4, 4] for a 4x4 grid).

You can update these values to suit your project needs. For example, if you'd prefer a smaller grid (or bigger snapshots), change "grid_size": [4, 4] to "grid_size": [3, 3] for a 3x3 grid (9 snapshots).

  • spacing_*: See schema.json and chart below

  • timestamp_offset_y: Vertical offset for snapshot timestamp display in pixels.

flowchart LR
  subgraph root["Root"]
    direction LR

    subgraph mainContent["Main Content"]
      direction TB

      subgraph metadata["Metadata"]
        direction LR
        subgraph metaCol["Metadata Column"]
          direction TB
          subgraph metaCell["Metadata Cell"]
            direction LR
            label <--"spacing_label_to_value"--> value
          end
          metaCell 
          <-- "spacing_in_one<br>_metadata_column" --> 
          metaCell2["<span style='visibility:hidden'>0000</span>Metadata Cell 2<span style='visibility:hidden'>0000</span>"]
          <-- "spacing_in_one<br>_metadata_column" --> 
          metaCell3["<span style='visibility:hidden'>0000</span>Metadata Cell 3<span style='visibility:hidden'>0000</span>"]
          
          style metaCell2 white-space:nowrap
          style metaCell3 white-space:nowrap
        end

        metaCol  
        <-- "spacing_metadata_columns" --> 
        metaCol2["<br><br><br><br><br><br><br><br><br><br><br><br><br><br>Metadata Column 2<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br>"]  
        <-- "spacing_metadata_columns" --> 
        metaCol3["<br><br><br><br><br><br><br><br><br><br><br><br><br><br>Metadata Column 3<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br>"]  
        
      end
    
      Title["Title Title *TITLE* Title Title"] 
      <--"spacing_title_to_content"--> metadata

      style Title white-space:nowrap
    end

    mainContent ~~~ Logo
  end
Loading

Limitations & Known Issues

  • Early Development: This is an early-stage project, and bugs may result in crashes or incomplete image generation.
  • Error Handling: Error handling is limited, especially for issues related to FFmpeg processing or missing metadata fields.

Development

Some modules include embedded scripts for development purposes, with the following usage and functionality:

  1. python -m src.models.info_layout / python -m src.models.global_config

    Generates a schema.json file based on the Pydantic models. Combined with VSCode settings, this enables basic validation and field hints when editing configuration files.

  2. python -m src.utils.common

    Computes the checksum of the backup file defaults.json.bak. This checksum is hardcoded in the config_manager module.

License

MIT License


For any issues, please open an issue on the repository or contribute with a pull request.

About

Video Snapshot & Metadata Scanner captures snapshots from videos, overlays metadata, and generates a composite image. Early-stage project; may have critical bugs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages