Also an easy-to-use tool to extract frames from video, for deep learning and computer vision.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Video2frame is also an easy-to-use tool to extract frames from video.

Why this tool

Forwchen's vid2frame tool is great, but I am always confused by their parameters. At the same time, I also want to add something I need to the tool.

So I re-wrote the code. And now, it is a new wheel. It is hard to make a PR since I changed the code style.

How to use

  1. Establish the environment

    We recommend to use conda to establish the environment. Just using

    conda env create -f install/conda-environment.yml

    You can also do it manually. This project relays on the following packages:

    • Python
    • FFmpeg
    • Python packages (can be installed using pip install -r install/pip-requirements.txt)
      • h5py
      • lmdb
      • numpy
      • easydict
      • tqdm
  2. Make the annotation json file

    The json file should like

        "meta": {
            "class_num": 2,
            "class_name": [
        "annotation": {
            "label1_abcdefg": {
                "path": "path/to/the/video/file_1.mp4",
                "class": 1
            "label2_asdfghj": {
                "path": "path/to/the/video/file_2.mp4",
                "class": 2
  3. Extract frames using


    • Using the default options:

      python dataset.json
    • Specify the output file name:

      python dataset.json --db_name my_dataset
    • Using lmdb rather than hdf5:

      python dataset.json --db_type LMDB


      python dataset.json --db_name my_dataset.lmdb
    • Random clip 5 seconds:

      python dataset.json --duration 5.0
    • Get 3 video clips with a length of 5 seconds:

      python dataset.json --clips 3 --duration 5.0 
    • Resize the frames to 320x240:

      python dataset.json --resize_mode 1 --resize 320x240
    • Keep the aspect ration, and resize the shorter side to 320:

      python dataset.json --resize_mode 2 --resize S320
    • Keep the aspect ration, and resize the longer side to 240:

      python dataset.json --resize_mode 2 --resize L240
    • Extract 5 frames per second:

      python dataset.json --fps 5
    • Uniformly sample 16 frames per video:

      python dataset.json --sample_mode 1 --sample 16
    • Randomly sample 16 frames per video:

      python dataset.json --sample_mode 2 --sample 16
    • Use 16 threads to speed-up:

      python dataset.json --threads 16
    • Resize the frames to 320x240, extract one frame every two seconds, uniformly sample 32 frames per video, and using 20 threads:

      python dataset.json \
          --resize_mode 1 \ 
          --resize 320x240 \
          --fps 0.5 \
          --sample_mode 1 \
          --sample 32 \
          --threads 20

    All parameters

    usage: [-h] [--db_name DB_NAME]
                          [--db_type {LMDB,HDF5,FILE,PKL}] [--tmp_dir TMP_DIR]
                          [--duration DURATION] [--clips CLIPS] [--resize_mode {0,1,2}]
                          [--resize RESIZE] [--fps FPS] [--sample_mode {0,1,2,3,4}]
                          [--sample SAMPLE] [--threads THREADS] [--keep]
    positional arguments:
      annotation_file       The annotation file, in json format
    optional arguments:
      -h, --help            show this help message and exit
      --db_name DB_NAME     The database to store extracted frames
      --db_type {LMDB,HDF5,FILE,PKL}
                            Type of the database
      --tmp_dir TMP_DIR     Tmp dir
      --duration DURATION   Length of the clip
      --clips CLIPS         Num of clips per video
      --resize_mode {0,1,2}
                            Resize mode
                              0: Do not resize
                              1: 800x600: Resize to W*H
                              2: L600 or S600: keep the aspect ration and scale the longer/shorter side to s
      --resize RESIZE       Parameter of resize mode
      --fps FPS             Sample the video at X fps
      --sample_mode {0,1,2,3,4}
                            Frame sampling options
                              0: Keep all frames
                              1: Uniformly sample n frames
                              2: Randomly sample n continuous frames
                              3: Randomly sample n frames
                              4: Sample 1 frame every n frames
      --sample SAMPLE       Parameter of sample mode
      --threads THREADS     Number of threads
      --keep                Do not delete tmp files at last



    A json generator where the videos are arranged in this way:


    A json generator that converts the Something-Something dataset.


    A json generator that converts the UCF101 dataset.



    Get frames using skvideo package, when training and evaluating. It is OKay when your batch size is small, and your cpus are powerful enough.


    A PyTorch Dataset example to read LMDB dataset.


    A PyTorch Dataset example to read HDF5 dataset.

    ALWAYS ENSURE num_workers=0 OR num_workers=1 OF YOUR DATA LOADER.


    A PyTorch Dataset example to read pickle dataset.


    A PyTorch Dataset example to read image files dataset.