# Extract frames from video

This notebook allows to extract `.jpg` frames from any local or distant video file at a specified frame rate (must be below or equal to the video’s actual frame rate). The resulting  `.jpg` sequence is saved (with numbered frames) at the video’s native resolution in a new folder ( "`$filename`-frames") created at the script’s location. 

First, import the relevant librairies: `openCV` (cv2) for video/image processing, `numpy` for data processing, `os` for file/folder/directory management, and `math` for basic calculations. 

In [6]:
# import the relevant librairies
import cv2
import numpy as np
import os
import math

You can then specify the framerate (10 fps by default) at which you’d like to extract. 
<br> It must be inferior or equal to the target video’s current framerate. 

In [7]:
SAVING_FRAMES_PER_SECOND = 1.0

The next function will calculate the timecodes at which to extract frames and store them in a new `s[]` list. It also prints the video’s duration, current total frames, and total amount of extracted frames to expect. Note that it will be called as part of the main function (next cell), which itself is only called in the notebook’s last cell. 

In [8]:
def get_saving_frames_durations(cap, saving_fps):
    """A function that returns the list of durations where to save the frames"""
    s = []
    # get the clip duration by dividing number of frames by the number of frames per second
    clip_duration = cap.get(cv2.CAP_PROP_FRAME_COUNT) / cap.get(cv2.CAP_PROP_FPS)
    # use np.arange() to make floating-point steps
    print("current frames in video:", math.floor(cap.get(cv2.CAP_PROP_FRAME_COUNT)))
    print("duration of the video in seconds:", clip_duration)
    print("extracting frames:", saving_fps, "FPS")
    frame_count_prediction = saving_fps * clip_duration
    print("total frames expected:", math.floor(frame_count_prediction-1))
    for i in np.arange(0, clip_duration, 1 / saving_fps):
        s.append(i)
    return s

The main function below will: <br>
+ Retrieve the video file’s name from the url or path<br>
+ Create a new directory named after the filename +`-frames`<br>
+ Check that the `SAVING_FRAMES_PER_SECOND` specified earlier is not higher than the video’s current frame rate. If it is, it will use the current frame rate as a fallback<br>
+ It will then retrieve the pixel values at the timecode calculated by the `get_saving_frames_durations()` function and stored in the `s[]` list, and save those as a new `.jpg` file named after the video’s file name + an index<br>

Note that this `main` function is actually called in the next cell. 

In [9]:
def main(video_file):
    filename, _ = os.path.splitext(video_file)
    #filename += "-frames"
    newdir = filename + "-frames"
    folder = newdir.split("/")
    # if there isn’t already one, create a folder named after filename + "-frames"
    if not os.path.isdir(folder[len(folder)-1]):
        os.mkdir(folder[len(folder)-1])
    # read the video file
    cap = cv2.VideoCapture(video_file)
    # get the FPS of the video
    fps = cap.get(cv2.CAP_PROP_FPS)
    print("source video fps:", fps, "FPS")
    # if the SAVING_FRAMES_PER_SECOND is above video FPS, then set it to FPS (as maximum)
    saving_frames_per_second = min(fps, SAVING_FRAMES_PER_SECOND)
    # get the list of duration spots to save
    saving_frames_durations = get_saving_frames_durations(cap, saving_frames_per_second)
    # start the loop
    count = 0
    index = 0
    while True:
        is_read, frame = cap.read()
        if not is_read:
            # break out of the loop if there are no frames to read
            break
        # get the duration by dividing the frame count by the FPS
        frame_duration = count / fps
        try:
            # get the earliest duration to save
            closest_duration = saving_frames_durations[0]
        except IndexError:
            # the list is empty, all duration frames were saved
            break
        	
        if frame_duration >= closest_duration:
            # if closest duration is less than or equals the frame duration, 
            # then save the frame
            #frame_duration_formatted = format_timedelta(timedelta(seconds=frame_duration))
#            cv2.imwrite(os.path.join(filename, str(video_file) + "-" + str(index) + ".jpg"), frame)
            videoname = filename.split("/")
            videoname = videoname[len(videoname)-1]
            newfile = os.path.join(folder[len(folder)-1], videoname + "-" + str(index) + ".jpg")
            cv2.imwrite(newfile, frame) 
            index += 1

            # drop the duration spot from the list, since this duration spot is already saved
            try:
                saving_frames_durations.pop(0)
            except IndexError:
                pass
        # increment the frame count
        count += 1

Finally, the main function is called below, prompting users to specify a url or local path where the video file is to be found. It prints a brief report that entails the source video’s current frame rate, the total amout of frames it has, the duration of the video, the frame rate at which frames are extracted, and the total amount of saved `.jpg` frames that are saved. 

In [10]:
if __name__ == "__main__":
    print("enter path to video:")
    video_file = str(input())
    #ie. "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4"
    main(video_file)

enter path to video:
http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4
source video fps: 23.976023976023978 FPS
current frames in video: 360
duration of the video in seconds: 15.014999999999999
extracting frames: 1.0 FPS
total frames expected: 14
