## Process directory videos

This notebook provides a run through of how to set up your environment, and a run through of the object detection algorithms that process video files in a chosen directory. 

#### Purpose

The purpose of the notebook is, for a given directory, to determine which videos contain a given object (e.g. 'cat') and to save those videos to a new location. All videos will be deleted from the source directory. The idea here is that if we have a camera with a sensor that is regularly storing files into a given location, we can run this notebook to keep any files that contain a 'cat' (or other object to detect), store them somewhere safe (e.g. googledrive) and purge all other videos to keep storage costs down. 

#### Setup

I have created a new environment using anaconda (python v3.8) and executed the following commands:

``` [python]
pip install tensorflow
pip install opencv-python
pip install pandas
```

The base python installation plus the libraries above should deal with all required libraries to run this notebook. Ensure that in your directory with this notebook you have the yolo-v5 model object `yolo-v5.tflite`. 

#### Load project libraries

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
import cv2
import glob
from pathlib import Path
import time
import os

#### Define functions

Here we're going to define a number of useful functions to execute the video review process. 

###### Load model

The purpose of this function is to load our Yolo-v5 model and extract key input and output details that we'll use to get our inputs/outputs into the correct format. 

In [2]:
# load the model
def load_model(model_path):

    # Load the TensorFlow Lite model
    interpreter = tf.lite.Interpreter(model_path=model_path)
    interpreter.allocate_tensors()

    # Get input and output details
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()
    
    return input_details, output_details, interpreter

##### Get videos

This function provides a list of paths to all files in a given directory with the extension .mp4 (i.e. video files!).

In [3]:
# get videos from a directory
def get_videos(directory):
    directory = 'test_videos_to_process'
    video_files = glob.glob(os.path.join(directory, '*.mp4'))
    return video_files

##### Set output names

Here we want to make sure that each time we run the process we timestamp our outputs. So we can define our output directory, and our input file and this function will create a nice timestamped output path that we can use for our processed videos. 

In [4]:
# set some names
def set_output_name(video_to_process,output_directory):
    name_prefix = Path(video_to_process).stem
    timestr = time.strftime("%Y%m%d_%H%M%S")
    output_video_file = output_directory + '//' + name_prefix + "_out_" + str(timestr) + ".mp4"
    
    return output_video_file

##### open video streams

The purpose of this function is fairly self explanatory, we want to open the cv2 read and write connections for the video we're processing and the new one we're going to write. 

In [5]:
# open connections to cv2
def open_video_streams(video_to_process,output_file_name):
    # Open the video stream
    video_capture = cv2.VideoCapture(video_to_process)
    
    # Define the output video file name and settings
    output_frame_size = (int(video_capture.get(3)), int(video_capture.get(4)))  # Use the same size as the input video
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    output_video = cv2.VideoWriter(output_file_name, fourcc, 30.0, output_frame_size)

    print('video connections opened')
    return video_capture, output_video

##### Class filter and yolo detect

The two functions below took some figuring out! Essentially we want to ensure we correctly interpret the prediction outputs from Yolo-v5, by processing the outputs to something that we can read / analyse further. Each function is annotated further. 

In [6]:
# define model output processing functions

# class filter
def class_filter(classdata):
    classes = []  # create a list
    for i in range(classdata.shape[0]):         # loop through all predictions
        classes.append(classdata[i].argmax())   # get the best classification location
    return classes  # return classes (int)

# yolo detect
def yolo_detect(output_data):  # input = interpreter, output is boxes(xyxy), classes, scores
    output_data = output_data[0]                
    boxes = np.squeeze(output_data[..., :4])    
    scores = np.squeeze( output_data[..., 4:5]) 
    classes = class_filter(output_data[..., 5:]) 
    # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
    x, y, w, h = boxes[..., 0], boxes[..., 1], boxes[..., 2], boxes[..., 3] # xywh
    xyxy = [x - w / 2, y - h / 2, x + w / 2, y + h / 2]  # xywh to xyxy

    return xyxy, classes, scores  # output is boxes(x,y,x,y), classes(int), scores(float) [predictions length]

##### Process video with ML

The function below is the crux of this notebook. This is where we parse a video frame by frame to determine whether or not our chosen objects are present, storing outputs along the way.

It performs 6 main actions, looping over each frame in a video.  

1. process a videos input frame into the right format for the model
2. runs the frame through the model
3. saves the model scores to a results list
4. draws a rectaingle around our object on the new 'processed video'
5. checks how many frames have been processed (if we want to break the loop)
6. returns results

In [7]:
# define running the model and outputs
def process_video_with_ml(video_capture, interpreter, input_details, output_details, output_video, max_frame_break=50):

    # initialise an empty list to store results
    results = []

    # initialise a frame counter
    frame_count = 0
    
    while True:
        # Read a frame from the video
        ret, frame = video_capture.read()
        if not ret:
            break

        # Store the original frame dimensions for later conversion
        original_frame_height, original_frame_width, _ = frame.shape

        # Preprocess the frame (resize, normalize, etc.) to match the model's input shape
        resized_frame = cv2.resize(frame,(320,320))
        normalized_frame = resized_frame / 255.0
        preprocessed_frame = np.expand_dims(normalized_frame, axis=0).astype(np.float32)

        # Perform inference
        interpreter.set_tensor(input_details[0]['index'], preprocessed_frame)
        interpreter.invoke()
        detections = interpreter.get_tensor(output_details[0]['index'])
        xyxy, classes, scores = yolo_detect(detections) #boxes(x,y,x,y), classes(int), scores(float)

        # draw on video and write results 
        for i in range(len(scores)):
            if ((scores[i] > 0.3) and (scores[i] <= 1.0)):
                H = frame.shape[0]
                W = frame.shape[1]
                xmin = int(max(1,(xyxy[0][i] * W)))
                ymin = int(max(1,(xyxy[1][i] * H)))
                xmax = int(min(H,(xyxy[2][i] * W)))
                ymax = int(min(W,(xyxy[3][i] * H)))

                cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)
                cv2.putText(frame, f"{classes[i]}: {scores[i]:.2f}", (xmin, ymin - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

                results.append({"Frame":frame_count,"Class": classes[i], "Confidence": scores[i]})

        # Write the processed frame to the output video
        output_video.write(frame)

        # Increment the frame counter
        frame_count += 1

        if frame_count % 10 == 0:
            print(frame_count)

        # Check if the maximum frame count has been reached
        if frame_count >= max_frame_break:
            break
        
    print('video processing complete')
    results_df = pd.DataFrame(results)
    return results_df

##### Close video connections

This one is self explanatory. We want to release cv2 connection to the videos we're processing. 

In [8]:
def close_video_connections(video_capture,output_video):
    # close video connections
    video_capture.release()
    output_video.release()
    cv2.destroyAllWindows()
    print('connections closed')

##### Is in video? 

Here we're running a comparison between the model outputs dataframe and our class threshold list to check whether the object is in a video file. 

In [9]:
def is_in_video(class_id,threshold,results):
    checks = []
    for i in results.index:
        if results.iloc[i,1] == class_id and results.iloc[i,2] >= threshold:
            checks.append(True)
        else:
            checks.append(False)
    if True in checks:
        return True
    else:
        return False

##### Keep video

This is a logic that combines the above function with a decision process to output true/false. The idea here is that we can build this logic into a process to decide to delete videos (or not). 

In [10]:
def keep_video(criteria,results):
    decision = False
    for i in criteria.index:
        print('checking ' + str(criteria.iloc[i,1]))
        if is_in_video(criteria.iloc[i,0],criteria.iloc[i,2],results):
            decision = True
            print(True)
        else:
            print(False)
    print('outcome: ' + str(decision))
    return decision

##### Delete file

We need a process to delete a file from a path. Here it is!

In [11]:
def delete_file(path):
    os.remove(path)
    print(path + ' file deleted')

##### Move a video

We also need a process to move a file from one path to another (i.e. out of the source location that we're purging and into our nice safe 'keep' folder. 

In [12]:
# relocate a successful file process to output folder
def move_video(vid,new_dir):
    new_loc = new_dir + '//' + Path(vid).stem + '.mp4'
    Path(vid).rename(new_loc)
    print(vid + ' file moved to ' + new_loc)

#### Execute the process

Here we're going to execute the functions, which broadly falls into the following steps:

1. set parameters for execution
2. load our model and get our input videos
3. iterate over each video
    a. determine whether our chosen objects (cats and teddy bears) are in the videos
    b. if so, move processed videos to output directory
    c. if not, delete videos

In [13]:
# set my params
chosen_model = "yolo-v5.tflite"
chosen_source_directory = 'test_videos_to_process'
chosen_output_directory = 'processed_videos'
chosen_frame_breaks = 10 # chosen cut off frame number for processing videos
d = {'class': [15, 77],'class_name': ['cat','teddy bear'] , 'threshold': [0.5, 0.5]} # see list here: https://github.com/ultralytics/yolov5/blob/master/data/coco128.yaml
my_criteria = pd.DataFrame(data=d) # set my parameters for keeping videos

In [14]:
# run process

input_details, output_details, interpreter = load_model(model_path=chosen_model)
video_files = get_videos(chosen_source_directory)

# run process for each video
for i in video_files:
    video_to_process = i
    output_video_file = set_output_name(video_to_process,chosen_output_directory)
    video_capture, output_video = open_video_streams(video_to_process,output_video_file)
    results_df = process_video_with_ml(video_capture, interpreter, input_details, output_details, output_video, max_frame_break=chosen_frame_breaks)
    close_video_connections(video_capture,output_video)

    kv = keep_video(my_criteria,results_df)

    if not kv:
        delete_file(output_video_file)
        delete_file(video_to_process)
    else:
        move_video(video_to_process,chosen_output_directory)

print('process complete')

video connections opened
10
video processing complete
connections closed
checking cat
False
checking teddy bear
False
outcome: False
processed_videos//apples_out_20231006_163445.mp4 file deleted
test_videos_to_process\apples.mp4 file deleted
video connections opened
10
video processing complete
connections closed
checking cat
True
checking teddy bear
False
outcome: True
test_videos_to_process\cat.mp4 file moved to processed_videos//cat.mp4
process complete


#### fin!

We can see that the video containing a cat. cat.mp4 was processed (you can go and have a look at the processed video in the output folder), and the apple.mp4 file (which doesn't have a cat or a teddy bear in it!) was deleted. Success!

We're done. Thanks for reading. 