# Using OpenCV
## Overview
This document exists to organize and contextualize a family of Python functions written by Phil Fahn-Lai for working with scientific videos using the open-source computer vision library [OpenCV](https://opencv.org/). The code shown here was originally developed in 2019/2020 to pre-process [XROMM](https://www.xromm.org/) videos in preparation for automated labeling by the deep learning toolbox [DeepLabCut](http://www.mousemotorlab.org/deeplabcut), and addresses several needs common to large video data sets (e.g. concatenating, trimming, downsampling, compressing), as well as a couple that were particular to this project (e.g. merging two separate streams of grayscale video into separate channels of a color video).
## Prerequisites
Both Python and OpenCV are platform-independent, and a number of detailed guides exist on the internet for setting them up on your computer, although Unix-like systems (e.g. Mac and Linux) are frequently friendlier development environments than Windows (for reasons we won't go into here.) 

[Anaconda](https://www.anaconda.com/) is a popular distribution of Python that includes a package manager (Conda), a user-friendly GUI for managing development environments (Navigator), and a code editor/IDE (Spyder), and is the easiest way to get up and running with Python without mucking around in the command line. Anaconda's intuitive approach to creating and managing environments makes it easy to recommend to beginners, as messed-up paths and missing or redundant dependencies can be a frustrating and discouraging barrier to learning to program.

Going all-in on the convenience of the Anaconda ecosystem means also accepting its limitations: while many popular Python packages (e.g. NumPy, TensorFlow) are available on Conda's repositories, Conda's selection pales in comparison with the universe of code available through PIP, Python's default package manager. Several of the modules and packages (package = bundle of modules) required by the functions below (e.g. blend_modes) do not yet exist on Conda, and while it is possible to take their source code from PyPI (PIP's repository) and compile them for Conda, I have found that it is much quicker and far less frustrating to simply **handle environments with Anaconda** and **use PIP to install packages within them.**

If you're reading this, you've probably gotten to the point where you've installed not only Python, but also [Jupyter Notebook/Lab](https://jupyter.org/). Jupyter is an IDE (Integrated Development Environment) analogous to RStudio or Spyder, but instead of running as a self-contained desktop application, Jupyter performs back-end calculations on a virtual server on your computer, and pipes the output to an interactive interface in your web browser. Jupyter notebooks (`.ipynb`) consist of text cells (written in formattable [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) (`.md`)) as well as executable code cells (written in Python). By allowing the results of computational analyses to be presented alongside the code that generated them as well as explanatory text, notebooks have become extremely popular with teachers (who value context and exposition), and data scientists (who value repeatability and (generally) transparency).

## How to use
Each code cell in this notebook contains the code of a single Python module. The project directory that these were taken from contained all of these modules as discrete `.py` files, as well as a `.ipynb` Jupyter notebook that imported them, like so: `from [module_filename] import [module_functionname]`. 

To actually use these functions, you should either replicate this folder structure or—preferably—decide to be more organized than I was and place them inside a subdirectory called 'modules' or some such, in which case you would import them with, e.g. `from .[subdirectory] import [module_filename]` (the `.` indicates that the path to the subdirectory is relative to the location of the file that's trying to import the module). Note that you can rename your imports for brevity—for instance, many people choose to `import numpy as np` so they can access the package's methods with `np.[function]` rather than `numpy.[function]`.

Alternatively, you can create a new Jupyter notebook and copy the function cells you want to use into there. Two things to keep in mind if you do this: 
1. These functions use a lot of the same imports, so make sure to go through and delete the redundant ones (or ideally consolidate your imports in a separate code cell at the top of the notebook). 
2. Variables in notebooks are *globally-scoped*—they exist outside of the cells they're created in. Each code cell in your notebook has access to every other code cell, since they all share the same namespace. What this means in practice is you should be careful not to assign overly-similar names to variables in different parts of the notebook, so as not to run the risk of accidentally overwriting them later on.

Or you could do the grown-up thing and wrap all of these inside a class.

Finally, a word on path separators. While Unix-like systems use a regular (forward) slash `/` as a separator in file paths (e.g. `[directory]/[file]`), Windows uses a backslash `\` instead (e.g. `[directory]\[file]`). This can cause *huge* issues when working with languages that use the backslash as an escape character *(a lot of them!)* To get around this, make sure you pass strings containing Windows paths as raw strings (by adding `r` before the opening quote, e.g. `r'[directory]\[file]'`)—this tells Python to treat the backslash literally, instead of as a prefix to a special character. 

### scanDir.py
Loops recursively through a specified folder `directory` and returns a list of all files with a given extension *(defaults to `.avi` if unspecified)*. Optionally drops filenames containing `filter_string` (case-sensitive).

In [1]:
import os
import re
        
def scanDir(directory, extension='avi', filters=[], filter_out=True, verbose=False):
    file_list=[]
    for root, dirs, files in os.walk(directory):
        for name in files:
            if name.lower().endswith(extension):
                filename = os.path.join(root, name)
                if verbose == True:
                    print("Found file with extension ."+ extension + ": " + filename)
                file_list.append(filename)
                continue
            else:
                continue
    if len(filters) != 0:
        if filter_out==True:
            for string in filters:
                file_list = [file for file in file_list if not re.search(string, file)]
        else:
            for string in filters:
                file_list = [file for file in file_list if re.search(string, file)]
    return(file_list)

### cv2VideoWriterDummy.py
Demonstrates how to open a video stream with OpenCV's VideoCapture class, and save it as either: 
1. An MPEG-4 video with H.264 compression[<sup>1</sup>](#fn1) (using the VideoWriter class, which calls FFMPEG under the hood but exposes very few options to the user), or 
2. As a stream of lossless `.PNGs` bound into an uncompressed `.AVI` (by piping the incoming video data to FFMPEG). 

<span id="fn1"><sup>1</sup> This requires the [H.264/AVC codec](https://www.videolan.org/developers/x264.html). Use 'mp4v' for less-efficient Apple MPEG encoding if installing H.264 is not an option.</span>


1) is much faster, but provides less control over the quality of the output video. 

2) is the only way to ensure near-zero degradation in video quality, but is quite a bit slower. 

A potential option 3) not shown here is to get OpenCV to output an uncompressed `.AVI` by passing it 0 as the codec, but this results in a file slightly larger than the original since there is no currently-working way to keep OpenCV from outputting a color video, and it is impossible to rule out data loss from converting between pixel formats. 

Option 4) is to pass the video data to FFMPEG as in 2), but use FFMPEG's much more comprehensive API to output a better-compressed video than OpenCV is capable of making. FFMPEG supports multiple lossless codecs (e.g. H.264 lossless, HUFFYUV, FFV1), and can be built to include next-generation H.265/HEVC support as well.

Ultimately, the best choice for output encoding depends on the nature of the project—**there is no single perfect solution, as pretty much everything involving video is a tradeoff.** Uncompressed AVIs preserve maximum quality, but are unwieldy to deal with and can be slow to save. Opting for compressed videos means choosing between older, less-efficient codecs like MP4V/Apple MPEG-4 (quick to encode and decode, lower quality) and newer codecs like H.264 and H.265 that produce higher-quality videos at the cost of longer encoding and decoding times. For long-term archival purposes it might be worth investing the encoding time upfront to transcode data into a slow-to-decode lossless format, while for immediate analysis on relatively powerful hardware you should see if you can get away with compressing without losing scientifically-relevant amounts of data.


In [8]:
from PIL import Image
from subprocess import Popen, PIPE
import numpy as np
import cv2
        
def cv2VideoWriterDummy(input_video, output_video, codec='avc1'):
    cap = cv2.VideoCapture(input_video)
    frame_width = int(cap.get(3))
    frame_height = int(cap.get(4))
    frame_rate = round(cap.get(5),2)
    if codec == 'uncompressed':
        pix_format = 'gray'   ##change to 'yuv420p' for color or 'gray' for grayscale. 'pal8' doesn't play on macs
        p = Popen(['ffmpeg', '-y', '-f', 'image2pipe', '-vcodec', 'png', '-r', str(int(frame_rate)), '-i', '-', '-vcodec', 'rawvideo','-pix_fmt',pix_format,'-r', str(int(frame_rate)), output_video], stdin=PIPE)
    else:
        if codec == 0:
            fourcc = 0
        else:
            fourcc = cv2.VideoWriter_fourcc(*codec)    
        out = cv2.VideoWriter(output_video, 
                              fourcc, 
                              frame_rate,(frame_width, frame_height))
        
    while(cap.isOpened()):
        ret, frame = cap.read()
        if ret == True:
            cv2.imshow('frame',frame)
            if codec == 'uncompressed':
                im = Image.fromarray(frame)
                im.save(p.stdin, 'PNG') 
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
            else:
                out.write(frame)
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
        else:
            break
    if codec == 'uncompressed':
        p.stdin.close()
        p.wait()
    cap.release()
    if codec != 'uncompressed':
        out.release()
    cv2.destroyAllWindows()
    print("done!")

### getImmediateAncestors.py
Returns a list of a given path's immediate parent directories, in ascending order. Level is specified as `depth`.

In [7]:
import os

def getImmediateAncestors(file_path, depth=1):
    result = []
    for i in range(depth):
        parent_path, child_name = os.path.split(file_path)
        file_path = parent_path
        result.append(child_name)
    return result        

### changeFilename.py
Given any filepath, inserts a custom string `insertion` at position `pos`. Default behavior is prefix. 

In [10]:
import os

def changeFileName(file_path, insertion, new_extension=None, pos=0):
    location, name = os.path.split(file_path)
    left, right = name[:pos], name[pos:]
    modified_name = left + insertion + right
    if new_extension != None:
        temp_name, old_ext = os.path.splitext(modified_name)
        modified_name = temp_name + new_extension
    result = os.path.join(location, modified_name)
    return result

### bakeMetadata.py
Bakes experiment metadata (Experiment day, trial condition, frame number) into each video frame. Expects the following folder structure: `[experiment]/[trial]/[video]`.

In [4]:
import os
from PIL import Image
from subprocess import Popen, PIPE
import numpy as np
import cv2
# from getImmediateAncestors import getImmediateAncestors
        
def bakeMetadata(input_path, output_path, codec='avc1'):
    cap = cv2.VideoCapture(input_path)
    frame_width = int(cap.get(3))
    frame_height = int(cap.get(4))
    frame_rate = round(cap.get(5),2)
    

    pos_x = round(frame_width/50)
    off_y = round(frame_height/50)
    off_y_initial = off_y
    frame_index = 1
    metadata = getImmediateAncestors(input_path, 3)[1:]
    metadata.append(frame_index)
    font_family = cv2.FONT_HERSHEY_SIMPLEX
    font_size = 0.4
    font_color = (255, 255, 255)
    
    if codec == 'uncompressed':
        pix_format = 'gray'   ##change to 'yuv420p' for color or 'gray' for grayscale. 'pal8' doesn't play on macs
        p = Popen(['ffmpeg', '-y', '-f', 'image2pipe', '-vcodec', 'png', '-r', str(int(frame_rate)), '-i', '-', '-vcodec', 'rawvideo','-pix_fmt',pix_format,'-r', str(int(frame_rate)), output_video], stdin=PIPE)
    else:
        if codec == 0:
            fourcc = 0
        else:
            fourcc = cv2.VideoWriter_fourcc(*codec)    
        out = cv2.VideoWriter(output_path, 
                              fourcc, 
                              frame_rate,(frame_width, frame_height))
        
    while(cap.isOpened()):
        ret, frame = cap.read()
        if ret == True:
            off_y = off_y_initial
            for metadatum in metadata:
                pos_y = frame_height-off_y

                frame = cv2.putText(frame, str(metadatum), (pos_x, pos_y), font_family,
                                    font_size, font_color)
                off_y = off_y+off_y_initial 
            cv2.imshow('frame',frame)
            frame_index += 1
            metadata.pop()
            metadata.append(frame_index)
            if codec == 'uncompressed':
                im = Image.fromarray(frame)
                im.save(p.stdin, 'PNG') 
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    breakw
            else:
                out.write(frame)
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
        else:
            break
    if codec == 'uncompressed':
        p.stdin.close()
        p.wait()
    cap.release()
    if codec != 'uncompressed':
        out.release()
    cv2.destroyAllWindows()
    print("done!")

### sortByCameraID.py
Given a list of file paths, returns a list of lists sorted by camera identifier (in the format prefix->camera#).

In [23]:
import re

def sortByCameraID(path_list, prefixes=['Cam','C00'], number_of_cameras=2):
    triaged_lists = [[] for cameras in range(number_of_cameras)]
    for i, triaged_list in enumerate(triaged_lists, start=1):
        for path in path_list:
            for prefix in prefixes:
                match_string = prefix+str(i)
                if re.search(match_string, path):
                    triaged_list.append(path)
                    break
    return triaged_lists


### concatenateVideos.py
Given a list of video paths, concatenates them into one long video. Passing in an optional downsampling factor tells the function to only capture one in every n frames.

In [3]:
import numpy as np
import cv2
        
def concatenateVideos(path_list, output_path, codec='avc1', interval=1):
    frame_index = 0
    video_index = 0
    cap = cv2.VideoCapture(path_list[0])
    frame_width = int(cap.get(3))
    frame_height = int(cap.get(4))
    frame_rate = round(cap.get(5),2)/interval
    fourcc = cv2.VideoWriter_fourcc(*codec)
    out = cv2.VideoWriter(output_path, 
                          fourcc, 
                          frame_rate,(frame_width, frame_height))
    while(cap.isOpened()):
        ret, frame = cap.read()
        frame_index += 1
        if frame is None:
            print("end of video " + str(video_index) + " ... next one now")
            video_index += 1
            if video_index >= len(path_list):
                break
            cap = cv2.VideoCapture(path_list[ video_index ])
            frame_index = 0
        elif frame_index == interval:
            frame = frame.astype(np.uint8)
            cv2.imshow('frame',frame)
            out.write(frame)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            frame_index = 0         
    cap.release()
    out.release()
    cv2.destroyAllWindows()
    print("done!")


### mergeRGB.py
Takes a dictionary containing two video paths in the format `{'A':[path A], 'B':[path B]}` and exports a single new video with video A written to the red channel and video B written to the green channel. The blue channel is, depending on the value passed as "mode", either the difference blend between A and B, the multiply blend, or just a black frame.

In [48]:
import os
import numpy as np
import cv2
import blend_modes
        
def mergeRGB(video_dict, output_path, codec='avc1', mode=None):
    capA = cv2.VideoCapture(video_dict['A'])
    capB = cv2.VideoCapture(video_dict['B'])
    frame_width = int(capA.get(3))
    frame_height = int(capA.get(4))
    frame_rate = round(capA.get(5),2)
    fourcc = cv2.VideoWriter_fourcc(*codec)
    out = cv2.VideoWriter(output_path,
                         fourcc,
                         frame_rate,(frame_width, frame_height))
    while(capA.isOpened()):
        retA, frameA = capA.read()
        retB, frameB = capB.read()
        if retA == True:
            ## give frames an alpha channel to prepare for blending; blend_modes requires 32bit
            frameA = cv2.cvtColor(frameA, cv2.COLOR_BGR2BGRA,4).astype(np.float32)
            frameB = cv2.cvtColor(frameB, cv2.COLOR_BGR2BGRA,4).astype(np.float32)
            frameA = cv2.normalize(frameA, None, 0, 255, norm_type=cv2.NORM_MINMAX)
            frameB = cv2.normalize(frameB, None, 0, 255, norm_type=cv2.NORM_MINMAX)
            if mode == "difference":
                extraChannel = blend_modes.difference(frameA,frameB,1)
            elif mode == "multiply":
                extraChannel = blend_modes.multiply(frameA,frameB,1)
            else:
                extraChannel = np.zeros((frame_width, frame_height,3),np.uint8)
                extraChannel = cv2.cvtColor(extraChannel, cv2.COLOR_BGR2BGRA,4).astype(np.float32)

            ## get rid of alpha channel in preparation for converting back to grayscale; opencv prefers 8bit
            frameA = cv2.cvtColor(frameA, cv2.COLOR_BGRA2BGR).astype(np.uint8)  
            frameB = cv2.cvtColor(frameB, cv2.COLOR_BGRA2BGR).astype(np.uint8)  
            extraChannel = cv2.cvtColor(extraChannel, cv2.COLOR_BGRA2BGR).astype(np.uint8)  

            ## convert to grayscale so we can merge into 3-channel image
            frameA = cv2.cvtColor(frameA, cv2.COLOR_BGR2GRAY)  
            frameB = cv2.cvtColor(frameB, cv2.COLOR_BGR2GRAY)  
            extraChannel = cv2.cvtColor(extraChannel, cv2.COLOR_BGR2GRAY)  

            ## merge, show and write                  
            merged = cv2.merge((extraChannel, frameB, frameA))
            cv2.imshow('merged',merged)
            out.write(merged)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        else:
            break
    capA.release()
    capB.release()
    out.release()
    cv2.destroyAllWindows()
    print("done!")


### splitRGB.py
Takes a RGB video with different grayscale data written to the R, G, and B channels and splits it back into its component source videos.

In [9]:
import os
import numpy as np
import cv2
from PIL import Image
from subprocess import Popen, PIPE
        
def splitRGB(input_path, codec='avc1'):
    out_name = os.path.splitext(os.path.basename(input_path))[0]+'_split_'
    cap = cv2.VideoCapture(input_path)
    frame_width = int(cap.get(3))
    frame_height = int(cap.get(4))
    frame_rate = round(cap.get(5),2)
    if codec == 'uncompressed':
        pix_format = 'gray'   ##change to 'yuv420p' for color or 'gray' for grayscale. 'pal8' doesn't play on macs
        p1 = Popen(['ffmpeg', '-y', '-f', 'image2pipe', '-vcodec', 'png', '-r', str(int(frame_rate)), '-i', '-', '-vcodec', 'rawvideo','-pix_fmt',pix_format,'-r', str(int(frame_rate)), out_name+'c1.avi'], stdin=PIPE)
        p2 = Popen(['ffmpeg', '-y', '-f', 'image2pipe', '-vcodec', 'png', '-r', str(int(frame_rate)), '-i', '-', '-vcodec', 'rawvideo','-pix_fmt',pix_format,'-r', str(int(frame_rate)), out_name+'c2.avi'], stdin=PIPE)
    else:
        if codec == 0:
            fourcc = 0
        else:
            fourcc = cv2.VideoWriter_fourcc(*codec)    
        out1 = cv2.VideoWriter(out_name+'c1.mp4', 
                              fourcc, 
                              frame_rate,(frame_width, frame_height))
        out2 = cv2.VideoWriter(out_name+'c2.mp4', 
                              fourcc, 
                              frame_rate,(frame_width, frame_height))
        
    while(cap.isOpened()):
        ret, frame = cap.read()
        if ret == True:
            B, G, R = cv2.split(frame)
            cv2.imshow('frame',R)
            if codec == 'uncompressed':
                imR = Image.fromarray(R)
                imG = Image.fromarray(G)
                imR.save(p1.stdin, 'PNG') 
                imG.save(p2.stdin, 'PNG') 
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
            else:
                out1.write(R)
                out2.write(G)
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
        else:
            break
    if codec == 'uncompressed':
        p1.stdin.close()
        p1.wait()
        p2.stdin.close()
        p2.wait()
    cap.release()
    if codec != 'uncompressed':
        out1.release()
        out2.release()
    cv2.destroyAllWindows()
    print("done!")

### vidToPngs.py
Takes a video as input and exports each frame as a separate .png.

In [11]:
import os
import numpy as np
import cv2
        
def vidToPngs(video_path, output_dir=None, indices_to_match=[], name_from_folder=True):
    frame_index = 0
    frame_counter = 0
    if name_from_folder:
        out_name = os.path.splitext(os.path.basename(video_path))[0]
    else:
        out_name = 'img'
    if output_dir==None:
        out_dir = os.path.join(os.path.dirname(video_path),out_name)
    else:
        out_dir = output_dir
    if not os.path.exists(out_dir):
        os.mkdir(out_dir)
    cap = cv2.VideoCapture(video_path)
    frame_width = int(cap.get(3))
    frame_height = int(cap.get(4))
    frame_rate = round(cap.get(5),2)
    while(cap.isOpened()):
        ret, frame = cap.read()
        frame_counter += 1
        if ret == True:
            if indices_to_match and not frame_index in indices_to_match:
                frame_index += 1
                continue
            else:
                png_name = out_name+str(frame_index).zfill(4)+'.png'
                png_path = os.path.join(out_dir, png_name)
                cv2.imshow('frame',frame)
                cv2.imwrite(png_path, frame)
                frame_counter = 0                
                if cv2.waitKey(1) & 0xFF == ord('q'):
                        break
                frame_index += 1
        else: 
            break
                
    cap.release()
    cv2.destroyAllWindows()
    print("done!")

In [5]:
import os

def matchFrames(extracted_dir):
    extracted_files = scanDir(extracted_dir, extension='png') 
    extracted_indices = [int(os.path.splitext(os.path.basename(png))[0][3:].lstrip('0')) for png in extracted_files]
    return extracted_indices   

def extractMatchedFrames(indices, output_dir = None, src_vids=[]):
    for video in src_vids:
        out_name = os.path.splitext(os.path.basename(video))[0]+'_matched'
        if output_dir is not None:
            output = output_dir
        else:
            output = os.path.join(os.path.dirname(video),out_name)
        vidToPngs(video, output, indices_to_match=extracted_indices, name_from_folder=False)

NameError: name 'vidToPngs' is not defined