# Overview

The goal here is to generate copies of research videos with small file sizes to allow DLC and data management to run more efficiently. This notebook includes code for experiments for where (1) the recording was saved as an image sequence and (2) where the recordings were saved in individual movie files. As demonstrated in [this preprint](https://www.biorxiv.org/content/10.1101/457242v1), videos used for DLC can yield similar results when highly downsampled and compressed while running much faster.

## Software
This code assumes that your system includes an installation of [Anaconda](https://www.anaconda.com), as well as [ffmpeg](https://www.ffmpeg.org/download.html). ffmpeg in an open-source platform for converting video.

## Video catalog
This code may be partially controlled by parameter values that are provided in a **video catalog**, which is a csv-formatted spreadsheet with each row corresponding to a particular experimental recording. The headings fod the columns on this spreadsheet are as follows:

1. date - Date of recording.
1. trial_num - Trail number for the experiments.
1. fps - frame rate (Hz) for the recording.
1. make_video - (1 or 0) Specifies whether to generate an output video.
1. roi_x, roi_y, roi_w, roi_h - (optional, in pixels) Specifies the coordinates of the lower-left corner of a region-of-interest and its width and height. Output videos will be cropped to these dimensions.

If the video catalog is not used (i.e., set parameter 'useCat = False'), then all available movies will be converted. In the case of image sequences, all directories within imPath that hold images will be used to create movies. In the case of movie files, all movies from vidInPath will be converted when the video catalog is not used.

The end of the notebook includes code for selecting a region-of-interest.

## Packages
We import the same packages for either image sequences or movie files. So, run the cell below for either case.

In [1]:
import pandas as pd
import numpy as np
import sys, os
import cv2 as cv  
from sources import videotools as vt
import glob

# Working with image sequences  ----------
This example draws video frames from an image sequence. The code assumes that the image files are saved in directories that are named after the date of the recording, as formatted in the 'date' column of the video catalog.
The video catalog may include the following:
1. start_image_filename  - Name of image file at the **start** of recording.
1. end_image_filename  - Name of image file at the **end** of recording.

Images for this example, along with a sample video catalog, may be downloaded from [here](https://drive.google.com/file/d/1pdCJX97nwkdKqtxPtK0LZlN1oDZLoMPB/view?usp=sharing).

## Parameters
Modify the paths listed below for your project and system.

In [4]:
# Change to for each computer
root_path = '/Users/mmchenry/Documents/code/kineKit_files/image_sequence'
# root_path = '/Users/mmchenry/Documents/Projects/geotaxis/'

# Whether to use the video catalog to control the batch execution
useCat = True

if useCat:
    # Path to csv-formatted spreadsheet catalog of videos 
    catPath  = root_path + os.path.sep + 'video_catalog.csv'

# Path to image sequence
imPath   = root_path + os.path.sep + 'images'

# Path to output the new videos, which will be analyzed with DLC
vidPath  = root_path + os.path.sep + 'Videos'

# If the output video is to be downsampled
downSample = True

# Number of pixels in verical dimension, if downsampling
vertPix = 480

# Suffix for source images or movies
suffixIn = 'jpeg'

# Suffix for output movies
suffixOut = 'mp4'

# Number of digits in input image filenames
nDigits = 5;

# Prefix at the star of each image filename
prefix = 'DSC'

# Image quality (low to high: 0 to 1) for output video
# imQuality = 0.75
imQuality = 0.35

# Check for paths
if not os.path.isdir(imPath):
    raise ValueError('Image path not found: ' + imPath) 
elif not os.path.isdir(vidPath):
    raise ValueError('Path not found: ' + vidPath) 

## Read catalog of image sequences

Imports the contents of the spreadsheet for the pre-processing job.

In [5]:
if useCat:
    # Open CSV file
    file = open(catPath)

    # Import CSV data
    d = pd.read_csv(file)

    # Extract only the 'make_video==1' rows
    d = d.loc[d.make_video==1]

    # Reset indices for the new rows
    d = d.reset_index(drop=True)

    # df2.set_index(pd.Index([0, 1, 2]))
    # Number of videos to analyze
    nVids = int(np.nansum(d.make_video))

    # Extract mandatory parameters
    vDate           = d.date.astype(str)
    trialNum        = d.trial_num.astype(int)
    fpsIn           = d.fps.astype(float)             

    # If roi is provided. Imported as str, to allow for empty cells. Converted to int in next cell
    if 'roi_x' in d:
        roiX = d.roi_x.astype(str)
        roiY = d.roi_y.astype(str)
        roiW = d.roi_w.astype(str)
        roiH = d.roi_h.astype(str)

    # If image filenames are included
    if 'start_image_filename' in d:
        startImageName  = d.start_image_filename.astype(str)
        endImageName    = d.end_image_filename.astype(str)
    else:
        startImageName = None
        endImageName = None
        
    # Close CSV file
    file.close()

## Generate videos from image sequences

In [6]:
# Verbose mode shows more output (from ffmpeg)
vMode = True

# If using the catalog
if useCat:
    # Loop thru each video listed in catalog where make_video==1
    for i in range(len(vDate)):
        
        # Paths for current output and input videos
        vidOutPath = vidPath + os.path.sep + vDate[i] + '_' + str(trialNum[i]) + '.' + suffixOut
        imagePath = imPath + os.path.sep + vDate[i] + os.path.sep

        # Read number of frames from spreadsheet
        frStart = int(startImageName[i][len(prefix):])
        frEnd   = int(endImageName[i][len(prefix):])

        # Match output with input frame rate
        # fpsOut = fpsIn[i]

        # Define ROI, if needed
        if 'roiX' in locals():
            r = [int(float(roiX[i])), int(float(roiY[i])), int(float(roiW[i])), int(float(roiH[i]))]
            # r = [roiX[i], roiY[i], roiW[i], roiH[i]]
        else:
            r = None

        # Create movie
        vt.vid_from_seq(imagePath, vidOutPath, frStart=frStart, frEnd=frEnd, fps=fpsIn[i], imQuality=imQuality, prefix=prefix, nDigits=nDigits, inSuffix=suffixIn,downSample=downSample, vertPix=vertPix, roi=r, vMode=vMode)
        
        # Report counter
        print('Finished with ' + str(i+1) + ' of ' + str(len(vDate)) + ' videos.')

# If not using the catalog, convert all directories of images in imagePath
else:

    # Get all directories in imagePath
    imageDirs = glob.glob(imPath  + os.path.sep + '*')

    # Loop thru each directory
    for currDir in imageDirs:

        # Execute, if currDir is not a file
        if os.path.isdir(currDir):
            # Name video after directory
            vidName = os.path.basename(os.path.normpath(currDir))
            vidOutPath = vidPath + os.path.sep + vidName + '.' + suffixOut

            # List of images
            imFiles = glob.glob(currDir + os.path.sep + '*.' + suffixIn)

            # Run, if images present
            if len(imFiles)>0:
                # Create movie
                vt.vid_from_seq(currDir, vidOutPath, imQuality=imQuality, prefix=prefix, nDigits=nDigits, inSuffix=suffixIn,downSample=downSample, vertPix=vertPix, vMode=vMode)

                # Report result
                print('Finished converting ' + currDir)
            else:
                print('No images found in ' + currDir)

    Reading images from: /Users/mmchenry/Documents/Projects/DeepLabCut/preprocessing_examples/image_sequence/images/2022-01-25/
    Making output movie file: /Users/mmchenry/Documents/Projects/DeepLabCut/preprocessing_examples/image_sequence/Videos/2022-01-25_1.mp4
    Completed writing 174 frames
Finished with 1 of 2 videos.
    Reading images from: /Users/mmchenry/Documents/Projects/DeepLabCut/preprocessing_examples/image_sequence/images/2022-01-25/
    Making output movie file: /Users/mmchenry/Documents/Projects/DeepLabCut/preprocessing_examples/image_sequence/Videos/2022-01-25_2.mp4
    Completed writing 124 frames
Finished with 2 of 2 videos.


ffmpeg version 5.0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with Apple clang version 13.1.6 (clang-1316.0.21.2.5)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/5.0.1_2 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoo

# Working with movie files
This example draws from videos saved as individual movie files. Example videos may be downloaded [here](https://drive.google.com/file/d/1_u4vb-7xDtzHxNtlZeRriFbXbdtCaQHN/view?usp=sharing).

## Parameters
Modify the paths listed below for your system.

In [3]:
# Change to for each computer
root_path = '/Users/mmchenry/Documents/code/kineKit_files/movie_files'

# Path to csv-formatted spreadsheet catalog of videos 
catPath  = root_path + os.path.sep + 'video_catalog.csv'

# Path to output video (to be analyzed)
vidOutPath   = root_path + os.path.sep + 'Compressed_videos'

# Path to output the new videos, which will be analyzed with DLC
vidInPath  = root_path + os.path.sep + 'Raw_videos'

# If the output video is to be downsampled
downSample = True

# Number of pixels in verical dimension, if downsampling
vertPix = 480

# Suffix for source images or movies
suffixIn = 'mov'

# Suffix for output movies
suffixOut = 'mp4'

# Image quality (low to high: 0 to 1) for output video
# imQuality = 0.75
imQuality = 0.35

# Whether to rename video files by date, trial_num, and a word
renameFiles = True
fileword = 'light'

## Read catalog of movie files

Imports the contents of the spreadsheet for the pre-processing job.

In [14]:
# Open CSV file
file = open(catPath)

# Import CSV data
d = pd.read_csv(file)

# Extract only the 'make_video==1' rows
d = d.loc[d.make_video==1]

# Reset indices for the new rows
d = d.reset_index(drop=True)

# Number of videos to analyze
nVids = int(np.nansum(d.make_video))

# Extract mandatory parameters
vDate           = d.date.astype(str)
trialNum        = d.trial_num.astype(int)
trialNum        = trialNum.map("{:03}".format)

# If filename not listed in catalog, then specify scheme for filenames
if not ('filename' in d):
    filenameIn = vDate + '_' + trialNum + '_' + fileword

# If input video filename specified in catalog
else:
    filenameIn = d.filename[d.make_video==1].astype(str)  

if renameFiles:
    filenameOut = vDate + '_' + trialNum + '_' + fileword
else:
    filenameOut = filenameIn

# Get start and end frame numbers
if 'frame_start' in d:
    frameStart  = d.frame_start[d.make_video==1].astype(int)
    frameEnd    = d.frame_end[d.make_video==1].astype(int)

# If roi is provided
if 'roi_x' in d:
    roiX = d.roi_x.astype(str)
    roiY = d.roi_y.astype(str)
    roiW = d.roi_w.astype(str)
    roiH = d.roi_h.astype(str)

# Close CSV file
file.close()

## Generate videos from movie files

In [15]:
# Verbose mode shows more output from ffmpeg
vMode = False

# Loop thru each video listed in catalog where make_video==1
for i in range(len(vDate)):
    
    # Paths for current output and input videos
    vInPath    = vidInPath + os.path.sep + filenameIn[i] + '.' + suffixIn
    vOutPath   = vidOutPath + os.path.sep + filenameOut[i] + '.' + suffixOut

    # Define ROI, if requested
    if 'roiX' in locals():
        # r = [roiX[i], roiY[i], roiW[i], roiH[i]]
        r = [int(float(roiX[i])), int(float(roiY[i])), int(float(roiW[i])), int(float(roiH[i]))]
    else:
        r = None

    # Create movie
    vt.vid_convert(vInPath, vOutPath, imQuality=imQuality, downSample=downSample, vertPix=vertPix, roi=r, vMode=vMode)

    print('Finished with ' + str(i+1) + ' of ' + str(len(vDate)) + ' videos.')

TypeError: vid_convert() got an unexpected keyword argument 'frStart'

# Manually select a region of interest ------------
You can run this code to interactively select a roi from the first frame of the first video in the list of movie files. You can enter the resulting values into the video catalog.

vt.find_roi can accept either an image or movie.

This code can be unreliable on my Mac, so imageJ or other software may be a better solution for measuring the roi.

Note that I get better results by clicking on the spacebar after selecting the ROI. It sometimes helps to also select the window header before clicking on the spacebar.

In [None]:
# Adjust this to the intended movie
inPath = '/Users/mmchenry/Documents/Projects/DeepLabCut/preprocessing_examples/movie_files/Raw_videos/2022-01-25_001.mov'

# Path to image sequence
# inPath = '/Users/mmchenry/Documents/Projects/DeepLabCut/preprocessing_examples/image_sequence/images/2022-01-25/DSC00849.jpeg'

# Find the ROI
r =  vt.find_roi(inPath)
print(r)