## OCR Code


### Basic Operation

To use this code:
* Ensure that your computer is using a python environment with all of the packages shown in the import cell below
* Create lineup.csv and a sightings.csv files following the format of the example files in this repository
* Populate lineuppath, sightingpath, videopath, and writepath according to your computer's filesystem
* Run all of the cells EXCEPT the one specifying lineuppath, sightingpath, etc.
* Run the cell containing lineuppath, sightingpath, etc to begin <b>ocr_complete</b>, and wait for it to finish - depending on the number of sightings and videos provided, this process may take several minutes

### Background

In order to estimate the size of a specimen on camera, the distance between the camera and that specimen must be known. When recording with the Splash Drone 2, proprietary flight log formats prevented us from extracting altitude data directly from the drone. As a workaround we developed an OCR function to read altitude information from a video stream produced by the Splash Drone 2 that displayed during flight on the drone's transmitter.

This code requires:
 * An HD video stream from the drone
 * An SD video stream originally displayed on the drone's transmitter screen and saved on an external computer. This is the stream that contains flight information such as altitude, battery info, etc.
 * Two separate csv files, one noting the clips of the hd video being examined (in our case, any point where a whale appears on screen), and the other noting the lineup point between the HD and SD videos (e.g. HD video 01's recording at X second corresponds to SD video's recording at Y seconds).
 
Format examples for the .csv files are included in this repository.

In [70]:
# Only run this cell after running all of the cells below

masterpath = '/home/devynn/github/Github/omurasWhaleAnalysis/example_data/' # All of my paths have the same start up until a few folders, so I just have it once here

lineuppath = masterpath + 'csv/ocr-lineup-example.csv' # Path to lineup csv file
sightingpath = masterpath + 'csv/sightings-example.csv' # Path to sightings csv file
videopath =  masterpath + 'videos/' # Path to folder containing all sd and hd videos mentioned in lineup and sighting csvs
writepath = masterpath + 'ocr_example_results/' # Path you wish to save your image/csv results

ocr_complete(sightingpath, lineuppath, videopath, writepath)

Now working with video ocr_hd_01.MP4
hd start time is 0.0, hd end time is 10.0
stream start time is 51.0, stream end time is 61.0
Working on OCR....
Done!
The path to the corrected file is: ocr_hd_01_0_corrected.csv
savepath: /home/devynn/github/Github/omurasWhaleAnalysis/example_data/ocr_example_results/ocr_hd_01_0/
Now working with video ocr_hd_02.MP4
hd start time is 23.0, hd end time is 33.0
stream start time is 26.0, stream end time is 36.0
Working on OCR....
Done!
The path to the corrected file is: ocr_hd_02_552_corrected.csv
savepath: /home/devynn/github/Github/omurasWhaleAnalysis/example_data/ocr_example_results/ocr_hd_02_552/


In [67]:
# import cell
import cv2, numpy, scipy, os, csv
import math
import matplotlib as mpl
from matplotlib import pyplot as plt
import pytesseract

In [71]:
def ocr_complete(sightingpath, lineuppath, readpath, savepath):
    """Function that finds all whale sightings from csv, uses OCR to detect the altitude of the drone during those sightings, and writes information to grabbed video frames
    
        Inputs:
            sightingpath: Path to SD3 Stream Sync.csv
            lineuppath: Path to SD3 Sightings.csv
            readpath: Path pointing to videos being processed
            savepath: Path pointing to save location
            
        """
    
    # Grab videoname, path, sighting start time, and sighting end time for each item in sighting directory
    with open(sightingpath, 'r', newline = '') as sightingdirectory:
        csvreader = csv.reader(sightingdirectory)
        
        sightingcollection = []
        
        # Create sighting list
        for row in csvreader:
            if row[0] == 'Filename':
                pass
            else:
                # Add [hd file name, path to hd file, sighting start time, and sighting end time to sightingcollection]
                sightingcollection.append([row[0], row[1], row[2]])

    # Match up information from sightingdirectory and lineup csv file
    with open(lineuppath, 'r', newline = '') as lineupdirectory:
        csvreader2 = csv.reader(lineupdirectory)
            
        for row in csvreader2:
            for sighting in sightingcollection:
                if row[0] == sighting[0]:
                    try:
                        sighting.extend((row[1], row[5], row[6]))
                    except ValueError:
                        pass

    # Format of sightingdirectory is [[hd file name, path to hd file, sighting start time, sighting end time, stream filename, hd lineup time, stream lineup time],...]
    # Double check sightingset for proper number of elements
    sightingset = [sighting for sighting in sightingcollection if len(sighting) == 6]
    
    # Main video processing section
    for video in sightingset:
        hdname = video[0]
        hdpath = readpath + hdname
        print('Now working with video ' + hdname)
        
        hdvideo = cv2.VideoCapture(hdpath)
        hd_fps = hdvideo.get(cv2.CAP_PROP_FPS)
        
        streamname = video[3]
        streampath = readpath + streamname
        
        streamvideo = cv2.VideoCapture(streampath)
        stream_fps = streamvideo.get(cv2.CAP_PROP_FPS)
        streamcreation = os.path.getmtime(streampath)
        
        hdvideo.release()
        streamvideo.release()
        
        offset = float(video[4]) - float(video[5])
        
        # Calculate sighting start and end times in hd video in seconds to translate to frames
        sightingstart = video[1].split(':')
        sightingend = video[2].split(':')
        
        hdstarttime = float(sightingstart[0])*3600 + float(sightingstart[1])*60 + float(sightingstart[2])
        hdendtime = float(sightingend[0])*3600 + float(sightingend[1])*60 + float(sightingend[2])
        streamstarttime = hdstarttime - offset
        streamendtime = hdendtime - offset
        print('hd start time is ' + str(hdstarttime) + ', hd end time is ' + str(hdendtime))
        print('stream start time is ' + str(streamstarttime) + ', stream end time is ' + str(streamendtime))
        
        # If video is over 300 seconds, don't bother with it. This is a pretty arbitrary value and can be changed around.
        if hdendtime - hdstarttime > 300:
            print('Video ' + hdname + ' was too large.')
            break
        
        # Calculate frame start/end points
        hdframestart = round(hd_fps*hdstarttime)
        hdframeend = round(hd_fps*hdendtime)
                
        hdframestotal = hdframeend-hdframestart
        streamframestotal = math.ceil((hdframestotal*stream_fps)/hd_fps)
                
        streamframestart = round(stream_fps*streamstarttime)

        # Make designated folders for files created in altfinder and hdvideolabeling functions called below
        try:
            os.mkdir(savepath + hdname.split('.')[0] + '_' + str(hdframestart))
        except FileExistsError:
            pass
        
        try:
            os.mkdir(savepath + hdname.split('.')[0] + '_' + str(hdframestart) + '/ocrdictionary') 
        except FileExistsError:
            pass
        
        # Savepaths for hd video frames and ocr information
        hdsavepath = savepath + hdname.split('.')[0] + '_' + str(hdframestart) + '/'
        sdsavepath = hdsavepath + 'ocrdictionary/'
        
        sightinginfo = [hdframestart, hdframestotal, streamframestart, streamframestotal]
                            
        ocrfilename = altfinder(hdpath, streampath, offset, streamcreation, sightinginfo, sdsavepath, saveimg = False)
        
        ocr_corrected_name = sdsavepath + ocrfilename
        
        hdvideolabeling(hdsavepath, hdpath, ocr_corrected_name, hdframestart, hdframestotal, stream_fps, False)

def altfinder(hdpath, streampath, offset, creationtime, sightings, savingpath, saveimg = False, maxalt = 50):
    """Uses OCR to find altitudes in stream video
    
    Inputs:
        -hdpath: Path to HD SD3 video + video title
        -streampath: Path to stream SD3 video + video title
        -offset: Offset time between hd and stream SD3 videos in seconds
        -creationtime: Time SD3 stream was created, in Unix time
        -sightings: List containing frame windows of whale sightings, for both hd and stream videos
                    formatted as [hd start frame, hd end frame, stream start frame, stream end frame]
        -savingpath: Path to save output csv
        -saveimg: Boolean value, saves analyzed stream frames to savingpath if set to True
        -maxalt: The maximum altitude reasonably attained during the flight. Default 50m
        
    Outputs:
        - filenames: name of CSV file output by function
        - CSV file containing calculated Unix times of stream frames and corresponding altitudes
        - If saveimg is True, altfinder will save a set of images to the same folder of the csv,
            containing frame/time/altitude information in the file name
        """
    
    # Open video
    hdname = hdpath.split('/')[-1]
    streamvideo = cv2.VideoCapture(streampath)
    streamfps = streamvideo.get(cv2.CAP_PROP_FPS)
    
    # Iterate through each sighting in sightingset from sightingwork function
    print('Working on OCR....')
    
    # Jump to correct frame in stream video
    streamvideo.set(cv2.CAP_PROP_POS_FRAMES, sightings[2])

    # Establish frame count, frame flag, alt and time lists
    count = 0
    frameflag = sightings[3]
    altlist = []
    timelist = []

    while(True):

        if count > frameflag: #Stop once endpoint has been reached
            break

        # Determine Unix time of a particular frame
        time = creationtime + (sightings[2] + count)/streamfps
        frame = sightings[2] + count

        # Read frame
        rval, frametest = cv2.VideoCapture.read(streamvideo)

        # Convert color to greyscale, isolate desired portion for OCR, preprocess, and analyze
        frametestcvt = cv2.cvtColor(frametest, cv2.COLOR_BGR2GRAY)
        frameslice = frametestcvt[295:330, 55:175]
        framebigger = cv2.resize(frameslice, None, fx=3.5, fy=6.5, interpolation=cv2.INTER_AREA)
        frameblurrer = cv2.bilateralFilter(framebigger,9,90,75)
        
        # Inverse Binary Threshold to clean up image
        ret, threshtest = cv2.threshold(frameblurrer, 180, 255, cv2.THRESH_BINARY_INV)

        # OCR command, whitelisting only the digits that could show up
        alt = pytesseract.image_to_string(threshtest, config = '--psm 13 --oem 0 -c tessedit_char_whitelist=-1234567890.')

        # Check if alt is convertible to float - if not, or if greater than maxalt, dump it as it is likely an error
        try:
            alt = float(alt)
            if alt > maxalt:
                alt = 0.0
        except ValueError:
            alt = 0.0
        
        # If set to save images, frames of stream video will be saved with unix time and altitudes in file name
        if saveimg == True:
            imagename = str(frame) + '_' + str(alt) + '.png'
            cv2.imwrite(savingpath + imagename, threshtest)

        # Append altitude and unix times to lists for later csv export
        altlist.append(alt)
        timelist.append(time)

        # Up count for loop
        count += 1

    filename = hdname.split('.')[0] + '_' + str(sightings[0]) + '.csv'

    csvname = savingpath + filename

    corrected_name = filename.split('.')[0] + '_corrected.csv'

    # Write csv files of individual sightings
    with open(csvname, 'w', newline = '') as alttimecsv:
        writer = csv.writer(alttimecsv, delimiter = ',')
        writer.writerow(timelist)
        writer.writerow(altlist)

    # Call csvcorrected function to save corrected csv file in same location as raw OCR csv file
    csvcorrected(csvname)

    print('Done!')
    # Release video
    streamvideo.release()
    print("The path to the corrected file is: " + corrected_name)
    return corrected_name

def hdvideolabeling(savepath, hdfile, OCRcorrected, startframe, frameflag, dataframerate, display = False):
    """Takes data collected via OCR and prepped with csvcorrected and framecalculations, as well as sighting data from
    altfinder to save a series of HD frames with frame number, unix timestamps, and altitude
    
    Inputs
        -savepath: path to save video labels
        -hdfile: path to hd sd3 video whose frames will be saved
        -OCRcorrected: path to csv file containing corrected OCR information and unix timestamps
        -startframe: Beginning frame of HD video
        -frameflag: Number of hd frames in sighting
        -dataframerate: framerate of data stream, passed on to framecalculation function
        -framerange: nimber of frames to be labeled in hd video stream
        -display: boolean passed onto framecalculations, default False. If True, prints frame allocation information
        """
    
    # Establish videocapture object and set stream to start at correct frame, grab framerate
    framerange = frameflag
    
    hdvideo = cv2.VideoCapture(hdfile)
    hdvideo.set(cv2.CAP_PROP_POS_FRAMES, startframe)
    hd_fps = hdvideo.get(cv2.CAP_PROP_FPS)
    
    # Establish frame count, frame flag, and ocr list
    count = 0
    
    ocrinfo = []
    
    # Move csv information into list of lists 
    with open(OCRcorrected, 'r', newline = '') as altitudes:
        reader = csv.reader(altitudes)
        for row in reader:
            ocrinfo.append(row)
    
    # Grab OCR data
    ocrtimes = ocrinfo[0]
    ocralts = ocrinfo[-1]
    
    # Feed csv info into framecalculations to get frame lineup info
    framesplitinfo = framecalculations(dataframerate, hd_fps, framerange, display = False)

    # Get those frames
    print("Saving to: " + savepath)
    while True:
        if dataframerate < hd_fps:
        # Break if frameflag is reached, update framecount if it falls behind framesplitinfo
            if count > framesplitinfo[-1][3] - 1:
                break
            else:
                frametime = round((float(ocrtimes[count]) * framesplitinfo[count][0] + float(ocrtimes[count + 1]) * framesplitinfo[count][1])/(dataframerate/hd_fps))
                framealt = round((float(ocralts[count]) * framesplitinfo[count][0] + float(ocralts[count + 1]) * framesplitinfo[count][1])/(dataframerate/hd_fps),2)
        
        else:    
        # Break if frameflag is reached, update framecount if it falls behind framesplitinfo
            if count > frameflag-2:
                break
            elif count < framesplitinfo[count][3]:
                pass
            
            frametime = round((float(ocrtimes[count]) * framesplitinfo[count][0] + float(ocrtimes[count + 1]) * framesplitinfo[count][1] + float(ocrtimes[count + 2]) * framesplitinfo[count][2])/(dataframerate/hd_fps))
            framealt = round((float(ocralts[count]) * framesplitinfo[count][0] + float(ocralts[count + 1]) * framesplitinfo[count][1] + float(ocralts[count + 2]) * framesplitinfo[count][2])/(dataframerate/hd_fps),2)
        
        imagename = str(startframe+count) + '_' + str(frametime) + '_' + str(framealt) + '.png'
        rval, hdframe = cv2.VideoCapture.read(hdvideo)

        # Write that image
        cv2.imwrite(savepath + imagename, hdframe)
        
        count += 1
        
    # Release video
    hdvideo.release()

In [72]:
def framecalculations(dataframerate, videoframerate, framerange, display = False):
    """Tool for assigning altitude and time information to hd video frames despite having different framerate from stream source
    
    Inputs:
        -dataframerate: Framerate of data stream from OCR work. Dataframerate much be > videoframerate
        -videoframerate: Framerate of hd video stream, must be < dataframerate
        -framerange: Number of frames of dataframerate being analyzed
        -display: Boolean value, if True prints out the ratios of frames, if False returns nothing (default false)
    Outputs:
        -proportions: a list of lists detailing how much each data frame lines up with an hd frame
            format is [[portion of frame[count], portion of frame[count+1], portion of frame[count+2], count], ...]
        """
    if dataframerate < videoframerate:
        # Grab total number of hd frames from framerange list
        frames = framerange
        
        step = dataframerate/videoframerate
        
        a = step
        b = 0
        c = step
        count = 0
        
        proportions = [[a, b, c, count]]
        
        if display == True:
            print('a = ' + str(a) + ', frame %s' % str(count))
            print('b = ' + str(b) + ', frame %s' % str(count + 1))
            print('c = ' + str(c) + ', count')
        
        for loop in range(frames):
            frameratio = step * loop
            if (c + step) <= 1:
                a = step
                b = 0
                c += step
                count += 0
                proportions.append([a, b, c, count])
            
                if display == True:
                    print('a = ' + str(a) + ', frame %s' % str(count))
                    print('b = ' + str(b) + ', frame %s' % str(count + 1))
                    print('c = ' + str(c) + ', count')
            elif (c+step) > 1:
                a = 1 - c
                b = step - a
                c = b
                count += 0
                proportions.append([a, b, c, count])
                
                if display == True:
                    print('a = ' + str(a) + ', frame %s' % str(count))
                    print('b = ' + str(b) + ', frame %s' % str(count + 1))
                    print('c = ' + str(c) + ', count')
                
                count += 1
                

    elif dataframerate == videoframerate:
        # I doubt this would ever happen, but here's code for the corner case anyway
        frames = framerange
        
        total = dataframerate/videoframerate
        
        a = 1
        b = 0
        c = 0
        count = 0
        
        porportions = [[a, b, c, count]]
        
        for loop in range(frames):
            count += 1
            proportions.append([a, b, c, count]) # a, b, c, don't change when framerates are equal
        
    else:
        # Grab total number of hd frames from framerange list
        frames = framerange

        # Because hdfps != stream fps,frames do not line up exactly.
        # a, b, and c are coefficients representing the proportion of a stream frame being used to calculate the value of an hd frame

        # Starting conditions: hd frame takes info from two adjacent stream frames
        a = 1
        b = dataframerate/videoframerate - 1
        step = b
        total = dataframerate/videoframerate
        c = 0
        count = 0

        # Create proportions as a list of lists to be returned at the end of the function
        proportions = [[a, b, c, count]]

        # Displays a, b, c values and which frames those coefficients correspond to if display == True
        if display == True:
            print('a = ' + str(a) + ', frame %s' % str(count))
            print('b = ' + str(b) + ', frame %s' % str(count + 1))
            print('c = ' + str(c) + ', frame %s' % str(count + 2))

        # Loop calculating all a, b, c values for a given number of frames, assuming starting conditions created above
        for loop in range(frames):            
            count += 1
            # Various if statements prevent more than 100% of a stream frame being used, from negative values occurring, etc
            if b == 1:
                if (a - step) < 0:
                    count += 1
                    a = total - (c + step)
                    b = c + step
                    c = 0
                    proportions.append([a, b, c, count])
                else:
                    a -= step
                    b = 1
                    c += step
                    proportions.append([a, b, c, count])
            elif (b + step) > 1:
                a -= step
                b = 1
                c = total - (a + b)
                proportions.append([a, b, c, count])
            else:
                a -= step
                b += step
                c = 0
                proportions.append([a, b, c, count])

            if display == True:
                frame1 = str(count)
                frame2 = str(count + 1)
                frame3 = str(count + 2)

                print('a = ' + str(a) + ', frame %s' % str(count))
                print('b = ' + str(b) + ', frame %s' % str(count + 1))
                print('c = ' + str(c) + ', frame %s' % str(count + 2))

    # Return list of lists for use in hdvideolabeling function
    return proportions

In [73]:
def csvcorrected(csvfile):
    """Examines the csv created by altfinder and attempts to smooth potentially mislabeled altitudes
    
    Inputs:
        -csvfile: pathing pointing to the csv file to be analyzed 
    
    Outputs:
        -A list of corrected altitudes saved as a new csvfile
    """
    
    corrected_list = []
    
    # Collect info from csv in list for easier use
    with open(csvfile, 'r', newline = '') as uncorrected:
        checker = csv.reader(uncorrected)
        for row in checker:
            corrected_list.append(row)
    
    altitudes = corrected_list[-1].copy()

    # 
    for i in range(len(altitudes)):
        val1, val2 = altitudecorrection(altitudes, i, 1)
        altitudes[i] = val1
        try:
            altitudes[i+1] = val2
        except IndexError:
            pass

    correctedname = csvfile.split('.')[0] + '_corrected.csv'
    
    # Create corrected csv
    with open(correctedname, 'w', newline = '') as corrected:
        writer = csv.writer(corrected)
        writer.writerow(corrected_list[0])
        writer.writerow(altitudes)
        
def checkvalue(alts, index):
    try:
        altvalue = float(alts[index])
    except ValueError:
        altvalue = checknextvalue(alts, index + 1)
    
    return altvalue
        
def altitudecorrection(altlist, altindex, num = 1):
    """Function for detecting errors in OCR csvs
    
    Behavior: Recursive, and meant to be called in a for-loop.
        -Takes a list of altitudes grabbed from SD3 stream video using OCR, an index number, and an index difference value
        -Grabs two values from the altitude list, at the index number and the index number + difference value
            -e.g. if altindex = 22 and num = 1, it grabs the 23rd and 24th elements of the list
        -Compares those two values
            -if the values can be compared and the difference between them is acceptable, return both values
                -if the values are too far apart (and thus an OCR error is expected), check the next closest index value and try the function again (see below)
            -if the values cannot be compared, compares the first value with he next closest index value ahead of the previous one
                -e.g. 23rd and 25th values are compared instead of the 23rd and 24th
                -if those two values can be compared and the difference between them is acceptable, the inbetween value is now the average between the two checked values
                -if the values aren't acceptable, keep looking at further and further index values until acceptable one found
    
    Inputs:
        -altlist: List of altitude measurements captured with OCR
        -altindex: Index value of altlist being checked 
        -num: Difference in index value between altindex and second altitude measurement, default 1
    
    Outputs:
        - Returns two altitude measurements for appending into corrected list
        """
    
    # Calculate maximum acceptable difference between values
    maxdifference = num * .5
    
    altvalue_float = checkvalue(altlist, altindex)
    
    if altvalue_float == 0:
        return altitudecorrection(altlist, altindex+1, 1)
    
    # If second value won't return float, call function again but with num + 1
    # If an IndexError pops up, return the first altvalue twice
    try:
        altvalue2_float = float(altlist[altindex + num])
    except ValueError:
        return altitudecorrection(altlist, altindex, num + 1)
    except IndexError:
        return altvalue_float, altvalue_float
        
    # Calculate the delta value
    delta = altvalue_float - altvalue2_float
    
    # If delta value is within allowable margin, return altlist and altlist + 1 without any changes

    if num <= 1:
        if abs(delta) < maxdifference:
            return altvalue_float, altvalue2_float
        else:
            # If the difference is too big, iterate again with num + 1
            return altitudecorrection(altlist, altindex, num + 1)
    else:
        # If num > 1, either return first altvalue and approximated second value or iterate again
        if abs(delta) < maxdifference:
            return altvalue_float, round((altvalue_float - delta/num),2)
        else:
            return altitudecorrection(altlist, altindex, num + 1)

    