### Convert hdf5 to CSV 
#### Implementation notes
- Code must be run in Python 2
- If there are issues running MultiTracker check to see if it is in conflict with OpenCV2. If so, edit your `.bashrc` file to change the PATH to ros while running this script only. Make sure to run `source ~/.bashrc` to update your changes. 

#### List of tasks accomplished in this Jupyter Notebook:
- Translate Multitracker hdf5 files into Pandas dataframes and save to CSV
- Double check that all animals have one acclimate and experiment CSV file
- Find videos with quiescent animals at beginning of video
- Manually correct videos with quiescent animals at beginning of video

In [1]:
import numpy as np
import pandas as pd
import glob, math, os
from moviepy.editor import *
import multi_tracker_analysis as mta
from __future__ import division

1.5.1
recommended version: 1.1.1 or greater


- Translate Multitracker hdf5 files into Pandas dataframes and save to CSV

In [2]:
df = pd.read_csv("./data/experiment_IDs/cleaned_static_data.csv")
print len(df), 'new animals to analyze'

for i, row in df.iterrows(): 
    pos = row['animal_ID'].split('-')[-1]
    num = row['animal_ID'].split('-')[-2]
    dat = row['animal_ID'].split('-')[0]
    
    for val, pos2 in zip(["A", "E"], ['acclimate', 'experiment']):
        filen = "/home/eleanor/Downloads/analysis_files_reviewed/"+dat+'-'+num+'-'+val+'-'+pos 
        video = "/home/eleanor/Downloads/videos/"+dat+'-'+num+'-'+val+'-'+pos
        save = "./data/trajectories/video_csvs/"+dat+'-'+num+'-'+pos+'-'+pos2

        filename = filen + "/data/"
        videoname = video + ".avi"
        savename = save + ".csv"
        
        # Do not overwrite video files that have already been made
        if not os.path.isfile(savename):
            print(filename)
            print(videoname)
            print(savename)
            try:
                df, config = mta.read_hdf5_file_to_pandas.load_and_preprocess_data(filename)
                video_clip = VideoFileClip(videoname)
                frame = np.array(video_clip.get_frame(0).astype(float))
                video_width, video_height = len(frame[0]), len(frame)

                df["pixel_width"] = video_width
                df["pixel_height"] = video_height

                # Mark frames detected by Multitracker to differentiate from manually entered frames.
                df['manual_tracker_fix'] = False

                # remove default columns added by multitracker that are wrong for mosquito larvae. 
                del df['angle'], df['area'], df['time_epoch'], df['time_epoch_nsecs'], \
                    df['time_epoch_secs'], df['speed'], \
                    df['velocity_x'], df['velocity_y']

                df.to_csv(savename, index=False)
                del video_clip.reader, video_clip
            except:
                pass
            
print("--- All files converted ---")

499 new animals to analyze
--- All files converted ---


- Double check that all animals have one acclimate and experiment CSV file
- Find videos with quiescent animals at beginning of video

Multitracker will not register objects until they begin to move. In some videos, larvae do not start to move until several seconds into the experiment. Each of these videos were manually inspected to confirm lack of movement in the initial frames. Next, the tracker's position when the larva was first detected was propagated to the beginning of the movie. All manually corrected frames are marked with ['manual_tracker_fix'] == True.

In [5]:
df = pd.read_csv("./data/experiment_IDs/cleaned_static_data.csv")
print len(df), 'animals to analyze'

# Check that each animal only has one acclimate and experiment CSV file 
acc_files = glob.glob("./data/trajectories/video_csvs/*-acclimate.csv")
exp_files = glob.glob("./data/trajectories/video_csvs/*-experiment.csv")
acc_filestr = [x.split("video_csvs/")[-1].split("-acclimate")[0] for x in acc_files]
exp_filestr = [x.split("video_csvs/")[-1].split("-experiment")[0] for x in exp_files]

# Print the names of any files that have fewer than framemin frames. 
# 1800 frames total for a 15 minute video.
# Maximum 2 seconds data missing per video determined to be ok.

framemin = 1795
fnames = glob.glob("./data/trajectories/video_csvs/*.csv")
print len(fnames), 'files to analyze'

for name in sorted(fnames)[::-1]:
    if os.path.isfile(name):
        df = pd.read_csv(name)
        missed = framemin - len(df)
        if len(df) < framemin: 
            print str(name).split("/")[-1], ':', framemin-len(df), 'frames fewer than minimum'
        
print("--- All files checked ---")

499 animals to analyze
998 files to analyze
--- All files checked ---


- Manually correct videos with quiescent animals at beginning of video

If the video has been manually checked to see if it has a long period of no motion at the beginning, then we can automatically propagate the first frame data point backwards to the beginning of the video. 

In [6]:
checked = pd.read_csv('./data/trajectories/manually_checked_beginning_pause.csv')
checked_fnames = checked['filename'].values
print len(checked_fnames), 'files to analyze'

for name in checked_fnames: 
    fname = "./data/trajectories/video_csvs/"+name+".csv"
    assert os.path.isfile(fname)
    
    df = pd.read_csv(fname)
    fmin = df["frames"].min()
    fmin_row = df[df['frames'] == fmin]
    assert len(fmin_row) == 1

    if fmin > 2:
        for n in range(2, fmin):
            fmin_copy = fmin_row.copy()
            fmin_copy.ix[0, "objid"] = "manual"
            fmin_copy.ix[0, "frames"] = n
            df = df.append(fmin_copy)

        df = df.sort_values(by="frames")
        df.to_csv(fname, index=False)

print("--- All files converted ---")

232 files to analyze
--- All files converted ---
