# Time Stamp Extract

Brief 1-2 sentence description of notebook.

TODO: Supplement the description
- Notebook that extracts the timestamps and gets the time that tones played

In [1]:
# Imports of all used packages and libraries
import sys
import os
import git
import glob
from collections import defaultdict

In [2]:
git_repo = git.Repo(".", search_parent_directories=True)
git_root = git_repo.git.rev_parse("--show-toplevel")

In [3]:
git_root

'/nancy/projects/reward_competition_extention'

In [4]:
sys.path.insert(0, os.path.join(git_root, 'src'))

In [5]:
# Imports of all used packages and libraries
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [6]:
import spikeinterface.extractors as se
import spikeinterface.preprocessing as sp

In [7]:
import utilities.helper
import trodes.read_exported

# Functions

In [8]:
import re

def extract_floats(s):
    """
    Extracts all floats from a string and returns them as a list of strings.

    Parameters:
    - s (str): The string to extract floats from.

    Returns:
    - list: A list of strings, each representing a float found in the input string.
    """
    float_pattern = r"[-+]?\d*\.\d+|\d+"
    return [str(float(num)) for num in re.findall(float_pattern, s)]

## Inputs & Data

- Explanation of each input and where it comes from.

Inputs and Required data loading
- input variable names are in all caps snake case
- Whenever an input changes or is used for processing 
- The variables are all lower in snake case

In [9]:
# Path of the directory that contains the Spike Gadgets recording and the exported timestamp files
# Exported with this tool https://docs.spikegadgets.com/en/latest/basic/ExportFunctions.html
# Export these files:
    # -raw – Continuous raw band export.
    # -dio – Digital IO channel state change export.
    # -analogio – Continuous analog IO export.
INPUT_DIR = "/scratch/back_up/reward_competition_extention/data/rce_cohort_3"
# TODO: Find way not to hard code this
# ALL_SESSION_DIR = glob.glob("/scratch/back_up/reward_competition_extention/data/standard/2023_06_*/*.rec")
ALL_SESSION_DIR = glob.glob("/scratch/back_up/reward_competition_extention/data/rce_cohort_3/omission/*.rec")

OUTPUT_DIR = r"./proc" # where data is saved should always be shown in the inputs
TONE_DIN = "dio_ECU_Din1"
TONE_STATE = 1
os.makedirs(OUTPUT_DIR, exist_ok=True)
OUTPUT_PREFIX = "rce_pilot_3"

In [10]:
COLS_TO_KEEP = ['session_dir', 'recording', 'metadata_dir', 'metadata_file',
'original_file', 'filename', 'session_path', 'all_subjects',
       'current_subject', 'event_timestamps', 'video_name',
       'video_timestamps', 'event_frames', 'first_item_data']

In [11]:
RAW_COLS_TO_KEEP = ['session_dir',
 'recording',
 'original_file',
 'session_path',
 'current_subject',
 'first_item_data',
 'first_timestamp',
 'all_subjects']

In [12]:
STATE_COLS_TO_KEEP = ['session_dir',
 'metadata_file',
 'event_timestamps',
 'video_name',
 'video_timestamps',
 'event_frames',]

In [13]:
same_columns = ['session_dir', 'video_name']
different_columns = ['metadata_file', 'event_frames', 'event_timestamps']

In [14]:
ALL_SESSION_DIR

['/scratch/back_up/reward_competition_extention/data/rce_cohort_3/omission/20240302_131025_comp_om_subj_3-1_and_3-4.rec',
 '/scratch/back_up/reward_competition_extention/data/rce_cohort_3/omission/20240227_130241_comp_om_subj_4-2_and_4-3.rec',
 '/scratch/back_up/reward_competition_extention/data/rce_cohort_3/omission/20240229_152936_comp_om_subj_3-3_and_3-4.rec',
 '/scratch/back_up/reward_competition_extention/data/rce_cohort_3/omission/20240228_142038_comp_om_subj_3-1_and_3-3.rec',
 '/scratch/back_up/reward_competition_extention/data/rce_cohort_3/omission/20240228_154053_comp_om_subj_4-3_and_4-4.rec']

## Outputs

Describe each output that the notebook creates. 

- Is it a plot or is it data?

- How valuable is the output and why is it valuable or useful?

## Other documentation

raw directory
- raw_group0.dat
    - voltage_value: Array with voltage measurement for each channel at each timestamp
- timestamps.dat
    - voltage_time_stamp: The time stamp of each voltage measurement

parent directory
- 1.videoTimeStamps.cameraHWSync
    - frame_number: Calculated by getting the index of each video time stamp tuple 
    - PosTimestamp: The time stamp of each video frame
    - HWframeCount: Unknown value. Starts at 30742 and increases by 1 for each tuple  
    - HWTimestamp: Unknown value. All zeroes
    - video_time: Calculated by dividing the frame number by the fps(frames per second) 
    - video_seconds: video_time, but rounded to seconds  	
    - These are filled in versions of the above collumns with the value from the most recent previous cell
        - filled_PosTimestamp 	
        - filledHWframeCount 	
        - filled_frame_number 	
        - filled_video_time 	
        - filled_video_seconds 	

DIO directory
- dio_ECU_Din1.dat
    - time: The time stamp the corresponds to the DIN input
    - state: Binary state of whether there is input from DIN or not 	
    - trial_number: Calculated by adding 1 to every time there is a DIN input
    - These are filled in versions of the above collumns with the value from the most recent previous cell
        - filled_state 	
        - filled_trial_number

ss_output directory (Spike sorting with Spike interface)
- firings.npz
    - unit_id: All the units that had a spike train for the given timestamp 	
    - number_of_units: Calculated by counting the number of units that had a spike train

## Functions

- function names are short and in snake case all lowercase
- a function name should be unique but does not have to describe the function
- doc strings describe functions not function names

## Processing

Describe what is done to the data here and how inputs are manipulated to generate outputs. 

In [15]:
# As much code and as many cells as required
# includes EDA and playing with data
# GO HAM!

# LOOP 1: Extracting all the Trodes

- Getting all the data from all the exported Trodes files

- Getting all the data from all the exported Trodes files and saving it to `session_to_trodes_data`
    - Creates a dictionary with the structure of:
        - `{dir_name: {file_name: metadata, file_name_2: metadata_2}, dir_name_2: {file_name_3: metadata_3, file_name_4: metadata_4}}`

In [16]:
# Saving the trodes data for each session
# Each key is a session name
# Each value is a dictionary of every recording file in that session
session_to_trodes_data = utilities.helper.create_recursive_dict()


# Saving the path of the session recording
session_to_path = {}

# Going through each session recording
# Which includes all the recordings from all the miniloggers and cameras
for session_path in ALL_SESSION_DIR:   
    try:
        # Getting the name of the session from the path
        session_basename = os.path.splitext(os.path.basename(session_path))[0]
        print("Current Session: {}".format(session_basename))
        # Reading the trodes data for every recording file in the session directory
        session_to_trodes_data[session_basename] = trodes.read_exported.organize_all_trodes_export(session_path)
        
        session_to_path[session_basename] = session_path
    except Exception as e: 
        print(e)


Current Session: 20240302_131025_comp_om_subj_3-1_and_3-4
Skipping file 20240302_131025_comp_om_subj_3-4_t3b2_merged.timestampoffset.txt due to error: Settings format not supported


  return np.dtype(dtype_spec)


Skipping file 20240302_131025_comp_om_subj_3-1_t1b3_merged.timestampoffset.txt due to error: Settings format not supported
Current Session: 20240227_130241_comp_om_subj_4-2_and_4-3
Skipping file 20240227_130241_comp_om_subj_4-2_t1b1_merged.timestampoffset.txt due to error: Settings format not supported
Skipping file 20240227_130241_comp_om_subj_4-3_t2b2_merged.timestampoffset.txt due to error: Settings format not supported
Current Session: 20240229_152936_comp_om_subj_3-3_and_3-4
Skipping file 20240229_152936_comp_om_subj_3-4_t1b1_merged.timestampoffset.txt due to error: Settings format not supported
Skipping file 20240229_152936_comp_om_subj_3-3_t3b3_merged.timestampoffset.txt due to error: Settings format not supported
Current Session: 20240228_142038_comp_om_subj_3-1_and_3-3
Skipping file 20240228_142038_comp_om_subj_3_1_t1b1_merged.timestampoffset.txt due to error: Settings format not supported
Current Session: 20240228_154053_comp_om_subj_4-3_and_4-4
Skipping file 20240228_154053_

In [17]:
session_to_trodes_data

defaultdict(<function utilities.helper.create_recursive_dict()>,
            {'20240302_131025_comp_om_subj_3-1_and_3-4': defaultdict(dict,
                         {'20240302_131025_comp_om_subj_3-4_t3b2_merged': {'timestampoffset': {},
                           'DIO': {'dio_ECU_Dout1': {'description': 'State change data for one digital channel. Display_order is 1-based',
                             'byte_order': 'little endian',
                             'original_file': '20240302_131025_comp_om_subj_3-4_t3b2_merged.rec',
                             'clockrate': '20000',
                             'trodes_version': '2.4.1',
                             'compile_date': 'Jul 14 2023',
                             'compile_time': '12:16:26',
                             'qt_version': '6.2.2',
                             'commit_tag': 'heads/Release_2.4.1-0-g0088fd36',
                             'controller_firmware': '3.17',
                             'headstage_firmware': 

- Adding the video timestamps

In [18]:
for session_path in ALL_SESSION_DIR:   
    try:
        session_basename = os.path.splitext(os.path.basename(session_path))[0]
        print("Current Session: {}".format(session_basename))
        file_to_video_timestamps = {}
        for video_timestamps in glob.glob(os.path.join(session_path, "*cameraHWSync")):
            video_basename = os.path.basename(video_timestamps)
            print("Current Video Name: {}".format(video_basename))
            timestamp_array = trodes.read_exported.read_trodes_extracted_data_file(video_timestamps)
            if "video_timestamps" not in session_to_trodes_data[session_basename][session_basename]:
                session_to_trodes_data[session_basename][session_basename]["video_timestamps"] = defaultdict(dict)
            session_to_trodes_data[session_basename][session_basename]["video_timestamps"][video_basename.split(".")[-3]] = timestamp_array
    
    
    except Exception as e: 
        print(e)

Current Session: 20240302_131025_comp_om_subj_3-1_and_3-4
Current Video Name: 20240302_131025_comp_om_subj_3-1_and_3-4.2.videoTimeStamps.cameraHWSync
Current Video Name: 20240302_131025_comp_om_subj_3-1_and_3-4.1.videoTimeStamps.cameraHWSync
Current Session: 20240227_130241_comp_om_subj_4-2_and_4-3
Current Video Name: 20240227_130241_comp_om_subj_4-2_and_4-3.1.videoTimeStamps.cameraHWSync
Current Video Name: 20240227_130241_comp_om_subj_4-2_and_4-3.2.videoTimeStamps.cameraHWSync
Current Session: 20240229_152936_comp_om_subj_3-3_and_3-4
Current Video Name: 20240229_152936_comp_om_subj_3-3_and_3-4.2.videoTimeStamps.cameraHWSync
Current Video Name: 20240229_152936_comp_om_subj_3-3_and_3-4.1.videoTimeStamps.cameraHWSync
Current Session: 20240228_142038_comp_om_subj_3-1_and_3-3
Current Video Name: 20240228_142038_comp_om_subj_3-1_and_3-3.2.videoTimeStamps.cameraHWSync
Current Video Name: 20240228_142038_comp_om_subj_3-1_and_3-3.1.videoTimeStamps.cameraHWSync
Current Session: 20240228_154053

In [19]:
session_to_trodes_data[session_basename][session_basename]["video_timestamps"]

defaultdict(dict,
            {'1': {'clock rate': '20000',
              'camera_name': 'HD USB Camera (\\\\?\\usb#vid_32e4&pid_9230&mi_00#6&bec0719&2&0000#{e5323777-f976-4f5b-9b55-b94699c46e44}\\global)',
              'fields': '<PosTimestamp uint32><HWframeCount uint32><HWTimestamp uint64>',
              'data': array([( 2116221, 0, 0), ( 2117607, 0, 0), ( 2117607, 0, 0), ...,
                     (76519401, 0, 0), (76519401, 0, 0), (76520787, 0, 0)],
                    dtype=[('PosTimestamp', '<u4'), ('HWframeCount', '<u4'), ('HWTimestamp', '<u8')]),
              'filename': '20240228_154053_comp_om_subj_4-3_and_4-4.1.videoTimeStamps.cameraHWSync'},
             '2': {'clock rate': '20000',
              'camera_name': 'HD USB Camera (\\\\?\\usb#vid_32e4&pid_9230&mi_00#7&f97c80&0&0000#{e5323777-f976-4f5b-9b55-b94699c46e44}\\global)',
              'fields': '<PosTimestamp uint32><HWframeCount uint32><HWTimestamp uint64>',
              'data': array([( 2116221, 0, 0), ( 2117607

- Creating a dataframe the dictionary with a column for:
  - Session directory
  - Recording name
  - Metadata directory
  - Metadata file
  - And a column for each metadata

In [20]:
# Creating a dataframe from the nested dictionary
trodes_metadata_df = pd.DataFrame.from_dict({(i,j,k,l): session_to_trodes_data[i][j][k][l] 
                           for i in session_to_trodes_data.keys() 
                           for j in session_to_trodes_data[i].keys()
                           for k in session_to_trodes_data[i][j].keys()
                           for l in session_to_trodes_data[i][j][k].keys()},
                           orient='index')

# Resetting the index and renaming the columns
trodes_metadata_df = trodes_metadata_df.reset_index()
trodes_metadata_df = trodes_metadata_df.rename(columns={'level_0': 'session_dir', 'level_1': 'recording', 'level_2': 'metadata_dir', 'level_3': 'metadata_file'}, errors="ignore")

# Adding the session path to the dataframe
trodes_metadata_df["session_path"] = trodes_metadata_df["session_dir"].map(session_to_path)

In [21]:
trodes_metadata_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,direction,id,display_order,fields,data,filename,decimation,clock rate,camera_name,session_path
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Dout1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,output,ECU_Dout1,2,<time uint32><state uint8>,"[[6143560, 0]]",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Dout2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,output,ECU_Dout2,3,<time uint32><state uint8>,"[[6143560, 0]]",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...
2,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din4,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,input,ECU_Din4,9,<time uint32><state uint8>,"[[6143560, 0]]",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...
3,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,input,ECU_Din1,7,<time uint32><state uint8>,"[[6143560, 1], [6258287, 0], [7458702, 1], [76...",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...
4,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,input,ECU_Din3,8,<time uint32><state uint8>,"[[6143560, 1], [6258287, 0], [37207877, 1], [3...",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...


In [22]:
trodes_metadata_df.tail()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,direction,id,display_order,fields,data,filename,decimation,clock rate,camera_name,session_path
104,20240229_152936_comp_om_subj_3-3_and_3-4,20240229_152936_comp_om_subj_3-3_and_3-4,video_timestamps,1,,,,,,,...,,,,<PosTimestamp uint32><HWframeCount uint32><HWT...,"[[2899241, 0, 0], [2900627, 0, 0], [2900627, 0...",20240229_152936_comp_om_subj_3-3_and_3-4.1.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...
105,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3,video_timestamps,2,,,,,,,...,,,,<PosTimestamp uint32><HWframeCount uint32><HWT...,"[[9739890, 0, 0], [9741276, 0, 0], [9741276, 0...",20240228_142038_comp_om_subj_3-1_and_3-3.2.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...
106,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3,video_timestamps,1,,,,,,,...,,,,<PosTimestamp uint32><HWframeCount uint32><HWT...,"[[9739871, 0, 0], [9739890, 0, 0], [9741276, 0...",20240228_142038_comp_om_subj_3-1_and_3-3.1.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...
107,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4,video_timestamps,1,,,,,,,...,,,,<PosTimestamp uint32><HWframeCount uint32><HWT...,"[[2116221, 0, 0], [2117607, 0, 0], [2117607, 0...",20240228_154053_comp_om_subj_4-3_and_4-4.1.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...
108,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4,video_timestamps,2,,,,,,,...,,,,<PosTimestamp uint32><HWframeCount uint32><HWT...,"[[2116221, 0, 0], [2117607, 0, 0], [2117607, 0...",20240228_154053_comp_om_subj_4-3_and_4-4.2.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...


- Getting the first item from each tuple in the arrays in the `data` column
  - This first item is usually just the timestamp

In [23]:
trodes_metadata_df["data"].iloc[0]

array([(6143560, 0)], dtype=[('time', '<u4'), ('state', 'u1')])

In [24]:
# Getting the dtype name of each column in the numpy array
trodes_metadata_df["first_dtype_name"] = trodes_metadata_df["data"].apply(lambda x: x.dtype.names[0])
# Getting the first item of each column in the numpy array
trodes_metadata_df["first_item_data"] = trodes_metadata_df["data"].apply(lambda x: x[x.dtype.names[0]])


In [25]:
# Same as above but for the last column
trodes_metadata_df["last_dtype_name"] = trodes_metadata_df["data"].apply(lambda x: x.dtype.names[-1])
trodes_metadata_df["last_item_data"] = trodes_metadata_df["data"].apply(lambda x: x[x.dtype.names[-1]])

In [26]:
trodes_metadata_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,data,filename,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Dout1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[[6143560, 0]]",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...,time,[6143560],state,[0]
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Dout2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[[6143560, 0]]",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...,time,[6143560],state,[0]
2,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din4,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[[6143560, 0]]",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...,time,[6143560],state,[0]
3,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[[6143560, 1], [6258287, 0], [7458702, 1], [76...",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ..."
4,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[[6143560, 1], [6258287, 0], [37207877, 1], [3...",20240302_131025_comp_om_subj_3-4_t3b2_merged.d...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ..."


In [27]:
trodes_metadata_df.tail()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,data,filename,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data
104,20240229_152936_comp_om_subj_3-3_and_3-4,20240229_152936_comp_om_subj_3-3_and_3-4,video_timestamps,1,,,,,,,...,"[[2899241, 0, 0], [2900627, 0, 0], [2900627, 0...",20240229_152936_comp_om_subj_3-3_and_3-4.1.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2899241, 2900627, 2900627, 2902012, 2902012, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
105,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3,video_timestamps,2,,,,,,,...,"[[9739890, 0, 0], [9741276, 0, 0], [9741276, 0...",20240228_142038_comp_om_subj_3-1_and_3-3.2.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[9739890, 9741276, 9741276, 9742522, 9742662, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
106,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3,video_timestamps,1,,,,,,,...,"[[9739871, 0, 0], [9739890, 0, 0], [9741276, 0...",20240228_142038_comp_om_subj_3-1_and_3-3.1.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[9739871, 9739890, 9741276, 9741276, 9742662, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
107,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4,video_timestamps,1,,,,,,,...,"[[2116221, 0, 0], [2117607, 0, 0], [2117607, 0...",20240228_154053_comp_om_subj_4-3_and_4-4.1.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2116221, 2117607, 2117607, 2118993, 2120379, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
108,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4,video_timestamps,2,,,,,,,...,"[[2116221, 0, 0], [2117607, 0, 0], [2117607, 0...",20240228_154053_comp_om_subj_4-3_and_4-4.2.vid...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2116221, 2117607, 2117607, 2117710, 2118993, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."


In [28]:
trodes_metadata_df["recording"].unique()

array(['20240302_131025_comp_om_subj_3-4_t3b2_merged',
       '20240302_131025_comp_om_subj_3-1_t1b3_merged',
       '20240227_130241_comp_om_subj_4-2_t1b1_merged',
       '20240227_130241_comp_om_subj_4-3_t2b2_merged',
       '20240229_152936_comp_om_subj_3-4_t1b1_merged',
       '20240229_152936_comp_om_subj_3-3_t3b3_merged',
       '20240228_142038_comp_om_subj_3_1_t1b1_merged',
       '20240228_154053_comp_om_subj_4-3_t3b3_merged',
       '20240228_154053_comp_om_subj_4-4_t4b4_merged',
       '20240302_131025_comp_om_subj_3-1_and_3-4',
       '20240227_130241_comp_om_subj_4-2_and_4-3',
       '20240229_152936_comp_om_subj_3-3_and_3-4',
       '20240228_142038_comp_om_subj_3-1_and_3-3',
       '20240228_154053_comp_om_subj_4-3_and_4-4'], dtype=object)

## Getting the subject information from the metadata

In [29]:
def split_by_multiple_delimiters(s, delimiters):
    """
    Splits a string by multiple delimiters.

    Parameters:
    - s (str): The string to split.
    - delimiters (list): A list of delimiters to split the string by.

    Returns:
    - list: A list of substrings.
    """
    return re.split('|'.join(map(re.escape, delimiters)), s)


In [30]:
trodes_metadata_df["all_subjects"] = trodes_metadata_df["session_dir"].apply(lambda x: x.split("subj")[-1].strip("_").replace("-", "."))#.split("t")[0].strip("_").replace("_",".").split(".and."))
trodes_metadata_df["all_subjects"] = trodes_metadata_df["all_subjects"].apply(lambda x: sorted(extract_floats(x)))

In [31]:
trodes_metadata_df["session_dir"].iloc[0]

'20240302_131025_comp_om_subj_3-1_and_3-4'

In [32]:
trodes_metadata_df["all_subjects"].apply(lambda x: tuple(x)).unique()

array([('3.1', '3.4'), ('4.2', '4.3'), ('3.3', '3.4'), ('3.1', '3.3'),
       ('4.3', '4.4')], dtype=object)

In [33]:
trodes_metadata_df["current_subject"] = trodes_metadata_df["recording"].apply(lambda x: x.split("subj")[-1].strip("_").replace("-", ".").replace("_", "."))#.split("t")[0].strip("_").replace("_",".").split(".and."))
trodes_metadata_df["current_subject"] = trodes_metadata_df["current_subject"].apply(lambda x: str(extract_floats(x)[0]).strip())


In [34]:
trodes_metadata_df["current_subject"].unique()

array(['3.4', '3.1', '4.2', '4.3', '3.3', '4.4'], dtype=object)

## Dropping all the rows with unneeded metadata

In [35]:
trodes_metadata_df["metadata_dir"].unique()

array(['DIO', 'time', 'raw', 'video_timestamps'], dtype=object)

In [36]:
METADATA_TO_KEEP = ['raw', 'DIO', 'video_timestamps']

In [37]:
trodes_metadata_df = trodes_metadata_df[trodes_metadata_df["metadata_dir"].isin(METADATA_TO_KEEP)]

In [38]:
trodes_metadata_df = trodes_metadata_df[~trodes_metadata_df["metadata_file"].str.contains("out")]
trodes_metadata_df = trodes_metadata_df[~trodes_metadata_df["metadata_file"].str.contains("coordinates")]


In [39]:
trodes_metadata_df = trodes_metadata_df.reset_index(drop=True)

# Getting the first time stamp of each recording

In [40]:
trodes_raw_df = trodes_metadata_df[(trodes_metadata_df["metadata_dir"] == "raw") & (trodes_metadata_df["metadata_file"] == "timestamps")].copy()


In [41]:
trodes_raw_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject
4,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,raw,timestamps,Raw timestamps,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6143561, 6143562, 6143563, 6143564, ...",time,"[6143560, 6143561, 6143562, 6143563, 6143564, ...","[3.1, 3.4]",3.4
5,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,raw,timestamps,Raw timestamps,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6143561, 6143562, 6143563, 6143564, ...",time,"[6143560, 6143561, 6143562, 6143563, 6143564, ...","[3.1, 3.4]",3.1
14,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_t1b1_merged,raw,timestamps,Raw timestamps,little endian,20240227_130241_comp_om_subj_4-2_t1b1_merged.rec,20000,2.4.0,May 24 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[1981794, 1981795, 1981796, 1981797, 1981798, ...",time,"[1981794, 1981795, 1981796, 1981797, 1981798, ...","[4.2, 4.3]",4.2
19,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-3_t2b2_merged,raw,timestamps,Raw timestamps,little endian,20240227_130241_comp_om_subj_4-3_t2b2_merged.rec,20000,2.4.0,May 24 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[1981794, 1981795, 1981796, 1981797, 1981798, ...",time,"[1981794, 1981795, 1981796, 1981797, 1981798, ...","[4.2, 4.3]",4.3
20,20240229_152936_comp_om_subj_3-3_and_3-4,20240229_152936_comp_om_subj_3-4_t1b1_merged,raw,timestamps,Raw timestamps,little endian,20240229_152936_comp_om_subj_3-4_t1b1_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[2899243, 2899244, 2899245, 2899246, 2899247, ...",time,"[2899243, 2899244, 2899245, 2899246, 2899247, ...","[3.3, 3.4]",3.4


In [42]:
trodes_raw_df["first_timestamp"] = trodes_raw_df["first_item_data"].apply(lambda x: x[0])

In [43]:
trodes_raw_df["recording"].iloc[0]

'20240302_131025_comp_om_subj_3-4_t3b2_merged'

In [44]:
recording_to_first_timestamp = trodes_raw_df.set_index('session_dir')['first_timestamp'].to_dict()

In [45]:
recording_to_first_timestamp

{'20240302_131025_comp_om_subj_3-1_and_3-4': 6143560,
 '20240227_130241_comp_om_subj_4-2_and_4-3': 1981794,
 '20240229_152936_comp_om_subj_3-3_and_3-4': 2899243,
 '20240228_142038_comp_om_subj_3-1_and_3-3': 9738506,
 '20240228_154053_comp_om_subj_4-3_and_4-4': 2116223}

In [46]:
trodes_metadata_df["first_timestamp"] = trodes_metadata_df["session_dir"].map(recording_to_first_timestamp)

In [47]:
trodes_metadata_df["first_timestamp"]

0     6143560
1     6143560
2     6143560
3     6143560
4     6143560
5     6143560
6     6143560
7     6143560
8     6143560
9     6143560
10    1981794
11    1981794
12    1981794
13    1981794
14    1981794
15    1981794
16    1981794
17    1981794
18    1981794
19    1981794
20    2899243
21    2899243
22    2899243
23    2899243
24    2899243
25    2899243
26    2899243
27    2899243
28    2899243
29    2899243
30    9738506
31    9738506
32    9738506
33    9738506
34    9738506
35    2116223
36    2116223
37    2116223
38    2116223
39    2116223
40    2116223
41    2116223
42    2116223
43    2116223
44    2116223
45    6143560
46    6143560
47    1981794
48    1981794
49    2899243
50    2899243
51    9738506
52    9738506
53    2116223
54    2116223
Name: first_timestamp, dtype: int64

# Getting the event timestamps

In [48]:
trodes_metadata_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din4,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,[6143560],state,[0],"[3.1, 3.4]",3.4
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4
2,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4
3,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4
4,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,raw,timestamps,Raw timestamps,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6143561, 6143562, 6143563, 6143564, ...",time,"[6143560, 6143561, 6143562, 6143563, 6143564, ...","[3.1, 3.4]",3.4


In [49]:
trodes_metadata_df.tail()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject
50,20240229_152936_comp_om_subj_3-3_and_3-4,20240229_152936_comp_om_subj_3-3_and_3-4,video_timestamps,1,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2899241, 2900627, 2900627, 2902012, 2902012, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[3.3, 3.4]",3.3
51,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3,video_timestamps,2,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[9739890, 9741276, 9741276, 9742522, 9742662, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[3.1, 3.3]",3.1
52,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3,video_timestamps,1,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[9739871, 9739890, 9741276, 9741276, 9742662, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[3.1, 3.3]",3.1
53,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4,video_timestamps,1,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2116221, 2117607, 2117607, 2118993, 2120379, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[4.3, 4.4]",4.3
54,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4,video_timestamps,2,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2116221, 2117607, 2117607, 2117710, 2118993, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[4.3, 4.4]",4.3


In [50]:
# trodes_state_df = trodes_metadata_df[trodes_metadata_df["last_dtype_name"] == "state"].copy()

# Filtering for digital IO channels
trodes_state_df = trodes_metadata_df[trodes_metadata_df["metadata_dir"].isin(["DIO"])].copy()
# Filtering for tone and port entry related channels
trodes_state_df = trodes_metadata_df[trodes_metadata_df["id"].isin(["ECU_Din1", "ECU_Din2", "ECU_Din3"])].copy()


In [51]:
trodes_state_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4
2,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4
3,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4
6,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.1
7,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,,,,/scratch/back_up/reward_competition_extention/...,time,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.1


In [52]:
trodes_state_df["event_indexes"] = trodes_state_df.apply(lambda x: np.column_stack([np.where(x["last_item_data"] == 1)[0], np.where(x["last_item_data"] == 1)[0]+1]), axis=1)

In [53]:
trodes_state_df["event_indexes"] = trodes_state_df.apply(lambda x: x["event_indexes"][x["event_indexes"][:, 1] <= x["first_item_data"].shape[0] - 1], axis=1)

In [54]:
trodes_state_df["event_timestamps"] = trodes_state_df.apply(lambda x: x["first_item_data"][x["event_indexes"]], axis=1)

## Updating the video timestamps

## Syncing up the video frame data

In [55]:
# Getting the rows that are the metadata for the video timestamps
trodes_video_df = trodes_metadata_df[trodes_metadata_df["metadata_dir"] == "video_timestamps"].copy().reset_index(drop=True)



In [56]:
# Filtering for the first video only
# This only applies to this pilot data where we are only looking the at competition data
# trodes_video_df = trodes_video_df[trodes_video_df["metadata_file"] == "1"].copy()

In [57]:
trodes_video_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,decimation,clock rate,camera_name,session_path,first_dtype_name,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_and_3-4,video_timestamps,2,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[6144944, 6144944, 6146330, 6147716, 6148939, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[3.1, 3.4]",3.1
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_and_3-4,video_timestamps,1,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[6144944, 6144944, 6146330, 6147716, 6147716, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[3.1, 3.4]",3.1
2,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_and_4-3,video_timestamps,1,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[1981792, 1983178, 1983178, 1984564, 1984564, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[4.2, 4.3]",4.2
3,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_and_4-3,video_timestamps,2,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[1981792, 1983178, 1983178, 1984564, 1984564, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[4.2, 4.3]",4.2
4,20240229_152936_comp_om_subj_3-3_and_3-4,20240229_152936_comp_om_subj_3-3_and_3-4,video_timestamps,2,,,,,,,...,,20000,HD USB Camera (\\?\usb#vid_32e4&pid_9230&mi_00...,/scratch/back_up/reward_competition_extention/...,PosTimestamp,"[2899241, 2900627, 2900627, 2902012, 2902012, ...",HWTimestamp,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[3.3, 3.4]",3.3


In [58]:
# Making the video timestamps into an evenly distributed array
trodes_video_df["video_timestamps"] = trodes_video_df["first_item_data"]

In [59]:
# Removing the columns that are no longer needed
trodes_video_df = trodes_video_df[["filename", "video_timestamps", "session_dir"]].copy()

In [60]:
# Renaming the filename so that we can merge with other dataframes with the same column name
trodes_video_df = trodes_video_df.rename(columns={"filename": "video_name"})

In [61]:
trodes_video_df.head()

Unnamed: 0,video_name,video_timestamps,session_dir
0,20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...",20240302_131025_comp_om_subj_3-1_and_3-4
1,20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...",20240302_131025_comp_om_subj_3-1_and_3-4
2,20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...",20240227_130241_comp_om_subj_4-2_and_4-3
3,20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...",20240227_130241_comp_om_subj_4-2_and_4-3
4,20240229_152936_comp_om_subj_3-3_and_3-4.2.vid...,"[2899241, 2900627, 2900627, 2902012, 2902012, ...",20240229_152936_comp_om_subj_3-3_and_3-4


- Adding each video as a row to each state row

In [62]:
trodes_state_df = pd.merge(trodes_state_df, trodes_video_df, on=["session_dir"], how="inner")

In [63]:
trodes_state_df.columns

Index(['session_dir', 'recording', 'metadata_dir', 'metadata_file',
       'description', 'byte_order', 'original_file', 'clockrate',
       'trodes_version', 'compile_date', 'compile_time', 'qt_version',
       'commit_tag', 'controller_firmware', 'headstage_firmware',
       'controller_serialnum', 'headstage_serialnum', 'autosettle', 'smartref',
       'gyro', 'accelerometer', 'magnetometer', 'time_offset',
       'system_time_at_creation', 'timestamp_at_creation', 'first_timestamp',
       'direction', 'id', 'display_order', 'fields', 'data', 'filename',
       'decimation', 'clock rate', 'camera_name', 'session_path',
       'first_dtype_name', 'first_item_data', 'last_dtype_name',
       'last_item_data', 'all_subjects', 'current_subject', 'event_indexes',
       'event_timestamps', 'video_name', 'video_timestamps'],
      dtype='object')

## Finding the closest frame to each event

In [64]:
trodes_state_df["event_timestamps"].iloc[1]

array([[ 6143560,  6258287],
       [ 7458702,  7658707],
       [ 9458728,  9658732],
       [10658745, 10858748],
       [11658755, 11858760],
       [13158774, 13358777],
       [14558798, 14758795],
       [15758808, 15958810],
       [17458832, 17658835],
       [18758849, 18958851],
       [20058865, 20258867],
       [21058878, 21258878],
       [22258893, 22458896],
       [23258905, 23458908],
       [24358919, 24558922],
       [26358944, 26558947],
       [27758962, 27958961],
       [28858973, 29058976],
       [30358992, 30558997],
       [31659008, 31859010],
       [32759024, 32959026],
       [34459043, 34659045],
       [40459119, 40659122],
       [41459132, 41659137],
       [42559146, 42759146],
       [43559155, 43759158],
       [45059176, 45259176],
       [46759198, 46959198],
       [49059226, 49259226],
       [50059238, 50259238],
       [51059248, 51259253],
       [52359264, 52559274],
       [53559281, 53759281],
       [55759306, 55959308],
       [570593

In [65]:
trodes_state_df["event_frames"] = trodes_state_df.apply(lambda x: utilities.helper.find_nearest_indices(x["event_timestamps"], x["video_timestamps"]), axis=1)

In [66]:
trodes_state_df.head()

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject,event_indexes,event_timestamps,video_name,video_timestamps,event_frames
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [7458702, 7658707], [9458...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [1312, 1512], [3309, 3508], [4507, ..."
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [7458702, 7658707], [9458...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [1312, 1512], [3309, 3508], [4507, ..."
2,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [37207877, 37210877], [37...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [31124, 31126], [31127, 31151], [31..."
3,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [37207877, 37210877], [37...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [31115, 31118], [31119, 31143], [31..."
4,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [6291290, 6295090], [6301...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [147, 151], [158, 162], [165, 175],..."


## Combine raw and state dataframes

In [67]:
trodes_state_df

Unnamed: 0,session_dir,recording,metadata_dir,metadata_file,description,byte_order,original_file,clockrate,trodes_version,compile_date,...,first_item_data,last_dtype_name,last_item_data,all_subjects,current_subject,event_indexes,event_timestamps,video_name,video_timestamps,event_frames
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [7458702, 7658707], [9458...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [1312, 1512], [3309, 3508], [4507, ..."
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [7458702, 7658707], [9458...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [1312, 1512], [3309, 3508], [4507, ..."
2,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [37207877, 37210877], [37...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [31124, 31126], [31127, 31151], [31..."
3,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din3,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 37207877, 37210877, 3721267...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [37207877, 37210877], [37...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [31115, 31118], [31119, 31143], [31..."
4,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [6291290, 6295090], [6301...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [147, 151], [158, 162], [165, 175],..."
5,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.4,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [6291290, 6295090], [6301...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [147, 151], [158, 162], [165, 174],..."
6,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.1,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [6291290, 6295090], [6301...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [147, 151], [158, 162], [165, 175],..."
7,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,DIO,dio_ECU_Din2,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 6291290, 6295090, 6301690, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.1,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [6291290, 6295090], [6301...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [147, 151], [158, 162], [165, 174],..."
8,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.1,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [7458702, 7658707], [9458...",20240302_131025_comp_om_subj_3-1_and_3-4.2.vid...,"[6144944, 6144944, 6146330, 6147716, 6148939, ...","[[0, 114], [1312, 1512], [3309, 3508], [4507, ..."
9,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,DIO,dio_ECU_Din1,State change data for one digital channel. Dis...,little endian,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,20000,2.4.1,Jul 14 2023,...,"[6143560, 6258287, 7458702, 7658707, 9458728, ...",state,"[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...","[3.1, 3.4]",3.1,"[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, ...","[[6143560, 6258287], [7458702, 7658707], [9458...",20240302_131025_comp_om_subj_3-1_and_3-4.1.vid...,"[6144944, 6144944, 6146330, 6147716, 6147716, ...","[[0, 114], [1312, 1512], [3309, 3508], [4507, ..."


In [68]:
trodes_state_df = trodes_state_df[STATE_COLS_TO_KEEP].drop_duplicates(subset=["session_dir", "video_name", "metadata_file"]).sort_values(["session_dir", "video_name", "metadata_file"]).reset_index(drop=True).copy()

In [69]:
trodes_state_df.head()

Unnamed: 0,session_dir,metadata_file,event_timestamps,video_name,video_timestamps,event_frames
0,20240227_130241_comp_om_subj_4-2_and_4-3,dio_ECU_Din1,"[[1981794, 2175214], [3375627, 3575632], [5775...",20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1, 242], [1737, 1988], [4730, 4978], [7597, ..."
1,20240227_130241_comp_om_subj_4-2_and_4-3,dio_ECU_Din2,"[[1981794, 2175214], [2175614, 2190011], [2486...",20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1, 242], [242, 261], [628, 644], [644, 648],..."
2,20240227_130241_comp_om_subj_4-2_and_4-3,dio_ECU_Din3,"[[1981794, 2175214], [37313669, 37324069], [37...",20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1, 242], [44043, 44055], [44696, 44699], [44..."
3,20240227_130241_comp_om_subj_4-2_and_4-3,dio_ECU_Din1,"[[1981794, 2175214], [3375627, 3575632], [5775...",20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1, 288], [2079, 2379], [5659, 5957], [9091, ..."
4,20240227_130241_comp_om_subj_4-2_and_4-3,dio_ECU_Din2,"[[1981794, 2175214], [2175614, 2190011], [2486...",20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1, 288], [288, 311], [751, 770], [770, 774],..."


In [70]:
trodes_state_df = trodes_state_df.groupby(same_columns).agg({**{col: 'first' for col in trodes_state_df.columns if col not in same_columns + different_columns}, **{col: lambda x: x.tolist() for col in different_columns}}).reset_index()

In [71]:
trodes_state_df.head()

Unnamed: 0,session_dir,video_name,video_timestamps,metadata_file,event_frames,event_timestamps
0,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[dio_ECU_Din1, dio_ECU_Din2, dio_ECU_Din3]","[[[1, 242], [1737, 1988], [4730, 4978], [7597,...","[[[1981794, 2175214], [3375627, 3575632], [577..."
1,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[dio_ECU_Din1, dio_ECU_Din2, dio_ECU_Din3]","[[[1, 288], [2079, 2379], [5659, 5957], [9091,...","[[[1981794, 2175214], [3375627, 3575632], [577..."
2,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3.1.vid...,"[9739871, 9739890, 9741276, 9741276, 9742662, ...","[dio_ECU_Din1, dio_ECU_Din2, dio_ECU_Din3]","[[[0, 465], [1961, 2210], [4953, 5202], [7821,...","[[[9738506, 10111886], [11312304, 11512304], [..."
3,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3.2.vid...,"[9739890, 9741276, 9741276, 9742522, 9742662, ...","[dio_ECU_Din1, dio_ECU_Din2, dio_ECU_Din3]","[[[0, 556], [2346, 2644], [5927, 6224], [9359,...","[[[9738506, 10111886], [11312304, 11512304], [..."
4,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4.1.vid...,"[2116221, 2117607, 2117607, 2118993, 2120379, ...","[dio_ECU_Din1, dio_ECU_Din2, dio_ECU_Din3]","[[[1, 243], [1441, 1640], [4372, 4620], [7237,...","[[[2116223, 2359198], [3559617, 3759619], [595..."


In [72]:
trodes_state_df["tone_timestamps"] = trodes_state_df["event_timestamps"].apply(lambda x: x[0])
trodes_state_df["box_1_port_entry_timestamps"] = trodes_state_df["event_timestamps"].apply(lambda x: x[1])
trodes_state_df["box_2_port_entry_timestamps"] = trodes_state_df["event_timestamps"].apply(lambda x: x[2])

trodes_state_df["tone_frames"] = trodes_state_df["event_frames"].apply(lambda x: x[0])
trodes_state_df["box_1_port_entry_frames"] = trodes_state_df["event_frames"].apply(lambda x: x[1])
trodes_state_df["box_2_port_entry_frames"] = trodes_state_df["event_frames"].apply(lambda x: x[2])


In [73]:
trodes_state_df = trodes_state_df.drop(columns=["event_timestamps", "event_frames", "metadata_file"], errors="ignore")

In [74]:
trodes_state_df.head()

Unnamed: 0,session_dir,video_name,video_timestamps,tone_timestamps,box_1_port_entry_timestamps,box_2_port_entry_timestamps,tone_frames,box_1_port_entry_frames,box_2_port_entry_frames
0,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1981794, 2175214], [3375627, 3575632], [5775...","[[1981794, 2175214], [2175614, 2190011], [2486...","[[1981794, 2175214], [37313669, 37324069], [37...","[[1, 242], [1737, 1988], [4730, 4978], [7597, ...","[[1, 242], [242, 261], [628, 644], [644, 648],...","[[1, 242], [44043, 44055], [44696, 44699], [44..."
1,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,"[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1981794, 2175214], [3375627, 3575632], [5775...","[[1981794, 2175214], [2175614, 2190011], [2486...","[[1981794, 2175214], [37313669, 37324069], [37...","[[1, 288], [2079, 2379], [5659, 5957], [9091, ...","[[1, 288], [288, 311], [751, 770], [770, 774],...","[[1, 288], [52653, 52665], [53402, 53405], [53..."
2,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3.1.vid...,"[9739871, 9739890, 9741276, 9741276, 9742662, ...","[[9738506, 10111886], [11312304, 11512304], [1...","[[9738506, 10111886], [10126486, 10131085], [1...","[[9738506, 10111886], [44207940, 44224940], [4...","[[0, 465], [1961, 2210], [4953, 5202], [7821, ...","[[0, 465], [482, 490], [490, 527], [730, 731],...","[[0, 465], [42981, 43002], [43345, 43397], [43..."
3,20240228_142038_comp_om_subj_3-1_and_3-3,20240228_142038_comp_om_subj_3-1_and_3-3.2.vid...,"[9739890, 9741276, 9741276, 9742522, 9742662, ...","[[9738506, 10111886], [11312304, 11512304], [1...","[[9738506, 10111886], [10126486, 10131085], [1...","[[9738506, 10111886], [44207940, 44224940], [4...","[[0, 556], [2346, 2644], [5927, 6224], [9359, ...","[[0, 556], [577, 585], [585, 631], [872, 874],...","[[0, 556], [51293, 51314], [51589, 51631], [51..."
4,20240228_154053_comp_om_subj_4-3_and_4-4,20240228_154053_comp_om_subj_4-3_and_4-4.1.vid...,"[2116221, 2117607, 2117607, 2118993, 2120379, ...","[[2116223, 2359198], [3559617, 3759619], [5959...","[[2116223, 2359198], [2359598, 2375598], [2421...","[[2116223, 2359198], [37276462, 37279065], [37...","[[1, 243], [1441, 1640], [4372, 4620], [7237, ...","[[1, 243], [243, 259], [305, 307], [308, 312],...","[[1, 243], [43409, 43412], [43414, 43419], [43..."


In [75]:
trodes_raw_df = trodes_raw_df[RAW_COLS_TO_KEEP].reset_index(drop=True).copy()

In [76]:
trodes_raw_df.head()

Unnamed: 0,session_dir,recording,original_file,session_path,current_subject,first_item_data,first_timestamp,all_subjects
0,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-4_t3b2_merged,20240302_131025_comp_om_subj_3-4_t3b2_merged.rec,/scratch/back_up/reward_competition_extention/...,3.4,"[6143560, 6143561, 6143562, 6143563, 6143564, ...",6143560,"[3.1, 3.4]"
1,20240302_131025_comp_om_subj_3-1_and_3-4,20240302_131025_comp_om_subj_3-1_t1b3_merged,20240302_131025_comp_om_subj_3-1_t1b3_merged.rec,/scratch/back_up/reward_competition_extention/...,3.1,"[6143560, 6143561, 6143562, 6143563, 6143564, ...",6143560,"[3.1, 3.4]"
2,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-2_t1b1_merged,20240227_130241_comp_om_subj_4-2_t1b1_merged.rec,/scratch/back_up/reward_competition_extention/...,4.2,"[1981794, 1981795, 1981796, 1981797, 1981798, ...",1981794,"[4.2, 4.3]"
3,20240227_130241_comp_om_subj_4-2_and_4-3,20240227_130241_comp_om_subj_4-3_t2b2_merged,20240227_130241_comp_om_subj_4-3_t2b2_merged.rec,/scratch/back_up/reward_competition_extention/...,4.3,"[1981794, 1981795, 1981796, 1981797, 1981798, ...",1981794,"[4.2, 4.3]"
4,20240229_152936_comp_om_subj_3-3_and_3-4,20240229_152936_comp_om_subj_3-4_t1b1_merged,20240229_152936_comp_om_subj_3-4_t1b1_merged.rec,/scratch/back_up/reward_competition_extention/...,3.4,"[2899243, 2899244, 2899245, 2899246, 2899247, ...",2899243,"[3.3, 3.4]"


In [77]:
trodes_final_df = pd.merge(trodes_raw_df, trodes_state_df, on=["session_dir"], how="inner")

In [78]:
trodes_final_df.shape

(18, 16)

In [79]:
trodes_final_df = trodes_final_df.rename(columns={"first_item_data": "raw_timestamps"})
trodes_final_df = trodes_final_df.drop(columns=["metadata_file"], errors="ignore")
trodes_final_df = trodes_final_df.sort_values(["session_dir", "recording"]).reset_index(drop=True).copy()

## Making the timestamps 0 indexed

In [80]:
trodes_final_df[[col for col in trodes_final_df.columns if "timestamps" in col]].head()

Unnamed: 0,raw_timestamps,video_timestamps,tone_timestamps,box_1_port_entry_timestamps,box_2_port_entry_timestamps
0,"[1981794, 1981795, 1981796, 1981797, 1981798, ...","[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1981794, 2175214], [3375627, 3575632], [5775...","[[1981794, 2175214], [2175614, 2190011], [2486...","[[1981794, 2175214], [37313669, 37324069], [37..."
1,"[1981794, 1981795, 1981796, 1981797, 1981798, ...","[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1981794, 2175214], [3375627, 3575632], [5775...","[[1981794, 2175214], [2175614, 2190011], [2486...","[[1981794, 2175214], [37313669, 37324069], [37..."
2,"[1981794, 1981795, 1981796, 1981797, 1981798, ...","[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1981794, 2175214], [3375627, 3575632], [5775...","[[1981794, 2175214], [2175614, 2190011], [2486...","[[1981794, 2175214], [37313669, 37324069], [37..."
3,"[1981794, 1981795, 1981796, 1981797, 1981798, ...","[1981792, 1983178, 1983178, 1984564, 1984564, ...","[[1981794, 2175214], [3375627, 3575632], [5775...","[[1981794, 2175214], [2175614, 2190011], [2486...","[[1981794, 2175214], [37313669, 37324069], [37..."
4,"[9738506, 9738507, 9738508, 9738509, 9738510, ...","[9739871, 9739890, 9741276, 9741276, 9742662, ...","[[9738506, 10111886], [11312304, 11512304], [1...","[[9738506, 10111886], [10126486, 10131085], [1...","[[9738506, 10111886], [44207940, 44224940], [4..."


In [81]:
trodes_final_df["last_timestamp"] = trodes_final_df["raw_timestamps"].apply(lambda x: x[-1])

- Dropping raw timestamps because of memory issues

In [82]:
trodes_final_df = trodes_final_df.drop(columns=["raw_timestamps", "original_file"], errors="ignore")

In [83]:
copy_trodes_final_df = trodes_final_df.copy

In [84]:
for col in [col for col in trodes_final_df.columns if "timestamps" in col]:
    trodes_final_df[col] = trodes_final_df.apply(lambda x: x[col].astype(np.int32) - np.int32(x["first_timestamp"]), axis=1)

for col in [col for col in trodes_final_df.columns if "frames" in col]:
    trodes_final_df[col] = trodes_final_df[col].apply(lambda x: x.astype(np.int32))

In [85]:
sorted_columns = sorted(trodes_final_df.columns
, key=lambda x: x.split("_")[-1])
trodes_final_df = trodes_final_df[sorted_columns].copy()

## Saving to a file

In [86]:
trodes_final_df.to_pickle(os.path.join(OUTPUT_DIR, "{}_00_trodes_metadata.pkl".format(OUTPUT_PREFIX)))

In [87]:
trodes_final_df.head()

Unnamed: 0,session_dir,tone_frames,box_1_port_entry_frames,box_2_port_entry_frames,video_name,session_path,recording,current_subject,all_subjects,first_timestamp,last_timestamp,video_timestamps,tone_timestamps,box_1_port_entry_timestamps,box_2_port_entry_timestamps
0,20240227_130241_comp_om_subj_4-2_and_4-3,"[[1, 242], [1737, 1988], [4730, 4978], [7597, ...","[[1, 242], [242, 261], [628, 644], [644, 648],...","[[1, 242], [44043, 44055], [44696, 44699], [44...",20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,/scratch/back_up/reward_competition_extention/...,20240227_130241_comp_om_subj_4-2_t1b1_merged,4.2,"[4.2, 4.3]",1981794,70466080,"[-2, 1384, 1384, 2770, 2770, 4156, 4156, 5541,...","[[0, 193420], [1393833, 1593838], [3793867, 39...","[[0, 193420], [193820, 208217], [504224, 51562...","[[0, 193420], [35331875, 35342275], [35856482,..."
1,20240227_130241_comp_om_subj_4-2_and_4-3,"[[1, 288], [2079, 2379], [5659, 5957], [9091, ...","[[1, 288], [288, 311], [751, 770], [770, 774],...","[[1, 288], [52653, 52665], [53402, 53405], [53...",20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,/scratch/back_up/reward_competition_extention/...,20240227_130241_comp_om_subj_4-2_t1b1_merged,4.2,"[4.2, 4.3]",1981794,70466080,"[-2, 1384, 1384, 2770, 2770, 4156, 4156, 5541,...","[[0, 193420], [1393833, 1593838], [3793867, 39...","[[0, 193420], [193820, 208217], [504224, 51562...","[[0, 193420], [35331875, 35342275], [35856482,..."
2,20240227_130241_comp_om_subj_4-2_and_4-3,"[[1, 242], [1737, 1988], [4730, 4978], [7597, ...","[[1, 242], [242, 261], [628, 644], [644, 648],...","[[1, 242], [44043, 44055], [44696, 44699], [44...",20240227_130241_comp_om_subj_4-2_and_4-3.1.vid...,/scratch/back_up/reward_competition_extention/...,20240227_130241_comp_om_subj_4-3_t2b2_merged,4.3,"[4.2, 4.3]",1981794,79962803,"[-2, 1384, 1384, 2770, 2770, 4156, 4156, 5541,...","[[0, 193420], [1393833, 1593838], [3793867, 39...","[[0, 193420], [193820, 208217], [504224, 51562...","[[0, 193420], [35331875, 35342275], [35856482,..."
3,20240227_130241_comp_om_subj_4-2_and_4-3,"[[1, 288], [2079, 2379], [5659, 5957], [9091, ...","[[1, 288], [288, 311], [751, 770], [770, 774],...","[[1, 288], [52653, 52665], [53402, 53405], [53...",20240227_130241_comp_om_subj_4-2_and_4-3.2.vid...,/scratch/back_up/reward_competition_extention/...,20240227_130241_comp_om_subj_4-3_t2b2_merged,4.3,"[4.2, 4.3]",1981794,79962803,"[-2, 1384, 1384, 2770, 2770, 4156, 4156, 5541,...","[[0, 193420], [1393833, 1593838], [3793867, 39...","[[0, 193420], [193820, 208217], [504224, 51562...","[[0, 193420], [35331875, 35342275], [35856482,..."
4,20240228_142038_comp_om_subj_3-1_and_3-3,"[[0, 465], [1961, 2210], [4953, 5202], [7821, ...","[[0, 465], [482, 490], [490, 527], [730, 731],...","[[0, 465], [42981, 43002], [43345, 43397], [43...",20240228_142038_comp_om_subj_3-1_and_3-3.1.vid...,/scratch/back_up/reward_competition_extention/...,20240228_142038_comp_om_subj_3_1_t1b1_merged,3.1,"[3.1, 3.3]",9738506,79291111,"[1365, 1384, 2770, 2770, 4156, 4156, 5541, 692...","[[0, 373380], [1573798, 1773798], [3973828, 41...","[[0, 373380], [387980, 392579], [393182, 42338...","[[0, 373380], [34469434, 34486434], [34761838,..."


In [106]:
trodes_final_df["tone_frames"].iloc[0]

array([[    1,   242],
       [ 1737,  1988],
       [ 4730,  4978],
       [ 7597,  7846],
       [ 9841, 10090],
       [11211, 11462],
       [12459, 12707],
       [14827, 15076],
       [16696, 16945],
       [19314, 19564],
       [20561, 20809],
       [22556, 22805],
       [24177, 24425],
       [25797, 26046],
       [27042, 27293],
       [28787, 29038],
       [31156, 31406],
       [33151, 33400],
       [34896, 35146],
       [37639, 37888],
       [39135, 39384],
       [46614, 46862],
       [48110, 48358],
       [49481, 49730],
       [51102, 51351],
       [52472, 52722],
       [53719, 53968],
       [54966, 55215],
       [57583, 57832],
       [60576, 60824],
       [63568, 63816],
       [66309, 66558],
       [68429, 68678],
       [71419, 71670],
       [72915, 73165],
       [74536, 74786],
       [76032, 76282],
       [77902, 78151],
       [80395, 80644],
       [82764, 83012],
       [84009, 84260]], dtype=int32)

In [88]:
trodes_final_df["session_dir"].unique()

array(['20240227_130241_comp_om_subj_4-2_and_4-3',
       '20240228_142038_comp_om_subj_3-1_and_3-3',
       '20240228_154053_comp_om_subj_4-3_and_4-4',
       '20240229_152936_comp_om_subj_3-3_and_3-4',
       '20240302_131025_comp_om_subj_3-1_and_3-4'], dtype=object)

In [89]:
trodes_final_df["video_name"].unique()

array(['20240227_130241_comp_om_subj_4-2_and_4-3.1.videoTimeStamps.cameraHWSync',
       '20240227_130241_comp_om_subj_4-2_and_4-3.2.videoTimeStamps.cameraHWSync',
       '20240228_142038_comp_om_subj_3-1_and_3-3.1.videoTimeStamps.cameraHWSync',
       '20240228_142038_comp_om_subj_3-1_and_3-3.2.videoTimeStamps.cameraHWSync',
       '20240228_154053_comp_om_subj_4-3_and_4-4.1.videoTimeStamps.cameraHWSync',
       '20240228_154053_comp_om_subj_4-3_and_4-4.2.videoTimeStamps.cameraHWSync',
       '20240229_152936_comp_om_subj_3-3_and_3-4.1.videoTimeStamps.cameraHWSync',
       '20240229_152936_comp_om_subj_3-3_and_3-4.2.videoTimeStamps.cameraHWSync',
       '20240302_131025_comp_om_subj_3-1_and_3-4.1.videoTimeStamps.cameraHWSync',
       '20240302_131025_comp_om_subj_3-1_and_3-4.2.videoTimeStamps.cameraHWSync'],
      dtype=object)

In [100]:

session_to_trodes_data["20240302_131025_comp_om_subj_3-1_and_3-4"]['20240302_131025_comp_om_subj_3-1_and_3-4']['video_timestamps']["1"]

{'clock rate': '20000',
 'camera_name': 'HD USB Camera (\\\\?\\usb#vid_32e4&pid_9230&mi_00#6&bec0719&2&0000#{e5323777-f976-4f5b-9b55-b94699c46e44}\\global)',
 'fields': '<PosTimestamp uint32><HWframeCount uint32><HWTimestamp uint64>',
 'data': array([( 6144944, 0, 0), ( 6144944, 0, 0), ( 6146330, 0, 0), ...,
        (69957308, 0, 0), (69958694, 0, 0), (69960080, 0, 0)],
       dtype=[('PosTimestamp', '<u4'), ('HWframeCount', '<u4'), ('HWTimestamp', '<u8')]),
 'filename': '20240302_131025_comp_om_subj_3-1_and_3-4.1.videoTimeStamps.cameraHWSync'}

In [104]:
np.diff(session_to_trodes_data["20240302_131025_comp_om_subj_3-1_and_3-4"]['20240302_131025_comp_om_subj_3-1_and_3-4']['video_timestamps']["1"]['data']["PosTimestamp"])[:50]

array([   0, 1386, 1386,    0, 1386, 1386, 1386,    0, 1385, 1386, 1386,
          0, 1386, 1386, 1386,    0, 1386, 1385,    0, 1386, 1386, 1386,
          0, 1386, 1386,    0, 1386, 1386, 1385,    0, 1386, 1386,    0,
       1386, 1386, 1386,    0, 1386, 1386, 1385,    0, 1386, 1386, 1386,
          0, 1386, 1386,    0, 1386, 1385], dtype=uint32)

In [105]:
np.diff(session_to_trodes_data["20240302_131025_comp_om_subj_3-1_and_3-4"]['20240302_131025_comp_om_subj_3-1_and_3-4']['video_timestamps']["1"]['data']["PosTimestamp"])[-50:]

array([   0, 1386, 1385, 1386,    0, 1386, 1386,    0, 1386, 1386,    0,
       1386, 1386, 1385,    0, 1386, 1386, 1386,    0, 1386, 1386, 1386,
          0, 1386, 1385,    0, 1386, 1386, 1386,    0, 1386, 1386,    0,
       1386, 1385, 1386,    0, 1386, 1386, 1386,    0, 1386, 1386,    0,
       1386, 1385, 1386,    0, 1386, 1386], dtype=uint32)