# Arduino Week 1 Data Analysis Jupyter Notebook

Author: Jun Ho Lee  
Last Update: 09/23/2019

**This code analyzes week 1 paradigm data (Port Habituation / Continuous Cue / Random Forced Choice)** 
- These paradigms are the ones with only single ':' as a delimiter
- Does NOT contain counter values
- Does NOT analyze hourly frequencies (resampled data)

____
<a id='Goals'></a>

### General Objective:
Streamline the Data Analysis Workflow by using both **1. Pycharm** (GUI interactivity) and **2. Jupyter** (better dataframe interactivity) 

**Pipeline (Daily):** 
1. Concatenate daily data into a Multilevel Dataframe (using Pycharm)
2. Read in the multilevel dataframe with .read_csv() (Using Jupyter) 
3. Use custom functions to parse out metrics of interest (Using Jupyter)
4. Save the aggregated metrics to a csv file 
5. Plot daily data to gather information 'on the fly'

**Pipeline (After End of Paradigm / Experiment):** 
1. Use custom functions to aggregate daily metric data over a time series (by each subject)
2. Use custom plotting functions to plot data over time


**References: Error Handling Docs**
1. [Built-in Exceptions](https://docs.python.org/3/library/exceptions.html)
2. [Errors and Exceptions](https://docs.python.org/3/tutorial/errors.html)
3. [Manually Raising an Error in Python](https://stackoverflow.com/questions/2052390/manually-raising-throwing-an-exception-in-python)
4. [Hierarchical Indexing Documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html)


### Important Note:

> Do **NOT** save or modify the original csv file!!! 
Modifying the csv file will lead to truncation of leading zeros in the file, which will mess up the analysis! 


<a id='Table of Contents'></a>
___
* * * * * * * * * 

0. <a href='#Goals'>Objectives and Pipeline</a>

## Table of Contents


1. <a href='#Function List'>List of Functions</a>

2. <a href='#data wrangling'>Initial Data Wrangling</a>

3. <a href='#metric output'>Metric Output</a>

3. <a href='#metric checkpoint'>Save Metric Output to CSV (Checkpoint)</a>



**Appendix:**  
1. <a href='#Event Code'>Event Codes</a> (Run this First!!)

<a id='Function List'></a>
___
    
### 1. List of Functions

*A. <a href='#data extraction'>Data Extraction and Parsing</a>*
1. <a href='#return_header_dict'>return-multi-header-dict</a>
2. <a href='#return_body_df'>return-multi-body-df</a>
3. <a href='#get_start_end_time'>get-start-end-time</a>
4. <a href='#return_multi_dt'>return-multi-dt-df</a> 
5. <a href='#fill_counter_datetime_col'>fill-counter-datetime-col</a> 
6. <a href='#return_multi_parsed_dt'>return-multi-parsed-dt-df</a>
7. <a href='#final wrapper function'>final-m-header-and-parsed-dt-df</a>


*B. <a href='#metric calculation'>Metric Calculations</a>*
1. <a href='#counts_during_window'>counts-during-window</a>
2. <a href='#count_events'>count-events</a>



### *For Function Testing Purposes*

###### Import Basic Libraries

In [4]:
import numpy as np
import pandas as pd
from natsort import natsorted, ns
from datetime import datetime
# import matplotlib.pyplot as plt
# import seaborn as sns

file = "../0923 RFC.csv" 
test_multi_df = pd.read_csv(file, header=[0,1], index_col=[0])


# # To test out the effect of saving on a csv file 
# file1 = "0903 w_delay.csv" 
# test1 = pd.read_csv(file1, header=[0,1], index_col=[0])

In [5]:
# df_test1 = test1['6']
# df_test1.head(10)
# test1.info()

<a id='data extraction'></a>
___
#### A. Data Extraction and Parsing


<a id='return_header_dict'></a>

**1: return_multi_header_dict (multi_df)**

- :**multi_df:** multilevel dataframe that we created from "import_files.py"
- **:return:** `m_head_dict`: nested dictionary of headers $\rightarrow$ {box number(keys): {header info(values)}}

In [6]:
# # Will only need the initial "MultiLevel Dataframe" to run subsequent codes!!
def return_multi_header_dict(multi_df):
    
    m_head_dict = {}
    box_numbers = multi_df.columns.levels[0]  # Returns a "Frozen List" 
    sorted_box_nums = natsorted(box_numbers) # outputs a list of sorted box numbers

    for i in range(len(sorted_box_nums)):
        box_num = sorted_box_nums[i]
        ind_df = multi_df.loc[:, box_num]  # individual dataframe (box is type 'string')

        ind_df = ind_df.dropna(how='all')

        start_code_idx = ind_df.index[ind_df.event_code == '0113'].tolist()[0]  # the list will only contain ONE element
        end_date_info = ind_df[-2:]  # last two rows will always be end date info

        head = ind_df[:start_code_idx]

        ind_head = pd.concat([head, end_date_info], axis=0)   # header dictionary requires end date/time info so need to concatenate the top and bottom dfs
        ind_head['timestamp'] = ind_head['timestamp'].str.strip()

        # # {first column: second column}
        ind_header_dict = {row[0]: row[1] for row in ind_head.values}  # .values --> transforms into numpy array

        m_head_dict[box_num] = ind_header_dict

    return m_head_dict


In [7]:
test_m_head_dict = return_multi_header_dict(test_multi_df)

<a href='#Function List'>Back to List of Functions</a>

In [8]:
# test_multi_df
# pd.DataFrame.from_dict(test_m_head_dict, orient='index')

<a id='return_body_df'></a>

**2: return_multi_body_df (multi_df):**

- **:multi_df:** multilevel dataframe that we created from "import_files.py"
- **:returns:** `m_body_df`: multilevel dataframe of the BODY portion of data  

*BODY*: FROM the first IR initialization (9070) TO the second to last row of the original dataframe (excluding end date/time)

In [11]:
# Will only need the initial "MultiLevel Dataframe" to run subsequent codes!!

def return_multi_body_df(multi_df, columns):

    result = []; box_arr = []
    box_numbers = multi_df.columns.levels[0]
    sorted_box_nums = natsorted(box_numbers) # outputs a list of sorted box numbers
    
    for i in range(len(sorted_box_nums)):  # for all the boxes, (outermost index is box number)
        box_num = sorted_box_nums[i]
        ind_df = multi_df.loc[:, box_num]  # individual dataframe

        ind_df = ind_df.dropna(how='all')
#         ind_df['event_code'] = ind_df['event_code'].astype('str')  # Changed

        # Extracting ACTUAL BODY
        header_end_idx = ind_df.loc[ind_df[ind_df.columns[0]] == '9070'].index[0]
        body_start_idx = header_end_idx + 1

        body = ind_df[body_start_idx:-2].reset_index(drop=True)
        body.loc[:,'timestamp'] = pd.to_numeric(body['timestamp'])
        body['event_string'] = body['event_code'].map(event_code_dict)

        body = body[columns]  # 4 columns or 3 columns

        box_arr.append(box_num)
        result.append(body)

    m_body_df = pd.concat(result, axis=1, keys=box_arr, names=['Box Number', 'Columns'])

    return m_body_df


<a href='#Event Code'>Event Codes</a> (Run this First!!)

In [12]:
# Use this for Port Habituation + Continuous Cue + RFC  (paradigms that don't have counter columns) - single : 
columns = ['event_string', 'event_code', 'timestamp']   # Including event_string is up to user's choice! 
# columns = ['event_code', 'timestamp']  

# Use this for TIR (paradigms that have counter values)  - double :: 
# columns = ['event_string', 'event_code', 'timestamp', 'counter'] 
# columns = ['event_code', 'timestamp', 'counter'] 

test_m_body_df = return_multi_body_df(test_multi_df, columns)

In [13]:
# test_m_body_df

<a href='#Function List'>Back to List of Functions</a>

<a id='get_start_end_time'></a>

**3. get_start_end_time (m_head_dict):**

- **:m_head_dict:** nested dictionary of headers for all boxes
- **:returns:** `start_end_time_dict`: dictionary of datetime tuples {box_num: (start_time, end_time)}


In [14]:
def get_start_end_time(m_head_dict):

    start_end_time_dict = {}

    box_numbers = list(m_head_dict)   # keys of the header dictionary --> box numbers
    for i in range(len(box_numbers)):
        box_num = box_numbers[i]
        
        # Start Datetime
        start_datetime = m_head_dict[box_num]['Start Date'] + " " + m_head_dict[box_num]['Start Time']
        start_datetime = start_datetime.replace("-",":")

        # End Datetime
        end_datetime = m_head_dict[box_num]['End Date']  + " " + m_head_dict[box_num]['End Time']
        end_datetime = end_datetime.replace("-",":")

        start_time = datetime.strptime(start_datetime, '%m/%d/%Y %H:%M:%S')
        end_time = datetime.strptime(end_datetime, '%m/%d/%Y %H:%M:%S')

        start_end_time_dict[box_num] = (start_time, end_time)  # saves it as a tuple of datetimes
        # print(start_time, end_time)

    return start_end_time_dict
    

In [15]:
test_start_end_time_dict = get_start_end_time(test_m_head_dict)

<a href='#Function List'>Back to List of Functions</a>

<a id='return_multi_dt'></a>

**4. return_multi_dt_df (m_head_dict, m_body_df, start_end_time_dict):**
- **:m_head_dict:** dictionary for all boxes
- **:m_body_df:** multilevel dataframe of the BODY
- **:start_end_time_dict:** nested dictionary of start/end time tuples
- **:returns:** `m_dt_df`: multilevel datetime dataframe


In [16]:

def return_multi_dt_df(m_head_dict, m_body_df, start_end_time_dict):

    result = []; box_arr = list(m_body_df.columns.levels[0])
    midx_shape = m_body_df.columns.levshape   # (returns a tuple)
    
    # # # Exception Handling 
    if (len(m_head_dict) != midx_shape[0]):   # This indicates the number of boxes
        raise ValueError('Number of boxes in dictionary and dataframe does not match')

    for i in range(len(box_arr)):  # for all the boxes in box_array
        box_num = box_arr[i]    
        ind_df = m_body_df.loc[:, box_num]  # individual dataframe / box_num --> class 'string'
    
        ind_df = ind_df.dropna(how='all')
        
        start_time = start_end_time_dict[box_num][0]
        end_time = start_end_time_dict[box_num][1]
        
        # # Broadcast new columns 
        ind_df['datetime_realtime'] = start_time + pd.to_timedelta(pd.to_numeric(ind_df['timestamp']), unit='ms')
        ind_df['day'] = ind_df['datetime_realtime'].dt.day
        ind_df['hour'] = ind_df['datetime_realtime'].dt.hour  # using the .dt accessor to access datetime object
    
        # box_arr.append(box_num)
        result.append(ind_df)
    
    m_dt_df = pd.concat(result, axis=1, keys=box_arr, names=['Box Number', 'Columns'])
    
    return m_dt_df
    

In [17]:
test_m_dt_df = return_multi_dt_df(test_m_head_dict, test_m_body_df, test_start_end_time_dict)
# test_m_dt_df

<a href='#Function List'>Back to List of Functions</a>

<a id='fill_counter_datetime_col'></a>

**5. fill_counter_datetime_col (m_dt_df):**
- **:m_dt_df:** multilevel datetime dataframe
- **:returns:** `m_dt_df_impute`: multilevel dataframe after datetime imputation (backfilled + ffilled) 


#### This is a necessary imputation step since counter values don't have timestamps! (and indexing will be impossible without valid timestamps) 
- Will use ffill if last row in the parsed dataframe is a counter value 
- Will use bfill if first row in the parsed dataframe is a counter value

(without this step, tried to parse the dataframe according to the indices, but that ran into problems as well... -> thought it would be easier to just bfill / ffill the missing datetime values for the counters) 


In [18]:

def fill_counter_datetime_col(m_dt_df):
    
    result = []; box_arr = list(m_dt_df.columns.levels[0])
#     midx_shape = m_dt_df.columns.levshape   # (returns a tuple)
    
    for i in range(len(box_arr)):  # for all the boxes in box_array
        box_num = box_arr[i]    
        ind_df = m_dt_df.loc[:, box_num]  # individual dataframe / box_num --> class 'string'
    
        ind_df = ind_df.dropna(how='all')
        
        first_row_timestamp = ind_df.iloc[0]['timestamp']  
        last_row_timestamp = ind_df.iloc[-1]['timestamp'] 
        
        ind_df_impute = ind_df.copy()

        # # Two if statements to ensure columns get filled in every case 
        # # Even if BOTH first row and last row are NaN values
        
        if pd.isnull(first_row_timestamp):
            ind_df_impute['datetime_filled'] = ind_df_impute.datetime_realtime.fillna(method='bfill')

        # # Column updating! 
        if pd.isnull(last_row_timestamp): 
            ind_df_impute['datetime_filled'] = ind_df_impute['datetime_filled'].fillna(method='ffill')
            
        # # If none of the first/last rows are none, just use bfill method
        else:
            ind_df_impute['datetime_filled'] = ind_df_impute.datetime_realtime.fillna(method='bfill')

        result.append(ind_df_impute)
    
    # box_arr from above (before the for loop)
    m_dt_df_imputed = pd.concat(result, axis=1, keys=box_arr, names=['Box Number', 'Columns'])
            
    return m_dt_df_imputed
    

In [19]:
test_m_dt_df_imputed = fill_counter_datetime_col(test_m_dt_df)
# test_m_dt_df_imputed

<a href='#Function List'>Back to List of Functions</a>

<a id='return_multi_parsed_dt'></a>

**6. return_multi_parsed_dt_df (m_head_dict, m_dt_df_imputed, start_parsetime, end_parsetime):**
- **:m_head_dict:** dictionary for all boxes
- **:m_dt_df_imputed:** multilevel datetime dataframe after imputation
- **:start_parsetime:** start of parsetime
- **:end_parsetime:** end of parsetime
- **:returns:** `m_parsed_dt_df`: multilevel parsed datetime dataframe (parsed by start/end times)


In [20]:
def return_multi_parsed_dt_df(m_head_dict, m_dt_df_imputed, start_parsetime, end_parsetime):
    
    # # Parse Time Criteria for all files (boxes)
    start_dt = datetime.strptime(start_parsetime, '%Y/%m/%d %H:%M')
    end_dt = datetime.strptime(end_parsetime, '%Y/%m/%d %H:%M')
    
    # # Boilerplate for Multilevel Dataframe
    result = []; box_arr = list(m_dt_df_imputed.columns.levels[0])

    for i in range(len(box_arr)):  
        box_num = box_arr[i]
        ind_df = m_dt_df_imputed.loc[:, box_num]  # individual dataframe
        # No need for conversion to str(box_num) since box_num is already string 

        ind_df = ind_df.dropna(how='all')
        
        # 1. Parse by time
        # # : Problem --> counter values don't have timestamps, thus need to index the dataframe
        # # : Problem solved by imputing datetimes
        p_body = ind_df[(ind_df['datetime_filled'] >= start_dt) & (ind_df['datetime_filled'] <= end_dt)]
        
        result.append(p_body)
        
    m_parsed_dt_df = pd.concat(result, axis=1, keys=box_arr, names=['Box Number', 'Columns'])
        
    return m_parsed_dt_df



In [21]:
print(test_m_head_dict['1']['Start Date'])
print(test_m_head_dict['1']['Start Time'])


09/23/2019
10-01-18


In [22]:

start_parsetime = '2019/09/20 18:00'
end_parsetime = '2019/09/21 06:00'
test_m_parsed_dt_df = return_multi_parsed_dt_df(test_m_head_dict, test_m_dt_df_imputed, start_parsetime, end_parsetime)

# test_m_parsed_dt_df

# test_m_parsed_dt_df['10'].dropna(how='all')

<a href='#Function List'>Back to List of Functions</a>

<a id='final wrapper function'></a>

### Wrapper Function (of the above 6 functions)!

**7. final_m_header_and_parsed_dt_df (file, start_parsetime, end_parsetime):**
- **:file:** name of csv file saved from Pycharm
- **:start_parsetime:** start of parsetime
- **:end_parsetime:** end of parsetime
- **:returns:** a tuple *(`m_head_dict, m_parsed_dt_df`)*   
    Note this function returns `m_parsed_dt_df`! (output from last function)


In [23]:
def final_m_header_and_parsed_dt_df(file, columns, start_parsetime, end_parsetime):

    # # Reading in multilevel dataframe 
    multi_df = pd.read_csv(file, header=[0,1], index_col=[0])
    
    m_head_dict = return_multi_header_dict(multi_df)
    m_body_df = return_multi_body_df(multi_df, columns)
    
    # # Dictinoary of start/end time tuples
    m_start_end_time_dict = get_start_end_time(m_head_dict)
    
    # # Returns dataframe with imputed datetime 
    m_dt_df = return_multi_dt_df(m_head_dict, m_body_df, m_start_end_time_dict)
    m_dt_df_imputed = fill_counter_datetime_col(m_dt_df)
    
    m_parsed_dt_df = return_multi_parsed_dt_df(m_head_dict, m_dt_df_imputed, start_parsetime, end_parsetime)
    
    return m_head_dict, m_parsed_dt_df


No Test Output for this function

<a href='#Function List'>Back to List of Functions</a>

<a id='metric calculation'></a>
___
#### B. Metric Calculations


<a id='counts_during_window'></a>

### FUNCTION NOT USED

**1: counts_during_window (m_parsed_dt_df, start_parsetime, window_of_interest):**
- **:m_parsed_dt_df:** multilevel dataframe PARSED by inputted time window
- **:start_parsetime:** start of parsetime $\rightarrow$ date for metrics gets extracted from here! 
- **:window_of_interest:** list of window counters event codes
- **:returns:** `m_window_counter_df`: multilevel window counter dataframe with metric code as index

In [24]:

def counts_during_window(m_parsed_dt_df, start_parsetime, window_of_interest):

    result = []; box_arr = list(m_parsed_dt_df.columns.levels[0])
    # No need to check for ValueError (length of boxes in dict and dataframe since it's already been checked once above)

    for i in range(len(box_arr)):  # for all the boxes, (outermost index is box number)
        box_num = box_arr[i]
        ind_df = m_parsed_dt_df.loc[:, box_num]  # individual dataframe
        
        # individual dataframe!! 
        ind_df = ind_df.dropna(how='all')
        
#         ind_df['event_code'] = ind_df['event_code'].astype('str')  # CHANGED to string dtype (for event_code )
        
        window_df = ind_df[ind_df.event_code.isin(window_of_interest)]
        final_df = window_df[['event_code','counter']]
        
        # Pivot the counter dataframe (so that L/M/R becomes columns) 
        # Replace np.nan with empty strings
        pivoted = final_df.pivot(index=None, columns='event_code', values='counter')
        pivoted.replace(np.nan, '')
        
        L_count = pivoted[window_of_interest[0]].sum()
        M_count = pivoted[window_of_interest[1]].sum()
        R_count = pivoted[window_of_interest[2]].sum()
        T_count = L_count + M_count + R_count

        # date 
        m_date = start_parsetime[5:10]  # Use parsetime value to extract dates! (since each file can have data for multiple days )
        # Metric Code: will become the name of index (for dataframe)
        metric_code = m_date + " code: oo" + window_of_interest[0][-2:]  # last two digits of counter_code

        # make it into a dataframe so that L/M/R is in the column!! 
        # (pass L/M/R with double brackets [[]])
        # Shape is (1, 4)
        window_counter_df = pd.DataFrame([[L_count, M_count, R_count, T_count]], index=[metric_code], columns=['Left','Middle','Right','Total'])
        result.append(window_counter_df)
    
    m_window_counter_df = pd.concat(result, axis=1, keys=box_arr, names=['Box Number', 'Columns'])
    
    return m_window_counter_df

In [25]:
# window = ['7529', '8529', '9529']
# # counts_during_window(test_m_parsed_dt_df, start_parsetime, window)


<a href='#Function List'>Back to List of Functions</a>

<a id='count_events'></a>

**2: count_events (m_parsed_dt_df, start_parsetime, code_of_interest):**
- **:m_parsed_dt_df:** multilevel dataframe PARSED by inputted time window
- **:start_parsetime:** start of parsetime $\rightarrow$ date for metrics gets extracted from here! 
- **:code_of_interest:** list of event codes to count
- **:returns:** `m_event_counter_df`: multilevel event counter dataframe with event code as index

In [26]:

def count_events(m_parsed_dt_df, start_parsetime, code_of_interest):
    result = []; box_arr = list(m_parsed_dt_df.columns.levels[0])
    # No need to check for ValueError (length of boxes in dict and dataframe since it's already been checked once above)

    for i in range(len(box_arr)):  # for all the boxes, (outermost index is box number)
        box_num = box_arr[i]
        ind_df = m_parsed_dt_df.loc[:, box_num]  # individual dataframe
        
        # individual dataframe!! 
        ind_df = ind_df.dropna(how='all')
        
        # filtered by the event codes in the array: 
        filtered = ind_df[ind_df.event_code.isin(code_of_interest)]
        
        # # Use (length of the index array at which dataframe evals to true) to count # of event occurences
        # # This method will be more robust (logic similar to np.where()) and generalizable 
        left = len(filtered[filtered.event_code == code_of_interest[0]].index)
        middle = len(filtered[filtered.event_code == code_of_interest[1]].index)
        right = len(filtered[filtered.event_code == code_of_interest[2]].index)
        total = left + middle + right
        
        # date 
        m_date = start_parsetime[5:10]  # Use parsetime value to extract dates! (since each file can have data for multiple days )
        # Metric Code: will become the name of index (for dataframe)
        metric_code = m_date + " event_cts: x" + code_of_interest[0][-3:]  # last three digits of event_code
        
        event_counter_df = pd.DataFrame([[left, middle, right, total]], index=[metric_code], columns=['Left','Middle','Right', 'Total'])
        result.append(event_counter_df)
            
    m_event_counter_df = pd.concat(result, axis=1, keys=box_arr, names=['Box Number', 'Columns'])
    
    return m_event_counter_df
    

In [27]:
# # Omission Trial Counts
omission_trials = ['7540','8540','9540']
count_events(test_m_parsed_dt_df, start_parsetime, omission_trials)


Box Number,1,1,1,1,2,2,2,2,3,3,...,8,8,9,9,9,9,10,10,10,10
Columns,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,...,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total
09/20 event_cts: x540,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


<a href='#Function List'>Back to List of Functions</a>

<a href='#Table of Contents'>Back to Table of Contents</a>

<a id='data wrangling'></a>
___
### 2. Initial Data Wrangling 

**Import Basic Libraries**

In [28]:
# import os
import numpy as np
import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt 
import seaborn as sns

<a href='#Event Code'>Event Codes</a> (Run this First!!)

**Run the final Wrapper Function**

<a href='#Table of Contents'>Back to Table of Contents</a>

In [92]:
# Use this for Port Habituation + Continuous Cue + RFC  (paradigms that don't have counter columns) - single : 
columns = ['event_string', 'event_code', 'timestamp']   # Including event_string is up to user's choice! 


file = "../0923 TIR.csv"      # Change Here! 

test_multi_df = pd.read_csv(file, header=[0,1], index_col=[0])
m_head_dict = return_multi_header_dict(test_multi_df)
 

In [93]:
print(m_head_dict['1']['Start Date'])
print(m_head_dict['1']['Start Time'])
print(m_head_dict['1']['End Date'])
print(m_head_dict['1']['End Time'])

09/21/2019
14-40-17
09/23/2019
09-27-40


In [101]:
# # # Arguments 

# Use this for Port Habituation + Continuous Cue + RFC  (paradigms that don't have counter columns) - single : 
columns = ['event_string', 'event_code', 'timestamp']   # Including event_string is up to user's choice! 


file = "../0921-0922 RFC.csv"      # Change Here! 
 
start_parsetime = '2019/09/22 15:00'    # Change Here! 
end_parsetime = '2019/09/23 09:00'      # Change Here! 


# # Final Function 
(m_head_dict, m_parsed_dt_df) = final_m_header_and_parsed_dt_df(file, columns, start_parsetime, end_parsetime)
 


In [95]:
# m_parsed_dt_df

<a href='#final wrapper function'>To Final Wrapper Function</a>

<a id='metric output'></a>
___
### 3. Metric Outputs

In [102]:
# # Total Poke Counts
total_pokes = ['7071','8071','9071']
total_poke_count = count_events(m_parsed_dt_df, start_parsetime, total_pokes)
total_poke_count

Box Number,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,10,10,10,10
Columns,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total
09/22 event_cts: x071,198,371,211,780,118,119,97,334,105,119,148,372,393,137,137,667,258,99,112,469,355,423,312,1090,323,312,240,875,204,286,211,701,239,283,212,734,173,264,171,608


In [103]:
# # Reward Counts
reward_trials = ['7271','8271','9271']  # Solenoid Counts
reward_count = count_events(m_parsed_dt_df, start_parsetime, reward_trials)
reward_count

Box Number,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,10,10,10,10
Columns,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total
09/22 event_cts: x271,78,87,84,249,78,81,80,239,101,104,97,302,76,68,62,206,84,82,80,246,113,115,117,345,114,105,115,334,84,88,81,253,76,85,81,242,121,137,133,391


In [104]:
# # LED on Counts
led_trials = ['7171','8171','9171']  # Solenoid Counts
led_count = count_events(m_parsed_dt_df, start_parsetime, led_trials)
led_count

Box Number,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,10,10,10,10
Columns,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total
09/22 event_cts: x171,78,87,84,249,78,80,81,239,100,105,97,302,77,68,61,206,84,82,80,246,113,115,117,345,115,104,115,334,85,88,80,253,76,85,81,242,122,136,133,391


<a href='#Table of Contents'>Back to Table of Contents</a>

<a id='metric checkpoint'></a>
___
### 4. Save Metric Output to CSV (Checkpoint)

#### Concatenated Metric Output

In [105]:
# # Add Metrics / Change the dataframe title as you wish 

pd.set_option('display.max_columns', 500)
rfc = pd.concat([total_poke_count, reward_count])  # 'Total Poke' equals 'reward count'
rfc

Box Number,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,10,10,10,10
Columns,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total,Left,Middle,Right,Total
09/22 event_cts: x071,198,371,211,780,118,119,97,334,105,119,148,372,393,137,137,667,258,99,112,469,355,423,312,1090,323,312,240,875,204,286,211,701,239,283,212,734,173,264,171,608
09/22 event_cts: x271,78,87,84,249,78,81,80,239,101,104,97,302,76,68,62,206,84,82,80,246,113,115,117,345,114,105,115,334,84,88,81,253,76,85,81,242,121,137,133,391


**Save the metrics into the directory - Make sure to check the date in title!!**

In [106]:
rfc.to_csv("Final_Metrics/23hrs/0922_RFC_15-14.csv")

<a href='#Table of Contents'>Back to Table of Contents</a>

<a id='Event Code'></a>
___
### Appendix:

***Event Codes***

In [10]:

event_code_dict = {'7071' :'L_Poke_Valid_IN',  '7171' :'L_led_Valid_ON',  '7271' :'L_sol_Valid_ON',
                   '7070' :'L_Poke_Valid_OUT', '7170' :'L_led_Valid_OFF', '7270' :'L_sol_Valid_OFF',
                   '8071' :'M_Poke_Valid_IN',  '8171' :'M_led_Valid_ON',  '8271' :'M_sol_Valid_ON',
                   '8070' :'M_Poke_Valid_OUT', '8170' :'M_led_Valid_OFF', '8270' :'M_sol_Valid_OFF',
                   '9071' :'R_Poke_Valid_IN',  '9171' :'R_led_Valid_ON',  '9271' :'R_sol_Valid_ON',
                   '9070' :'R_Poke_Valid_OUT', '9170' :'R_led_Valid_OFF', '9270' :'R_sol_Valid_OFF',

                   '7160' :'L_led_Invalid_OFF',
                   '8160' :'M_led_Invalid_OFF',
                   '9160' :'R_led_Invalid_OFF',

                   '7519' :'L_iw',  '7529' :'L_tw',  '7539' :'L_vw', '7559' :'L_delay_w',
                   '8519' :'M_iw',  '8529' :'M_tw',  '8539' :'M_vw', '8559' :'M_delay_w',
                   '9519' :'R_iw',  '9529' :'R_tw',  '9539' :'R_vw', '9559' :'R_delay_w',

                   '7540' :'Left Omission', '8540' :'Middle Omission', '9540' :'Right Omission',

                   '5520' :'Trial_Window_End',
                   '5521' :'Trial_Window_Start',

                   '0114' :'END'}




<a href='#data wrangling'>Back to Initial Data Wrangling</a>

<a href='#return_body_df'>Back to return-body-df function</a>

<a href='#Table of Contents'>Back to Table of Contents</a>

#### End of Notebook