# medPC wrangler 
***
medPC wrangler contains a set of functions to convert medPC output files into a more useable format for further data analysis, as well as pull event counts and latencies that occur within trials or within a certain time window following a given event.

it is *imperative* that the files you are using follow the same structure to the example files. 
format and amount of event codes are malleable such that they follow this format: event code + time in seconds (eg, as in the examples, 30000 + 180.908 = 30180.908)

the functions are written such that each file is handled as a single session, primarily via the "experiment" label in each subject's header in the file.

the medpc_wrangler.py scripts must also be in the same folder as the files you would like to use.

data structures should appear similar to the following example, in which all data of interest is in the B array: 


File: C:\MED-PC IV\DATA\!2023-06-11



Start Date: 06/11/23  
End Date: 06/11/23  
Subject: C6_01  
Experiment: day_12  
Group: L  
Box: 1  
Start Time: 14:58:32  
End Time: 16:00:18  
MSN: TT_auto_left_TTL  
A:      25.000  
D:    7000.000  
F:      68.000  
G:       1.000  
H:       0.000  
I:      58.000  
J:       1.000  
K:       0.000  
L:       0.000  
M:      58.000  
N:       0.000  
O:       0.000  
P:      58.000  
Q:      50.000  
R:       0.000  
S:      25.000  
T:       0.000  
U:       0.000  
V:       0.000  
W:       0.000  
X:  353836.000  
Y:       0.000  
Z:     385.000  
B:  
     0:    30013.710    40013.720    50060.020   110060.030    10069.730  
     5:    60070.030   120070.040    30070.470    40070.480    30099.230  
    10:    40099.240    70130.870   130130.880    20131.100    80140.880  
C:  
     0:     6000.000     6000.000     7500.000     4500.000     7000.000  
     5:     5000.000     6500.000     5500.000  
E:  
     0:        1.000        2.000  
***

***
to begin using medPC wrangler, first you must import the functions:
***

In [4]:
import medpc_wrangler as med

***
now, set your event codes as they appear in your .mpc code to run your tasks:
***

In [5]:
csp_press = 10000 
csm_press = 20000
mag_start = 30000 
mag_end = 40000 
csp_start = 50000 
csp_end = 60000 
csm_start = 70000 
csm_end = 80000 
pel = 90000 

***
you likely will want to run multiple files/sessions at once. you can do so using the following loop through each necessary function to create a dataframe as follows:
***

In [7]:
import glob
import pandas as pd

files = glob.glob('/Users/ericatownsend/Desktop/medpc_wrangler/medpc/*')
# the * will take every file in the folder. 

list_of_session_dfs = []
for file in files: # this will iterate through each datafile for you
    data_strings = med.mpc_to_strings(file)
    cleaned_data = med.extract_data_from_file(data_strings)
    time_event_df = med.time_event(cleaned_data)
    list_of_session_dfs.append(time_event_df)

final_df = pd.concat(list_of_session_dfs)
final_df['animal_session'] = final_df.session.values + '_' + final_df.subject.values

the dataframe will appear as so, with subjects, sessions, events, event times parsed out as well as an animal_session column for grouping purposes if you choose to use the count and latency functions:

In [12]:
final_df.head(5)

Unnamed: 0,subject,session,event,time,animal_session
0,C6_01,day_01,30000,3.2,day_01_C6_01
0,C6_01,day_01,40000,3.21,day_01_C6_01
0,C6_01,day_01,30000,4.6,day_01_C6_01
0,C6_01,day_01,40000,4.61,day_01_C6_01
0,C6_01,day_01,30000,13.8,day_01_C6_01


***
from this point, you can pull out the counts of an event within trials. you can also pull out the counts of an event from another event within a time window (eg, 10 seconds after a cue ends).
***

In [17]:
# counting csp presses during a trial (in between event codes for csp start and end)
event_counts_during_trials = med.event_trial_counts(df = final_df, event1 = csp_start, event2 = csp_end, 
                                                    event_of_interest = csp_press)

event_counts_during_trials[100:110] # example output

Unnamed: 0,subject,session,trial,count
100,C6_01,day_12,1,1
101,C6_01,day_12,2,6
102,C6_01,day_12,3,4
103,C6_01,day_12,4,3
104,C6_01,day_12,5,4
105,C6_01,day_12,6,5
106,C6_01,day_12,7,1
107,C6_01,day_12,8,2
108,C6_01,day_12,9,5
109,C6_01,day_12,10,1


In [23]:
# counting magazine entries 10 seconds after trial end event codes
event_counts_after_trials = med.event_timed_counts(df = final_df, event1 = csp_end,
                                                    seconds_post_event = 10, 
                                                    event_of_interest = mag_start)

event_counts_after_trials[100:110] # example output

Unnamed: 0,subject,session,instance,count
100,C6_01,day_12,1,1
101,C6_01,day_12,2,1
102,C6_01,day_12,3,1
103,C6_01,day_12,4,2
104,C6_01,day_12,5,1
105,C6_01,day_12,6,1
106,C6_01,day_12,7,2
107,C6_01,day_12,8,1
108,C6_01,day_12,9,2
109,C6_01,day_12,10,2


***
finally, you can pull out the latencies to a given event in a similar fashion.   
*NOTE: if the event did not occur in the trial/time window, the latency value for that trial/instance will be NaN.*
***

In [28]:
# pulling latency to the first csp press during a trial (in between event codes for csp start and end)
event_latency_during_trials = med.event_trial_latency(df = final_df, event1 = csp_start, event2 = csp_end, 
                                                    event_of_interest = csp_press)

event_latency_during_trials[100:110] # example output

Unnamed: 0,subject,session,trial,latency
100,C6_01,day_12,1,9.71
101,C6_01,day_12,2,2.41
102,C6_01,day_12,3,4.02
103,C6_01,day_12,4,3.66
104,C6_01,day_12,5,3.09
105,C6_01,day_12,6,3.95
106,C6_01,day_12,7,7.1
107,C6_01,day_12,8,4.33
108,C6_01,day_12,9,2.2
109,C6_01,day_12,10,8.07


In [29]:
# pulling latency to first magazine entry 10 seconds after trial end event codes
event_latency_after_trials = med.event_timed_latency(df = final_df, event1 = csp_end,
                                                    seconds_post_event = 10, 
                                                    event_of_interest = mag_start)

event_latency_after_trials[100:110] # example output

Unnamed: 0,subject,session,instance,latency
100,C6_01,day_12,1,0.44
101,C6_01,day_12,1,0.47
102,C6_01,day_12,1,0.46
103,C6_01,day_12,1,0.45
104,C6_01,day_12,1,0.55
105,C6_01,day_12,1,0.44
106,C6_01,day_12,1,0.56
107,C6_01,day_12,1,0.49
108,C6_01,day_12,1,0.41
109,C6_01,day_12,1,0.42


***
*any of these dataframes can be saved as a csv file using the pandas function* df_name.to_csv('filename.csv')
***