# Create Onset and duration file from log-data

### Intention:
In this notebook I will  use the log files from the mr-scanner to evaluate timing information of our experiment and make it accesible for machine learning algorithms.
Basically the log-file contains all the information we need about the timing of the different events (e.g. emotion categories shown) to the participant, but it also contains a lot of information that is of no interest to us. Additionally it is formatted in a way thats not suitable for the nilearn machine learning tools we will use later on in the analysis (e.g. Support-Vector-Machine with Cross-Validation methods).   
Thats why we have to extract valuable information (e.g. timing and type of trials) and create a file that is way less complex than the original log-file and therefore easier to read and interpret.


### Underlying Experiment:
Our paradigma is implemented in psychopy3 and gets presented to the participants while they are in the scanner.
In the experiment participants see short videos (1.5 s) of human avatars. These avatars change their facial expression througout the video from neutral into several kinds of emotional facial expressions. The emotions used for this paradigm are:  

1. happy  
2. disgusted  
3. happily surprised  
4. happily disgusted  
5. angriliy surprised  
6. fearfully surprised  
7. sadly fearful  
8. fearfully disgusted  

In one block Participants see 6 videos of different avatars moving into the same emotion. Participants then see the names of 2 emotions and have to select the one that rather decribes the videos presneted beforehand. The experiment conists of 9 runs with 16 blocks each.

### Steps to be made:
1. **Set starting point** of the experiment to be the point 0 in time (as the scanner does things before the actual experiment starts the starting point of the log-file ist not equal to the starting point of the experiment. So we basically remove everything that has happened before the first trigger (start of the experiment)
2. We ** extract first and then rename the rows of interest** (e.g. starts of blocks of presentation of Videos, keypresses, etc)
3. We then subract the time that has passed before starting the experiment from every event to **normalize time information** for the events of experiment (e.g. the first block should start 6 seconds after starting the experiment but in the log-file the time of this event could be at second 200 if the first trigger was at approximately 194 seconds) 
4. In compliance with BIDS standarts we have to **add a duration column**, so we added this column to the dataframe and filled it row by row. To do so we substracted the onset of the present event from the onset of the following event.
5. As you might realize when you take a look at the log-files the time dimension is measured in seconds. As we have fmri-data that is in 4D shape (fourth dimension is scan number) we need to tranform this data frame into a suitable format for labeling such data (e.g. we need scan number as the time dimension and not seconds).  Frankly we know the TR so we can calculate how many scans the were made in the time of the experiment (last timepoint / TR). When dividing the onsets of the events by the TR we will know the scan number of each events start. We then can check how often the TR fits into the duration of the events to get an idea of how many scans the event lasted. If there were 2 events within one scan we can choose to either leave this scan blank or fill it with the event that lasted longer than half of the TR. Doing this for every event subsequently we **transformed the time dimension from seconds to scan numbers**. 
6. Now we can **replace the placeholders "start_block" with the emotion category presented in this specific block**. We  use a list coontaing every emotion category as a string ordered in an ascending order. So the first time we reach an event = "start_block" in our dataframe we replace it with the emotion category which is the ith element of the conditions-list. As soon as a row occurs that does not have "start_block" as event (e.g. whenever the block is over) we set the counter one up such that the events column for the next block will be set to the second element of the conditions-list. 
7. Knowing the emotion category presented in specific trials we can now **fill the columns of the AU's** as a specific emotion has an "AU-Code" e.g. every emotion has specific values (absent/present) for each AU. We fill the columns with 0's for trials in which the AU is absent and 1's for trilas in which it is present. Nan was filled in if during that spedific scan no video was presented (e.g. time between blocks or breaks)
8. **Adding the columns block and run** is essential if we want to use Cross-validation methods as we need parameters we can use to split the data. We therefore created a loop that fills in a number of the block_counter in the block column in an anscending manner (start with the first block and jump to the next one as soon as the block is over)  and if the counter reaches 16 the counter for run increases by 1. Both counters start at 1.
9. We can now export the dataframe as .csv file to our directory or keep on and do the last step in order to not get possibly biased results
10. As the  number of absent and present trials for each AU are not equal we need to randomly erase rows containg the conditions that is overrepresented (e.g. if there are more trials showing the AU than trials not showing it we have to remove some of the trials with the AU present). We do this by getting the absolute difference of absnet vs. presnet trials for every AU and then randomly remove that many trials of the condition that has more trials overall (e.g. is overrepresented). We can now save a specific dataframe for each AU with the balanced presentation of absent/present trials into our directory.

In [29]:
%matplotlib inline

## Import modules
as we will be playing around with spreadsheets we will use pandas and numpy as pandas dataframes make manipulation of data possible in many ways

In [32]:
import pandas as pd
import numpy as np

#subject_ids = ["02","03","04","05","06","07","08","09","10","11"]
subject_ids = ["11"]
#log_names = ["AU_MVPA_05_2021_Feb_17_0957.log","AU_MVPA_0606_2021_Feb_17_1120.log"]
log_names = ["PyMVPA_test_02_2020_Jun_05_1235.log","test_02_2020_Jun_10_1236.log","Test03_2020_Jun_12_1025.log","AU_MVPA_04_2021_Feb_17_0838.log","AU_MVPA_05_2021_Feb_17_0957.log","AU_MVPA_0606_2021_Feb_17_1120.log"]

## check out one log-file to get an idea of what it looks like

In [26]:
path = '/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/source/Log_Files/AU_MVPA_0606_2021_Feb_17_1120.log'
sid = "07"
path = '/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/source/Log_Files/sub-'+sid+'/sub-'+sid+'_task-video_log.log'
df = pd.read_csv(path, delimiter = '\t')
print(df)

            time      type                                              event
1       506.9135      EXP   Created window1 = Window(allowGUI=False, allow...
2       506.9135      EXP               window1: recordFrameIntervals = False
3       507.0797      EXP                window1: recordFrameIntervals = True
4       507.2631      EXP               window1: recordFrameIntervals = False
...          ...       ...                                                ...
11552  3591.5072     DATA                                         Keypress: t
11553  3592.9569     DATA                                         Keypress: t
11554  3594.4066     DATA                                         Keypress: t
11555  3595.8563     DATA                                         Keypress: t
11556  3596.2223      EXP                        window1: mouseVisible = True

[11557 rows x 3 columns]


## Create the list with the order in which emotion categories have been presented:
In order to do this we just have to take every column of the Runs_datapath_VIDEOS (file containing the ordering of emotion category presentation) and convert them to lists. We the append all the different lists, starting with the first column and ending with the last so we will have a list containg the order in which emotion categories have been presented.  
We will later use this list to fill the categories into our dataframe to then beeing able to know which AU have been present/absent during specific trials. 

In [27]:
path_conditions = '/media/lmn/86A406A0A406933B/Aaron_MA/Runs_datapaths_VIDEOS.csv'
conditions = pd.read_csv(path_conditions, delimiter = ',')

conditions_simple = conditions.iloc[::6, :]
emocat1 = conditions_simple["emocat1"].tolist()
emocat2 = conditions_simple["emocat2"].tolist()
emocat3 = conditions_simple["emocat3"].tolist()
emocat4 = conditions_simple["emocat4"].tolist()
emocat5 = conditions_simple["emocat5"].tolist()
emocat6 = conditions_simple["emocat6"].tolist()
emocat7 = conditions_simple["emocat7"].tolist()
emocat8 = conditions_simple["emocat8"].tolist()
emocat9 = conditions_simple["emocat9"].tolist()
conditions_list = emocat1 + emocat2 + emocat3 + emocat4 + emocat5 + emocat6 + emocat7 + emocat8 + emocat9
#check out the list
conditions_list
#make it nicer to look at by removing the endings just leaving us with the emotion categories
for i in range(0,len(conditions_list)):
    conditions_list[i] =  conditions_list[i].split('_')[0]
#check it out again
#conditions_list
print (len(conditions_list))

144


## This for-loop does all that we need
To get from a log-file to an interpretable timing-file we need to extract the information about specific timepoints of the Experiment .  
**Steps to be made are roughly :** (for more information check above)
1. **Extraxt the starting point** of the event (the first trigger). We need this to normalize the rest of the data (e.g. timepoint 0 should be when the experiment starts) and everything afterwards shall be normalized to this.
2. Then we **rename the rows of interest** (e.g. start_block when a presentaion block starts, keypresses, etc) and cut out everything else.
3. Transform dataframe such that the **time dimension is defined by scan numbers**.
4. **Fill in the emotion categories** in the order given by the conditions-list created beforehand and afterwards fill in the colums of the AU with 0's or 1's for absent or present trials or NaN if no video was presented at that point in time.
5. **Add the columns block and row and fill them respectively.**
6. **Save dataframe as .csv to my directory**
7. **Create a downsampled version of dataframe for each AU** to balance the number of absent/present trials and save them as .csv to my directory (four files for each subject --> one for each AU)

In [33]:
sid_counter = 0
for sid in subject_ids:
    path = '/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/source/Log_Files/sub-'+sid+'/sub-'+sid+'_task-video_log.log'
    df = pd.read_csv(path, delimiter = '\t')
    #sid = str(subject_ids[sid_counter])
    
    #get starting point
    start_exp = 0
    for index,row in df.iterrows():
        if (row['event'] == 'Keypress: t') and start_exp == 0:
        #        print ('here')
            start_exp = index
        #print (index)
        #print (row['time'])
            print(sid)
            print ('Exp start at index = ' + str(start_exp))
            print (df.iloc[start_exp].time)
        
    #get stimulus onsets (and type)
    sotdf = pd.DataFrame(columns=['trial_type','onset'])
    start_exp = False
    counter = 0
    for index, row in df.iterrows():
        # get block init point
        if(row['event'].startswith('New trial (rep=0, index=0): OrderedDict([(\'emocat1\',')):
            sotdf.loc[counter]=['start_block',df.iloc[index].time]
            counter = counter+1
        #get first trigger event
        if (row['event'] == 'Keypress: t') and start_exp == False:
            start_exp = True
            sotdf.loc[counter]=['start_experiment',df.iloc[index].time]
            counter = counter+1
        # get movie
        #if(': movie = \'' in row['event']):
         #   sotdf.loc[counter]=[df.iloc[index].event,df.iloc[index].time]
            #counter = counter+1
        # response
        #if(row['event'].startswith('Keypress:') and row['event'] != 'Keypress: t'):
            #sotdf.loc[counter]=[df.iloc[index].event,df.iloc[index].time]
            #counter = counter+1

        if(': text = \'' in row['event']):
            sotdf.loc[counter]=[df.iloc[index].event,df.iloc[index].time]
            counter = counter+1
    
    starting_point = sotdf.iloc[0].onset
    for index, row in sotdf.iterrows():
        sotdf.loc[index,'onset'] = sotdf.iloc[index].onset - starting_point 
        
    #get a column for duration to match BIDS standart
    #I had to create a list of 0's that has the same length as our dataframe has rows in order to not get into 
    #troubles with dimensions
    df_length = len(sotdf.index)
    duration = np.zeros(df_length, dtype = float)
    sotdf['duration'] = duration
    
    # here I fill the column duration (just 0's so far) by subtracting the onset of the event from the onset of the
    # following event
    i = 1
    for index, row in sotdf.iterrows():
        if(i < df_length):
            sotdf.loc[index, 'duration'] = sotdf.iloc[index+1].onset - sotdf.iloc[index].onset
        i = i+1

    print(sotdf)
    sotdf.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/data_bids_all/sub-'+sid+'/func/sub-'+sid+'_task-video_events.tsv')
    #we have to transform the time dimension into scans instead of seconds
    #take the last element of the dataframe and divide its number (seconds passed until that event) by the TR of the scanner
    #to get the number of scans that have been done throughout the experiment
    scan_nr = sotdf.iloc[len(sotdf)-1].onset / 1.45
    
    scans = range(1,int(round(scan_nr)))
    scan_nr_list = list(scans)
    scan_df = pd.DataFrame(columns=["scan_number","event","AU1","AU2","AU12","AU20","block","run","run_total","subject_id"])
    scan_df['scan_number'] = scan_nr_list
    
    scan_df_av = pd.DataFrame(columns=["category","AU1","AU2","AU12","AU20","block","run","run_total","subject_id"])
    scan_df_av['category'] = conditions_list
    #fill all the rows of one sibject with its subject_id in order to merge data afterwards and single out subjects
    #this is important for leave one subject out cross-validation
    for index,row in scan_df.iterrows():
        scan_df.at[index, 'subject_id'] = sid
    for index, row in scan_df_av.iterrows():
        scan_df_av.at[index, 'subject_id'] = sid
    
    i=0
    tr = 1.45
    for index,row in scan_df.iterrows():
        time = scan_df.iloc[index].scan_number * tr
        #if sotdf.iloc[i].onset == 0 and sotdf.iloc[i].duration > time:
         #   scan_df.loc[index, 'event'] = sotdf.iloc[i].trial_type

        #if sotdf.iloc[i].onset == 0 and (time-tr) < sotdf.iloc[i+1].onset and time > sotdf.iloc[i+1].onset:
         #   i=i+1

        if (sotdf.iloc[i].onset + sotdf.iloc[i].duration) >= time :
            scan_df.loc[index,'event'] = sotdf.iloc[i].trial_type
        else:
            #we can comment this line in order to get a clean list that only contains events that occured during the whole scan period
            #scan_df.loc[index,'event'] = sotdf.iloc[i+1].trial_type
            i=i+1
            # timer einbauen wie bei der ersten if bedingung 

        #if sotdf.iloc[i].onset < time and tr < sotdf.iloc[i].duration and  :
            #i = i+1
            #scan_df.loc[index,'event'] = sotdf.iloc[i].trial_type
            
    
    
    # i is the counter to iterate through the list of conditions 
    i = 0
    # we therefore iterate through our dataframe and replace the "start block" 
    # with the respective emotion category from the conditions list as soon
    # as we have a row that does not contain a start block we go to the next category
    for index, row in scan_df.iterrows():
        if(index<len(scan_df)-1):

            if scan_df.iloc[index].event == "start_block" and scan_df.iloc[index+1].event =="start_block" :
                scan_df.at[index,'event'] = conditions_list[i]
                #scan_df.loc[index].event = conditions_list[i]

            if scan_df.iloc[index+1].event != "start_block" and scan_df.iloc[index].event == "start_block":
                scan_df.at[index,'event'] = conditions_list[i]
                #scan_df.loc[index].event = conditions_list[i]
                i = i+1
                
    
    
    #loop over the dataframe with the scannumbers and events and fill in
    #the respective value for each AU depending on the emotion presented 
    #to the subject 1 means AU was present 0 means AU was absent
    for index, row in scan_df.iterrows() :

        if scan_df.iloc[index].event == "angrilysurprised" :
            scan_df.at[index,'AU1'] = 0
            scan_df.at[index,'AU2'] = 0
            scan_df.at[index,'AU12'] = 0
            scan_df.at[index,'AU20'] = 0

        if scan_df.iloc[index].event == "disgusted" :
            scan_df.at[index,'AU1'] = 0
            scan_df.at[index,'AU2'] = 0
            scan_df.at[index,'AU12'] = 0
            scan_df.at[index,'AU20'] = 0

        if scan_df.iloc[index].event == "fearfullydisgusted" :
            scan_df.at[index,'AU1'] = 1
            scan_df.at[index,'AU2'] = 1
            scan_df.at[index,'AU12'] = 0
            scan_df.at[index,'AU20'] = 1

        if scan_df.iloc[index].event == "fearfullysurprised" :
            scan_df.at[index,'AU1'] = 1
            scan_df.at[index,'AU2'] = 1
            scan_df.at[index,'AU12'] = 0
            scan_df.at[index,'AU20'] = 1

        if scan_df.iloc[index].event == "happy" :
            scan_df.at[index,'AU1'] = 0
            scan_df.at[index,'AU2'] = 0
            scan_df.at[index,'AU12'] = 1
            scan_df.at[index,'AU20'] = 0

        if scan_df.iloc[index].event == "happilysurprised" :
            scan_df.at[index,'AU1'] = 1
            scan_df.at[index,'AU2'] = 1
            scan_df.at[index,'AU12'] = 1
            scan_df.at[index,'AU20'] = 0

        if scan_df.iloc[index].event == "happilydisgusted" :
            scan_df.at[index,'AU1'] = 0
            scan_df.at[index,'AU2'] = 0
            scan_df.at[index,'AU12'] = 1
            scan_df.at[index,'AU20'] = 0

        if scan_df.iloc[index].event == "sadlyfearful" :
            scan_df.at[index,'AU1'] = 1
            scan_df.at[index,'AU2'] = 0
            scan_df.at[index,'AU12'] = 0
            scan_df.at[index,'AU20'] = 1
            
            
    #loop over the dataframe with the scannumbers and events and fill in
    #the respective value for each AU depending on the emotion presented 
    #to the subject 1 means AU was present 0 means AU was absent
    counter = 1
    for index, row in scan_df_av.iterrows() :

        if scan_df_av.iloc[index].category == "angrilysurprised" :
            scan_df_av.at[index,'AU1'] = 0
            scan_df_av.at[index,'AU2'] = 0
            scan_df_av.at[index,'AU12'] = 0
            scan_df_av.at[index,'AU20'] = 0

        if scan_df_av.iloc[index].category == "disgusted" :
            scan_df_av.at[index,'AU1'] = 0
            scan_df_av.at[index,'AU2'] = 0
            scan_df_av.at[index,'AU12'] = 0
            scan_df_av.at[index,'AU20'] = 0

        if scan_df_av.iloc[index].category == "fearfullydisgusted" :
            scan_df_av.at[index,'AU1'] = 1
            scan_df_av.at[index,'AU2'] = 1
            scan_df_av.at[index,'AU12'] = 0
            scan_df_av.at[index,'AU20'] = 1

        if scan_df_av.iloc[index].category == "fearfullysurprised" :
            scan_df_av.at[index,'AU1'] = 1
            scan_df_av.at[index,'AU2'] = 1
            scan_df_av.at[index,'AU12'] = 0
            scan_df_av.at[index,'AU20'] = 1
            
        if scan_df_av.iloc[index].category == "happy" :
            scan_df_av.at[index,'AU1'] = 0
            scan_df_av.at[index,'AU2'] = 0
            scan_df_av.at[index,'AU12'] = 1
            scan_df_av.at[index,'AU20'] = 0
            
        if scan_df_av.iloc[index].category == "happilysurprised" :
            scan_df_av.at[index,'AU1'] = 1
            scan_df_av.at[index,'AU2'] = 1
            scan_df_av.at[index,'AU12'] = 1
            scan_df_av.at[index,'AU20'] = 0

        if scan_df_av.iloc[index].category == "happilydisgusted" :
            scan_df_av.at[index,'AU1'] = 0
            scan_df_av.at[index,'AU2'] = 0
            scan_df_av.at[index,'AU12'] = 1
            scan_df_av.at[index,'AU20'] = 0
            
        if scan_df_av.iloc[index].category == "sadlyfearful" :
            scan_df_av.at[index,'AU1'] = 1
            scan_df_av.at[index,'AU2'] = 0
            scan_df_av.at[index,'AU12'] = 0
            scan_df_av.at[index,'AU20'] = 1
        # we can also fill the other colums now in this process e.g. run, block run_total
        scan_df_av.at [index,'run_total'] = index+1
        scan_df_av.at[index, 'run'] = counter
        scan_df_av.at[index, 'block'] = index%16 +1
        
        if index != 0 and (index+1)%16 == 0:
            counter = counter+1
    
    # in order to run the data through a crossvalidation, we need to 
    # the columns block and run of this dataframe
    # i is the counter to iterate through the blocks 
    # j is the counter that iterates through the runs
    # whenever there is no 0's or 1's (that is exactly the case when one block is over we go up with the block counter )
    # whenever the blockcounter reaches 16 we set it to 0 again and add one to the run counter (one run has 16 blocks)
   
    i = 0
    j = 0
    r=0
    block_counter = range(1,160)
    for index, row in scan_df.iterrows():
        if(index<len(scan_df)-1):

            if (scan_df.iloc[index].AU1 == 0 or scan_df.iloc[index].AU1 == 1) and (scan_df.iloc[index+1].AU1 == 0 or scan_df.iloc[index+1].AU1 == 1):
                scan_df.at[index,'block'] = block_counter[i]
                scan_df.at[index,'run'] = block_counter[j]
                scan_df.at[index,'run_total'] = block_counter[r]
                #scan_df.loc[index].event = conditions_list[i]

            if scan_df.iloc[index+1].AU1 != 0  and scan_df.iloc[index].AU1 == 0 :
                scan_df.at[index,'block'] = block_counter[i]
                scan_df.at[index,'run'] = block_counter[j]
                scan_df.at[index,'run_total'] = block_counter[r]
                #scan_df.loc[index].event = conditions_list[i]
                i = i+1
                r = r+1

            if scan_df.iloc[index+1].AU1 != 1  and scan_df.iloc[index].AU1 == 1:
                scan_df.at[index,'block'] = block_counter[i]
                scan_df.at[index,'run'] = block_counter[j]
                scan_df.at[index, 'run_total'] = block_counter[r]
                #scan_df.loc[index].event = conditions_list[i]
                i = i+1
                r = r+1

            if i == 16:
                i = 0
                j = j+1
        
    scan_df.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/derivatives/timing_data/sub-'+sid+'/sub-'+sid+'_task-video_events_scan.csv')
    scan_df_av.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/derivatives/timing_data/sub-'+sid+'/sub-'+sid+'_task-video_events_BlockAverage.csv')
    #now we have to downsample the data to get reliable unbiased results
    
    #get number of occurences of every single AU (present)
    nrp_AU1 = scan_df.loc[scan_df.AU1 == 1 , 'AU1'].count()
    nrp_AU2 = scan_df.loc[scan_df.AU2 == 1 , 'AU2'].count()
    nrp_AU12 = scan_df.loc[scan_df.AU12 == 1 , 'AU12'].count()
    nrp_AU20 = scan_df.loc[scan_df.AU20 == 1 , 'AU20'].count()

    #get number of occurences of every single AU (absent)
    nra_AU1 = scan_df.loc[scan_df.AU1 == 0 , 'AU1'].count()
    nra_AU2 = scan_df.loc[scan_df.AU2 == 0 , 'AU2'].count()
    nra_AU12 = scan_df.loc[scan_df.AU12 == 0 , 'AU12'].count()
    nra_AU20 = scan_df.loc[scan_df.AU20 == 0 , 'AU20'].count()
    
    #get value of difference between present and absent trials
    diff_AU1 = abs(nrp_AU1-nra_AU1)
    diff_AU2 = abs(nrp_AU2-nra_AU2)
    diff_AU12 = abs(nrp_AU12-nra_AU12)
    diff_AU20 = abs(nrp_AU20-nra_AU20)
    
    #now I have to randomly delete colums in order to have an equal distribution
    #of present and absent trials of each AU throughout the entire experiment
    AUs = ["AU1","AU2","AU12","AU20"]

    if nrp_AU1 < nra_AU1:
        scan_df_corrected_AU1 = scan_df.drop(scan_df[scan_df['AU1'].eq(0)].sample(diff_AU1).index)

    if nrp_AU1 > nra_AU1:
        scan_df_corrected_AU1 = scan_df.drop(scan_df[scan_df['AU1'].eq(1)].sample(diff_AU1).index)


    if nrp_AU2 < nra_AU2:
        scan_df_corrected_AU2 = scan_df.drop(scan_df[scan_df['AU2'].eq(0)].sample(diff_AU2).index)

    if nrp_AU2 > nra_AU2:
        scan_df_corrected_AU2 = scan_df.drop(scan_df[scan_df['AU2'].eq(1)].sample(diff_AU2).index)


    if nrp_AU12 < nra_AU12:
        scan_df_corrected_AU12 = scan_df.drop(scan_df[scan_df['AU12'].eq(0)].sample(diff_AU12).index)
    

    if nrp_AU12 > nra_AU12:
        scan_df_corrected_AU12 = scan_df.drop(scan_df[scan_df['AU12'].eq(1)].sample(diff_AU12).index)
      

    if nrp_AU20 < nra_AU20:
        scan_df_corrected_AU20 = scan_df.drop(scan_df[scan_df['AU20'].eq(0)].sample(diff_AU20).index)

    if nrp_AU20 > nra_AU20:
        scan_df_corrected_AU20 = scan_df.drop(scan_df[scan_df['AU20'].eq(1)].sample(diff_AU20).index)
        
   
    #create a new csv file for every AU which has a balanced number of present and absent trials
    scan_df_corrected_AU1.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/derivatives/timing_data/sub-'+sid+'/sub-'+sid+'_task-video_events_scan_corrected_AU1.csv')

    scan_df_corrected_AU2.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/derivatives/timing_data/sub-'+sid+'/sub-'+sid+'_task-video_events_scan_corrected_AU2.csv')

    scan_df_corrected_AU12.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/derivatives/timing_data/sub-'+sid+'/sub-'+sid+'_task-video_events_scan_corrected_AU12.csv')

    scan_df_corrected_AU20.to_csv('/media/lmn/86A406A0A406933B/Aaron_MA/data_bids/derivatives/timing_data/sub-'+sid+'/sub-'+sid+'_task-video_events_scan_corrected_AU20.csv')
    
    #set the counter for the sid one up to not overwrite the data for sub-02/sub_id[0] several times 
    sid_counter = sid_counter+1


    

11
Exp start at index = 49
64.2521
                                            trial_type      onset  duration
0                                     start_experiment     0.0000    6.0310
1                                          start_block     6.0310   11.9891
2    text: text = 'ärgerlich-überrascht  \n \n  \t ...    18.0201    8.0224
3                                          start_block    26.0425   11.9898
4    text: text = ' glücklich  \n \n    oder   \n \...    38.0323    8.0230
..                                                 ...        ...       ...
284  text_13: text = '   traurig-ängstlich  \n \n  ...  2944.1516    8.0232
285                                        start_block  2952.1748   11.9891
286  text_13: text = 'traurig-ängstlich  \n \n     ...  2964.1639    8.0232
287                                        start_block  2972.1871   11.9890
288  text_13: text = 'glücklich-überrascht  \n \n  ...  2984.1761    0.0000

[289 rows x 3 columns]


In [20]:
scan_df_av

Unnamed: 0,category,AU1,AU2,AU12,AU20,block,run,run_total,subject_id
0,angrilysurprised,0,0,0,0,1,1,1,04
1,disgusted,0,0,0,0,2,1,2,04
2,fearfullydisgusted,1,1,0,1,3,1,3,04
3,happy,0,0,1,0,4,1,4,04
4,sadlyfearful,1,0,0,1,5,1,5,04
...,...,...,...,...,...,...,...,...,...
139,happilysurprised,1,1,1,0,5,9,140,04
140,angrilysurprised,0,0,0,0,6,9,141,04
141,fearfullydisgusted,1,1,0,1,7,9,142,04
142,sadlyfearful,1,0,0,1,8,9,143,04
