## Generate event files associated with functional images

*Zizhuang Miao*

*Dec 2023*

This script is used to generate `*events.tsv` and `*events.json` files associated with each functional runs we are going to share. They will be put into a folder with the same structure as the raw imaging data. Later, the two folder can merge into one to be shared.

In this script, we will deal with four tasks seperately -- narratives, self-referential (shortvideos), align videos, and faces.

### I. Make basic folder structure for BIDS

First, let's make the same structure of the folder.

In [16]:
import os
import glob
import pandas as pd

outputDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\events_files'
behDataDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\d_beh'

In [15]:
# get a list of subjects with behavior data
folders = glob.glob(os.path.join(behDataDir, 'sub-*'))
subList = [os.path.basename(x) for x in folders]

# make directories for each subject
for sub in subList:
    newfolder = os.path.join(outputDir, sub)
    if os.path.exists(newfolder):
        continue
    os.makedirs(newfolder)
    for session in ['ses-01', 'ses-02', 'ses-03', 'ses-04']:
        os.makedirs(os.path.join(newfolder, session, 'func'))
    print(f'{sub} done.', end=' ')


sub-0001 done. sub-0002 done. sub-0003 done. sub-0004 done. sub-0005 done. sub-0006 done. sub-0007 done. sub-0008 done. sub-0009 done. sub-0010 done. sub-0011 done. sub-0013 done. sub-0014 done. sub-0015 done. sub-0016 done. sub-0017 done. sub-0018 done. sub-0019 done. sub-0020 done. sub-0021 done. sub-0023 done. sub-0024 done. sub-0025 done. sub-0026 done. sub-0028 done. sub-0029 done. sub-0030 done. sub-0031 done. sub-0032 done. sub-0033 done. sub-0034 done. sub-0035 done. sub-0036 done. sub-0037 done. sub-0038 done. sub-0039 done. sub-0040 done. sub-0041 done. sub-0043 done. sub-0044 done. sub-0046 done. sub-0047 done. sub-0050 done. sub-0051 done. sub-0052 done. sub-0053 done. sub-0055 done. sub-0056 done. sub-0057 done. sub-0058 done. sub-0059 done. sub-0060 done. sub-0061 done. sub-0062 done. sub-0063 done. sub-0064 done. sub-0065 done. sub-0066 done. sub-0068 done. sub-0069 done. sub-0070 done. sub-0071 done. sub-0073 done. sub-0074 done. sub-0075 done. sub-0076 done. sub-0077 d

### II. Create event files for each task

#### a. task-narratives

For the narratives task, because I already organized the behavioral data, I just need to select some of them and put them into the events file.

Values in the `trial_type` column to be included in the onset files:
+ `presentation_audio`: when the narrative audios were played
+ `presentation_text`: when the narrative texts were presented
+ `rating_feeling`: the entire rating period for feelings
+ `rating_expectation`: the entire rating period for expectations
+ `feeling_mouse_trajectory`: the period when the trackball is moving during the feeling rating
+ `expectation_mouse_trajectory`: the period when the trackball is moving during the expectation rating

Columns of the events files:
+ `onset`, `duration`, `trial_type`
+ `response_x`, `response_y`: for `feeling` and `expectation`
+ `stim_file`: for `presentation_audio` and `presentation_text`
+ `situation`, `context`: for all events
+ `modality`: for all events

In [1]:
import os
import glob
import pandas as pd

outputDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\events_files'
behDataDir = 'C:\\Users\\f006fkn\\Desktop\\SpacetopNarrativesStudy\\BehaviorData\\beh02_preproc'

# get a list of subjects with behavior data
folders = glob.glob(os.path.join(outputDir, 'sub-*'))
subList = [os.path.basename(x) for x in folders]

In [5]:
for sub in subList:
    
    for run in ['01', '02', '03', '04']:
        
        dataFile = os.path.join(behDataDir, sub, 'task-narratives', f'{sub}_ses-02_task-narratives_run-{run}_beh-preproc.csv')
        if not os.path.isfile(dataFile):
            # print(f'No file for {sub}_run-{run}')
            continue

        oriData = pd.read_csv(dataFile)
        newData = pd.DataFrame(columns=["onset", "duration", "trial_type", 
                                "response_x", "response_y",
                                "situaion", "context", "modality", "stim_file"])    # new events to store
        if run in ['01', '02']:
            modality = 'Audio'
        else:
            modality = 'Text'
        
        t_runStart = oriData.loc[0, 'param_trigger_onset']    # start time of this run; all onsets calibrated by this

        for t in range(len(oriData)):    # each trial
            # stimuli presentation
            onset = oriData.loc[t, 'event02_administer_onset'] - t_runStart
            duration = oriData.loc[t, 'event03_feel_displayonset'] - oriData.loc[t, 'event02_administer_onset']
            trial_type = "presentation_" + oriData.loc[t, 'event02_administer_type']
            situation = oriData.loc[t, 'situation']
            context = oriData.loc[t, 'context']
            stim_file = oriData.loc[t, 'param_stimulus_filename']
            stim_file = stim_file[:20] + '.mp3'
            stim_file = 'task-narratives/' + stim_file
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "situaion": situation, "context": context, "modality": modality, "stim_file": stim_file}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)

            # feeling rating
            onset = oriData.loc[t, 'event03_feel_displayonset'] - t_runStart
            duration = oriData.loc[t, 'RT_feeling']
            trial_type = 'rating_feeling'
            response_x = oriData.loc[t, 'feeling_end_x']
            response_y = oriData.loc[t, 'feeling_end_y']
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "response_x": response_x, "response_y": response_y, \
                            "situaion": situation, "context": context, "modality": modality}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)
            
            # feeling_mouse_trajectory
            onset += oriData.loc[t, 'D_onset_feeling']
            duration = oriData.loc[t, 'D_dur_feeling']
            trial_type = 'feeling_mouse_trajectory'
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "response_x": response_x, "response_y": response_y, \
                            "situaion": situation, "context": context, "modality": modality}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)

            # expectation rating
            onset = oriData.loc[t, 'event04_expect_displayonset'] - t_runStart
            duration = oriData.loc[t, 'RT_expec']
            trial_type = 'rating_expectation'
            response_x = oriData.loc[t, 'expec_end_x']
            response_y = oriData.loc[t, 'expec_end_y']
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "response_x": response_x, "response_y": response_y, \
                            "situaion": situation, "context": context, "modality": modality}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)
            
            # expectation_mouse_trajectory
            onset += oriData.loc[t, 'D_onset_expec']
            duration = oriData.loc[t, 'D_dur_expec']
            trial_type = 'expectation_mouse_trajectory'
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "response_x": response_x, "response_y": response_y, \
                            "situaion": situation, "context": context, "modality": modality}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)
        
        # change precisions
        precision_dic = {'onset': 3, 'duration': 3}
        newData = newData.round(precision_dic)

        # save new events file
        newFilename = os.path.join(outputDir, sub, 'ses-02', 'func', f'{sub}_ses-02_task-narratives_acq-mb8_run-{run}_events.tsv')
        newData.to_csv(newFilename, sep='\t', index=False)

Create `.json` file associated with the events.

In [7]:
import json

for sub in subList:
    
    for run in ['01', '02', '03', '04']:
        
        dataFile = os.path.join(behDataDir, sub, 'task-narratives', f'{sub}_ses-02_task-narratives_run-{run}_beh-preproc.csv')
        if not os.path.isfile(dataFile):
            continue
        
        des_duration = {"Description": "For the two 'rating' trial types, if no response was provided, the duration was set as n/a. If you would like to model the rating period, you can use the maximum time for rating (4 seconds) to replace those missing values."}
        des_trial = {"LongName": "Event category", 
                     "Description": "A categorical variable indicating event types within a trial", 
                     "Levels": {
                         "presentation_audio": "when audios of the narratives were played", 
                         "presentation_text": "when texts of the narratives were presented", 
                         "rating_feeling": "the rating period for feelings, whose duration is response time", 
                         "rating_expectation": "the rating period for expectations, whose duration is response time", 
                         "feeling_mouse_trajectory": "the period when the participant was moving the trackball during the feeling rating", 
                         "expectation_mouse_trajectory": "the period when the participant was moving the trackball during the expectation rating"
                     }}
        des_responseX = {"LongName": "The x coordinate of the rating", 
                         "Description": "The screen coordinate system uses the upper left corner as the origin. Left-right is along the x axis, and up-down is along the y axis. All x and y values on the screen are positive. The rating scale was in a rectangle whose two upper corners are (478, 300) (pole of 'Bad') and (1442, 300) (pole of 'Good'), and two lower corners are (478, 780) and (1442, 780). In each trial, the rating dot started at (960, 707). Then the participant moved the dot to the place they thought was approriate before clicking to lock the rating. The x value of the position of the click was recorded in this column."}
        des_responseY = {"LongName": "The y coordinate of the rating", 
                         "Description": "This is the y value of the position of the click during ratings. Note: if no rating was provided by clicking, the value, as well as response_x, was imputed by finding the last time the trackball moved."}
        des_situation = {"LongName": "The situation of the narrative in this trial", 
                         "Description": "There were a total of 32 situaions used in the study, sampled from Polti's 36 dramatic situations. Examples include 'Pursuit', 'Adultery', and 'Conflict with a god'."}
        des_context = {"LongName": "The context where stories happened in this trial", 
                       "Description": "Contexts in the current study mean the general place of the stories. There were 8 places used: park, hospital, prison, town, city, beach, forest, swamp."}
        des_modality = {"LongName": "The modality where narratives were presented in this trial",
                        "Levels": "Audio, Text"}
        des_stimFile = {"LongName": "The name of the stimulus file in this trial"}

        dataToWrite = {"duration": des_duration, "trial_type": des_trial, "response_x": des_responseX, "response_y": des_responseY,
                       "situation": des_situation, "context": des_context, "modality": des_modality,
                       "stim_file": des_stimFile}        
        newFilename = os.path.join(outputDir, sub, 'ses-02', 'func', f'{sub}_ses-02_task-narratives_acq-mb8_run-{run}_events.json')

        with open(newFilename, 'w') as json_file:
            json.dump(dataToWrite, json_file, indent=4) 

#### b. task-shortvideos

For the self-referential task, or task-shortvideos, there is no "preprocessed" behavior data. I will work out the specific behavior data for each event.

Event types to include in the `trial_type` column in events files:
+ `cue`: when the cue of the question was shown
+ `video`: when the videos were played
+ `rating`: rating periods
+ `rating_mouse_trajectory`: when the trackball was moving

Columns of the onsets files:
+ `onset`, `duration`, `trial_type`
+ `response_angle`, `response_label`: for `rating` and `rating_mouse_trajectory`
+ `stim_file`: for `video`
+ `block_condition`: for all events, with three levels: likeability, similarity, and mental (attribution)
+ `mentalizing_level`: specific conditions for mental events, seven levels

In [8]:
import os
import glob
import pandas as pd
import numpy as np
import math

outputDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\events_files'
behDataDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\d_beh'

# get a list of subjects with behavior data
folders = glob.glob(os.path.join(outputDir, 'sub-*'))
subList = [os.path.basename(x) for x in folders]
taskname = 'task-shortvideos'
session = 'ses-03'

In [12]:
run = '01'
xcenter = 980
ycenter = 707    # starting point and center point of the rating

for sub in subList: 
    dataFile = os.path.join(behDataDir, sub, taskname, f'{sub}_{session}_{taskname}_beh-preproc.csv')
    if not os.path.isfile(dataFile):
        print(f'No file for {sub}_run-{run}')
        continue
    
    oriData = pd.read_csv(dataFile)
    newData = pd.DataFrame(columns=["onset", "duration", "trial_type", 
                            "response_angle", "response_label",
                            "block_condition", "mentalizing_level", "stim_file"])    # new events to store
    
    t_runStart = oriData.loc[0, 'param_trigger_onset']    # start time of this run; all onsets calibrated by this

    for t in range(len(oriData)):    # each trial
        # cue
        if t%3 == 0:
            onset = oriData.loc[t, 'event01_block_cue_onset'] - t_runStart
            duration = oriData.loc[t, 'event02_video_onset'] - oriData.loc[t, 'event01_block_cue_onset']
            trial_type = 'cue'
            block_condition = oriData.loc[t, 'event01_block_cue_type']
            mentalizing_level = np.nan
            if block_condition == 'mentalizing':
                mentalizing_level = oriData.loc[t, 'event03_rating_type']
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "block_condition": block_condition, "mentalizing_level": mentalizing_level}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)

        # video playing
        onset = oriData.loc[t, 'event02_video_onset'] - t_runStart
        duration = oriData.loc[t, 'event02_video_stop'] - oriData.loc[t, 'event02_video_onset']
        trial_type = 'video'
        stim_file = oriData.loc[t, 'event02_video_filename']
        stim_file = taskname + '/' + stim_file
        newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                            "block_condition": block_condition, "mentalizing_level": mentalizing_level, "stim_file": stim_file}, index=[0])
        newData = pd.concat([newData, newRow], ignore_index=True)

        # rating
        onset = oriData.loc[t, 'event03_rating_displayonset'] - t_runStart
        duration = oriData.loc[t, 'event03_rating_RT']
        trial_type = 'rating'
        x = oriData.loc[t, 'rating_end_x']
        y = oriData.loc[t, 'rating_end_y']
        # calculating angles of ratings and corresponding labels
        if x > xcenter:
            response_angle = math.atan((ycenter-y)/(x-xcenter))
            response_angle = math.pi - response_angle
        elif x == xcenter:
            response_angle = math.pi/2
        else:
            response_angle = math.atan((ycenter-y)/(xcenter-x))
        response_angle = 180*response_angle/math.pi
        if response_angle == 0:
            response_label = 'No sensation'
        elif response_angle <= 3:
            response_label = 'Barely detectable'
        elif response_angle <= 10:
            response_label = 'Weak'
        elif response_angle <= 29:
            response_label = 'Moderate'
        elif response_angle <= 64:
            response_label = 'Strong'
        elif response_angle <= 98:
            response_label = 'Very strong'
        elif response_angle <= 180:
            response_label = 'Strongest possible'
        newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                        "response_angle": response_angle, "response_label": response_label, \
                        "block_condition": block_condition, "mentalizing_level": mentalizing_level}, index=[0])
        newData = pd.concat([newData, newRow], ignore_index=True)
        
        # rating_motion
        onset += oriData.loc[t, 'motion_onset']
        duration = oriData.loc[t, 'motion_dur']
        trial_type = 'rating_mouse_trajectory'
        newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                        "response_angle": response_angle, "response_label": response_label, \
                        "block_condition": block_condition, "mentalizing_level": mentalizing_level}, index=[0])
        newData = pd.concat([newData, newRow], ignore_index=True)
    
    # change precisions
    precision_dic = {'onset': 3, 'duration': 3, 'response_angle': 2}
    newData = newData.round(precision_dic)

    # save new events file
    newFilename = os.path.join(outputDir, sub, session, 'func', f'{sub}_{session}_{taskname}_acq-mb8_run-01_events.tsv')
    newData.to_csv(newFilename, sep='\t', index=False)

No file for sub-0015_run-01
No file for sub-0020_run-01
No file for sub-0028_run-01
No file for sub-0030_run-01
No file for sub-0047_run-01
No file for sub-0063_run-01
No file for sub-0068_run-01
No file for sub-0071_run-01
No file for sub-0081_run-01
No file for sub-0085_run-01
No file for sub-0097_run-01
No file for sub-0114_run-01
No file for sub-0117_run-01
No file for sub-0118_run-01
No file for sub-0119_run-01
No file for sub-0120_run-01
No file for sub-0123_run-01
No file for sub-0147_run-01


In [13]:
import json

run = '01'

for sub in subList:

    dataFile = os.path.join(behDataDir, sub, taskname, f'{sub}_{session}_{taskname}_beh-preproc.csv')
    if not os.path.isfile(dataFile):
        continue

    des_duration = {"Description": "For the 'rating' events, if no response was provided, the duration was set as n/a. If you would like to model the rating period, you can use the maximum time for rating (4 seconds) to replace those missing values."}
    des_trial = {"LongName": "Event category", 
                    "Description": "A categorical variable indicating event types within a trial", 
                    "Levels": {
                        "cue": "when the cue of question type was displayed at the start of each block", 
                        "video": "when the video was played in each trial", 
                        "rating": "the rating period", 
                        "rating_mouse_trajectory": "the period when the participant was moving the trackball during the rating", 
                    }}
    des_responseAngle = {"LongName": "The angle of the rating", 
                        "Description": "The scale is of a semi-circular shape, and a single value of angle can represent a rating. Here the left most end of the scale was defined as 0 degree, while the right most is 180 degree. Thus, all rating angles were within [0 180]."}
    des_responseLabel = {"LongName": "The label of the rating", 
                        "Description": "According to a set of rules (see main texts for details), we assigned each rating a label based on the angle of it."}
    des_blockCond = {"LongName": "The experimental condition of the current block (of three trials)", 
                     "Description": "The value in each Level below is the specific cue used in each type of blocks.", 
                     "Levels": {
                         "likeability": "Consider how much you like this character", 
                         "similarity": "How similar are you to the character?", 
                         "mentalizing": "Consider what the character is thinking"
                     }}
    des_mentalLev = {"LongName": "The specific condition for mentalizing blocks", 
                      "Description": "Seven levels of mentalizing questions were used in the study. This column indicates which level was used if the block is mentalizing. The values in each level below are the questions used for that level.", 
                      "Levels": {"angry": "Was the character feeling angry?", 
                                 "calm": "Was the character feeling calm?", 
                                 "danger": "Did the character feel in danger?", 
                                 "enjoy": "Was the character enjoying themselves?", 
                                 "heights": "Was the character afraid of heights?", 
                                 "remember": "Was the character remembering something?", 
                                 "tired": "Was the character feeling tired?"}}
    des_stimFile = {"LongName": "The name of the stimulus file in this trial"}

    dataToWrite = {"duration": des_duration, "trial_type": des_trial, "response_angle": des_responseAngle, "response_lable": des_responseLabel,
                    "block_condition": des_blockCond, "mentalizing_level": des_mentalLev, "stim_file": des_stimFile}

    newFilename = os.path.join(outputDir, sub, session, 'func', f'{sub}_{session}_{taskname}_acq-mb8_run-{run}_events.json')

    with open(newFilename, 'w') as json_file:
        json.dump(dataToWrite, json_file, indent=4)

#### c. task-alignvideo

For the Align video task, or task-alignvideo, although there is no "preprocessed" behavior data, there are only one mouse position saved per trial: the final rating position and the time of it. Thus, no preprocessing is needed to generate event files.

Event types (`trial_type`) to include in the events files:
+ `video`: when the videos were played
+ `rating_relevance`: rating periods for personal relevance
+ `rating_happy`: rating periods for happy
+ `rating_sad`: rating periods for sad
+ `rating_afraid`: rating periods for afraid
+ `rating_disgusted`: rating periods for disgusted
+ `rating_warm`: rating periods for warm and tender
+ `rating_engaged`: rating periods for engaged

Columns of the onsets files:
+ `onset`, `duration`, `trial_type`
+ `response_value`
+ `stim_file`: for all events

In [15]:
import os
import glob
import pandas as pd
import numpy as np

outputDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\events_files'
behDataDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\d_beh'

# get a list of subjects with behavior data
folders = glob.glob(os.path.join(outputDir, 'sub-*'))
subList = [os.path.basename(x) for x in folders]
taskname = 'task-alignvideos'
sessionDict = {'ses-01': 4, 'ses-02': 4, 'ses-03': 3, 'ses-04': 2}    # different sessions have different numbers of runs

In [19]:
for sub in subList:    
    for session in sessionDict:
        for r in range(sessionDict[session]):
            run = 'run-0' + str(r+1)

            dataFile = os.path.join(behDataDir, sub, taskname, session, f'{sub}_{session}_{taskname}_{run}_beh.csv')
            if not os.path.isfile(dataFile):
                print(f'No file for {sub}_{session}_{run}')
                continue

            oriData = pd.read_csv(dataFile)
            newData = pd.DataFrame(columns=["onset", "duration", "trial_type", 
                                    "response_value", "stim_file"])    # new events to store
            
            t_runStart = oriData.loc[0, 'param_trigger_onset']    # start time of this run; all onsets calibrated by this

            for t in range(len(oriData)):    # each trial
                # stimuli presentation
                onset = oriData.loc[t, 'event01_video_onset'] - t_runStart
                duration = oriData.loc[t, 'event01_video_end'] - oriData.loc[t, 'event01_video_onset']
                trial_type = 'video'
                stim_file = oriData.loc[t, 'param_video_filename']
                stim_file = taskname + '/' + stim_file
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # ratings
                # 01
                onset = oriData.loc[t, 'event02_rating01_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating01_RT']
                trial_type = 'rating_relevance'
                response_value = oriData.loc[t, 'event02_rating01_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # 02
                onset = oriData.loc[t, 'event02_rating02_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating02_RT']
                trial_type = 'rating_happy'
                response_value = oriData.loc[t, 'event02_rating02_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # 03
                onset = oriData.loc[t, 'event02_rating03_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating03_RT']
                trial_type = 'rating_sad'
                response_value = oriData.loc[t, 'event02_rating03_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # 04
                onset = oriData.loc[t, 'event02_rating04_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating04_RT']
                trial_type = 'rating_afraid'
                response_value = oriData.loc[t, 'event02_rating04_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # 05
                onset = oriData.loc[t, 'event02_rating05_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating05_RT']
                trial_type = 'rating_disgusted'
                response_value = oriData.loc[t, 'event02_rating05_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # 06
                onset = oriData.loc[t, 'event02_rating06_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating06_RT']
                trial_type = 'rating_warm'
                response_value = oriData.loc[t, 'event02_rating06_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)

                # 07
                onset = oriData.loc[t, 'event02_rating07_onset'] - t_runStart
                duration = oriData.loc[t, 'event02_rating07_RT']
                trial_type = 'rating_engaged'
                response_value = oriData.loc[t, 'event02_rating07_rating']
                newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "stim_file": stim_file}, index=[0])
                newData = pd.concat([newData, newRow], ignore_index=True)
            
            # change precisions
            precision_dic = {'onset': 3, 'duration': 3, 'response_value': 2}
            newData = newData.round(precision_dic)

            # save new events file
            newFilename = os.path.join(outputDir, sub, session, 'func', f'{sub}_{session}_{taskname}_acq-mb8_{run}_events.tsv')
            newData.to_csv(newFilename, sep='\t', index=False)

No file for sub-0001_ses-01_run-01
No file for sub-0001_ses-01_run-02
No file for sub-0001_ses-01_run-03
No file for sub-0001_ses-01_run-04
No file for sub-0009_ses-01_run-04
No file for sub-0015_ses-02_run-01
No file for sub-0015_ses-02_run-02
No file for sub-0015_ses-02_run-03
No file for sub-0015_ses-02_run-04
No file for sub-0015_ses-03_run-01
No file for sub-0015_ses-03_run-02
No file for sub-0015_ses-03_run-03
No file for sub-0015_ses-04_run-01
No file for sub-0015_ses-04_run-02
No file for sub-0016_ses-04_run-01
No file for sub-0016_ses-04_run-02
No file for sub-0017_ses-01_run-04
No file for sub-0017_ses-02_run-01
No file for sub-0021_ses-01_run-03
No file for sub-0024_ses-01_run-04
No file for sub-0028_ses-01_run-02
No file for sub-0028_ses-01_run-03
No file for sub-0028_ses-01_run-04
No file for sub-0028_ses-02_run-01
No file for sub-0028_ses-02_run-02
No file for sub-0028_ses-02_run-03
No file for sub-0028_ses-02_run-04
No file for sub-0028_ses-03_run-01
No file for sub-0028

Create `.json` file associated with the events.

In [20]:
import json

for sub in subList:
    
    for session in sessionDict:
        for r in range(sessionDict[session]):
            run = 'run-0' + str(r+1)

            dataFile = os.path.join(behDataDir, sub, taskname, session, f'{sub}_{session}_{taskname}_{run}_beh.csv')
            if not os.path.isfile(dataFile):
                continue
            
            des_duration = {"Description": "For the seven 'rating_*' trial types, the durations are the response time in each rating; if no response was provided, the duration was set as 0. If you would like to model the rating period, you can use the maximum time for rating (5 seconds) to replace those 0s."}
            des_trial = {"LongName": "Event category", 
                        "Description": "A categorical variable indicating event types within a trial", 
                        "Levels": {
                            "video": "when videos were played",  
                            "rating_relevance": "the rating period for personal relevance", 
                            "rating_happy": "the rating period for happy", 
                            "rating_sad": "the rating period for sad", 
                            "rating_afraid": "the rating period for afraid", 
                            "rating_disgusted": "the rating period for disgusted", 
                            "rating_warm": "the rating period for warm and tender", 
                            "rating_engaged": "the rating period for engaged", 
                        }}
            des_responseValue = {"LongName": "The value of the rating", 
                            "Description": "This value ranges from 0 ('Barely at all') to 100 ('Strongest imaginable'). Note that if the 'duration' of one rating event was 0, the response value would also be 0, but no response was provided; in this case, the response value of 0 should be treated as a missing value."}
            des_stimFile = {"LongName": "The name of the stimulus file in this trial"}

            dataToWrite = {"duration": des_duration, "trial_type": des_trial, "response_value": des_responseValue, "stim_file": des_stimFile}        
            newFilename = os.path.join(outputDir, sub, session, 'func', f'{sub}_{session}_{taskname}_acq-mb8_{run}_events.json')

            with open(newFilename, 'w') as json_file:
                json.dump(dataToWrite, json_file, indent=4) 

#### d. task-faces

For the Faces task, or task-faces, I generated "preprocessed" behavior files for each run.

Event types (`trial_type`) to include in the events files:
+ `face`: when the videos of faces were played
+ `rating`: rating periods for personal relevance
+ `rating_mouse_trajectory`: the period when participants were moving the trackball

Columns of the events files:
+ `onset`, `duration`, `trial_type`
+ `response_value`: the converted value within [0, 1]
+ `rating_type`: for all events, intensity, age, or sex
+ `expression`: for all events, eight levels
+ `sex`: for all events, male or female
+ `race`: for all events, African, WC (White Caucassian), or EA (East Asian)
+ `age`: for all events, young or old
+ `stim_file`: for all events

In [1]:
import os
import glob
import pandas as pd
import numpy as np

outputDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\events_files'
behDataDir = 'C:\\Users\\f006fkn\\Desktop\\dataPaper\\d_beh'

# get a list of subjects with behavior data
folders = glob.glob(os.path.join(outputDir, 'sub-*'))
subList = [os.path.basename(x) for x in folders]
taskname = 'task-faces'
session = 'ses-02'

In [11]:
for sub in subList: 
    
    if int(sub[-4:])%2 == 0:    # determine rating type 
        runDict = {'run-01': 'age', 'run-02': 'sex', 'run-03': 'intensity'}
    else:
        runDict = {'run-01': 'intensity', 'run-02': 'sex', 'run-03': 'age'}

    for run in runDict:
        dataFile = os.path.join(behDataDir, sub, taskname, f'{sub}_{session}_{taskname}_{run}-{runDict[run]}_beh-preproc.csv')
        if not os.path.isfile(dataFile):
            print(f'No file for {sub}_{run}')
            continue
        
        oriData = pd.read_csv(dataFile)
        newData = pd.DataFrame(columns=["onset", "duration", "trial_type", 
                                "response_value", "rating_type",
                                "expression", "sex", "race", "age", "stim_file"])    # new events to store
        
        t_runStart = oriData.loc[0, 'param_trigger_onset']    # start time of this run; all onsets calibrated by this

        for t in range(len(oriData)):    # each trial
            # video playing
            onset = oriData.loc[t, 'event02_face_onset'] - t_runStart
            duration = oriData.loc[t, 'event03_rating_displayonset'] - oriData.loc[t, 'event02_face_onset']
            trial_type = 'face'
            stim_file = oriData.loc[t, 'param_video_filename']
            conditions = stim_file.split('_')
            stim_file = taskname + '/' + stim_file
            rating_type = runDict[run]
            expression = conditions[0][1:].lower()
            sex = conditions[2][1:].lower()
            race = conditions[3][1:]
            age = conditions[4][1:-4].lower()
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "rating_type": rating_type, 'expression': expression, 'sex': sex, \
                                "race": race, "age": age, 'stim_file': stim_file}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)

            # rating
            onset = oriData.loc[t, 'event03_rating_displayonset'] - t_runStart
            duration = oriData.loc[t, 'event03_rating_RT']
            trial_type = 'rating'
            response_value = oriData.loc[t, 'rating_converted']
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "rating_type": rating_type, \
                                'expression': expression, 'sex': sex, "race": race, "age": age}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)
            
            # rating_mouse_trajectory
            onset += oriData.loc[t, 'motion_onset']
            duration = oriData.loc[t, 'motion_dur']
            trial_type = 'rating_mouse_trajectory'
            newRow = pd.DataFrame({"onset": onset, "duration": duration, "trial_type": trial_type, \
                                "response_value": response_value, "rating_type": rating_type, \
                                'expression': expression, 'sex': sex, "race": race, "age": age}, index=[0])
            newData = pd.concat([newData, newRow], ignore_index=True)
        
        # change precisions
        precision_dic = {'onset': 3, 'duration': 3, 'response_value': 4}
        newData = newData.round(precision_dic)

        # check response_value
        if any(newData.response_value > 1) or any(newData.response_value < 0):
            print(f"Please check the rating values of {sub}_{run}!")

        # save new events file
        newFilename = os.path.join(outputDir, sub, session, 'func', f'{sub}_{session}_{taskname}_acq-mb8_{run}_events.tsv')
        newData.to_csv(newFilename, sep='\t', index=False)

No file for sub-0001_run-01
No file for sub-0001_run-02
No file for sub-0001_run-03
No file for sub-0003_run-01
No file for sub-0003_run-02
No file for sub-0003_run-03
No file for sub-0004_run-01
No file for sub-0004_run-02
No file for sub-0004_run-03
No file for sub-0005_run-01
No file for sub-0005_run-02
No file for sub-0005_run-03
No file for sub-0008_run-01
No file for sub-0008_run-02
No file for sub-0008_run-03
No file for sub-0015_run-01
No file for sub-0015_run-02
No file for sub-0015_run-03
No file for sub-0028_run-01
No file for sub-0028_run-02
No file for sub-0028_run-03
No file for sub-0029_run-03
No file for sub-0030_run-01
No file for sub-0030_run-02
No file for sub-0030_run-03
No file for sub-0047_run-01
No file for sub-0047_run-02
No file for sub-0047_run-03
No file for sub-0063_run-01
No file for sub-0063_run-02
No file for sub-0063_run-03
No file for sub-0068_run-01
No file for sub-0068_run-02
No file for sub-0068_run-03
No file for sub-0071_run-01
No file for sub-0071

Create `.json` file associated with the events.

In [12]:
import json

for sub in subList:
    
    if int(sub[-4:])%2 == 0:    # determine rating type 
        runDict = {'run-01': 'age', 'run-02': 'sex', 'run-03': 'intensity'}
    else:
        runDict = {'run-01': 'intensity', 'run-02': 'sex', 'run-03': 'age'}

    for run in runDict:
        dataFile = os.path.join(behDataDir, sub, taskname, f'{sub}_{session}_{taskname}_{run}-{runDict[run]}_beh-preproc.csv')
        if not os.path.isfile(dataFile):
            continue
        
        des_duration = {"Description": "For the 'rating' trial type, the duration was set as n/a to indicate no response was provided by the participant. If you would like to model the rating period, you can use the maximum time for rating (1.875 seconds) or the duration of the mouse trajectory (see below) to replace those missing values."}
        des_trial = {"LongName": "Event category", 
                     "Description": "A categorical variable indicating event types within a trial", 
                     "Levels": {
                         "face": "when videos of faces were played", 
                         "rating": "the rating period", 
                         "rating_mouse_trajectory": "the period when the participant was moving the trackball during the rating. The onset of this event was when participant began to move the trackball, and the duration is the length of the time when the trackball was being moved."
                     }}
        des_responseValue = {"LongName": "The value of the rating", 
                         "Description": "The ratings were made on a linear scale (see details in the paper). Then all the rating values were converted into the [0 1] range via a simple linear scaling (0 and 1 correspond to the left and right end of the scale, respectively)."}        
        des_ratingType = {"LongName": "The type of ratings in this trial", 
                         "Description": "There were three types of ratings (about the faces) used in this experiment: sex (male to female), age (young to old), and intensity of facial expressions (neutral to strongest imaginable)."}
        
        des_expression = {"LongName": "The facial expressions of the face presented in this trial", 
                       "Description": "Eight kinds of facial expressions were used: happy, pleasure, sad, anger, fear, surprise, disgust, pain"}
        des_age = {"LongName": "The age of the face presented in this trial", 
                       "Description": "Two levels of age were used: young and old."}
        des_race = {"LongName": "The race of the face presented in this trial", 
                       "Description": "Three kinds of race were used: African, WC (White Caucassian), and EA (East Asian)."}
        des_sex = {"LongName": "The sex of the face presented in this trial", 
                       "Description": "Two levels of sex were used: female and male."}
        des_stimFile = {"LongName": "The name of the stimulus file in this trial"}

        dataToWrite = {"duration": des_duration, "trial_type": des_trial, "response_value": des_responseValue, "rating_type": des_ratingType, \
                       "expression": des_expression, "age": des_age, "race": des_race, "sex": des_sex, "stim_file": des_stimFile}        
        newFilename = os.path.join(outputDir, sub, session, 'func', f'{sub}_{session}_{taskname}_acq-mb8_{run}_events.json')

        with open(newFilename, 'w') as json_file:
            json.dump(dataToWrite, json_file, indent=4) 