**READ ME (provided by Aitana Grasso Cladera)**  
You will receive 7 .CSV files. Most files’ structure is 59x176 (participants/trials). Test trials have already been removed. 59 corresponds to participants. However, we recorded 61. Two participants were excluded due to bad eye data (Subjects 22 and 61, with less than 90% of valid data).

1. dominantEye = 122x1 (this follows the same structure as the behavioral data. Here, the data is duplicated (rows 1 and 2 correspond to Participant 1); you should consider this when doing your analysis. 1 means Right, and 0 means Left.
2. onsetTimePicture = 59x176 (participants/trials). This shows the time on the computer clock when the picture was presented. The unit is random.
Remember that for us, the time of picture onset represents time 0 and that we have a sampling rate of 500 Hz. This information should be enough to compute reaction times.
3. onsetTimeSaccade = 59x176 (participants/trials). This shows the time on the computer clock when the first valid saccade was made. The unit is random.
4. sideLookedLeftEye = 59x176 (participants/trials). This shows the side that was looked for the participant according to the information provided by the left eye. 1 means Right, and 0 means Left.
5. sideLookedRightEye = 59x176 (participants/trials). This shows the side that was looked for the participant according to the information provided by the right eye. 1 means Right, and 0 means Left.
Consider the dominant eye information here.
6. valenceLeftEye = 59x176 (participants/trials). This shows the emotional valence of the picture that was on the side that the participant looked at first, according to the information provided by the left eye. 1 means Positive, and 0 means Negative.
7. valenceRightEye = 59x176 (participants/trials). This shows the emotional valence of the picture that was on the side that the participant looked at first, according to the information provided by the right eye. 1 means Positive, and 0 means Negative.

The information provided by left and and right eyes should match. It can be the case that they don’t match because the going over the threshold did not occur at the same time. This is why we only consider the information coming from the dominant eye.




In [None]:
### MY CODE

# --- Import Libraries and Data ---

import scipy.io
from google.colab import drive
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

drive.mount('/content/drive')

In [None]:
# Dominant Eye (122x1)
dominant_eye_raw = pd.read_csv('/YOUR_PATH/dominantEye_Task2Exp1.csv', header=None)
# Onset Time Picture (59x176)
onset_time_picture_raw = pd.read_csv('/YOUR_PATH/onsetTimePicture_Task2Exp1.csv', header=None)
# Onset Time Saccade (59x176)
onset_time_saccade_raw = pd.read_csv('/YOUR_PATH/onsetTimeSaccade_Task2Exp1.csv', header=None)
# Side Looked Left Eye (59x176)
side_looked_left_eye_raw = pd.read_csv('/YOUR_PATH/sideLookedLeftEye_Task2Exp1.csv', header=None)
# Side Looked Right Eye (59x176)
side_looked_right_eye_raw = pd.read_csv('/YOUR_PATH/sideLookedRightEye_Task2Exp1.csv', header=None)
# Valence Left Eye (59x176)
valence_left_eye_raw = pd.read_csv('/YOUR_PATH/valenceLeftEye_Task2Exp1.csv', header=None)
# Valence Right Eye (59x176)
valence_right_eye_raw = pd.read_csv('/YOUR_PATH/valenceRightEye_Task2Exp1.csv', header=None)
# Sequence of positive images (92x122)
sequence_positive_images_raw = pd.read_csv('/YOUR_PATH/positiveSequence_Task2Exp1.csv', header=None)
# Sequence of negative images (92x122)
sequence_negative_images_raw = pd.read_csv('/YOUR_PATH/negativeSequence_Task2Exp1.csv', header=None)


# get arousal ratings
arousal_ratings = pd.read_csv('/YOUR_PATH/median_arousal_ratings.csv')
arousal_dict = dict(zip(arousal_ratings['img_id'], arousal_ratings['arousal_levels']))

In [None]:
# --- Functions ---

def participants_to_trials(participants):
  '''Returns the blocks for the respective participants

  Args:
  participants: a list containing the participant ids

  Resturns:
  trials: a list of the trials from the participant
  '''
  trials = []
  for i in range(len(participants)):
    trials.append((participants[i]-1)*2)
    trials.append((participants[i]-1)*2+1)
  return trials


def reshape_image_sequence_data(data):
  '''Reshapes the image sequence data from 88x122 to 61x176

  Args:
  data: the image sequence data

  Returns:
  reshaped_data: a dataframe containing the reshaped image sequence
  '''
  reshaped_data = []

  for block in range(0, len(data.columns), 2):
    col1, col2 = data.iloc[:, block], data.iloc[:, block+1]
    joint_blocks = pd.concat([col1, col2], ignore_index=True)
    reshaped_data.append(joint_blocks)

  return pd.DataFrame(reshaped_data)


In [None]:
# --- Preprocessing the Eye Tracking Data ---

# 1. Exclude test trials from positive and negative image sequence data

# remove test trials
sequence_positive_images = sequence_positive_images_raw.drop(range(4), axis=0)
sequence_negative_images = sequence_negative_images_raw.drop(range(4), axis=0)
# reset indices
sequence_positive_images.reset_index(drop=True, inplace=True)
sequence_negative_images.reset_index(drop=True, inplace=True)

# reshapes image sequence data to match structure of other data
reshaped_pos_img_seq = reshape_image_sequence_data(sequence_positive_images)
reshaped_neg_img_seq = reshape_image_sequence_data(sequence_negative_images)

In [None]:
# 2. Remove participant
# remove participants 22 and 61 from dominant_eye data because of little valid data
# remove participants 41, 43 and 44 from all data because of accuracy < 90% in Classic AAT

# exclude Participants 22 and 61
rows_to_drop_a = participants_to_trials([22,61])
dominant_eye = dominant_eye_raw.drop(rows_to_drop_a, axis=0)
reshaped_neg_img_seq = reshaped_neg_img_seq.drop([22-1,61-1], axis=0)
reshaped_pos_img_seq = reshaped_pos_img_seq.drop([22-1,61-1], axis=0)

# reset indices
dominant_eye.reset_index(drop=True, inplace=True)
reshaped_neg_img_seq.reset_index(drop=True, inplace=True)
reshaped_pos_img_seq.reset_index(drop=True, inplace=True)

# exclude Participants 41,43 and 44
# since participants 22 and 61 have already been dropped all participants are shifted one to the left for part > 21
rows_to_drop_b = participants_to_trials([41-1,43-1,44-1])
dominant_eye = dominant_eye.drop(rows_to_drop_b, axis=0)
participants_to_drop = [41-2, 43-2, 44-2] # -2 because indexing is 0-based and participants shifted to the left by 1
onset_time_picture = onset_time_picture_raw.drop(participants_to_drop, axis=0)
onset_time_saccade = onset_time_saccade_raw.drop(participants_to_drop, axis=0)
side_looked_left_eye = side_looked_left_eye_raw.drop(participants_to_drop, axis=0)
side_looked_right_eye = side_looked_right_eye_raw.drop(participants_to_drop, axis=0)

valence_left_eye = valence_left_eye_raw.drop(participants_to_drop, axis=0)
valence_right_eye = valence_right_eye_raw.drop(participants_to_drop, axis=0)
reshaped_neg_img_seq = reshaped_neg_img_seq.drop(participants_to_drop, axis=0)
reshaped_pos_img_seq = reshaped_pos_img_seq.drop(participants_to_drop, axis=0)

# reset indices
onset_time_picture.reset_index(drop=True, inplace=True)
onset_time_saccade.reset_index(drop=True, inplace=True)
side_looked_left_eye.reset_index(drop=True, inplace=True)
side_looked_right_eye.reset_index(drop=True, inplace=True)
valence_left_eye.reset_index(drop=True, inplace=True)
valence_right_eye.reset_index(drop=True, inplace=True)
dominant_eye.reset_index(drop=True, inplace=True)
reshaped_pos_img_seq.reset_index(drop=True, inplace=True)
reshaped_neg_img_seq.reset_index(drop=True, inplace=True)

# 3. Compute the first fixation reaction times

rt_eye_movement = (onset_time_saccade - onset_time_picture)

In [None]:
# 4. Derive the Valence, Arousal, and Side information from the dominant eye

# create dominant_eye_part 59x1 (59 participants) out of out of dominant_eye 122x1 (122 blocks) by taking every second entry
# this is neccessary because for the ET Data we only have one block per participant (all other data frames have the structure 59x176)
dominant_eye_part = pd.DataFrame(dominant_eye.values[::2].flatten())

# create empty arrays with the same shape
side_looked = np.empty_like(side_looked_right_eye)
valence_looked = np.empty_like(valence_right_eye)
arousal_looked = np.empty_like(valence_right_eye)
arousal_difference = np.empty_like(valence_right_eye)
picture_looked = np.empty_like(valence_right_eye)

# create side_looked and valence_looked that contains the information from the dominant eye

# go through participants
for i in range(len(dominant_eye_part)):
  # go through 176 trials
  for j in range(len(side_looked_right_eye.columns)):

    # if the right eye is dominant save right eye information for side_looked and valence_looked
    if dominant_eye_part.iloc[i, 0] == 1:
        side_looked[i,j] = side_looked_right_eye.iloc[i, j]
        valence_looked[i,j] = valence_right_eye.iloc[i, j]

    # if the left eye is dominant save information of left eye
    else:
        side_looked[i,j] = side_looked_left_eye.iloc[i, j] # save the side the left eye looked at in side_looked
        valence_looked[i,j] = valence_left_eye.iloc[i, j] # save the valence the left eye looked at in valence_looked

    # if the positive image was looked at get the respective arousal rating from the positive image sequence
    if valence_looked[i,j] == 1: # valence 1 = positive
      picture_looked[i,j] = reshaped_pos_img_seq.iloc[i, j]
      arousal_looked[i,j] = arousal_dict.get(reshaped_pos_img_seq.iloc[i, j])

      # compute Arousal-Difference
      # (-) indicates looked image is smaller in arousal than not-looked image, (+) indicates looked image is higher
      arousal_difference[i,j] = arousal_dict.get(reshaped_pos_img_seq.iloc[i, j]) - arousal_dict.get(reshaped_neg_img_seq.iloc[i, j])

    # and if the negative image was looked at get the respective arousal rating from the negative image sequence
    else: # valence 0 = negative
      picture_looked[i,j] = reshaped_neg_img_seq.iloc[i, j]
      arousal_looked[i,j] = arousal_dict.get(reshaped_neg_img_seq.iloc[i, j])

      # compute Arousal-Difference
      # (-) indicates looked image is smaller in arousal than not-looked image, (+) indicates looked image is higher
      arousal_difference[i,j] = arousal_dict.get(reshaped_neg_img_seq.iloc[i, j]) - arousal_dict.get(reshaped_pos_img_seq.iloc[i, j])

# turn into a data frame
side_looked = pd.DataFrame(side_looked)
valence_looked = pd.DataFrame(valence_looked)
arousal_looked = pd.DataFrame(arousal_looked)
arousal_difference = pd.DataFrame(arousal_difference)
picture_looked = pd.DataFrame(picture_looked)


In [None]:
# 5. Exclude very fast reaction times (RTs < 2 std deviations from the mean)

# compute the bound and set respective reaction times to NaN
global_mean = np.nanmean(rt_eye_movement.values)
global_std = np.nanstd(rt_eye_movement.values)
lower_bound = global_mean - 2 * global_std
rt_eye_movement[rt_eye_movement < lower_bound] = np.nan

# 6. Log-transform the data using log10

# flatten and clean the data
reaction_times_flat_untransformed = rt_eye_movement.values.flatten()
reaction_times_flat_untransformed = reaction_times_flat_untransformed[~np.isnan(reaction_times_flat_untransformed)]

# log-transform the data and clean and flatten
reaction_time_log_transformed = rt_eye_movement.applymap(
    lambda x: np.log10(x) if pd.notna(x) and x > 0 else np.nan
)
reaction_times_flat_log = reaction_time_log_transformed.values.flatten()
reaction_times_flat_log = reaction_times_flat_log[~np.isnan(reaction_times_flat_log)]

# --- plot the log-transformed and untransformed distribution ---

# create a plot
fig, ax1 = plt.subplots(figsize=(10, 6))

sns.histplot(reaction_times_flat_untransformed, kde=True, bins=30, color='lightblue', alpha=0.7, label='non-transformed')

# create a second axis for the log-transformed reaction time distribution
ax2 = ax1.twiny()
sns.histplot(reaction_times_flat_log, kde=True, bins=30, color='dodgerblue', alpha=0.7, label='log-transformed', ax=ax2)

# define plot labels and design details
ax1.set_xlabel("Reaction Time (ms)", fontsize=12)
ax2.set_xlabel("Reaction Time (log-transformed)", fontsize=12)
ax1.set_ylabel("Frequency", fontsize = 12)
fig.legend(loc='upper right', bbox_to_anchor=(0.89, 0.87), fontsize=12)

plt.show()

# --- Save the Eye Tracking Data in a Data Frame ---

# initialize a list to create the data frame
data_frame_ET = []

# go through participants
for participant in range(len(rt_eye_movement)):

    # go through trials and save all the information for each trial
    for trial in range(len(rt_eye_movement.columns)):
        data_frame_ET.append({
            'participant_id': participant+1, # so that participants start at 1 and not 0
            'picture_looked': picture_looked.iloc[participant, trial],
            'rt': rt_eye_movement.iloc[participant, trial],
            'log-transformed rt': reaction_time_log_transformed.iloc[participant, trial],
            'side_looked': side_looked.iloc[participant, trial],
            'valence_looked': valence_looked.iloc[participant, trial],
            'arousal_looked': arousal_looked.iloc[participant, trial],
            'arousal_difference': arousal_difference.iloc[participant, trial]
        })

# turn into a data frame
df_ET = pd.DataFrame(data_frame_ET)

# remove NaN rows
df_ET = df_ET.dropna()

# safe the data frame as a csv file
#df_ET.to_csv("/YOUR_PATH/EyeTrackingData_T2.csv", index=False)


In [None]:
# --- Descriptive Plots ---

# compute the mean reaction times and standard deviations for different
# Valence-Condition combinations
valence_side_combinations = df_ET.groupby(['valence_looked', 'side_looked'])['rt'].agg(['mean', 'std']).reset_index()

# adjust the order of combinations
valence_side_combinations = valence_side_combinations.sort_values(['side_looked', 'valence_looked'], ascending=[True, False])

# save means and standard deviations
combinations = ['left positive', 'left negative', 'right positive', 'right negative']
means = valence_side_combinations['mean']
std_devs = valence_side_combinations['std']

# plot the different mean reaction times
plt.figure(figsize=(10, 6))
bars = plt.bar(combinations, means, yerr=std_devs, capsize=5,
               color=['lightblue', 'gold', 'blue', 'orange'], label=combinations)

# add values for means and standard deviations to the bars
for i, (bar, mean, std_dev) in enumerate(zip(bars, means, std_devs)):
    plt.text(
        bar.get_x() + bar.get_width()/2,
        mean,
        f"{mean:.2f}\n±{std_dev:.2f}",
        ha="center",
        va="center",
        fontsize=12,
        color="black",
        fontweight="bold",
        bbox=dict(boxstyle="round,pad=0.3", facecolor="white", edgecolor="none", alpha=0.7)
    )

# define plot labels and design details
plt.ylabel('Mean Reaction Time (ms)', fontsize = 13)
plt.xticks([], rotation=45)
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.legend()
plt.tight_layout()

plt.show()


In [None]:
# compute the mean reaction times for different Valence Arousal-Difference combinations
valence_arousal_diff_combinations = df_ET.groupby(['valence_looked', 'arousal_difference'])['rt'].agg(['mean']).reset_index()

# create a data frame with all combinations for Valence and Arousal-Difference
all_combinations = {
    'arousal_difference': np.append(np.arange(-6,7), np.arange(-6,7)),
    'valence_looked': np.append(np.zeros(13, dtype = int), np.ones(13, dtype=int))
}
all_combs = pd.DataFrame(all_combinations)

# merge both frames together for plotting
df_complete = pd.merge(all_combs, valence_arousal_diff_combinations, on=['arousal_difference', 'valence_looked'], how='left')

# split up into positive and negative for color-coding in the plot
pos_data = df_complete[df_complete['valence_looked'] == 1]
neg_data = df_complete[df_complete['valence_looked'] == 0]

# create subplots
fig = plt.subplots(figsize=(12, 8))

# define the positions of the bars
bar_position_positive = np.arange(len(pos_data))
bar_position_negative = [x + 0.3 for x in bar_position_positive]

# plot the positive and negative bars
plt.bar(bar_position_positive, pos_data['mean'], color='g', width=0.3, label='positive')
plt.bar(bar_position_negative, neg_data['mean'], color='r', width=0.3, label='negative')

# define plot labels and design details
plt.xlabel('Arousal-Difference', fontsize=15)
plt.ylabel('Mean Reaction Time (ms)', fontsize=15)
plt.xticks(np.arange(0.15, 13.15, step=1), labels = pos_data['arousal_difference'], fontsize=12)
plt.yticks(fontsize=12)
plt.legend(fontsize=12)
plt.legend(loc='upper right', fontsize=12)
plt.grid(axis="y", linestyle="--", alpha=0.7)

plt.show()
