## Experiment 2 (semantic priming)

Our experiment 2 was based on the first experiment in:

Besner, D., Smith, M. C., & MacLeod, C. M. (1990). Visual word recognition: A dissociation of lexical and semantic processing. _Journal of Experimental Psychology: Learning, Memory, and Cognition, 16_(5), 862.

If we find the same results that they did, our results should look something like the results reported in their Table 1:

<img src="https://github.com/ethanweed/ExPsyLing/blob/master/Slides/Images/Besner%20et%20al_1990_Table%201.png?raw=true" width=""/>


In [2]:
# install packages (remove the comments and run, then replace the comment and run again, to remove all the mess!)

#%pip install jsonlines
#%pip install seaborn

In [2]:

# import packages you might need

import jsonlines                        # turn the json data blob into a dataframe
import pandas as pd                     # make dataframes
import pingouin as pg                   # do statistical tests
import seaborn as sns                   # make plots
from matplotlib import pyplot as plt    # make your plots prettier
import os

# silence annoying (but also useful!) warnings
import warnings
warnings.filterwarnings('ignore')

In [8]:
os.getcwd()

'/Users/Nikita/Desktop/EPL/experimental_psycholinguistics_2023/experiment_2/src'

In [9]:
raw = os.path.join("..", "data", "data_semantic-relatedness_2023.txt")
pathout = os.path.join("..", "out", "data.csv")

In [11]:
# convert data json blob to csv

# the code in this cell comes from michedini and their post at:
# https://forum.cogsci.nl/discussion/8257/problem-with-jatos-result-conversion


i = 0

with jsonlines.open(raw) as reader:
    for line in reader:
        if i == 0:
            df = pd.DataFrame(line)
            i += 1
        else:
            df = pd.concat([df, pd.DataFrame(line)])
            i += 1

df['url'] = df['url'].ffill()
df['url'] = [int(x['srid']) for x in list(df['url'])]
del df['meta']

# save the data to a csv file
df.to_csv(pathout)

In [18]:
df = pd.read_csv(pathout)
df

Unnamed: 0.1,Unnamed: 0,url,sender,sender_type,sender_id,response,response_action,ended_on,duration,time_run,...,timestamp,time_switch,counterbalance,Unnamed: 17,stim,condition,block,correct_response,correctResponse,correct
0,0,2252,Instructions1,canvas.Screen,0,,keypress,response,4679.644,10931.9,...,2023-10-05T12:39:22.095Z,15646.283,,,,,,,,
1,1,2252,Instructions2,canvas.Screen,1,,keypress,response,5033.217,15641.1,...,2023-10-05T12:39:27.136Z,20685.162,,,,,,,,
2,2,2252,Instructions3,canvas.Screen,2,,keypress,response,3670.538,20681.5,...,2023-10-05T12:39:30.811Z,24372.782,,,,,,,,
3,3,2252,Instructions4,canvas.Screen,3,,keypress,response,5168.818,24356.8,...,2023-10-05T12:39:35.998Z,29567.617,,,,,,,,
4,4,2252,Instructions5,canvas.Screen,4,,keypress,response,3231.283,29543.4,...,2023-10-05T12:39:39.255Z,32815.263,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9679,330,2225,Block B Loop,flow.Loop,7_0_3,,,skipped,,174474.3,...,2023-10-05T12:30:23.759Z,174485.434,1.0,,,,,,,
9680,331,2225,Block C Loop,flow.Loop,7_0_4,,,skipped,,174474.6,...,2023-10-05T12:30:23.759Z,174485.434,1.0,,,,,,,
9681,332,2225,Trial Sequence,flow.Sequence,7_0,,,completion,139990.209,34472.6,...,2023-10-05T12:30:23.759Z,174485.434,1.0,,,,,,,
9682,333,2225,Counterbalance Loop,flow.Loop,7,,,completion,139990.209,34472.6,...,2023-10-05T12:30:23.759Z,174485.434,,,,,,,,


In [32]:
new_df = df[df["sender"] == "Stimulus"]
new_df = df[df["correct"] == True]

data = df[["sender", "url", "duration", "condition", "stim"]].copy()

data


Unnamed: 0,sender,url,duration,condition,stim
0,Instructions1,2252,4679.644,,
1,Instructions2,2252,5033.217,,
2,Instructions3,2252,3670.538,,
3,Instructions4,2252,5168.818,,
4,Instructions5,2252,3231.283,,
...,...,...,...,...,...
9679,Block B Loop,2225,,,
9680,Block C Loop,2225,,,
9681,Trial Sequence,2225,139990.209,,
9682,Counterbalance Loop,2225,139990.209,,


In [34]:
# I start by creating a new dataframe with no values
new_df = df[['url', 'duration', 'condition', 'correct', 'stim']]
# Filter rows where 'sender' is 'Stimulus'
stim_df = new_df[df['sender'] == 'Stimulus']
stim_df = new_df[new_df["correct"] == True]

#reset the index
stim_df.reset_index(drop=True, inplace=True)

stim_df


Unnamed: 0,url,duration,condition,correct,stim
0,2252,2613.908,Unrelated,True,sand-pepper
1,2252,1718.625,Unrelated,True,shark-dull
2,2252,4212.088,Nonword,True,hort-sain
3,2252,1300.176,Related,True,buy-sell
4,2252,2442.458,Nonword,True,ip-bown
...,...,...,...,...,...
2089,2225,595.032,Nonword,True,slom-wast
2090,2225,491.098,Filler,True,boeh-ale
2091,2225,589.492,Related,True,army-soldier
2092,2225,534.656,Unrelated,True,pot-cold


In [39]:
# aggregate the data (means for each participant for each condition find the mean)
data_agg = result = df.groupby('condition').size().reset_index(name='count')
data_agg
# for the condition I'll check how many different unique valuess we have
#set(stim_df["condition"])
#{'Filler', 'Nonword', 'Related', 'Unrelated'}

Unnamed: 0,condition,count
0,Filler,2240
1,Nonword,2288
2,Related,2512
3,Unrelated,2224
