# Statistics: The Science of Decisions Project Instructions

## Background Information

In a Stroop task, participants are presented with a list of words, with each word displayed in a color of ink. The participant’s task is to say out loud the color of the ink in which the word is printed. The task has two conditions: a congruent words condition, and an incongruent words condition. In the congruent words condition, the words being displayed are color words whose names match the colors in which they are printed: for example RED, BLUE. In the incongruent words condition, the words displayed are color words whose names do not match the colors in which they are printed: for example PURPLE, ORANGE. In each case, we measure the time it takes to name the ink colors in equally-sized lists. Each participant will go through and record a time from each condition.


## Questions For Investigation


#### 1. What is our independent variable? What is our dependent variable?  ####



   **Dependent variable**:  The reaction time it takes to name the ink colors in equally-sized lists

   **Independent variable**:  Congruency of font color and word
   
   
     
   
#### 2.  What is an appropriate set of hypotheses for this task? What kind of statistical test do you expect to perform? Justify your choices.  ####




   **Null Hypothesis:**   
   
   $H_0 : \mu_I - \mu_C = 0$       
   
   Incongruent words does not affect reading speed.
    
   **Alternate Hypothesis:** 
   
   $H_A : \mu_I - \mu_C > 0$       
   
   Incongruent words affects reading speed

  
   I will perform a **one-tailed paired t-test**. It is a t-test as the population standard deviation is unknown, the sample size is below 30 and we have normal (Gaussian) distribution, it is paired as the study subjects are measured before and after a treatment intervention. I expect that incrongruent word condition increases the reaction time, therefore I will perform a one-tailed test. I will chose an alpha level of $\alpha = .05.$ 
   
   
      

#### 3. Report some descriptive statistics regarding this dataset. Include at least one measure of central tendency and at least one measure of variability. ####



   Congruent word condition (calulated by hand): 

   $\bar{x}_1 = 14.05$
   
   $s_1 = 3.56$
   
   $n = 24$
   
   Incongruent word condition (calulated by hand):
   
   $\bar{x}_2 = 22.02$
   
   $s_2 = 4.80$
   
   $n = 24$
   
   
   
   
##### 4. Provide one or two visualizations that show the distribution of the sample data. Write one or two sentences noting what you observe about the plot or plots. #####


   
    

In [8]:
import pandas as pd
import matplotlib.pyplot as plt


dataFrame = pd.read_csv('data.csv')
dataFrame['Diff'] = dataFrame['Incongruent'] - dataFrame['Congruent']

print 'Calculated measures of central tendency and variability (Incrongruent)'
print dataFrame['Incongruent'] .describe()
print '\n'

print 'Calculated measures of central tendency and variability (Crongruent)'
print dataFrame['Congruent'] .describe()
print '\n'

print 'Calculated measures of central tendency and variability (Diff)'
print dataFrame['Diff'] .describe()
print '\n'


%pylab inline

plt.hist(dataFrame.Diff, 10, normed=1, facecolor="blue")
plt.xlabel("Time Difference in Second")
plt.ylabel("frequency")
plt.title("Time Difference Stroop Effect")
plt.show()




ImportError: No module named seaborn

All subjects needed more time in the incongruent word condition than in the congruent word condition.


#### 5. Now, perform the statistical test and report your results. What is your confidence level and your critical statistic value? Do you reject the null hypothesis or fail to reject it? Come to a conclusion in terms of the experiment task. Did the results match up with your expectations? ####


    
   Sample size: $n = 24$

   Degrees of freedom: $df = n-1 = 23$
    
   T critical for one sided test: $t_{critial}(\alpha = .05,df = 23) = 1.714$
    
   Mean of difference: $\bar{x}_{x_2,x_1} = \frac{\sum_{i=1}^{n} x_{2_i}-x_{1_i}}{n} = 7.96$
    
   Sample difference standard deviation: $SD_{x_2,x_1} = 4.86$
    
   Standard error of mean: $SEM_{x_2,x_1} = \frac{SD_{x_2,x_1}}{\sqrt{n}} = 0.99$
    
   t-value: $t = \frac{\bar{x}_{x_2,x_1}}{SEM_{x_2,x_1}} = 8.02$
    
   ### Result
    
   $t(23) = 8.02, p < .0001, one-tailed$
    
   $95\% CI = (5.91,10.01)$
   
   Cohen's: $d = \frac{\bar{x}_{x_2,x_1}}{SD_{x_2,x_1}} = 1.64$
    
   $r^2 = \frac{t^2}{t^2 + df} = .74$
    
    
   I **reject $H_0$** because the t value is in the critical region with p < .0001. As p is below .0001 the difference  is considered to be **extremely statistically significant**. This means that the incongruent words condition increases the reaction time a lot. The effect size is 74 %. 
   I expected that incrongruent word condition will increase the reaction time (as you can see from the reaction time of each subject), but I didn't expect that the effect is that big.
   
   


#### 6. *What do you think is responsible for the effects observed? Can you think of an alternative or similar task that would result in a similar effect? Some research about the problem will be helpful for thinking about these two questions!* ####

It seems when reading the words we are in a kind of automation mode: When the color and the word is different this automaticity is disturbed. I noticed that the reading was more automatic than naming the color.
I can think of similiar experiments where we are as well in a kind of automation mode (maybe a daily routine). We could then also install some disturbance and measure the influence.
