# Questions

1. What is our **independent variable**? What is our **dependent variable**?
2. What is an **appropriate set of hypotheses** for this task? What kind of **statistical test** do you expect to perform? Justify your choices.
3. **Report some descriptive statistics** regarding this dataset. Include at least one measure of central tendency and at least one measure of variability.
4. **Provide one or two visualizations** that show the distribution of the sample data. Write one or two sentences noting what you observe about the plot or plots.
5. Now, **perform the statistical test and report your results**. What is your **confidence level and your critical statistic value**? Do you reject the null hypothesis or fail to reject it? **Come to a conclusion** in terms of the experiment task. Did the results match up with your expectations?
6. Optional: What do you think is responsible for the effects observed? Can you think of an alternative or similar task that would result in a similar effect? Some research about the problem will be helpful for thinking about these two questions!


**1.** An independent variable is a variable that changes or is controlled in an experiment to test the effects on the dependent variable.

In our experiment, the independant refers to the color of the ink, while the dependant variable is the time taken to name the color. 

**2**. The hypothesis in the current test is: Incogruent words require more time to be identified, because the words do not match the color in which are printed. This incongruency implies more effort, and therefore, more time.

This means that the null hypothesis is: Average time spent on words with an incogruent color is not longer than the average time spent on words that match with the color that are print (congruent words)

And the alternative Hypothesis is: Average time spent on incongruent words is longer than the average time spent on congruent words

$ H_0: $ There is no difference in response time under incongruent and congruent conditions ( $ H_0: μC = μI $).


$ H_1: $ Time with incongruent condition will be **significantly** larger than the response time with congruent condition ( $ H_1: μC < μI $ ).

**3.** Descriptive statistics

In [2]:
#Import Required modules
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
import math
%matplotlib inline

# Read data
df = pd.read_csv('stroopdata.csv')
df.head()

Unnamed: 0,Congruent,Incongruent
0,12.079,19.278
1,16.791,18.741
2,9.564,21.214
3,8.63,15.687
4,14.669,22.803


In [5]:
#Mean
con_mean = df['Congruent'].mean()
inc_mean = df['Incongruent'].mean()

#median
con_median = df['Congruent'].median()
inc_median = df['Incongruent'].median()

#deviation
con_std = df['Congruent'].std()
inc_std = df['Incongruent'].std()


In [15]:
print ("Variable             ", "Congruent       ", "Incongruent")
print ("Means                ", con_mean, inc_mean)
print ("Medians              ", con_median, inc_median)
print ("Standard Desviations ", con_std, inc_std)

variable              Congruent        Incongruent
Means                 14.051125000000004 22.01591666666667
Medians               14.3565 21.0175
Standard Desviations  3.559357957645195 4.797057122469138
