### Analyzing the Stroop Effect
Perform the analysis in the space below. Remember to follow [the instructions](https://docs.google.com/document/d/1-OkpZLjG_kX9J6LIQ5IltsqMzVWjh36QpnP2RYpVdPU/pub?embedded=True) and review the [project rubric](https://review.udacity.com/#!/rubrics/71/view) before submitting. Once you've completed the analysis and write-up, download this file as a PDF or HTML file, upload that PDF/HTML into the workspace here (click on the orange Jupyter icon in the upper left then Upload), then use the Submit Project button at the bottom of this page. This will create a zip file containing both this .ipynb doc and the PDF/HTML doc that will be submitted for your project.


(1) What is the independent variable? What is the dependent variable?

For these experiments, the independent variable is type of word list the subject is asked to read: congruent or incongruent.   The dependent variable is the time it takes (measured in seconds) for the subject to read the list aloud.

(2) What is an appropriate set of hypotheses for this task? Specify your null and alternative hypotheses, and clearly define any notation used. Justify your choices.

Let $ \mu_{C} $ be the average time by the subjects to read the *congruent* list of words.  Let $ \mu_{I}$ be the average time the subjects take to read the *incongruent* lists.  Then, our null hypothesis is that the type of list does not effect the reading speed, notationally, this is
$$
  H_{0}: \mu_{C} = \mu_{I}
$$
and the alternative, which we'd expect based on published work about this experiment, is that the average time for reading incongruent lists is longer, or
$$
  H_{1}: \mu_{C} < \mu_{I}
$$

In [2]:
# Import packages
import pandas as pd
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [3]:
# import data and report on a few rows
df = pd.read_csv('stroopdata.csv')
df.head()

Unnamed: 0,Congruent,Incongruent
0,12.079,19.278
1,16.791,18.741
2,9.564,21.214
3,8.63,15.687
4,14.669,22.803


(3) Report some descriptive statistics regarding this dataset. Include at least one measure of central tendency and at least one measure of variability. The name of the data file is 'stroopdata.csv'.

In [6]:
# report some descriptive statistics
df.describe()

Unnamed: 0,Congruent,Incongruent
count,24.0,24.0
mean,14.051125,22.015917
std,3.559358,4.797057
min,8.63,15.687
25%,11.89525,18.71675
50%,14.3565,21.0175
75%,16.20075,24.0515
max,22.328,35.255


The dataset contains measures of the time it takes participants to read congruent and incongruent lists of color names in seconds.  A common measure of central tencency is _mean_ or _average_ reported as *mean* above.  For this dataset, the mean for the congruent and incongruent list reading times is 14.051 and 22.016 seconds, respectively.

A common measure of variability is _sample standard deviation_ shown as *std* above as 3.559 and 4.797 for the congruent and incongruent reading times, respectively.  Note that `describe()` computes the sample standard deviation, but since we have the whole (albeit small) population available to us, it probably makes more sense to use the _population standard deviation_ which is given by

In [10]:
(df['Congruent'].std(ddof=0),df['Incongruent'].std(ddof=0))

(3.484415712766633, 4.696055134513317)

or 3.484 and 4.696 for the congruent and incongruent reading times, respectively. 

(4) Provide one or two visualizations that show the distribution of the sample data. Write one or two sentences noting what you observe about the plot or plots.

In [None]:
# Build the visualizations here

--write answer here--

(5)  Now, perform the statistical test and report your results. What is your confidence level or Type I error associated with your test? What is your conclusion regarding the hypotheses you set up? Did the results match up with your expectations? **Hint:**  Think about what is being measured on each individual, and what statistic best captures how an individual reacts in each environment.

In [1]:
# Perform the statistical test here

--write answer here--

(6) Optional: What do you think is responsible for the effects observed? Can you think of an alternative or similar task that would result in a similar effect? Some research about the problem will be helpful for thinking about these two questions!

--write answer here--