In [1]:
%matplotlib inline

# Pandas and DataFrames

Often, we have tables of data--collections of named columns arranged in rows.  The **Pandas** package gives us a **DataFrame()** class that lets us index these columns the same way as with dicts, while still getting the benefit of Numpy arrays, meaning we can still write vectorized code.  

Let's start playing with the analysis now.  We'll examine Pandas in more depth in the coming days.

In [2]:
import pandas as pd

## Making DataFrames from a Data File

Pandas has functions that can make DataFrames from a wide variety of file types.  To do this, use one of the functions in Pandas that start with "read_".  Here is a non-exclusive list of examples:

| File Type | Function Name |
| :----:    |  :---:  |
| Excel | pd.read_excel() |
| CSV | pd.read_csv() |
| TSV | pd.read_table() |
| H5, HDF, HDF5 | pd.read_hdf() |
| JSON  | pd.read_json() |
| SQL | pd.read_sql_table() |


If your file type isn't listed here, chances are that there is a Python package with a Dataframe-creating function for your data out there--it's a very popular data structure!

## Loading the Data

Please open the file “MentalRotation.csv” (pd.read_csv()) and use it to answer the following questions about the results of the Mental Rotation psychology experiment. If you reach the end of the exercises, explore the dataset and DataFrames more and see what you can find about this experiment!

## Examining the Dataset

**head()**, **tail()**, **sample()**

Look at the first 5 lines of the dataset

Look at the last 5 lines of the dataset

Check 3 random lines in the dataset.

How Many Total Trials (rows) are in the study?

What is the maximum number of trials that one subject performed?

### Making New Columns

Convert the Time column to seconds by dividing it by 1000.

Change the "Correct" column to *bool* (True/False) using the **astype()** method

### The mean() method

What is the mean response time, across all trials?

What percent of trials were answered correctly?

What percent of trials were “Matching” trials?

### Slicing

Is there a difference in accuracy between matching and non-matching trials?

Is there a response time difference between matching and nonmatching
trials?

Is there a response time difference between matching and nonmatching trials, for different rotation Angles?

### Plotting

Plot the response time distribution as a histogram.

Plot the average response time for each stimulus category (matching and non-matching)

Is there a correlation between Angle of mental rotation and response time?  Visualize the relationship

Is there a difference in the relationship between Angle of mental rotation and response time, between stimulus categories?