Using Python
============

In [None]:
# This is called a comment. Anything in the Python code cells below that starts with a # is just there to explain the code
# for human readers. Python doesn't interpret it.
# Code cells have an "In []:" next to them.
# You can edit code cells if you download this file (File->Download As->Notebook) and open the .ipynb file in Canopy.
# To run anything in a code cell, click it to highlight it, and press ctrl+enter

This cell is written in [_markdown_](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#emphasis). You can edit it, too. Double click in this cell to see what the markdown text looks like, then press ctrl+enter to get back to the typeset version.
You can also insert cells above and below here by selecting this cell and clicking the Insert->Insert Cell Below (or Above) menu at the top of the page. If you want those cells to be in markdown rather than code, type esc+m and click in the cell again to edit it.

In [None]:
print "hello"

I just told the computer to say "Hello"

You can use Python as a big expensive calculator

In [None]:
2+2

You can store numbers in variables:

In [None]:
a = 10

In [None]:
a # this outputs the value of a

In [None]:
a+3

Here is a list of numbers stored in a variable:

In [None]:
b=[5,25]

In [None]:
sum(b+b)

The numpy package gives you more options of things to do with lists of numbers. These are called arrays in numpy:

In [None]:
import numpy as np
help(np)

In [None]:
c=np.array([a]+b)
d=np.arange(1,5)
print'c=',c
print 'd=',d

The `np.concatenate` command sticks arrays together.

In [None]:
cd=np.concatenate((c,d,b)) # Note that you need parentheses around the group of lists you want to stick together
print 'cd=',cd

In [None]:
s=cd.reshape(3,3)
print 's=\n',s,'\n... a 3x3 matrix' # '\n' means 'Insert a new line here'

In [None]:
mu,sigma=1,0.5
mu2,sigma2=1.5,0.5
r=np.random.normal(mu,sigma,500)
r2=np.random.normal(mu2,sigma2,500)

In [None]:
print 'sum of r =',np.sum(r)
print 'sum of r2 = ',np.sum(r2)
print 'mean of r =',np.mean(r)
print 'mean of r2 = ',np.mean(r2)

Matplotlib lets you plot things (note these show up as separate windows):

In [None]:
import matplotlib.pyplot as plt
n,bins,patches=plt.hist(r,50)
plt.show()

To get matplotlib windows to show up here, run these commands with the `%`s:

In [None]:
%matplotlib inline
%config InlineBackend.figure_formats = {'svg',}

In [None]:
n,bins,patches=plt.hist(r,20,alpha=0.5) # This not only plots a histogram, but stores some useful info in a few variables
n2,bins2,patches2=plt.hist(r2,bins=bins,alpha=0.5) # We use the 'bins' variable again here to plot the second histogram.
plt.xlabel('Random Number')
plt.ylabel('Number of occurrences')
plt.show()

In [None]:
plt.plot(r,r2,'k.')
plt.axis('equal') # Same scale on both axes
plt.xlabel('r')
plt.ylabel('r2')
plt.show()


Complex things you can't do in Excel:

In [None]:
import matplotlib.mlab as mlab
from matplotlib import cm
from mpl_toolkits.axes_grid1 import make_axes_locatable

binwidth = 0.25
xymax = np.max([np.max(r), np.max(r2)])
xymin = np.min([np.min(r), np.min(r2)])
bins = np.arange(xymin, xymax, binwidth) # Figure out bins for histograms
with plt.style.context('ggplot'): # My favorite "style" for plots
    fig, axScatter = plt.subplots(figsize=(5.5, 5.5)) # Scatter plot
    plt.xlabel('Value of r')
    plt.ylabel('Value of r2')
    axScatter.hist2d(r,r2,cmap=cm.RdBu_r) # Red-blue color contours 
    axScatter.scatter(r, r2,alpha=0.3) # Scatter plot with translucent points
    axScatter.set_aspect(1.)
    divider = make_axes_locatable(axScatter) # Place other plots
    axHistx = divider.append_axes("top", 1.2, pad=0.15, sharex=axScatter)
    axHisty = divider.append_axes("right", 1.2, pad=0.15, sharey=axScatter)
    plt.setp(axHistx.get_xticklabels() + axHisty.get_yticklabels(),
             visible=False)
    axHistx.hist(r, bins=bins,color='blue')
    axHistx.set_ylabel('Count of r')
    axHisty.hist(r2, bins=bins, orientation='horizontal',color='blue')
    axHisty.set_xlabel('Count of r2')
    plt.draw()
    plt.show()

Also: stats!

In [None]:
from scipy import stats
stats.ttest_ind(r,r2)

A p-value below 0.05 indicates that r and r2 have different means, with statistical significance at the 95% level!

Exercises
---------
1. Change the number of points in `r` and `r2` from 500 to 5000 and see what happens.
2. Find where the `np.random.normal` function generates the list of random numbers in the variable `r`. Change the random numbers in `r` from a normal (Gaussian) distribution to a logistic distribution (`np.random.logistic`).
3. Insert a new cell below. Generate 100 random numbers from a [Von Mises distribution](http://docs.scipy.org/doc/numpy/reference/routines.random.html) with a mu of 0 and a kappa of 1. Use `plt.subplot(111,projection='polar')` to create a polar plot, and make a [histogram](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist) of your random numbers (hint: use `bins=np.arange(-np.pi,np.pi,0.2)` in your histogram command to get bins of appropriate widths; don't forget `plt.show()`!). 
4. What happens if you change kappa in the previous question to 2? 5? 10?
5. What happens if you use `np.concatenate` to stick two lists of von Mises-distributed numbers together, one with a mu of 0 and one with a mu of `np.pi/2`?