In [None]:
import matplotlib
matplotlib.__version__

In [None]:
import matplotlib.pyplot as plt
import numpy as np
%pylab inline

# Learning Objectives

1. Name and recognize the two interfaces to matplotlib
2. Be able to use both interfaces to generate charts
3. Understand the connection between matplotlib and seaborne/pandas
4. Make plots with multiple datasets, and figures
5. Know the recommended functional form for writing your own plotting functions

# Plotting in Python

There are many libraries for doing plotting in Python. Some you may encounter
* Plotly
* Bokeh
* **Matplotlib**
* **Seaborne**
* **Pandas**
* ggplot (port of R package of same name)

All of these aim to solve the same problem: allowing you to visualize your data.

# Appreciating the challenges

A good plotting library should:

* Be easy to use.
* Allow plotting of all kinds of data.
* Support arbitrarily fine-grained control.
* Support a variety of backends to make graphs in various formats.

# Matplotlib

While everyone has different opinions about what library is best, everybody knows and has used matplotlib. This makes it the de-facto choice for plotting in python.

## How does it work?

In an effort to make easy things easy, and hard things possible, matplotlib has a number of different levels at which it can be accessed. They are:

| Level | Control | Complexity |
|-------|---------|------------|
| plt | minimal, fast interface for plots, annotations | low |
| OO interface w/ pyplot | fine-grained control over figure, axes, etc. | medium |
| pure OO interface | Embed plots in GUI applicatione e.g. | too high |

# plt example

In [None]:
x_data = np.arange(0, 4, .011)
y_data = np.sin(x_data)
plt.plot(x_data, y_data)
plt.show()

## Weird

`plt` was imported as a library, but it appears to be keeping some state between the last two lines above, behavior that we'd usually associate with objects.

In fact, `plt`, operates in a not-very-pythonic way.

In [None]:
x_data = np.arange(0, 4, .01)
y_data = np.sin(x_data)
plt.plot(x_data, y_data)
#We can actually keep adding state here, and it will be reflected when we finally call show.
plt.plot(x_data, np.cos(x_data))
plt.title("sin(x) & cos(x)")
plt.show()

If you thought it was strange that we were working in Python, but there didn't seem to be any objects required to make our image, join the club!

# Behind the curtain

![Matplotlib diagram](http://matplotlib.org/_images/fig_map.png)



In [None]:
fig = plt.figure()
fig.add_subplot?

In [None]:
ax

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111) #nrows, ncols, plotnumber
ax.plot(x_data, y_data)
ax.plot(x_data, np.cos(x_data))
ax.set_title('sin(x) and cos(x)')
plt.show()

In this example, the fact that state is maintained is less surprising.

## Polling break 1

# Mix 'n' Match

Turns out, you can combine `plt` and object-oriented approach.

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x_data, y_data)
ax.plot(x_data, np.cos(x_data))
plt.title('sin(x) and cos(x)')
plt.show()

# Why should we use the OO oriented approach?

If we want to exercise fine-grained control over our plots that isn't offered via the `plt` shortcuts.

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x_data, y_data, label='sin(x)')
ax.plot(x_data, np.cos(x_data), label='cos(x)')
plt.title('sin(x) and cos(x)')
ax.legend()

# Multiple plots

In [None]:
fig, ax_list = plt.subplots(2, 1)
y_funcs = [np.sin, np.cos]
for subp, y_func in zip(ax_list, y_funcs):
    subp.plot(x_data, y_func(x_data))

In [None]:
ax_list

# What about pandas?

In [None]:
import pandas as pd

In [None]:
df = pd.DataFrame({'x':x_data, 'sinx':np.sin(x_data), 'cosx':np.cos(x_data)})
df = df.set_index('x')
df.head()

In [None]:
ax = df.cosx.plot()
ax.set_title('sin(x) & cos(x)')

Calling the plot method on a pandas series returns a familiar matplotlib axes object.

In [None]:
#We can also supply an axes object on which to draw!
fig, ax_list = plt.subplots(2,1)
cols = ['sinx', 'cosx']
for ax, col in zip(ax_list, cols):
    df[col].plot(ax=ax)
    ax.legend()
top_ax = ax_list[0]
top_ax.set_ylim(bottom=-1, top=1)

# Writing plotting functions

In [None]:
def my_plotter(ax, data1, data2, param_dict):
    """
    A helper function to make a graph

    Parameters
    ----------
    ax : Axes
        The axes to draw to

    data1 : array
       The x data

    data2 : array
       The y data

    param_dict : dict
       Dictionary of kwargs to pass to ax.plot

    Returns
    -------
    out : list
        list of artists added
    """
    out = ax.plot(data1, data2, **param_dict)
    return out

# Example:
Let's write a function that draws some data, and some horizontal
lines representing the 25th and 75th percentile.

We'll call it iqr_plot.

### I do: 
a function for drawing a horizontal line at some point.

### We do: 
write the iqr_plot function.

### You do: 
make a 2 by 2 grid of plots using this function.


In [None]:
ax.hlines?

In [None]:
axes_array = pd.scatter_matrix(df)
axes_array[1,0].hlines(0,-1,1)

# Seaborn

A special data visualization library, *built on matplotlib*, for drawing statistical graphics.

In [None]:
import seaborn as sns

In [None]:
#Seaborn
seaborn_grid = sns.lmplot(x="x", y="sinx", data=df.reset_index())
#Combined with matplotlib
seaborn_grid.axes[0,0].hlines(0, -1, 5)

In [None]:
seaborn_grid.axes