# Matplotlib

Matplotlib is a plotting and visualization library for Python. There are others available like seaborn and bokeh. Matplotlib has all the functionality of Matlab and R plotting plus things like visualization. 

In [2]:
import matplotlib.pyplot as plt

In order to use matplotlib in a iPython/jupyter notebook you need to use the ```%pylab notebook``` cell

In [3]:
%pylab notebook

Populating the interactive namespace from numpy and matplotlib


In [4]:
import pandas as pd

In [5]:
from pandas import DataFrame

In [6]:
eggs = pd.read_table('report.txt',parse_dates=True,index_col=0)

FileNotFoundError: File b'report.txt' does not exist

In [6]:
eggs

Unnamed: 0_level_0,Region,Egg Class,Weighted Avg,Price Unit
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2017-01-06,COMBINED REGIONAL,LARGE,117.65,CENTS PER DOZEN
2017-01-06,SOUTH CENTRAL,EXTRA LARGE,127.50,CENTS PER DOZEN
2017-01-06,SOUTHEAST,EXTRA LARGE,124.50,CENTS PER DOZEN
2017-01-06,MIDWEST,LARGE,111.50,CENTS PER DOZEN
2017-01-06,NORTHEAST,MEDIUM,75.00,CENTS PER DOZEN
2017-01-06,SOUTHEAST,LARGE,122.50,CENTS PER DOZEN
2017-01-06,COMBINED REGIONAL,MEDIUM,76.18,CENTS PER DOZEN
2017-01-06,NORTHEAST,LARGE,112.00,CENTS PER DOZEN
2017-01-06,COMBINED REGIONAL,EXTRA LARGE,120.70,CENTS PER DOZEN
2017-01-06,SOUTH CENTRAL,LARGE,123.50,CENTS PER DOZEN


In [7]:
southeast = eggs[eggs.Region=="SOUTHEAST"]

In [8]:
southeast_large = southeast[southeast['Egg Class']=="LARGE"]

In [9]:
southeast_large

Unnamed: 0_level_0,Region,Egg Class,Weighted Avg,Price Unit
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2017-01-06,SOUTHEAST,LARGE,122.5,CENTS PER DOZEN
2017-01-13,SOUTHEAST,LARGE,72.5,CENTS PER DOZEN
2017-01-20,SOUTHEAST,LARGE,73.5,CENTS PER DOZEN
2017-01-27,SOUTHEAST,LARGE,78.5,CENTS PER DOZEN
2017-02-03,SOUTHEAST,LARGE,85.5,CENTS PER DOZEN
2017-02-10,SOUTHEAST,LARGE,82.5,CENTS PER DOZEN
2017-02-17,SOUTHEAST,LARGE,67.5,CENTS PER DOZEN
2017-02-24,SOUTHEAST,LARGE,65.5,CENTS PER DOZEN
2017-03-03,SOUTHEAST,LARGE,57.5,CENTS PER DOZEN
2017-03-10,SOUTHEAST,LARGE,57.5,CENTS PER DOZEN


## Basic Plotting

You can use pyplot from matplotlib to do most of your plotting needs. Like plotting in R there are different functions/methods to create different type of plots.

The simplest scatterplot can be created by just passing in x and y values to the ```pyplot.scatter``` method.

In [10]:
plt.scatter(southeast_large.index, southeast_large['Weighted Avg'])
plt.show()

<IPython.core.display.Javascript object>

### Adding labels

You can use the ```xlabel``` and ```ylabel``` methods to add labels to the x and y axes respectively.

The ```xticks``` method can be used to adjust the representation on the x-axis. 

In [12]:
plt.scatter(southeast_large.index, southeast_large['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.show()

<IPython.core.display.Javascript object>

### Formatting Layout

You can use the ```tight_layout``` method to fit the plot into the display

In [13]:
plt.scatter(southeast_large.index, southeast_large['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [14]:
eggs2016 = pd.read_table('report-2016-2017.txt',parse_dates=True,index_col=0)
southeast2016 = eggs2016[eggs2016.Region=="SOUTHEAST"]
southeast_large2016 = southeast2016[southeast2016['Egg Class']=="LARGE"]

In [15]:
plt.scatter(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

### Adding Layers

You can add layers to a plot by callong the plot fuction multiple times.

In [16]:
plt.scatter(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

## Adding Complexity

In [17]:
large2016=eggs2016[eggs2016['Egg Class']=="LARGE"]

In [18]:
plt.scatter(large2016.index, large2016['Weighted Avg'])
plt.show()

<IPython.core.display.Javascript object>

We can combine fuctionality from pandas and matplotlib to add complexity. We can use the ```groupby``` method from pandas to aggregrate the data, and then use matplotlib to plot each group in a different color.

In [19]:
groups = large2016.groupby('Region')

In [20]:
fig, ax = plt.subplots()

for name, group in groups:
    ax.scatter(group.index, group['Weighted Avg'])
    ax.plot(group.index, group['Weighted Avg'])
ax.legend()
plt.show()

<IPython.core.display.Javascript object>

We can use the ```label``` argument to adjust the legend.

In [21]:
fig, ax = plt.subplots()

for name, group in groups:
    ax.scatter(group.index, group['Weighted Avg'], label=name)
    ax.plot(group.index, group['Weighted Avg'], label=name)
ax.legend()
plt.show()

<IPython.core.display.Javascript object>

We can adjust the legend by passing in a list as an argument. That way we don't get duplicate entries.

In [22]:
fig, ax = plt.subplots()

names = []

for name, group in groups:
    names.append(name)
    ax.scatter(group.index, group['Weighted Avg'], label=name)
    ax.plot(group.index, group['Weighted Avg'],label=name)
ax.legend(names)
plt.xticks(rotation=70)
plt.tight_layout()

<IPython.core.display.Javascript object>

## Histograms

We can use the ```pyplot.hist``` method to create histograms.

In [23]:
plt.hist(southeast_large2016['Weighted Avg'],histtype="bar")
plt.show()

<IPython.core.display.Javascript object>

We can use the ```bins``` argument to set the number of bins in the histogram.

In [24]:
plt.hist(southeast_large2016['Weighted Avg'],bins=20)
plt.show()

<IPython.core.display.Javascript object>

You can set the width of the bar as a percent (0-1) using the ```rwidth``` argument. This allows you to adjust the spacing beween the bars.

In [25]:
plt.hist(southeast_large2016['Weighted Avg'],bins=20,rwidth=.75)
plt.show()

<IPython.core.display.Javascript object>

In [26]:
plt.hist(southeast2016['Weighted Avg'],bins=20,rwidth=.75)
plt.show()

<IPython.core.display.Javascript object>

We could add multiple histograms to the plot, but the result isn't very satisfactory. 

In [27]:
plt.figure()
plt.hist(southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],bins=20,rwidth=.75)
plt.hist(southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],bins=20,rwidth=.75)
plt.hist(southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg'],bins=20,rwidth=.75)
plt.show()

<IPython.core.display.Javascript object>

By adding all the data to one histogram, and using the ```stacked=True``` argument we can create a less cluttered plot, and one that is easier to understand.

In [28]:
plt.hist([southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg']], 
         stacked=True, bins=20)
plt.show()

<IPython.core.display.Javascript object>

You can change the type of histogram by using the ```histtype``` argument. The options are ```bar```, ```barstacked```, ```step```, and ```stepfilled``` with the default being ```bar```.

In [29]:
plt.hist([southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg']], 
         stacked=True, histtype="step",bins=20)
plt.show()

<IPython.core.display.Javascript object>

Currently our histogram isn't clear, as we don't know which color coresponds to what egg size. To fix this we can add a legend. 

In [30]:
labels = ["Large","Medium","Extra Large"]
plt.hist([southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg']], 
         stacked=True, bins=20, histtype="step",label = labels)
plt.legend()
plt.show()

<IPython.core.display.Javascript object>

We can create figures with multiple plots inside by using the ```subplots``` method.

In [31]:
fig, axes = plt.subplots(nrows=1, ncols=2)
ax0, ax1 = axes.flatten()
plt.show()

<IPython.core.display.Javascript object>

To fill in the subplots you treat each one like an individual plot. 

In [32]:
fig, axes = plt.subplots(nrows=1, ncols=2)
ax0, ax1 = axes.flatten()
/
labels = ["Large","Medium","Extra Large"]
ax0.hist([southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg']], 
         stacked=True, histtype="step",label = labels)
ax0.legend()

names = []

for name, group in groups:
    names.append(name)
    ax1.scatter(group.index, group['Weighted Avg'], label=name)
    ax1.plot(group.index, group['Weighted Avg'],label=name)
ax1.legend(names)
plt.xticks(rotation=70)
plt.tight_layout()

<IPython.core.display.Javascript object>

## Titles

You can use the ```set_title``` method to add a title to each subplot.

In [33]:
fig, axes = plt.subplots(nrows=1, ncols=2)
ax0, ax1 = axes.flatten()

labels = ["Large","Medium","Extra Large"]
ax0.hist([southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg']], 
         stacked=True, histtype="step",label = labels)
ax0.legend()
ax0.set_title('Egg Size and Price Histogram')

names = []

for name, group in groups:
    names.append(name)
    ax1.scatter(group.index, group['Weighted Avg'], label=name)
    ax1.plot(group.index, group['Weighted Avg'],label=name)
ax1.legend(names)
ax1.set_title('Large Egg Price by Region')
plt.xticks(rotation=70)
plt.tight_layout()

<IPython.core.display.Javascript object>

You can adjust the layout by modifying the ```nrows``` and ```ncols``` arguments.

In [34]:
fig, axes = plt.subplots(nrows=2, ncols=1)
ax0, ax1 = axes.flatten()

labels = ["Large","Medium","Extra Large"]
ax0.hist([southeast2016[southeast2016['Egg Class']=="LARGE"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="MEDIUM"]['Weighted Avg'],
          southeast2016[southeast2016['Egg Class']=="EXTRA LARGE"]['Weighted Avg']], 
         stacked=True, histtype="step",label = labels)
ax0.legend()
ax0.set_title('Egg Size and Price Histogram')

names = []

for name, group in groups:
    names.append(name)
    ax1.scatter(group.index, group['Weighted Avg'], label=name)
    ax1.plot(group.index, group['Weighted Avg'],label=name)
ax1.legend(names)
ax1.set_title('Large Egg Price by Region')
plt.xticks(rotation=70)
plt.tight_layout()

<IPython.core.display.Javascript object>

## Customizing Plot Output

You can change the width of the line by using the ```linewidth``` argument.

In [35]:
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'], linewidth=5.0)
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

Matplotlib plots in the same order as the commands run, so things plotted first will be behind things plotted last. We can adjust the location of the points by simply plotting them after the line.

In [36]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'], linewidth=5.0)
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

You can adjust how the line is displayed by using the ```linestyle``` argument. The options are ```dotted```, ```dashdot```, ```dotted```, ```-```, ```--```, ```-.```,```:``` 

In [37]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'], linestyle="-.")
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [38]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'], linestyle="-.")
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

You can adjust the color of the line by using the ```color``` argument. You can pass in colors by name, by abbreviation for simple colors (b for blue, g for green etc), hex colors or rgb colors. You can also pass in an optional alpha argument which controls the opacity.

In [39]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'],color="purple")
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [40]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'],color="#6ef442")
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [41]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'],color=(0.4296875,0.953,0.2578))
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [42]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'],color=(0.4296875,0.953,0.2578,0.25))
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

You can also change the marker type by using the ```marker``` argument. There are many different options, including ```+```, ```^```, ```o```, ```s``` (square), etc.

In [43]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'],
              marker="+")
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [44]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'],
              marker="^")
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

You can change the marker size by using the ```markersize``` argument

In [45]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])

plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'],
              marker="+",markersize=10.0)
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

There are two options for customizing the marker color, the ```markerfacecolor``` or ```mfc``` and the ```markeredgecolor``` or ```mec```.

In [46]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'],
              markerfacecolor="red")
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [47]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'],
              markerfacecolor="red",markeredgecolor="red")
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [48]:
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'],
              mfc="red",mec="red")
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

You can change the style of plotting by using the ```pyplot.style``` methods. You can even write and customize your own style.

In [49]:
plt.style.use('ggplot')
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [50]:
print(plt.style.available)

[u'seaborn-darkgrid', u'seaborn-notebook', u'classic', u'seaborn-ticks', u'grayscale', u'bmh', u'seaborn-talk', u'dark_background', u'ggplot', u'fivethirtyeight', u'seaborn-colorblind', u'seaborn-deep', u'seaborn-whitegrid', u'seaborn-bright', u'seaborn-poster', u'seaborn-muted', u'seaborn-paper', u'seaborn-white', u'seaborn-pastel', u'seaborn-dark', u'seaborn', u'seaborn-dark-palette']


In [51]:
plt.xkcd()
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

plt.rcdefaults()

<IPython.core.display.Javascript object>

In [52]:
plt.xkcd()
plt.style.use('ggplot')
plt.plot(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.plot_date(southeast_large2016.index, southeast_large2016['Weighted Avg'])
plt.xlabel("Date")
plt.ylabel("Price in cents")
plt.xticks(rotation=70)
plt.tight_layout()
plt.show()

plt.rcdefaults()

<IPython.core.display.Javascript object>