# Table of Contents
* [Learning Objectives:](#Learning-Objectives:)
	* [The `matplotlib` object-oriented API](#The-matplotlib-object-oriented-API)
	* [Plotting with Pandas DataFrames](#Plotting-with-Pandas-DataFrames)
		* [Using Index](#Using-Index)
		* [Using `data=` keyword](#Using-data=-keyword)
	* [Legends, labels and titles](#Legends,-labels-and-titles)
	* [Setting colors, linewidths, linetypes](#Setting-colors,-linewidths,-linetypes)
		* [Colors](#Colors)
		* [Line and marker styles](#Line-and-marker-styles)
	* [Control over axis appearance](#Control-over-axis-appearance)
		* [Plot range](#Plot-range)
		* [Logarithmic scale](#Logarithmic-scale)
		* [Placement of ticks and custom tick labels](#Placement-of-ticks-and-custom-tick-labels)
		* [Axis number and axis label spacing](#Axis-number-and-axis-label-spacing)
			* [Axis position adjustments](#Axis-position-adjustments)
		* [Scientific notation](#Scientific-notation)
		* [Axis grid](#Axis-grid)
		* [Axis spines](#Axis-spines)
		* [Twin axes](#Twin-axes)
		* [Axes where x and y are zero](#Axes-where-x-and-y-are-zero)
		* [Text annotation](#Text-annotation)
	* [Formatting text: LaTeX, fontsize, font family](#Formatting-text:-LaTeX,-fontsize,-font-family)
		* [Some LaTeX shown in Markdown cell](#Some-LaTeX-shown-in-Markdown-cell)
		* [Figure size, aspect ratio and DPI](#Figure-size,-aspect-ratio-and-DPI)
		* [Saving figures](#Saving-figures)
			* [What formats are available, and which ones should be used for best quality?](#What-formats-are-available,-and-which-ones-should-be-used-for-best-quality?)


# Learning Objectives:

After completion of this module, learners should be able to:

* construct reproducible scripts for generation of figures
* construct basic two-dimensional plots
* customize `matplotlib` objects (e.g., position, size, color, line thickness, axis ticks, colormaps, etc.)
* annotate `matplotlib` plots with custom text, legends, labels, etc.

In [None]:
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

## The `matplotlib` object-oriented API

The main idea with object-oriented programming is to have objects to which one can apply functions and actions, and no object or program states should be global (such as the MATLAB-like API). The real advantage of this approach becomes apparent when more than one figure is created, or when a figure contains more than one subplot. 

To use the object-oriented API, we start out very much like in the previous example, but instead of creating a new global figure instance, we store a reference to the newly created figure instance in the `fig` variable, and from it we create a new axis instance `axes` using the `add_axes` method in the `Figure` class instance `fig`:

In [None]:
fig = plt.figure()
x = np.linspace(0, 5, 100)
y = x ** 2
graph = fig.add_axes([0, 0, 1, 0.3]) # left, bottom, width, height (range 0 to 1)

graph.plot(x, y, 'r')

graph.set_xlabel('x')
graph.set_ylabel('y')
graph.set_title('title');

Although a bit more code is involved, the advantage is that we now have full control of where the plot axes are placed, and we can easily add more than one axis to the figure:

In [None]:
fig = plt.figure()

graph1 = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # main axes
graph2 = fig.add_axes([0.2, 0.5, 0.4, 0.3]) # inset axes

# main figure
graph1.plot(x, y, 'r')
graph1.set_xlabel('x')
graph1.set_ylabel('y')
graph1.set_title('Title\n')

# insert
graph2.plot(y, x, 'g')
graph2.set_xlabel('y')
graph2.set_ylabel('x')
graph2.set_title('Inset Title');

If we don't care about being explicit about where our plot axes are placed in the figure canvas, then we can use one of the many axis layout managers in matplotlib. My favorite is `subplots`, which can be used like this:

In [None]:
fig, axes = plt.subplots()

axes.plot(x, y, 'r')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');

In [None]:
x = np.linspace(0, 5, 20)
y = x ** 2
fig, axes = plt.subplots(nrows=1, ncols=3)

styles = ['go-','r+', 'bx-']
for style, ax in zip(styles, axes):
    ax.plot(x, y, style)
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_title('title')

That was easy, but it isn't so pretty with overlapping figure axes and labels, right?

We can deal with that by using the `fig.tight_layout` method, which automatically adjusts the positions of the axes on the figure canvas so that there is no overlapping content:

In [None]:
fig, axes = plt.subplots(nrows=1, ncols=2)

for ax in axes:
    ax.plot(x, y, 'r')
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_title('title')
    
fig.tight_layout()

## Plotting with Pandas DataFrames

Matplotlib version 1.5 has provided some improvments to using Pandas DataFrames.

This data set provides the percentage of bachelor degrees awarded to women in 17 fields from 1970 to 2011.

In [None]:
df=pd.read_csv('data/percent-bachelors-degrees-women-usa.csv', index_col='Year')
df.info()

### Using Index

You can pass the `.index` object as the `x` array and plot each column using `.plot`.

In [None]:
fig, ax = plt.subplots()

ax.plot(df.index, df['Computer Science'], 'g--')

### Using `data=` keyword

Using the `data=` keyword argument the index is automatically retrieved and a string is provided for the column to plot.

In [None]:
fig, ax = plt.subplots()

ax.plot('Agriculture', 'k', data=df)

## Legends, labels and titles

Now that we have covered the basics of how to create a figure canvas and add axes instances to the canvas, let's look at how to decorate a figure with titles, axis labels, and legends.

**Figure titles**

A title can be added to each axis instance in a figure. To set the title, use the `set_title` method in the axes instance:

In [None]:
fig, ax = plt.subplots(figsize=(15,7))
ax.plot('Physical Sciences', data=df)
ax.set_title("% Degrees awarded to women in physical sciences");

**Axis labels**

Similarly, with the methods `set_xlabel` and `set_ylabel`, we can set the labels of the X and Y axes:

In [None]:
ax.set_xlabel("year")
ax.set_ylabel("%")
fig

Use the `label="label text"` keyword argument when plots or other objects are added to the figure, and then using the `legend` method without arguments to add the legend to the figure: 

In [None]:
fig, ax = plt.subplots(figsize=(15,7))
ax.set_title("% Degrees awarded to women in physical sciences");

ax.plot('Physical Sciences', data=df, label='Physical Sciences')
ax.plot('Psychology', data=df, label='Psychology')
ax.plot('Health Professions', data=df, label='Health Sciences')

ax.legend()
plt.show()

The advantage with this method is that if curves are added or removed from the figure, the legend is automatically updated accordingly.

The `legend` function takes an optional keyword argument `loc` that can be used to specify where in the figure the legend is to be drawn. The allowed values of `loc` are numerical codes for the various places the legend can be drawn. See http://matplotlib.org/users/legend_guide.html#legend-location for details. Some of the most common `loc` values are:

```python
ax.legend(loc=0) # let matplotlib decide the optimal location
ax.legend(loc=1) # upper right corner
ax.legend(loc=2) # upper left corner
ax.legend(loc=3) # lower left corner
ax.legend(loc=4) # lower right corner
# .. many more options are available
```

In [None]:
ax.legend(loc=0)
fig

## Setting colors, linewidths, linetypes

### Colors

With matplotlib, we can define the colors of lines and other graphical elements in a number of ways. First of all, we can use the MATLAB-like syntax where `'b'` means blue, `'g'` means green, etc. The MATLAB API for selecting line styles are also supported: where, for example, 'b.-' means a blue line with dots:

In [None]:
# MATLAB style line color and style 
fig, ax = plt.subplots()
ax.plot(x, x**2, 'b.-') # blue line with dots
ax.plot(x, x**3, 'g--') # green dashed line

We can also define colors by their names or RGB hex codes and optionally provide an alpha value using the `color` and `alpha` keyword arguments:

In [None]:
fig, ax = plt.subplots()

ax.plot(x, 1.5*x, color="#15cc55")        # RGB hex code for a greenish color
ax.plot(x, x+2, color="#1155dd", linewidth=3) # RGB hex code for a bluish color
ax.plot(x, x+1, color="red", alpha=0.5, linewidth=4) # half-transparant red

### Line and marker styles

To change the line width, we can use the `linewidth` or `lw` keyword argument. The line style can be selected using the `linestyle` or `ls` keyword arguments:

In [None]:
fig, ax = plt.subplots(figsize=(12,6))

ax.plot(x, x+1, color="blue", linewidth=0.25, label="A")
ax.plot(x, x+2, color="blue", linewidth=0.50, label="B")
ax.plot(x, x+3, color="blue", linewidth=1.00, label="C")
ax.plot(x, x+4, color="blue", linewidth=2.00)

# possible linestype options ‘-‘, ‘–’, ‘-.’, ‘:’, ‘steps’
ax.plot(x, x+5, color="red", lw=2, linestyle='-')
ax.plot(x, x+6, color="red", lw=2, ls='-.')
ax.plot(x, x+7, color="red", lw=2, ls=':')

# custom dash
line, = ax.plot(x, x+8, color="black", lw=1.50)
line.set_dashes([5, 10, 15, 10]) # format: line length, space length, ...

# possible marker symbols: marker = '+', 'o', '*', 's', ',', '.', '1', '2', '3', '4', ...
#ax.plot(x, x+ 9, color="green", lw=2, ls='*', marker='+')
#ax.plot(x, x+10, color="green", lw=2, ls='*', marker='o')
#ax.plot(x, x+11, color="green", lw=2, ls='*', marker='s')
#ax.plot(x, x+12, color="green", lw=2, ls='*', marker='1')

# marker size and color
ax.plot(x, x+13, color="purple", lw=1, ls='-', marker='o', markersize=2, label="D")
ax.plot(x, x+14, color="purple", lw=1, ls='-', marker='o', markersize=4)
ax.plot(x, x+15, color="purple", lw=1, ls='-', marker='o', markersize=8, markerfacecolor="red")
ax.plot(x, x+16, color="purple", lw=1, ls='-', marker='s', markersize=8, 
        markerfacecolor="yellow", markeredgewidth=2, markeredgecolor="blue");
ax.legend(loc=4, fancybox=True, title="Legend")

## Control over axis appearance

The appearance of the axes is an important aspect of a figure that we often need to modify to make publication quality graphics. We need to be able to control where the ticks and labels are placed, modify the font size and possibly the labels used on the axes. In this section we will look at controling those properties in a matplotlib figure.

### Plot range

First, let's configure the ranges of the axes. We can use the `set_ylim` and `set_xlim` methods in the axis object, or `axis('tight')` for automatically getting "tightly fitted" axes ranges:

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(12, 4))

axes[0].plot(x, x**2, label="Parabola")
axes[0].plot(x, x**3, label="Cube")
axes[0].set_title("default axes ranges")
axes[0].legend(loc=0, fancybox=True, title="Legend")

axes[1].plot(x, x**2, x, x**3)
axes[1].axis('tight')
axes[1].set_title("tight axes")

axes[2].plot(x, x**2, x, x**3)
axes[2].set_ylim([0, 60])
axes[2].set_xlim([2, 5])
axes[2].set_title("custom axes range");

### Logarithmic scale

It is also possible to set a logarithmic scale for one or both axes. This functionality is in fact only one application of a more general transformation system in Matplotlib. Each of the axes' scales are set separately using `set_xscale` and `set_yscale` methods, which accept one parameter (with the value "log" in this case):

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(10,4))
      
axes[0].plot(x, x**2)
axes[0].plot(x, np.exp(x))
axes[0].set_title("Linear scale")

axes[1].plot(x, x**2, x, np.exp(x))
axes[1].set_yscale("log")
axes[1].set_title("Log-Linear scale (y)")

axes[2].plot(x, x**2, x, np.exp(x))
axes[2].set_yscale("log")
axes[2].set_xscale("log")
axes[2].set_title("Log-Log scale");

### Placement of ticks and custom tick labels

We can explicitly determine where we want the axis ticks with `set_xticks` and `set_yticks`, which both take a list of values for where on the axis the ticks are to be placed. We can also use the `set_xticklabels` and `set_yticklabels` methods to provide a list of custom text labels for each tick location:

In [None]:
fig, ax = plt.subplots(figsize=(10, 4))

ax.plot(x, x**2, x, x**3, lw=2)

ax.set_xticks([1, 2, 3, 4, 5])
ax.set_xticklabels([r'$\alpha$', r'$\beta$', r'$\gamma$', r'$\delta$', r'$\epsilon$'], 
                   fontsize=18)

yticks = [0, 50, 100, 150]
ax.set_yticks(yticks)
ax.set_yticklabels(["$%.1f$" % y for y in yticks], fontsize=18); # use LaTeX formatted labels
plt.show()

In [None]:
# Non-functional form scatterplot
fig, ax = plt.subplots(figsize=(6, 6), dpi=200)
x1 = np.random.random(2000)
y2 = np.random.randn(2000)
ax.plot(x1, y2, 'g.', markersize=.5)
x2 = np.random.randn(2000)
y2 = np.random.randn(2000)*0.5
ax.plot(x2, y2, 'r.', markersize=.5);

There are a number of more advanced methods for controlling major and minor tick placement in matplotlib figures, such as automatic placement according to different policies. See http://matplotlib.org/api/ticker_api.html for details.

### Axis number and axis label spacing

In [None]:
# distance between x and y axis and the numbers on the axes
matplotlib.rcParams['xtick.major.pad'] = 5
matplotlib.rcParams['ytick.major.pad'] = 5

fig, ax = plt.subplots(1, 1)
      
ax.plot(x, x**2, x, np.exp(x))
ax.set_yticks([0, 50, 100, 150])

ax.set_title("label and axis spacing")

# padding between axis label and axis numbers
ax.xaxis.labelpad = 5
ax.yaxis.labelpad = 5

ax.set_xlabel("x")
ax.set_ylabel("y");

In [None]:
# restore defaults
matplotlib.rcParams['xtick.major.pad'] = 3
matplotlib.rcParams['ytick.major.pad'] = 3

#### Axis position adjustments

Unfortunately when saving figures, the labels are sometimes clipped. When necessary, adjust the positions of axes a little using `subplots_adjust`:

In [None]:
fig, ax = plt.subplots(1, 1)
      
ax.plot(x, x**2, x, np.exp(x))
ax.set_yticks([0, 50, 100, 150])

ax.set_title("title")
ax.set_xlabel("x")
ax.set_ylabel("y")

fig.subplots_adjust(left=0.15, right=.8, bottom=0.1, top=0.9);

### Scientific notation

With large numbers on axes, it is often better to use scientific notation:

In [None]:
x = np.linspace(0, 5, 100)
fig, ax = plt.subplots(1, 1)
      
ax.plot(x, x**2, x, np.exp(x))
ax.set_title("scientific notation")

ax.set_yticks([0, 50, 100, 150])

from matplotlib import ticker
formatter = ticker.ScalarFormatter(useMathText=True)
formatter.set_scientific(True) 
formatter.set_powerlimits((-1,1)) 
ax.yaxis.set_major_formatter(formatter)
plt.show()

### Axis grid

With the `grid` method in the axis object, we can turn on and off grid lines. We can also customize the appearance of the grid lines using the same keyword arguments as the `plot` function:

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(10,3))

# default grid appearance
axes[0].plot(x, x**2, x, x**3, lw=2)
axes[0].grid(True)

# custom grid appearance
axes[1].plot(x, x**2, x, x**3, lw=2)
axes[1].grid(color='b', alpha=0.5, linestyle='dashed', linewidth=0.5)

### Axis spines

We can also change the properties of axis spines:

In [None]:
fig, ax = plt.subplots(figsize=(6,2))

ax.spines['bottom'].set_color('blue')
ax.spines['top'].set_color('green')

ax.spines['left'].set_color('red')
ax.spines['left'].set_linewidth(2)

# turn off axis spine to the right
ax.spines['right'].set_color("none")
ax.yaxis.tick_left() # only ticks on the left side

### Twin axes

Sometimes it is useful to have dual x or y axes in a figure; for example, when plotting curves with different units together. Matplotlib supports this with the `twinx` and `twiny` functions:

In [None]:
fig, ax1 = plt.subplots()

ax1.plot(x, x**2, lw=2, color="blue")
ax1.set_ylabel(r"area $(m^2)$", fontsize=18, color="blue")
for label in ax1.get_yticklabels():
    label.set_color("blue")
    
ax2 = ax1.twinx()
ax2.plot(x, x**3, lw=2, color="red")
ax2.set_ylabel(r"volume $(m^3)$", fontsize=18, color="red")
for label in ax2.get_yticklabels():
    label.set_color("red")

### Axes where x and y are zero

In [None]:
fig, ax = plt.subplots()

ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')

ax.xaxis.set_ticks_position('bottom')
ax.spines['bottom'].set_position(('data',0)) # set position of x spine to x=0

ax.yaxis.set_ticks_position('left')
ax.spines['left'].set_position(('data',0))   # set position of y spine to y=0

xx = np.linspace(-0.75, 1., 100)
ax.plot(xx, xx**3);

### Text annotation

We can annotate text in matplotlib figures using the `text` function. It supports LaTeX formatting just like axis label texts and titles:

In [None]:
fig, ax = plt.subplots()

ax.plot(xx, xx**2, xx, xx**3)

ax.text(0.15, 0.2, r"$y=x^2$", fontsize=20, color="blue")
ax.text(0.65, 0.1, r"$y=x^3$", fontsize=20, color="green");

## Formatting text: LaTeX, fontsize, font family

The figure above is functional, but it does not (yet) satisfy the criteria for a figure used in a publication. First and foremost, we need to have \LaTeX formatted text, and second, we need to be able to adjust the font size to appear legibly in a publication.

Matplotlib has great support for LaTeX. All we need to do is to use dollar signs to encapsulate LaTeX in any text (legend, title, label, etc.). For example, `"$y=x^3$"` ... $y=x^3$

But here we can run into a slightly subtle problem with LaTeX code and Python text strings. In LaTeX, we frequently use the backslash in commands, for example `\alpha` to produce the symbol $\alpha$. But the backslash already has a meaning in Python strings (the escape code character). To avoid Python messing up our LaTeX code, we need to use "raw" text strings. Raw text strings are prepended with an '`r`', like `r"\alpha"` or `r'\alpha'` instead of `"\alpha"` or `'\alpha'`:

### Some LaTeX shown in Markdown cell

*A figure*

$$x = \sum_0^{\infty} \frac{a}{b}$$

*Inline*

The formula is: $x = \sum_0^{\infty} \frac{a}{b}$.

In [None]:
fig, ax = plt.subplots()

ax.plot(x, x**2, label=r"$y = \alpha^2$")
ax.plot(x, x**3, label=r"$y = \alpha^3$")
ax.legend(loc=2) # upper left corner
ax.set_xlabel(r'$\alpha$', fontsize=18)
ax.set_ylabel(r'$y$', fontsize=18)
ax.set_title("Title");

We can also change the global font size and font family, which applies to all text elements in a figure (tick labels, axis labels and titles, legends, etc.):

In [None]:
# Update the matplotlib configuration parameters:
matplotlib.rcParams.update({'font.size': 18, 'font.family': 'serif'})

In [None]:
fig, ax = plt.subplots()

ax.plot(x, x**2, label=r"$\Omega = \alpha^2$")
ax.plot(x, x**3, label=r"$\Omega = \alpha^3$")
ax.legend(loc=2) # upper left corner
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel(r'$\Omega$')
ax.set_title(r'$\aleph_0$');

A good choice of global fonts are the STIX fonts: 

In [None]:
# Update the matplotlib configuration parameters:
matplotlib.rcParams.update({'font.size': 18, 'font.family': 
                            'STIXGeneral', 'mathtext.fontset': 'stix'})

In [None]:
fig, ax = plt.subplots()

ax.plot(x, x**2, label=r"$y = \alpha^2$")
ax.plot(x, x**3, label=r"$y = \alpha^3$")
ax.legend(loc=2) # upper left corner
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel(r'$y$')
ax.set_title('title');

Or, alternatively, we can request that matplotlib use LaTeX to render the text elements in the figure:

In [None]:
# Not installed here!
matplotlib.rcParams.update({'font.size': 18, 'text.usetex': True})

In [None]:
fig, ax = plt.subplots()

ax.plot(x, x**2, label=r"$y = \alpha^2$")
ax.plot(x, x**3, label=r"$y = \alpha^3$")
ax.legend(loc=2) # upper left corner
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel(r'$y$')
ax.set_title('title');

In [None]:
# restore
matplotlib.rcParams.update({'font.size': 12, 'font.family': 'sans', 'text.usetex': False})

### Figure size, aspect ratio and DPI

Matplotlib allows the aspect ratio, DPI and figure size to be specified when the `Figure` object is created, using the `figsize` and `dpi` keyword arguments. `figsize` is a tuple of the width and height of the figure in inches, and `dpi` is the dots-per-inch (pixel per inch). To create an 800x400 pixel, 100 dots-per-inch figure, we can do: 

In [None]:
fig = plt.figure(figsize=(8,4), dpi=100)
x = np.linspace(0, 5, 20)
y = x ** 2
plt.plot(x,y)
plt.show()

The same arguments can also be passed to layout managers, such as the `subplots` function:

In [None]:
fig, axes = plt.subplots(figsize=(12,3))

axes.plot(x, y, 'r')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');
plt.show()

### Saving figures

To save a figure to a file, we can use the `savefig` method in the `Figure` class:

In [None]:
fig.savefig("tmp/filename.png")
!open tmp/filename.png

Here we can also optionally specify the DPI and choose between different output formats:

In [None]:
fig.savefig("tmp/filename.png", dpi=200)
!open tmp/filename.png

In [None]:
fig.savefig("tmp/filename.pdf")
!open tmp/filename.pdf

#### What formats are available, and which ones should be used for best quality?

Matplotlib can generate high-quality output in a number of formats, including PNG, JPG, EPS, SVG, PGF and PDF. For scientific papers, I recommend using PDF whenever possible. (LaTeX documents compiled with `pdflatex` can include PDFs using the `includegraphics` command). In some cases, PGF can also be a good alternative.