# Introduction to matplotlib

matplotlib is a popular data visualization library in Python that can produce a wide variety of plots. It is often used in scientific publications and helped create one of the first ever images of a black hole (below) in April, 2019.

As you have seen a few times in this book, pandas can create plots, so you might be wondering why we need to learn about a completely separate library. All of the default plots created with pandas are done internally with matplotlib. Whenever you create a plot with pandas, a matplotlib plotting object is returned. pandas can only create a fraction of the plots available to matplotlib.

The major benefit of using pandas for visualization is that the plots are easier to create. The trade-off is that you won't have as much control as you do with matplotlib directly. We will eventually cover the pandas (and the seaborn) library for visualization in great deatail, but will begin with matplotlib.

![0]

## Two interfaces of matplotlib

matplotlib was originally created by the late John Hunter in the early 2000's to mimic the plotting functionality of [Matlab][1], a popular scientific computing software application. In essence, it is a "matlab-like plotting library". One issue with porting a library from one language to another is that the idioms and practices that each language have are not often compatible. Python is usually programmed differently than matlab. 

Over the course of matplotlib's development, two separate ways to interface with matplotlib evolved. They are the **state-machine environment** and the **object-oriented** approach. The state-machine environment (known as **pyplot** from here on out) implicitly handles some of the plotting for you. 

The **object-oriented** approach gives you full control over each element of the plot and fits the style of how Python is usually developed. Most plots can be reproduced with either interface, but the object-oriented approach is explicit, and in my opinion, easier to determine what is happening.

### Using only the object-oriented approach

The chapters on matplotlib only use the object-oriented approach, as attempting to learn both at the start is not necessary and confusing. Many tutorials online use pyplot, so it is something that you might need to understand. Thankfully, much of the code between each approach looks quite similar when making a single plot.

## Figure - Axes hierarchy

There is an important hierarchy that must be understood when working with matplotlib. The highest and outermost part of a plot is the **figure**, which contains all of the other plotting elements. Typically, you do not interact with it much. Inside the figure are the **axes**. This is the actual plotting surface that you normally would refer to as a 'plot'. 

![2]

A figure may contain any number of axes. An axes is a container for most of the plotting elements that get drawn onto your screen. This includes the x and y axis, lines, text, points, legends, images, and others.

### Axes is a confusing word

The term **axes** is not actually plural and does not mean more than one axis. It literally stands for a single 'plot'. It's unfortunate that this fundamental element has a name that is so confusing. I usually pronounce it "axeez" when I am teaching to help differentiate it from the word 'axis'.


[0]: images/blackhole.png
[1]: https://en.wikipedia.org/wiki/MATLAB
[2]: images/fig_ax.png

### Importing the pyplot module

Importing matplotlib into your workspace is done a little differently than numpy or pandas. You rarely will import matplotlib itself directly like this:

```python
import matplotlib
```

The above is perfectly valid code, but the matplotlib developers decided not to put all the main functionality in the top level module. When you `import pandas as pd`, you get access to nearly all of the available functions and classes of the pandas library. This isn't true with matplotlib. Instead, much of the functionality for quickly plotting is found in the `pyplot` module. This is the module that we want to import into our workspace. There are dozens of other modules in the matplotlib library, some of which will be imported when needed. Let's import the pyplot module now and alias it as `plt`, which is done by convention.

In [None]:
import matplotlib.pyplot as plt

### Use pyplot to begin

pyplot does provide lots of useful functions, one of which creates the figure and any number of axes that you desire. You can do this without pyplot, but it involves more syntax. It's also quite standard to begin the object-oriented approach by laying out the figure and axes first with pyplot and then proceed by calling methods from these objects.

### Use the `subplots` function

The pyplot `subplots` function creates a single figure and any number of axes. If you call it with the default arguments, a single axes is created within a figure. 

### Unpack the `subplots` tuple

The `subplots` function returns a two-item tuple containing the figure and the axes. We unpack each of these objects as their own variable.

In [None]:
fig, ax = plt.subplots()

### Verify the returned types of the `subplots` function

Let's verify that we indeed have a figure and axes.

In [None]:
type(fig)

In [None]:
type(ax)

### Distinguishing the figure from the axes

It's not obvious from looking at the image which part is the figure and which is the axes. To help distinguish the figure from the axes, we will set the 'facecolor' (background color) of each to a different color. Both objects have a `set_facecolor` method, which will be passed a named color called in an object-oriented fashion. Colors will be covered in greater detail in an upcoming chapter.

In [None]:
fig.set_facecolor('skyblue')
ax.set_facecolor('sandybrown')

### Where is the figure?

When using the object-oriented approach, you need to place the figure variable name as the last line in a cell to view it in the notebook. This should now hopefully distinguish the figure from the axes.

In [None]:
fig

### Why is there no assignment statement?

Notice, that the two calls above to the `set_facecolor` method were made without an assignment statement. Both of these operations happened **in-place**. The calling figure and axes objects were updated without a new one getting created.

### Can only view the entire figure not the axes

You cannot view an axes independent of a figure. Running a cell with the axes variable name as the last line will simply output a default representation of the object. You can only view a figure in the notebook.

In [None]:
ax

## Setting the size of the figure upon creation

By default, all figures in a Jupyter Notebook have a width of 6 inches and a height of 4 inches. If you are working in a Jupyter Notebook, you'll probably notice that the size of the figure on your screen isn't actually 6 inches by 4 inches. A deeper discussion on what these "inches" really mean can be found in the upcoming chapter, "Matplotlib Resolution". For now, think of these two numbers as the relative width and height of the figure.

We can change the dimensions of the figure when creating it by setting the `figsize` parameter to a two-item tuple of the width and height of the figure. Below, we create a figure with a width of 40 inches and height of 8 inches. Notebooks will always scale down the figure so that it fits in the output area.

In [None]:
fig, ax = plt.subplots(figsize=(40, 8))
fig.set_facecolor('skyblue')
ax.set_facecolor('sandybrown')

Let's create one more figure and axes with dimensions of 4 by 2 and a dots-per-inch (DPI) of 144. The DPI is the number of pixels created per inch and will be discussed in detail in the 'Matplotlib Resolution' chapter. For now, think it of as increasing the sharpness of the image. 

You can also set the face color for the figure in the `subplots` function. Notice that the tick labels of the image below are much larger than the image above. Each tick label has a default font size that does not depend on the width of the image. The tick labels above appear very small above because they are relative to a figure that has a width of 40. These same tick labels appear larger in the plot below because the width is 4.

In [None]:
fig, ax = plt.subplots(figsize=(4, 2), facecolor='skyblue', dpi=144)
ax.set_facecolor('sandybrown')

### Began the Object-Oriented Approach

Both calls to the `set_facecolor` method demonstrated the object-oriented approach to matplotlib. With this approach, every plotting object that is created may be manipulated by calling its methods.

As we will see, everything on our plot is a separate object. Each axis, tick mark, tick label, axis label, plot title, line, and many others are separate objects. Each of these objects may be explicitly referenced and assigned to a variable name. Once we have a reference to a particular object, we can then modify it by calling its methods. Thus far we have two references, `fig` and `ax`. 

## Axes methods

Even though we've called both figure and axes methods, it is far more common to call axes methods. It is the axes that contains most of the plotting methods and is the object you will interact with most frequently. The figure is analogous to the frame of a picture. It plays a role, but isn't the main attraction. The bulk of the visualization commands will come from the axes. We'll now begin our exploration of the many axes methods.

### Getter and setter methods

Many axes methods begin with either `get_` or `set_` followed by the part of the axes that will get retrieved or modified. These kinds of methods are often referred to as 'getter' and 'setter' methods. The following list shows several of the most common properties that can be set on our axes. We will see examples of each one below.

* `title`
* `xlabel`/`ylabel`
* `xlim`/`ylim`
* `xticks`/`yticks`
* `xticklabels`/`yticklabels`

### Getting and setting the title of the axes

The `get_title` method returns the title of the axes as a string. There is no title at this moment, so an empty string is returned.

In [None]:
ax.get_title()

The `set_title` method places a centered title on our axes when passing it a string. Notice that a matplotlib Text object has been returned in the output area. This will be discussed later.

In [None]:
ax.set_title('My First Matplotlib Graph')

Again, the figure variable name must be placed as the last line in a cell to show in the notebook.

In [None]:
fig

Running the `get_title` method again returns the string that was just set as the title.

In [None]:
ax.get_title()

### Getting and setting the labels for the x and y axis

The x and y axis can each be labeled with a single string. By default, there are no x and y axis labels and using their getter methods returns an empty string.

In [None]:
ax.get_xlabel()

In [None]:
ax.get_ylabel()

We can provide labels for both the x and y axis using the `set_xlabel` and `set_ylabel` commands. We set both labels in the same cell and output the figure.

In [None]:
ax.set_xlabel('X Axis')
ax.set_ylabel('Y Axis')
fig

Let's verify that the getter axis labels work.

In [None]:
ax.get_xlabel()

In [None]:
ax.get_ylabel()

### Getting and setting the x and y limits

By default, the limits of both the x and y axis begin at 0 and end at 1. Let's verify this with the `get_xlim` and `get_ylim` methods, which return a tuple of the limits.

In [None]:
ax.get_xlim()

In [None]:
ax.get_ylim()

We can change these limits with the `set_xlim` and `set_ylim` methods by passing them a new lower and upper boundary as the first two arguments. Notice that the size of the figure remains the same. Only the limits of the x and y axis have changed.

In [None]:
ax.set_xlim(0, 5)
ax.set_ylim(-20, 60, auto=True)
fig

### Getting and setting the location of the x and y ticks

In the graph above, ticks are placed every 1 unit along the x-axis and every 20 units along the y-axis. matplotlib chooses reasonable default locations for the ticks. Retrieve the location of these ticks with the `get_xticks` and `get_yticks` methods.

In [None]:
ax.get_xticks()

In [None]:
ax.get_yticks()

The tick locations are returned as numpy arrays. We can specify the exact location of the x and y ticks with the `set_xticks` and `set_yticks` methods. We pass them a list of numbers indicating where we want the ticks. When we set the y-ticks we use a number outside of the current bounds of the axis. This forces matplotlib to change the limits.

In [None]:
ax.set_xticks([1.8, 3.44, 4.4])
ax.set_yticks([-99, -29, -1, 22, 44])
fig

Let's verify that the y-axis limits have indeed changed and that the x-axis limits have not.

In [None]:
ax.get_xlim()

In [None]:
ax.get_ylim()

### Getting and setting the x and y tick labels

matplotlib has separate objects for the tick labels, which are the values printed directly below each tick location. The current tick labels for the x-axis and y-axis are the same as the tick locations. Let's view them with the `get_xticklabels` method.

In [None]:
ax.get_xticklabels()

Pass the `set_xticklabels` and `set_yticklabels` methods a list of values (which can be strings) to use as the new labels.

In [None]:
ax.set_xticklabels(['dog', 'cat', 'snake'])
ax.set_yticklabels(['Texas', 'Oklahoma', 'Alabama', 'Arkansas', 'Florida'])
fig

Retrieve both the x and y tick labels.

In [None]:
ax.get_xticklabels()

In [None]:
ax.get_yticklabels()

### The difference between the tick locations and labels

The tick locations are a completely separate concept than the tick labels. The tick locations are always numeric and determine where on the axis the tick marks appear. The tick labels, on the other hand, are the strings that are used on the graph. By default, the tick labels are just the string version of the numeric tick location, but you can set them to be any value you want, as we did above.

### Styling text

All of the text we placed on our plot used the default matplotlib styling. There are many different parameters that we can set to customize the appearance of our text with some of the most common below. To view all of the options, [visit the text tutorial in the documentation][1]. Notice that most of these properties have aliases. Below, we use some of these properties when setting the title.

| Property                      | Possible Values                                                           |
|-------------------------------|---------------------------------------------------------------------------|
| `fontsize` or `size`          | integer in "points" where 1 point is defaulted to 1/72nd of an inch       |
| `fontname` or `name`          | name of font as a string                                                  |
| `fontweight` or `weight`      | `'normal'`, `'bold'`, `'heavy'`, `'light'`, `'ultrabold'`, `'ultralight'` |
| `fontstyle` or `style`      | `'normal'`, `'italic'`, `'oblique'`|
| `color` or `c`                | text color                                                                |
| `backgroundcolor`             | color of rectangular background of text                                   |
| `horizontalalignment` or `ha` | `'left'`, `'center'`, `'right'`                              |
| `verticalalignment` or `va`   | `'center'`, `'top'`, `'bottom'`, `'baseline'`                             |
| `rotation`                    | degree of rotation                                                        |

[1]: https://matplotlib.org/tutorials/text/text_props.html

In [None]:
ax.set_title('Tests', fontsize=15, fontname='Verdana', fontweight='bold',
             color='firebrick', backgroundcolor='steelblue', rotation=10)
fig

Any other text may be stylized with those same parameters. Below we do so with the x-label.

In [None]:
ax.set_xlabel('New and Imporved X-Axis Stylized Label', fontsize=10,
              color='indigo', fontname='Times New Roman', rotation=-5)
fig

## Change tick label and tick line properties with `tick_params`

The `tick_params` method sets properties for both the tick labels and tick lines. Although it does not have the word "set" in its name, it is a setter method.

We just saw how the `set_xticklabels` and `set_yticklabels` methods can set the string values of the tick labels. They can also change the text style of those labels. However, the `tick_params` method has the ability to set some of those properties plus a few more. It also sets properties of the tick lines.

### Changing tick label properties

We'll begin using `tick_params` to change tick label properties. First, set the `axis` parameter to the string `'x'`, `'y'`, or `'both'` to select which labels to change. The parameters beginning with 'label' change the tick labels, all of which are listed below.

* `labelsize`, `labelrotation`, `labelcolor` - change label size, degree of rotation, and color
* `labelleft`, `labelright`, `labelbottom`, `labeltop` - boolean to control whether labels are visible on specific sides of the axes

Here, we then use `tick_params` to change properties of the y-axis labels.

In [None]:
ax.tick_params(axis='y', labelleft=False, labelright=True, labelsize=8,
               labelrotation=-15, labelcolor='red')
fig

### Changing tick line properties

The tick lines are the tiny lines on each axis that denote the tick locations and point to the tick labels. There exists a `get_xticklines` method, but no `set_xticklines` to change them. Instead, use the `tick_params` method to set the properties on the lines themselves. Some of its parameters are:

* `length` - length of tick line in points
* `width` - width of tick line in points
* `pad` - distance in points of tick label from tick line
* `direction` - 'in', 'out', or 'inout'
* `top`, `bottom`, `left`, `right` - boolean corresponding to the top/bottom x-axis and left/right y-axis that determines whether ticks are visible.

In [None]:
ax.tick_params(axis='x', color='gold', length=15, width=6,
               pad=10, direction='inout', top=True, right=True)
fig

## Setting multiple properties at the same time with `set`

Most matplotlib objects have a `set` method that can be used to set many properties at once in a single line of code. Use the property name as the parameter name and set it equal to teh value you would like. Here, we set the title, face color, x-axis limits, and y-axis label at once.

In [None]:
fig, ax = plt.subplots(figsize=(4, 2), facecolor='skyblue', dpi=144)
ax.set(title='Some title', facecolor='tan', xlim=(-5, 3), ylabel='mpl');

## Exercises

### Exercise 1

<span style="color:green; font-size:16px">Create a figure with dimensions 5 inches by 3 inches with 144 DPI containing a single axes. Set the facecolor of the figure and the axes. Set a title and labels for the x and y axis. Set three ticks on the y-axis, and two on the x-axis. Give the three ticks on the y-axis a new string label. Change the limits of the x-axis and y-axis so they are larger than the minimum and maximum tick values. Change the size, shape, and color of the y-axis tick lines. Increase the size of the x tick labels and rotate them.</span>