<div class="alert alert-danger">
  <strong><h1>MATPLOTLIB TUTORIAL FOR NEWCOMERS</h1></strong>
</div>

>by Dr Juan H Klopper

- Research Fellow
- School for Data Science and Computational Thinking
- Division of Biostatistics and Epidemiology

<img src='SUN.png' width=400>

## INTRODUCTION

The matplotlib package is arguably the best known and the largest plotting package in the Python ecosystem. In this notebook I touch on some of the useful aspects when using matplotlib. It is an enormous package and I will concentrate on topics pertaining to its use in science in general.

The [matplotlib homepage](https://matplotlib.org/stable/gallery/index.html#examples-index) lists many examples. [Full documentation](https://matplotlib.org/stable/Matplotlib.pdf) is available as a PDF file for download.

We can set values for almost all of the aspects of a plot when creating the plot. Some of these can also be set on a global scale using the `rcParams` class. A full list of these parameters can be viewed [here](https://matplotlib.org/stable/api/matplotlib_configuration_api.html). All the settings are actually saved in a file called `matplotlibrc`. To customise this file, read [here](https://matplotlib.org/stable/tutorials/introductory/customizing.html). You can find this file by running the code cell below.

In [None]:
import matplotlib
matplotlib.matplotlib_fname()

There are many other plotting packages available in Python. A few are listed below:

- [Plotly](https://plotly.com/python/)
- [Seaborn](https://seaborn.pydata.org)
- [Bokeh](https://docs.bokeh.org/en/latest/)
- [Altair](https://altair-viz.github.io)

## TABLE OF CONTENTS

A set of links to the sections and subsections in this notebook.

- [PACKAGES USED IN THIS NOTEBOOK](#PACKAGES-USED-IN-THIS-NOTEBOOK)
- [QUICK PLOTS WITH THE PYPLOT INTERFACE](#QUICK-PLOTS-WITH-THE-PYPLOT-INTERFACE)
    - [LINE AND SCATTER PLOTS](#LINE-AND-SCATTER-PLOTS)
    - [FREQUENCY PLOTS](#FREQUENCY-PLOTS)
    - [BAR PLOTS FOR STATISTICS](#BAR-PLOTS-FOR-STATISTICS)
    - [BOX AND WHISKER PLOTS](#BOX-AND-WHISKER-PLOTS)
- [OBJECT ORIENTED INTERFACE](#OBJECT-ORIENTED-INTERFACE)
- [PLOTS FOR FUNCTIONS AND VECTORS](#PLOTS-FOR-FUNCTIONS-AND-VECTORS)
    - [CONTOUR PLOTS](#CONTOUR-PLOTS)
    - [QUIVER PLOTS](#QUIVER-PLOTS)
    - [STREAM PLOTS](#STREAM-PLOTS)

## PACKAGES USED IN THIS NOTEBOOK

In [None]:
import os # Interacting with the operating system and directory structure

In [None]:
import sympy as sym # Symbolic python (computer algebra system)

In [None]:
import numpy as np # Numerical computation

In [None]:
import matplotlib.pyplot as plt # Plotting package

In [None]:
import matplotlib.gridspec as gridspec

In [None]:
%config InlineBackend.figure_format = 'retina'

The `style.available` attribute from the pyplot module shows a list of all the available plot styles. Note that this Python environment has the scienceplots package installed using pip. A Tex installation is required for many of these themes.

In [None]:
plt.style.available

I will use the classic plotting style.

In [None]:
plt.style.use(['classic'])

A backend can also be set for matplotlib. The current backend is set to generate plots inline with the notebook cells.

The default is to have plot generated in line with the cells that generate them. The code below is the default behavior and need to be executed.

In [None]:
%matplotlib inline

In [None]:
import matplotlib
matplotlib.get_backend()

We can set `%matplotlib notebook`. The `get_backend` function then returns `nbAgg`. Each plot will now be interactive and has to be shut down (button on the top right corner), before a new plot can be created.

## QUICK PLOTS WITH THE PYPLOT INTERFACE

There are many ways to use matplotlib. We start by exploring the __pyplot interface__. We will take a look ar tge __object oriented interface__ later. It allows for more control, but requires more code.

A list of links to pyplot plots and settings is available [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html).

### LINE AND SCATTER PLOTS

We generate points that serve as _x_ axis values for the expression $\sin{\left( x \right)}$ with some random noise taken from the standard normal distribution.

In [None]:
xvals = np.linspace(0, 4 * np.pi, 50) # Start, stop, number of points
yvals_1 = np.sin(xvals) + np.random.randn(len(xvals)) * 0.1 # Experiment 1
yvals_2 = np.sin(xvals) + np.random.randn(len(xvals)) * 0.1 # Experiment 2

We start with a bare bones line chart. While we onlu have $50$ data points, the `plot` function will interpolate values to create a continuous line.

In [None]:
plt.plot(xvals, yvals_1)

Placing a semicolon after the lat plot command suprresses the extra information printed to the screen.

In [None]:
plt.plot(xvals, yvals_1); # Semicolon supresses other output to the screen

We can add a variety of arguments to the `plot` function. We start by changing the line style. Below, the `linestyle` argument is set to `solid`.

In [None]:
plt.plot(xvals, yvals_1, linestyle='solid');

We can use the shorthand `-` notation instead of the string `solid`.

In [None]:
plt.plot(xvals, yvals_1, linestyle='-');

Setting the argument value to `dashed` or using `--` produces an evenly spaced dashed line.

In [None]:
plt.plot(xvals, yvals_1, linestyle='dashed');

In [None]:
plt.plot(xvals, yvals_1, linestyle='--');

We can also use `dotted` and `dashdot`.

In [None]:
plt.plot(xvals, yvals_1, linestyle='dotted');

In [None]:
plt.plot(xvals, yvals_1, linestyle=':');

In [None]:
plt.plot(xvals, yvals_1, linestyle='dashdot');

In [None]:
plt.plot(xvals, yvals_1, linestyle='-.');

In fact, we can use a tuple to parameterize the length of dashes and spaces, as well as an offset. Below, we have $0$ offset, then a dash of $3$ points, a space of $5$ points, a dash of 1 point, and the space $5$ points. We also introduce the `linewidth` argument and set it to `2`.

In [None]:
plt.plot(xvals, yvals_1, linestyle=(0, (0, 3, 5, 1, 5)));

We have used the plot function to generate a continuous line (even with dashes and dots) to interpolate values between our $50$ data points. We can place markers on the actual data points. There are numerous markers. A list of all the markers are available [here](https://matplotlib.org/stable/api/markers_api.html). Below, we commence with a point, setting the `marker` argument to `.`

In [None]:
plt.plot(xvals, yvals_1, linestyle='-', marker='.');

We can increase the marker size by setting the `markersize` argument.

In [None]:
plt.plot(xvals, yvals_1, linestyle='-', marker='.', markersize=10); # Using 10pt

Instead of a point, we can use a circle, setting the `marker` argument to `o`.

In [None]:
plt.plot(xvals, yvals_1, linestyle='-', marker='o', markersize=5);

There are shorthand keyword arguments. Below we use `ls` for `linestyle`, `ms` for `marker`, and `lw` for `linewidth`. We also set the marker to `x`.

In [None]:
plt.plot(xvals, yvals_1, ls='-', marker='x', ms=8, lw=1);

The markers module in the matplotlib package provides direct use of marker names.

In [None]:
plt.plot(xvals, yvals_1, ls='-', marker=matplotlib.markers.CARETDOWNBASE, ms=8, lw=1);

It is common to see shorthand positional argument notation for combinating the interpolated lines and markers.

In [None]:
plt.plot(xvals, yvals_1, 'x-', ms=8, lw=1);

A title and axes labels can be added using the `title`, `xlabel` and `ylabel` function in the pyplot module. These functions allow for the use of TeX notation. TeX is enclosed in a pair of `$` symbols. A full list of the subset of TeX markup that can be used in matplotlib is available [here](https://matplotlib.org/stable/tutorials/text/mathtext.html).

In [None]:
plt.plot(xvals, yvals_1, 'x-', ms=8, lw=1)
plt.title(r'Position at time $t$')
plt.xlabel(r'Time $\left[ \mu s \right]$')
plt.ylabel(r'Position $\left[ m \right]$');

The title and axes labels are too small. We can use the `fontdict` argument to change font parameters. Note that we could also simply use the `size` argument.

In [None]:
title_font_size = 16 # Setting a default title font size
plt.plot(xvals, yvals_1, 'x-', ms=8, lw=1)
plt.title(r'Position at time $t$', fontdict={'size':title_font_size})
plt.xlabel(r'Time $\left[ \mu s \right]$', fontdict={'size':title_font_size - 4})
plt.ylabel(r'Position $\left[ m \right]$', size=title_font_size - 4);

The `legend` function generates a legend. For this, we need to set a label name in the `plot` function. Below, we set to legend position and font size.

In [None]:
title_font_size = 16 # Setting a default title font size
plt.plot(xvals, yvals_1, 'x-', ms=8, lw=1, label='Experiment 1')
plt.title(r'Position at time $t$', fontdict={'size':title_font_size})
plt.xlabel(r'Time $\left[ \mu s \right]$', fontdict={'size':title_font_size - 4})
plt.ylabel(r'Position $\left[ m \right]$', fontdict={'size':title_font_size - 4})
plt.legend(loc='upper right', fontsize=10);

To add a second data set, we simple use another `plot` function.

Below, we combine all of our knowledge and add a grid and change the axes ticks to face outwards. We also use keywords for the color of each data set and we use the `figure` function to set the `figsize` argument. This argument takes a tuple, indicating the width and height of the figure.

In [None]:
marker_type = 'o--' # Dashes with dots at actual values
color_1 = 'dodgerblue' # Color
color_2 = 'orangered' # Color
lw = 1 # Line width
ms = 5 # Marker size
label_1 = 'Experiment 1' # Label for values
label_2 = 'Experiment 2' # lable for values

In [None]:
plt.figure(figsize=(10, 6)) # Size of the plot
plt.plot(xvals, yvals_1, marker_type, color=color_1, lw=lw, ms=ms, label=label_1)
plt.plot(xvals, yvals_2, marker_type, color=color_2, lw=lw, ms=ms, label=label_2)
plt.legend(loc='upper right', fontsize=10) # Legend with position and font size
plt.xlabel(r'Time $\left[ \mu s \right]$', fontdict={'size':12}) # x axis title with TeX
plt.ylabel(r'Position [cm]', fontdict={'size':12})# y axis title
plt.title(r'Position over time $\left[ \mu s \right]$', fontdict={'size':16}) # Plot title
plt.grid() # Add x and y grid lines
plt.tick_params(which='both', direction='out'); # Ticks face into plot area

There are eight _shortcut_ color keywords: `k` for black, `w` for white, `r` for red, `g` for green,  `b` for blue, `c` for cyan, `m` for magenta, and `y` for yellow. By using a fraction such as `color='0.5` we set a shade of grey. We can also specify HEX colors such as `color='#aabbcc'`. Finally there are a list of named colors. The list is available [here](https://en.wikipedia.org/wiki/Web_colors).

Below, we add a model, $\sin{\left( x \right)}$, indicated by more points and a solid line. We also remove the dashed lines of the data interpolation.

In [None]:
xvals_model = np.linspace(0, 4 * np.pi, 200)
yvals_model = np.sin(xvals_model)

In [None]:
plt.figure(figsize=(10, 6)) # Size of the plot
plt.plot(xvals, yvals_1, 'o', color=color_1, ms=ms, label=label_1)
plt.plot(xvals, yvals_2, 'o', color=color_2, ms=ms, label=label_2)
plt.plot(xvals_model, yvals_model, '-', color='#555555', lw=2, label='Model')
plt.legend(loc='upper right', fontsize=10, ncol=3) # Legend with position and font size and columns
plt.xlabel(r'Time $\left[ \mu s \right]$', fontdict={'size':12}) # x axis title with TeX
plt.ylabel('Position [cm]', fontdict={'size':12})# y axis title
plt.ylim(top=1.5) # Adding space for the legend
plt.title('Position over time and model', fontdict={'size':16}) # Plot title
plt.grid()
plt.tick_params(which='both', direction='in');

Without the interpolated lines, these plots are actually scatter plots. Below, we generate data for an independent and a dependent variable.

In [None]:
indep = np.random.randint(100, 300, 50) / 2
dep = indep * 0.8 + np.random.randint(-10, 10, 50)

In [None]:
plt.figure(figsize=(5, 5))
plt.plot(indep, dep, 'o')
plt.title('Scatter plot')
plt.xlabel('Independent variable')
plt.ylabel('Dependent variable')
plt.grid(True, color='0.5', dashes=(5, 2, 1, 2)) # For dashes we set a tuple of line and break lengths
plt.tick_params(which='both', direction='out');

The `scatter` function also generates scatter plots. We can visualise another numerical variable by setting the size of the markers. A fourth numerical variable can be included by coloring the markers.

In [None]:
indep2 = np.random.randint(50, 200, 50)
indep3 = np.random.randint(1000, 2000, 50)

In [None]:
plt.figure(figsize=(5, 5))
plt.scatter(indep, dep, s=indep2, c=indep3, alpha=0.5)
plt.title('Scatter plot')
plt.xlabel('Independent variable')
plt.ylabel('Dependent variable')
plt.grid()
plt.tick_params(which='both', direction='out');

The `colorbar` function adds a color bar for the _fourth_ dimension used here for the `indep3` variable.

In [None]:
plt.figure(figsize=(5, 5))
plt.scatter(indep, dep, s=indep2, c=indep3, alpha=0.5)
plt.title('Scatter plot')
plt.xlabel('Independent variable')
plt.ylabel('Dependent variable')
plt.grid()
plt.tick_params(which='both', direction='out')
plt.colorbar();

Both the _x_ and _y_ axes are linear. We can change this to log scales. We start by generating values for plots and then plot with the _y_ axis scale set to $\log_{10}$.

In [None]:
range(10)

In [None]:
list(range(10))

In [None]:
vals_for_log = [10**i for i in range(10)]

plt.plot(vals_for_log, 'o--') # Single list defaults to y axis values
plt.yscale('log')
plt.title(r'Log scale for $y$ axis', size=16)
plt.ylabel(r'$\log_{10}$ scale', size=12)
plt.grid()
plt.tick_params(which='both', direction='out');

We do the same for the _x_ axis.

In [None]:
plt.plot(vals_for_log, range(10), 'o--')
plt.xscale('log')
plt.title('Log scale for $x$ axis', size=16)
plt.xlabel(r'$\log_{10}$ scale', size=12)
plt.grid()
plt.tick_params(which='both', direction='out');

The axes can use different log scales.

In [None]:
vals_for_x = [10**i for i in range(1, 6)]
vals_for_y = [2**i for i in range(1, 6)]

In [None]:
plt.plot(vals_for_x, vals_for_y, 'o--')
plt.semilogx() # Default is base=10
plt.semilogy(base=2)
plt.title('Log base 10 and log base 2', size=16)
plt.xlabel(r'$\log_{10}$ scale', size=12)
plt.ylabel(r'$\log_{2}$ scale', size=12)
plt.grid()
plt.tick_params(which='both', direction='out');

If both axes share the same log scale, we can use the `loglog` function.

In [None]:
vals_for_x = [10**i for i in range(1, 6)]
vals_for_y = [10**i for i in range(6, 11)]

In [None]:
plt.plot(vals_for_x, vals_for_y, 'o--')
plt.loglog(base=10)
plt.title('Log base 10 on both axes', size=16)
plt.xlabel(r'$\log_{10}$ scale', size=12)
plt.ylabel(r'$\log_{10}$ scale', size=12)
plt.grid()
plt.tick_params(which='both', direction='out');

### FREQUENCY PLOTS

Frequency plots indicate the count of binned numerical variables (intervals) or the classes (sample space elements) of categorical variables. The former is called a __histogram__ and the latter a __bar plot__. In a relative frequency plot, we divide by the sample size.

Below, we generate random variables for two groups taken from normal distributions.

In [None]:
var_group_1 = np.random.normal(loc=100, scale=10, size=1000)
var_group_2 = np.random.normal(loc=98, scale=15, size=1000)

We start once again with a bare bones histogram.

In [None]:
plt.hist(var_group_1)

Two array and a plot are returned. The first array indicates the frequency (count) of values in each bin. The second array indicates the intervals created by matplotlib. We can overwrite the intervals. Below, we create eight intervals (requiring nine values).

In [None]:
bin_intv = [10 * i for i in range(6, 15)]
bin_intv

The `bins` argument can now be set to these intervals.

In [None]:
plt.hist(var_group_1, bins=bin_intv, color='gray');

We can also simply specify the number of bins. Below, we create a histogram for both groups. The first sets the number of bins to $10$ and the second specifies the intervals. The histogram type, `histtype`, is set to `step`. This shows the outlines of the bars only. We also add a legend, specifying a single column setup, with the set legend outside of the plot using the `bbox_to_anchor` argument. The assigned tuple uses relative positioning where the left lower corner of the figure is `(0, 0)` and the top right is `(1, 1)`. We also specify the `loc` argument to fit the top left corner of the legend to the specified spot of the `bbox_to_anchor` argument. Finally, we remove the border from the legend and add a title.

In [None]:
plt.figure(figsize=(10, 5))
plt.hist(var_group_1, bins=10, histtype='step', label='Group 1')
plt.hist(var_group_2, bins=bin_intv, histtype='step', label='Group 2')
plt.legend(fontsize=8, bbox_to_anchor=(1.05, 1), loc='upper left', frameon=False, title='Groups')
plt.title('Frequency plot')
plt.xlabel('Variable value')
plt.ylabel('Frequency')
plt.tick_params(which='both', direction='out');

The `stepfilled` value for the `histtype` argument fills the histogram with color. We also add some transparency using the `alpha` argument.

In [None]:
plt.figure(figsize=(10, 5))
plt.hist(var_group_1, bins=10, histtype='stepfilled', alpha=0.5, label='Group 1')
plt.hist(var_group_2, bins=bin_intv, histtype='stepfilled', alpha=0.5, label='Group 2')
plt.legend(fontsize=8, bbox_to_anchor=(1.05, 1), loc='upper left', frameon=False, title='Groups')
plt.title('Frequency plot')
plt.xlabel('Variable value')
plt.ylabel('Frequency')
plt.tick_params(which='both', direction='out');

The `hist2d` function creates a two-dimensional histogram (heat map).

In [None]:
x = np.random.normal(loc=0, scale=1, size=10000)
y = np.random.normal(loc=0, scale=1, size=10000)

plt.hist2d(x, y, bins=30)
plt.colorbar().set_label('Frequency in bin');

The `hexbin` function generates regular six-sided polygons.

In [None]:
plt.hexbin(x, y, gridsize=30, cmap='Reds')
plt.colorbar().set_label('Frequency in bin');

The classes or sample space elements of a categorical variable can also be visualised as a frequency chart. In this case, we use a bar chart. There are spaces between the bars to indicate that the variable is not continuous numerical as in a histogram with no spaces between the bars.

In [None]:
classes = ['A', 'B', 'C', 'D']

In [None]:
cat_group_1 = np.random.choice(classes, 500)
cat_group_2 = np.random.choice(classes, 500)

In [None]:
np.unique(cat_group_1, return_counts=True)

In [None]:
cnts_1 = np.unique(cat_group_1, return_counts=True)[1]
cnts_2 = np.unique(cat_group_2, return_counts=True)[1]

We have to subtract and add to the _x_ axis value using a numpy array, and then set the width to generate grouped bar plots. Below, we subtract $0.2$ and add $0.2$ and set a width of $0.4$. The four numerical _x_ axis values are then changed using the `xticks` function.

In [None]:
plt.figure(figsize=(10, 5))
plt.bar(np.arange(4)-0.2, cnts_1, 0.4, label='Group 1', color='gray')
plt.bar(np.arange(4)+0.2, cnts_2, 0.4, label='Group 2', color='lightgray')
plt.xticks(range(4), classes) # Replace 1, 2, 3, 4 with A, B, C, D
plt.legend(loc='upper right', fontsize=10, ncol=2)
plt.ylim(top=160); # Adding a margin for legend

### BAR PLOTS FOR STATISTICS

Bar plots can also be used to indicate statistics such as mean and standard deviation. Below, we generate some data for the coefficient of thermal expansion of two metals. The height of the bars indicate the means and the error bars visualise the standard deviation.

In [None]:
aluminium = np.random.normal(loc=0.00004, scale=0.000015, size=100)
copper = np.random.normal(loc=0.000025, scale=0.00001, size=100)

In [None]:
al_mean = np.mean(aluminium)
al_std = np.std(aluminium, ddof=1)

cu_mean = np.mean(copper)
cu_std = np.std(copper, ddof=1)

The `bar` function generates a bar plot. On the $x$ axis we place the values $0$ and $1$ by using the `range` function. We overwrite the numerical $x$ axis values with the `xticks` function.

In [None]:
plt.figure(figsize=(5, 5))
plt.bar(range(2),
        [al_mean, cu_mean],
        0.8, # Width
        yerr=[al_std, cu_std],
        align='center',
        capsize=8,
        color='gray',
        alpha=0.5)
plt.title('Coefficient of thermal expansion')
plt.xticks(range(2), ['Aluminium', 'Copper'])
plt.ylabel('Coefficient of thermal expansion $[{}^{o}C^{-1}]$');

Since the bar can appear at the left and right limits of the plot we can add limits to the $x$ axis. Below we use the `xlim` function and set the arguments `-0.5, 1.5`.

In [None]:
plt.figure(figsize=(5, 5))
plt.bar(range(2),
        [al_mean, cu_mean],
        0.4,
        yerr=[al_std, cu_std],
        align='center',
        capsize=8,
        color='gray',
        alpha=0.5)
plt.title('Coefficient of thermal expansion')
plt.xticks(range(2), ['Aluminium', 'Copper'])
plt.ylabel('Coefficient of thermal expansion $[{}^{o}C^{-1}]$')
plt.xlim(-0.5, 1.5);

The `errorbar` function can be used separately.

In [None]:
plt.figure(figsize=(5, 5))
plt.bar(range(2),
        [al_mean, cu_mean],
        0.4,
        color='lightgray')
plt.errorbar(range(2),
             [al_mean, cu_mean],
             yerr=[al_std, cu_std],
             fmt="o", # Data marker
             color="r")
plt.title('Coefficient of thermal expansion', size=16)
plt.xticks(range(2), ['Aluminium', 'Copper'])
plt.ylabel('Coefficient of thermal expansion $[{}^{o}C^{-1}]$', size=12)
plt.xlim(-0.5, 1.5);

The error bars can be used on their own.

In [None]:
plt.errorbar(range(2),
             [al_mean, cu_mean],
             yerr=[al_std, cu_std],
             fmt="d", # Diamond
             color="r")
plt.title(r'Coefficient of thermal expansion ($\mu , \; \sigma$)')
plt.xticks(range(2), ['Aluminium', 'Copper'])
plt.ylabel('Coefficient of thermal expansion $[{}^{o}C^{-1}]$')
plt.ylim(0, 0.00006) # Including zero
plt.grid()
plt.xlim(-0.5, 1.5);

### BOX AND WHISKER PLOTS

Box-and-whisker plots also give an indication of the distribution of values for a numerical variable. Below, we use the `boxplot` function to visualise the same information as in the histograms above.

In [None]:
plt.figure(figsize=(10, 5))
plt.boxplot([var_group_1, var_group_2],
            flierprops={'marker':'D'}, # Suspected outliers are diamonds
            meanline=True, # Calculate mean
            showmeans=True, # Show mean
            notch=True, # To indicate CI
            bootstrap=5000) # 95% CI around the median
plt.title('Box-and-whisker plot for variable values in two groups')
plt.xlabel('Group')
plt.ylabel('Variable value')
plt.xticks([1, 2],['Group 1', 'Group 2']) # Change 1 and 2 to Group 1 and Group 2
plt.tick_params(which='both', direction='in')
plt.grid()
ax = plt.gca() # Note on these two last lines below
ax.set_frame_on(False);

## OBJECT ORIENTED INTERFACE

More options become available when we use the object oriented matplotlib interface. You can read more about subplots [here](https://matplotlib.org/stable/gallery/subplots_axes_and_figures/subplots_demo.html).

When using the pyplot module, a single plot is created.

In [None]:
plt.plot(xvals, yvals_1, 'o')
plt.title('Simple scatter plot');

This is actually a subplot within a figure. In matplotlib these two entities are usually assigned to the variables `fig` and `ax`. We can create a figure with a single subplot and then return more information about the current subplot using the `gca` function.

In [None]:
plt.plot(xvals, yvals_1, 'o'); # Don't add anything else to the plot
ax = plt.gca()

The `type` function shows that `ax` is a `matplotlib.axes._subplots.AxesSubplot` object.

In [None]:
type(ax)

We us the object oriented interface to separate the two entities.

In [None]:
fig, ax = plt.subplots() # Single plot

Below, we create three rows and two columns of subplots in a figure. Since this is an arry along two dimensions, we using indexing to reference each axes in the figure.

In [None]:
# Subplot [row] [column] indexing from single ax 
fig, ax = plt.subplots(3, 2, figsize=(16, 9)) # Three rows and two columns of plots

ax1 = ax[0][0] # Indices are [row][column]
ax1.set_title('Top left plot')

ax6 = ax[2][1]
ax6.set_title('Bottom right')

plt.tight_layout() # Can fix overlapping by using padding

There are numerous ways to add axes to a figure. We will see some in the rest of the notebook. Below, we generate a figure first and then add two axes, using the `add_axes` method. It takes a list as argument where we specify the fraction (left bottom is `(0, 0)` and top right is `(1, 1)`) from the left edge and the bottom edge that the axes must take up in the fidure. The last two fractions are the width and height. We then use the same technique to generate a second axes inside the first.

In [None]:
# Add subplot from plt
fig_1 = plt.figure(figsize=(6, 4))
axes_1 = fig_1.add_axes([0.05, 0.05, 0.95, 0.95]) # [left, bottom, width, height]
axes_1.set_title('Main plot')

axes_2 = fig_1.add_axes([0.6, 0.6, 0.3, 0.3])
axes_2.set_title('Inside');

The frames (borders) called spines around a plot can be removed. To remove all splines, the `set_frame_on` method can be set to `False`.

In [None]:
fig, ax = plt.subplots(1, 1)
ax.set_frame_on(False) # Remove all spines

Below, we remove the top and bottom spines individually using the `set_visible` method. There are also bottom and left spines.

In [None]:
fig, ax = plt.subplots(1, 1)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

As an example of using subplots, we create a single row of plots, with two columns. We also add a text box to the histogram.

In [None]:
txt = '\n'.join((r'$\sigma_{1}=%.2f$'%(np.std(var_group_1)), r'$\sigma_{2}=%.2f$'%(np.std(var_group_2))))

In [None]:
# Separate axes as a tuple
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 8), frameon=False)

ax1.hist(var_group_1, bins=bin_intv, histtype='step', label='Group 1', lw=3, color='deepskyblue')
ax1.hist(var_group_2, bins=bin_intv, histtype='step', label='Group 2', lw=3, color='orangered')
ax1.legend(loc='upper right', fontsize=8, ncol=2, edgecolor='gray', facecolor='lightgrey')
ax1.set_title('Frequency plot')
ax1.set_xlabel('Variable value')
ax1.set_ylabel('Frequency')
ax1.text(65, 200, txt, bbox={'facecolor':'white', 'edgecolor':'gray'})
ax1.tick_params(which='both', direction='out')
ax1.grid()

ax2.boxplot([var_group_1, var_group_2], flierprops={'marker':'D'})
ax2.set_title('Box-and-whisker plot for variable values in two groups')
ax2.set_xlabel('Group')
ax2.set_ylabel('Variable value')
ax2.yaxis.grid(True)
ax2.set_xticklabels(['Group 1', 'Group 2'])
ax2.tick_params(which='both', direction='out')
ax2.set_frame_on(False)

fig.text(0.5, -0.05, 'Experiment 1', ha='center', fontsize=12);

In line $9$ above where we have `ax1.text` we set the _x_ and _y_ coordinates according to the axes tick values of the plot. If we use the keyword argument and value `transform=ax1.transAxes` the coordinates will be detached from the tick values. Now $0,0$ will be the left bottom corner and $1,1$ will be the top right corner.

The gridspec module allows even more flexibility when it comes to plots. Below, we incorporate three of our previous plots in two rows. The top row spans a single plot and the bottom row has plots in two columns.

In [None]:
fig = plt.figure(tight_layout=True, figsize=(16, 9))
gs = gridspec.GridSpec(2, 2)

ax1 = fig.add_subplot(gs[0, :])
ax1.plot(xvals, yvals_1)
ax1.set_title('Basic plot')

ax2 = fig.add_subplot(gs[1, 0])
ax2.plot(xvals, yvals_1, marker_type, color=color_1, lw=lw, ms=ms, label=label_1)
ax2.plot(xvals, yvals_2, marker_type, color=color_2, lw=lw, ms=ms, label=label_2)
ax2.legend(loc='upper right', fontsize=8)
ax2.set_xlabel(r'Time $\left[ \mu s \right]$')
ax2.set_ylabel('Position [cm]')
ax2.set_title('Position over time')
ax2.tick_params(which='both', direction='in')
ax2.grid(True)

ax3 = fig.add_subplot(gs[1, 1])
ax3.plot(xvals, yvals_1, 'o', color=color_1, ms=ms, label=label_1)
ax3.plot(xvals, yvals_2, 'o', color=color_2, ms=ms, label=label_2)
ax3.plot(xvals_model, yvals_model, '-', label='Model')
ax3.legend(loc='upper right', fontsize=8, ncol=3)
ax3.set_xlabel(r'Time $\left[ \mu s \right]$')
ax3.set_ylim(top=1.5)
ax3.set_title('Position over time and model')
ax3.tick_params(which='both', direction='in')
ax3.grid(True);

You will have noticed that not all functions have the same names when comparing the plotting and the object-oriented formats. Below is a list of some of the differences.

- `xlabel` $\to$ `set_xlabel`
- `ylabel` $\to$ `set_ylabel`
- `xlim` $\to$ `set_xlim`
- `ylim` $\to$ `set_ylim`
- `title` $\to$ `set_title`

There is a simple `set` function that can take many settings arguments.

In [None]:
fig, ax = plt.subplots()
ax.plot(xvals, yvals_1)
ax.set(xlim=(0, 13), xlabel=r'Time $\left[ \mu s \right]$', ylabel=r'Position $\left[ cm \right]$',
      title='Position over time');

## PLOTS FOR FUNCTIONS AND VECTORS

### CONTOUR PLOTS

There are two types of contour plots. The `contourf` function creates filled contour plots and the `contour` function creates contour lines only (without the colour fill).

Below, we consider the function $f \left( x,y \right) = x^{2} + y^{2}$.

We have to use numpy to set up a grid of _x_ and _y_ coordinates. The `linspace` function can generate an array of values that we can use for both axes. We calculate coordinate values for each point on the grid and assign the function to it.

In [None]:
_ = np.linspace(-1, 1, 200)
X, Y = np.meshgrid(_, _)
f = X**2 + Y**2

Now, we create a filled contour plot.

In [None]:
plt.contourf(X, Y, f, levels=60, cmap='inferno')
plt.colorbar(label='Value of $f(x,y)$')
plt.title('Filled contour plot');

We can add the `vmin` or the `vmax` arguments to the `contourf` function. Either of these can be used to set the limit beyond which the colour is constant. This is helpful for infinities.

Instead of a filled contour plot, we an view only the contour lines. Below, we add values to the contours themselves instead of adding a color bar.

In [None]:
C = plt.contour(X, Y, f, levels=10, cmap='plasma')
plt.clabel(C, fontsize=12)
plt.title('Contour plot with values');

### QUIVER PLOTS

Quiver plots generate vectors. A vector takes an initial _x_ and _y_ coordinate and then a magnitude for each of the two coordinate directions.

In [None]:
# Coordinate positions for tail of vector
x_pos = 0
y_pos = 0

# Magnitude for x and y directions (not to scale)
x_direct = 2
y_direct = 1

We now plot the vector and with the values above and set the _x_ axis and _y_ axis limits.

In [None]:
fig, ax = plt.subplots(figsize=(5, 5))
ax.quiver(x_pos, y_pos, x_direct, y_direct, scale=5) # Scale the magnitude
ax.axis([-0.01,0.03, -0.01, 0.03]) # Axes limits
ax.set_aspect('equal'); # Set equal aspect ratio for the axes

We do not create individual vectors for a vector field. Below we see a function, $\mathbf{F}$, representing a vector field.

$$\mathbf{F} = \frac{x}{5} \hat{i} - \frac{y}{5} \hat{j} \\ \mathbf{F}\left( x,y \right) = \left( \frac{x}{5}, -\frac{y}{5} \right)$$

We use the `meshgrid` function again. The variables `u` and `v` hold the components of $\mathbf{F}$.

In [None]:
_ = np.arange(0, 2.2, 0.2) # Starting points

In [None]:
X, Y = np.meshgrid(_, _) # Mesh of x and y coordinates
u = X/5 # x direction
v = -Y/5 # y direction

The `quiver` function creates multiple vectors demonstrating the vector field.

In [None]:
fig, ax = plt.subplots(figsize=(7,7))
ax.quiver(X, Y, u, v)
ax.axis([-0.2, 2.3, -0.2, 2.1])
ax.set_aspect('equal')
ax.set_title('Vector field $\mathbf{F}$')
ax.text(0, 0,
        r'$\vec{F} = \frac{x}{5} \hat{i} - \frac{y}{5} \hat{j}$',
        fontdict={'fontsize':14},
        bbox={'facecolor':'white', 'edgecolor':'gray'});

We can also use the `quiver` function for a gradient field. We start with a multivariable function shown below.

$$f \left( x,y \right) = x^{2} - y^{2}$$

We are interested in the gradient.

$$\nabla f = \left( \frac{\partial{f}}{\partial{x}} , \frac{\partial{f}}{\partial{y}} \right) = \left( f_{x} , f_{y} \right)$$

Sympy can be used to calculate the partial derivatives analytically. We set the variables `x` and `y` to be mathematical symbols.

In [None]:
x, y = sym.symbols('x y')

The variable `f` holds our symbolic function.

In [None]:
f = x**2 - y**2
f

We consider the partial derivative of $f$ with respect to $x$, written as $f_{x}$ using the `diff` method.

In [None]:
f.diff(x) # First derivative of f with respect to x

The partial derivative of $f$ with respect to $y$ is written as $f_{y}$.

In [None]:
f.diff(y) # First derivative of f with respect to x

We consider the point $p \left( 2,2 \right)$. Below, we calculate $f_{x} \left( p \right)$. The `subs` method allows us to substitute values for our variables.

In [None]:
f.diff(x).subs(x, 2).subs(y, 2)

We also calculate $f_{y} \left( p \right)$.

In [None]:
f.diff(y).subs(x, 2).subs(y, 2)

We now have that $\nabla f \left( p \right) = \left( 4,-4 \right)$. We can view this point on our gradient plot to make sure that our plot is accurate.

In [None]:
_ = np.linspace(-2, 2, 20)
X, Y = np.meshgrid(_, _)

In [None]:
u = 2 * X # Partial drivative of f with respect to x
v = -2 * Y # Partial drivative of f with respect to y

We now create the gradient plot. This time we assign the plot to a variable and then pass that to the `quiverkey` function. The positions are relative to the plot, with `(0, 0)` in the bottom left. We state a length and add a title to the position `E` for east of the key.

In [None]:
fig, ax = plt.subplots(figsize=(7, 7))
Q = ax.quiver(X, Y, u, v)
ax.quiverkey(Q, 0.85, 1.02, 10, '[10 units]', labelpos='E')
ax.set_aspect('equal')
ax.set_title('Gradient plot')
ax.set_xlabel('$x$ axis')
ax.set_ylabel('$y$ axis');

The point $p$ does indeed show a slope of $\left( 4, -4 \right)$.

Instead of using sympy for analytical partial differentiation, we can use the `gradient` function from numpy. We set the step size the same same as when we generated the meshgrid object. Since both are the same, we only set a single value. Note that the function returns the derivative with respect to $y$ first, hence `v, u`.

In [None]:
my_f = X**2 - Y**2
v, u = np.gradient(my_f, 0.1)

In [None]:
fig, ax = plt.subplots(figsize=(7, 7))
Q = ax.quiver(X, Y, u, v)
ax.quiverkey(Q, 0.85, 1.02, 10, '[10 units]', labelpos='E')
ax.set_aspect('equal')
ax.set_title('Gradient plot')
ax.set_xlabel('$x$ axis')
ax.set_ylabel('$y$ axis');

Now we add the gradient vectors to a contour plot.

In [None]:
fig, ax = plt.subplots(figsize=(7, 7))
Q = ax.quiver(X, Y, u, v)
ax.quiverkey(Q, 0.85, 1.02, 10, '[10 units]', labelpos='E')
C = ax.contour(X, Y, my_f, levels=10, cmap='plasma')
ax.clabel(C, inline=True, fontsize=10)
ax.set_aspect('equal')
ax.set_title('Gradient plot')
ax.set_xlabel('$x$ axis')
ax.set_ylabel('$y$ axis');

The gradient is pointing uphill and the vectors are perpendicular to the contours.

### STREAM PLOTS

Stream plots can likewise help us visualise a vector field. Below, we use sub plots to create three plots. The first is a basic stream plot of our previous function $f$. The magnitude of the vector at each coordinate point is demonstrated using colour in the second plot. The last plot uses variable line width to demonstrate the magnitude of the vector as each coordinate. We calculate the magnitude in the usual way for $\mathbb{R}^{2}$ using $\sqrt{x^{2} + y^{2}}$.

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(12, 4), sharey=True) # Set single y axis label

ax = axes[0]
ax.streamplot(X, Y, u, v)
ax.set_title('Basic stream plot')

ax = axes[1]
magn = np.sqrt(u**2 + v**2) # Magnitude of each vector
ax.streamplot(X, Y, u, v, color=magn, cmap='magma')
ax.set_title('Colour stream plot')

ax = axes[2]
lw = 4 * magn / np.max(magn)
ax.streamplot(X, Y, u, v, linewidth=lw)
ax.set_title('Velocity stream plot');

## SAVING PLOTS

A plot can be saved in may format using the `savefig` function. The os package allows us to interact with the foldder structure on our computer. We can get the working directory in which our notebook is saved using the `getcwd` function in case we need to use it when saving the plot.

In [None]:
os.getcwd()

In this case, I have a subfolder named `images` in the current working folder. This is where the plot will be saved.

In [None]:
fig.savefig('images/streamplots.png')

We can also specify the folder structure directly.

In [None]:
fig.savefig('/Users/juan/Desktop/streamplots.png')

## ONLINE TEXTBOOK

https://jakevdp.github.io/PythonDataScienceHandbook/04.00-introduction-to-matplotlib.html